Creating excerpts in Astro

This blog is running on Hugo. It had previously been running on Jekyll. Both these SSGs ship with the ability to create excerpts from your markdown content in 1 line or thereabouts.

/* Hugo */
{{ .Summary | truncate 130 }}

# Ruby
{{ post.content | markdownify | strip_html | truncatewords: 20 }}

I was mildly surprised Astro does not have a corresponding way to do it. To be fair, there is an integration for it: Post Excerpt component for 🚀 Astro. And I’m all for keeping the core product streamlined.

But I’m also THAT annoying developer that likes to keep the dependency count as low as I can. Which means I’m constantly playing this balance game in my head of building the feature myself versus installing it.

I usually give myself an hour or so, and if I feel it’s going to take me more than an hour, I’ll fold.

In this particular case, I’m like, I just need to access the post content, right? That has to exist already, right?

We-ll, kind of?

Someone else already did it!

Cool kids use ChatGPT for all the things now. But I’m not cool. And not young. So I still Google my shit. Which brought me to Paul Scanlon’s post How to Create Excerpts With Astro.

Paul’s website is pretty. Go visit Paul’s website.

The gist of the self-rolled solution is:

Grab the post.body, which is in Markdown
Parse it into HTML using markdown-it
Extract usable text content from the HTML
Cut off the text to whatever length you’d like
Use excerpt and profit

I tried Paul’s instructions exactly, but it didn’t quite work out for my particular use-case. Because I had articles where the <figure> and <img> tag show up very early. And that somehow got parsed into the excerpt.

My solution deviates at step 3. Because I’m not a Computer Science major. I am not well-versed in the art of regex and parsing. Therefore, I cede the responsibility to a professional: html-to-text. Who am I to doubt more than 1 million downloads a week?

Same but different

If it isn’t broke, don’t fix it. So I used a similar implementation strategy as Paul. The source of the script goes into an utils folder and I import it into the layout file that needs it.

The script itself isn’t rocket science. No, I did not use Typescript for this. Don’t @ me.

import MarkdownIt from "markdown-it";
import { convert } from "html-to-text";
const parser = new MarkdownIt();

export const createExcerpt = (body) => {
  const html = parser.render(body);
  const options = {
    wordwrap: null,
    selectors: [
      { selector: "a", options: { ignoreHref: true } },
      { selector: "img", format: "skip" },
      { selector: "figure", format: "skip" },
    ],
  };
  const text = convert(html, options);
  const distilled = convert(text, options);
  return distilled;
};

The fun part is distilled. Why on earth would I run convert() twice? That took me the better part of the hour to figure out.

At first I thought I wasn’t configuring the options correctly, but after reading the documentation and this issue, I realised it was more likely a source issue.

After a couple rounds of console.log, I realised that the first parse from Markdown to HTML sanitised the <figure> and <img> tags to <figure> and <img> because they were wrapped in a <p> tag.

So the first convert() returned all the text content plus these tags. That’s why a second round is needed to clean out these caused-by-sanitisation tags.

Naming things is hard. I just called it distilled. Because you distil booze multiple times.

Actual usage on the [...page].astro file looks something like this:

import { createExcerpt } from '../../utils/create-excerpt';
---

<ol class="postlist">
  {((page as any).data || []).map((blogPostEntry: any) => {
    const excerpt = `${createExcerpt(blogPostEntry.body).substring(0, 300)}...`;
    return (
      <li class="postlist-item">
        <a href={`${blogPostEntry.slug}`} class="postlist-link heading--6">{blogPostEntry.data.title}</a>
        <p>{blogPostEntry.data.description ? blogPostEntry.data.description : excerpt}</p>
      </li>
    )}
  )}
</ol>

Wrapping up

Was it worth the effort to roll this feature myself? I do think so. The code wasn’t complicated. And yes, I succumbed to installing 2 parsers. What can I say, I’m not a rational human being. 乁 ⁠(⁠ ⁠•⁠_⁠•⁠ ⁠) ㄏ

astro