This blog is running on Hugo. It had previously been running on Jekyll. Both these SSGs ship with the ability to create excerpts from your markdown content in 1 line or thereabouts.
/* Hugo */
{{ .Summary | truncate 130 }}
# Ruby
{{ post.content | markdownify | strip_html | truncatewords: 20 }}
I was mildly surprised Astro does not have a corresponding way to do it. To be fair, there is an integration for it: Post Excerpt component for 🚀 Astro. And I’m all for keeping the core product streamlined.
But I’m also THAT annoying developer that likes to keep the dependency count as low as I can. Which means I’m constantly playing this balance game in my head of building the feature myself versus installing it.
I usually give myself an hour or so, and if I feel it’s going to take me more than an hour, I’ll fold.
In this particular case, I’m like, I just need to access the post content, right? That has to exist already, right?
We-ll, kind of?
Someone else already did it!
Cool kids use ChatGPT for all the things now. But I’m not cool. And not young. So I still Google my shit. Which brought me to Paul Scanlon‘s post How to Create Excerpts With Astro.
Paul’s website is pretty. Go visit Paul’s website.
The gist of the self-rolled solution is:
- Grab the
post.body
, which is in Markdown - Parse it into HTML using markdown-it
- Extract usable text content from the HTML
- Cut off the text to whatever length you’d like
- Use excerpt and profit
I tried Paul’s instructions exactly, but it didn’t quite work out for my particular use-case. Because I had articles where the <figure>
and <img>
tag show up very early. And that somehow got parsed into the excerpt.
My solution deviates at step 3. Because I’m not a Computer Science major. I am not well-versed in the art of regex and parsing. Therefore, I cede the responsibility to a professional: html-to-text. Who am I to doubt more than 1 million downloads a week?
Same but different
If it isn’t broke, don’t fix it. So I used a similar implementation strategy as Paul. The source of the script goes into an utils
folder and I import it into the layout file that needs it.
The script itself isn’t rocket science. No, I did not use Typescript for this. Don’t @ me.
import MarkdownIt from "markdown-it";
import { convert } from "html-to-text";
const parser = new MarkdownIt();
export const createExcerpt = (body) => {
const html = parser.render(body);
const options = {
wordwrap: null,
selectors: [
{ selector: "a", options: { ignoreHref: true } },
{ selector: "img", format: "skip" },
{ selector: "figure", format: "skip" },
],
};
const text = convert(html, options);
const distilled = convert(text, options);
return distilled;
};
The fun part is distilled
. Why on earth would I run convert()
twice? That took me the better part of the hour to figure out.
At first I thought I wasn’t configuring the options correctly, but after reading the documentation and this issue, I realised it was more likely a source issue.
After a couple rounds of console.log
, I realised that the first parse from Markdown to HTML sanitised the <figure>
and <img>
tags to <figure>
and <img>
because they were wrapped in a <p>
tag.
So the first convert()
returned all the text content plus these tags. That’s why a second round is needed to clean out these caused-by-sanitisation tags.
Naming things is hard. I just called it distilled. Because you distil booze multiple times.
Actual usage on the [...page].astro
file looks something like this:
import { createExcerpt } from '../../utils/create-excerpt';
---
<ol class="postlist">
{((page as any).data || []).map((blogPostEntry: any) => {
const excerpt = `${createExcerpt(blogPostEntry.body).substring(0, 300)}...`;
return (
<li class="postlist-item">
<a href={`${blogPostEntry.slug}`} class="postlist-link heading--6">{blogPostEntry.data.title}</a>
<p>{blogPostEntry.data.description ? blogPostEntry.data.description : excerpt}</p>
</li>
)}
)}
</ol>
Wrapping up
Was it worth the effort to roll this feature myself? I do think so. The code wasn’t complicated. And yes, I succumbed to installing 2 parsers. What can I say, I’m not a rational human being. 乁 ( •_• ) ㄏ