Serve Markdown to AI Agents (Hugo + Vercel)

A robot hand reaching for a markdown document
A robot hand reaching for a markdown document

The problem

AI coding agents like Claude Code use tools like WebFetch to read web pages. These tools convert HTML to Markdown internally, but the conversion is lossy — navigation, sidebars, and other chrome get mixed into the content, and the structure of the original page can get mangled.

Meanwhile, Hugo sites already have the original Markdown source for every page. Why not serve it directly to clients that want it?

The detection signal

Claude Code’s WebFetch sends:

Accept: text/markdown, text/html, */*

No browser (or bot that doesn’t want Markdown) sends text/markdown in its Accept header. So text/markdown in the header is the cleanest, opt-in signal. No need to look at User Agents at all.

Hugo setup

First, Hugo needs to output Markdown alongside HTML. That’s easy, because for most websites the Markdown is already there. In hugo.toml, add a Markdown output format and configure pages and sections to use it:

[outputFormats.markdown]
  mediaType = "text/markdown"
  baseName = "index"
  isPlainText = true
  notAlternative = true

[mediaTypes."text/markdown"]
  suffixes = ["md"]

[outputs]
  page = ["HTML", "markdown"]
  section = ["HTML", "markdown"]

Then create a Markdown template at layouts/_default/single.markdown:

---
title: {{ .Title }}
url: {{ .Permalink }}
date: {{ .Date.Format "2006-01-02" }}
---

{{ .RawContent }}

Hugo will now generate an index.md next to every index.html. For most people, that’s all you need.

Edge Middleware

The middleware itself is minimal. You’ll just need @vercel/functions. Here’s middleware.js, which goes into the repo root:

import { rewrite, next } from '@vercel/functions';

export default function middleware(request) {
  const accept = request.headers.get('accept') || '';
  // this assumes that no agent in its right mind would ask for markdown
  // and then prefer any other format over markdown
  if (accept.includes('text/markdown')) {
    const url = new URL(request.url);
    return rewrite(new URL(url.pathname + 'index.md', url.origin));
  }
  return next();
}

export const config = {
  matcher: ['/((?!.*\\.(?:css|js|png|jpg|jpeg|gif|svg|ico|webp|woff|woff2|ttf|eot|xml|txt|json|md)).*)'],
};

The matcher excludes static assets, so the middleware only runs on page requests.

Note that this is just for Vercel to use, it won’t be served.

We also need to add response headers in vercel.json so the Markdown files have the right content type:

{
  "headers": [
    {
      "source": "/(.*)/index.md",
      "headers": [
        {
          "key": "Content-Type",
          "value": "text/markdown; charset=utf-8"
        },
        {
          "key": "Vary",
          "value": "Accept"
        }
      ]
    }
  ]
}

The Vary: Accept header tells caches that the response varies based on the Accept header, so a cached Markdown response won’t be served to a browser.

Cost

Edge Middleware on Vercel uses the fluid compute pricing model. The Hobby (free) tier includes 1 million invocations per month. The middleware runs on every page request (static assets are excluded by the matcher), so this should be plenty for a personal site.

Testing

# Should return Markdown
curl -s -H "Accept: text/markdown" https://tashian.com/articles/hash-table-attack/

# Should return HTML
curl -s -H "Accept: text/html" https://tashian.com/articles/hash-table-attack/

# Bare curl — returns HTML (curl sends Accept: */*)
curl -s https://tashian.com/articles/hash-table-attack/ | head -5