Dynamic Route Generation for Headless CMS

Headless CMS platforms store content as API payloads with no inherent URL structure. Without an explicit build-time mapping step, JavaScript frameworks produce no crawlable paths — every URL either returns a client-rendered blank shell or a 404. Dynamic route generation is the process that converts CMS slugs into static HTML files at build time, guaranteeing that search engine crawlers receive fully resolved, link-followable paths on the first request.

Prerequisites

Before configuring route generation, verify the following are in place:

Framework version: Next.js 13+ (App Router), SvelteKit 2.x, Nuxt 3.x, or Astro 3+
Environment variables: CMS_URL (base API endpoint) and CMS_API_KEY (bearer token) set in .env.local / CI secrets
CMS pagination: confirm the API supports cursor or offset pagination — unbounded single-request fetches cause build timeouts on catalogs over ~500 items
Webhook endpoint: a deployable URL that can receive publish / unpublish events from the CMS to trigger on-demand revalidation
Node version: 18+ (native fetch, no polyfill required)

Architecture: CMS-to-URL Execution Path

The diagram below shows how a CMS content item becomes a pre-rendered static HTML file — and how ISR handles slugs published after the build.

Step-by-Step Implementation Workflow

Step 1 — Build the route manifest

Fetch all published items from your CMS and reduce them to a flat array of path segments. Use cursor-based pagination to stay within API rate limits and avoid CI build timeouts.

// lib/cms-routes.ts
interface CMSItem {
  slug: string;
  locale: string;
  publishedAt: string | null;
}

export async function fetchRouteManifest(): Promise<Array<{ slug: string; locale: string }>> {
  const routes: Array<{ slug: string; locale: string }> = [];
  let cursor: string | null = null;

  do {
    const url = new URL(`${process.env.CMS_URL}/posts`);
    url.searchParams.set('limit', '200');
    if (cursor) url.searchParams.set('after', cursor);

    const res = await fetch(url.toString(), {
      headers: {
        Authorization: `Bearer ${process.env.CMS_API_KEY}`,
        Accept: 'application/json',
      },
      next: { revalidate: 0 }, // always fresh during build
    });

    const { data, meta } = await res.json();
    const published = (data as CMSItem[]).filter((item) => item.publishedAt !== null);
    routes.push(...published.map((item) => ({ slug: item.slug, locale: item.locale })));
    cursor = meta.nextCursor ?? null;
  } while (cursor !== null);

  return routes;
}

Validation: Log routes.length during build and compare against the CMS admin dashboard published count. A mismatch indicates a pagination bug or filter issue.

Step 2 — Register routes with the framework

Pass the manifest to the framework’s static-path hook. Each framework has a different API but the same contract: return an array of param objects that map to URL segments.

Next.js App Router

// app/blog/[slug]/page.tsx
import { fetchRouteManifest } from '@/lib/cms-routes';

export async function generateStaticParams(): Promise<Array<{ slug: string }>> {
  const routes = await fetchRouteManifest();
  return routes.map((r) => ({ slug: r.slug }));
}

// Allow on-demand ISR for slugs published after the build
export const dynamicParams = true;
export const revalidate = 3600;

SEO impact: All known routes render as static HTML with sub-10 ms TTFB from the CDN edge. dynamicParams = true with revalidate = 3600 ensures newly published posts become crawlable within one hour without a full rebuild.

Validation: Inspect .next/server/app/blog/ — each slug should have a corresponding .html file. On deployed pages, confirm x-nextjs-cache: HIT on the second request.

SvelteKit

// src/routes/blog/[slug]/+page.ts
import type { EntryGenerator, PageLoad } from './$types';
import { fetchRouteManifest } from '$lib/cms-routes';

export const prerender = true;

export const entries: EntryGenerator = async () => {
  const routes = await fetchRouteManifest();
  return routes.map((r) => ({ slug: r.slug }));
};

export const load: PageLoad = async ({ fetch, params }) => {
  const res = await fetch(`/api/articles/${params.slug}`);
  if (!res.ok) return { article: null };
  return { article: await res.json() };
};

SEO impact: prerender = true forces zero-JS static output for all listed entries. Crawlers receive fully resolved HTML with no hydration dependency, maximising the pages indexed per crawl budget allocation.

Validation: Run npm run build. Inspect .svelte-kit/output/prerendered/pages/blog/. Run curl -o /dev/null -w "%{http_code}" https://staging.example.com/blog/<slug> for a sample of slugs — all must return 200.

Nuxt 3

// nuxt.config.ts
export default defineNuxtConfig({
  routeRules: {
    '/blog/**': { prerender: true },
    '/products/**': { isr: 3600 },
  },
  nitro: {
    prerender: {
      crawlLinks: true,
      routes: ['/sitemap.xml'],
    },
  },
});

SEO impact: prerender: true on /blog/** tells the Nitro engine to generate static HTML for every blog route at build time, while /products/** uses ISR with a 1-hour TTL — appropriate for inventory data that changes frequently. This splits crawl budget allocation by content type.

Validation: Run npx nuxi build. Check .output/public/blog/ for prerendered HTML. The Nitro build log should confirm the expected route count under Prerendering routes.

Astro

// src/pages/blog/[slug].astro
---
import { fetchRouteManifest } from '../../lib/cms-routes';
import type { GetStaticPaths } from 'astro';

export const getStaticPaths: GetStaticPaths = async () => {
  const routes = await fetchRouteManifest();
  return routes.map((r) => ({
    params: { slug: r.slug },
    props: { slug: r.slug },
  }));
};

const { slug } = Astro.props;
---

SEO impact: Astro generates zero-JS static HTML by default. No hydration bundle is shipped to crawlers, eliminating render-blocking resources entirely. This is the lightest possible output for SEO-critical content.

Validation: Inspect dist/blog/ for .html files. Verify Content-Type: text/html and the absence of <script type="module"> hydration tags in the rendered source.

HTTP Headers and CDN Directives Reference

Header / Directive	Required value	Rationale
`Authorization`	`Bearer ${CMS_API_KEY}`	Authenticate CMS API calls during build and ISR
`Cache-Control` (static pages)	`public, s-maxage=86400, stale-while-revalidate=3600`	Long CDN TTL; background refresh for changed content
`Cache-Control` (ISR pages)	`public, s-maxage=3600, stale-while-revalidate=60`	Shorter TTL matches revalidate interval
`Cache-Control` (error pages)	`no-store`	Prevent 404/410 responses from being cached by mistake
`X-Robots-Tag`	`index, follow`	Confirm indexation intent on all dynamically generated routes
`Vary`	`Accept-Language`	Required when generating locale-specific route variants
CDN cache key	Vary on `locale`, `content-type`	Prevent locale collisions in shared CDN caches
Webhook purge	Tag-based purge on `publish` event	Invalidate stale ISR pages when content updates arrive

URL Structure and Slug Processing

Raw CMS slugs often contain uppercase characters, special symbols, or trailing whitespace that produce duplicate URL variants. Apply slug normalization strategies before inserting slugs into the route manifest:

Lowercase the entire string: slug.toLowerCase()
Replace spaces and underscores with hyphens: .replace(/[\s_]+/g, '-')
Strip non-alphanumeric characters (except hyphens): .replace(/[^a-z0-9-]/g, '')
Collapse consecutive hyphens: .replace(/-{2,}/g, '-')
Trim leading and trailing hyphens: .replace(/^-|-$/g, '')

Configure a middleware redirect for any request arriving at a non-normalized variant:

// middleware.ts (Next.js)
import { NextRequest, NextResponse } from 'next/server';

function normalizeSlug(slug: string): string {
  return slug
    .toLowerCase()
    .replace(/[\s_]+/g, '-')
    .replace(/[^a-z0-9-]/g, '')
    .replace(/-{2,}/g, '-')
    .replace(/^-|-$/g, '');
}

export function middleware(req: NextRequest) {
  const { pathname } = req.nextUrl;
  const segments = pathname.split('/');
  const normalized = segments.map(normalizeSlug).join('/');

  if (normalized !== pathname) {
    return NextResponse.redirect(new URL(normalized, req.url), 301);
  }
}

Issue 301 redirects for malformed variants so link equity consolidates on the canonical form. Set Cache-Control: public, max-age=31536000, immutable on the redirect response — the CDN can then serve the redirect from cache, avoiding an origin roundtrip on every crawl.

Pagination and Archive Route Handling

Blog and product archive routes follow a sequential pattern (/blog/page/2, /blog/page/3) that requires explicit generation. Align this with the pagination handling for headless APIs reference for full header and canonical configuration.

// lib/pagination-routes.ts
export async function fetchPaginationManifest(
  contentType: string,
  pageSize = 20
): Promise<Array<{ page: string }>> {
  const res = await fetch(`${process.env.CMS_URL}/${contentType}/count`, {
    headers: { Authorization: `Bearer ${process.env.CMS_API_KEY}` },
  });
  const { total } = await res.json();
  const totalPages = Math.ceil(total / pageSize);

  // Exclude page 1 — that's the canonical archive index
  return Array.from({ length: totalPages - 1 }, (_, i) => ({ page: String(i + 2) }));
}

On each paginated page, inject rel="prev" / rel="next" link headers and point the rel="canonical" to the first page of the archive:

// Rendered in <head>
<link rel="canonical" href="/blog/" />
<link rel="prev" href={page > 2 ? `/blog/page/${page - 1}/` : '/blog/'} />
{hasNextPage && <link rel="next" href={`/blog/page/${page + 1}/`} />}

Serve Link response headers alongside the HTML — some crawlers read headers in preference to <head> elements:

Link: </blog/page/2>; rel="next"
Cache-Control: public, s-maxage=86400

Validation Protocol

Run the following checks after each build and before any production deployment.

1. Manifest count parity

# Count generated routes in the build output
find .next/server/app/blog -name 'page.html' | wc -l
# Should match CMS published count
curl -s "${CMS_URL}/posts/count" \
  -H "Authorization: Bearer ${CMS_API_KEY}" | jq '.total'

2. HTTP status sampling

# Verify a random sample of 20 routes return 200
jq -r '.[].slug' build-manifest.json | shuf | head -20 | \
  xargs -I{} curl -o /dev/null -s -w "%{http_code} {}\n" \
  "https://staging.example.com/blog/{}"

3. Cache header verification

curl -I https://staging.example.com/blog/my-post | \
  grep -E 'cache-control|x-nextjs-cache|x-cache|cf-cache-status'

Expected output: cache-control: public, s-maxage=86400 and cf-cache-status: HIT on the second request.

4. XML sitemap coverage

curl -s https://staging.example.com/sitemap.xml | \
  xmllint --noout - && echo "Valid XML"
# Count URLs in sitemap and compare to manifest length
curl -s https://staging.example.com/sitemap.xml | \
  grep -c '<loc>'

For automated XML sitemap generation from the route manifest, see the sitemap configuration reference.

5. GSC URL Inspection

After deploying, use Google Search Console URL Inspection on a sample of newly generated routes. Confirm the indexing status shows URL is on Google or Discovered — currently not indexed (not Page with redirect or Soft 404).

Lighthouse CI threshold: mobile Performance score ≥ 90, LCP ≤ 2.5 s. Pre-rendered static routes should clear this easily; any failure indicates a hydration issue leaking into static pages.

Troubleshooting

Symptom	Root cause	Fix
Build timeout on large catalogs	Single unbounded API call fetching all routes	Switch to cursor-based pagination; chunk in batches of 200
Route count mismatch (build < CMS)	Draft items included in API response	Filter by `publishedAt !== null` before mapping to route array
Soft 404 in GSC (200 status but no content)	CMS item deleted after build; framework serves empty template	Add existence check in `load()` / `getStaticProps()`; return `404` status explicitly
`x-nextjs-cache: MISS` on every request	`revalidate = 0` set during debugging and not reverted	Restore `export const revalidate = 3600` on the page file
Duplicate URLs indexed (`/Blog/Post` and `/blog/post`)	Missing slug normalisation step	Apply lowercase + hyphen middleware redirect before route generation
Paginated pages missing from index	Pagination routes not included in sitemap	Generate paginated route array and append to sitemap XML
ISR pages not updating after CMS publish	Webhook not triggering on-demand revalidation	Verify webhook URL is accessible and `res.revalidate('/blog/<slug>')` is called
`410 Gone` pages still indexed	410 response cached by CDN	Set `Cache-Control: no-store` on error responses; purge CDN cache for deleted slugs

Error Handling and Fallback Routing

When a request arrives for a slug that does not exist in the CMS, the framework must return an explicit 404 or 410 — not a 200 with empty content. Soft 404s are a leading cause of indexation penalties on headless sites. See fixing 404s in headless dynamic routes for the full diagnostic workflow.

// app/blog/[slug]/page.tsx — explicit 404 on missing content
import { notFound } from 'next/navigation';

export default async function BlogPost({ params }: { params: { slug: string } }) {
  const res = await fetch(`${process.env.CMS_URL}/posts/${params.slug}`, {
    headers: { Authorization: `Bearer ${process.env.CMS_API_KEY}` },
  });

  if (res.status === 404) notFound();         // renders Next.js not-found.tsx → HTTP 404
  if (res.status === 410) {
    // Next.js doesn't have a built-in 410 helper; use a custom response
    return new Response(null, { status: 410 });
  }

  const post = await res.json();
  return <Article post={post} />;
}

Return 410 Gone (not 404) for content that previously existed and has been permanently removed — this signals to redirect chain management workflows that no redirect is warranted and Googlebot should drop the URL from its index faster than a 404.

Set Cache-Control: no-store on all 4xx responses to prevent CDN caching of transient error states.

Automating Route Generation in CI/CD

Manual builds break the contract between CMS publishes and live URLs. Wire the full route-generation pipeline into your CI/CD system so every content change triggers a targeted rebuild. The automating dynamic route generation for headless blogs guide covers the full webhook and pipeline configuration.

High-level CI integration:

# .github/workflows/build.yml (excerpt)
on:
  repository_dispatch:
    types: [cms-publish, cms-unpublish]

jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: npm ci
      - run: npm run build
        env:
          CMS_URL: ${{ secrets.CMS_URL }}
          CMS_API_KEY: ${{ secrets.CMS_API_KEY }}
      - run: npm run validate-routes  # cross-reference manifest vs CMS count

For ISR-based stacks, prefer on-demand revalidation over full rebuilds — call res.revalidate('/blog/<slug>') from a webhook handler so only the changed route is regenerated, keeping build times flat as the catalog grows.

Pages in This Section

Automating Dynamic Route Generation for Headless Blogs — webhook configuration, CI/CD pipeline hooks, and scheduled rebuild patterns for production CMS deployments
Fixing 404s in Headless Dynamic Routes — diagnosing and resolving soft 404s, explicit 404/410 status handling, and GSC coverage monitoring

Frequently Asked Questions

How does dynamic route generation affect crawl budget?

Pre-rendered static routes give crawlers instantly accessible HTML with predictable TTFB, which maximises the number of pages Googlebot indexes per crawl session. Unoptimised SSR routes that trigger slow upstream CMS calls can exhaust crawl budget in headless deployments on large catalogs.

Should I use SSG or ISR for headless dynamic routes?

SSG is optimal for SEO-critical, evergreen content because every URL is pre-rendered at build time, guaranteeing instant TTFB. ISR is better for large catalogs or content that changes frequently — set a revalidate interval matched to your publish cadence. The ISR vs SSG vs CSR routing guide covers the full decision matrix.

How do I validate dynamically generated routes after deployment?

Cross-reference the build manifest count with the CMS published item count, run curl -I against a representative sample to confirm 200 OK and correct cache headers, then use GSC URL Inspection to verify successful indexation.

What causes build timeouts on large CMS datasets?

Fetching the entire route list in a single unbounded API call is the most common cause. Use cursor-based pagination on the CMS side to chunk requests, and cap static generation limits with ISR fallback for slugs outside the initial build set.

Part of: Dynamic Routing & Indexation Workflows

Related

Slug Normalization Strategies — sanitization pipeline, middleware redirect patterns, and duplicate-URL prevention
Pagination Handling in Headless — sequential archive route generation, rel=prev/next headers, and canonical configuration
XML Sitemap Generation for Headless — building and serving dynamic sitemaps from the route manifest
Canonical URL Enforcement — enforcing canonical tags across dynamically generated route variants
Crawl Budget Impact in Headless — how rendering strategy choices affect Googlebot’s crawl allocation