Dynamic Route Generation for Headless CMS
Headless CMS platforms store content as API payloads with no inherent URL structure. Without an explicit build-time mapping step, JavaScript frameworks produce no crawlable paths β every URL either returns a client-rendered blank shell or a 404. Dynamic route generation is the process that converts CMS slugs into static HTML files at build time, guaranteeing that search engine crawlers receive fully resolved, link-followable paths on the first request.
Prerequisites
Before configuring route generation, verify the following are in place:
- Framework version: Next.js 13+ (App Router), SvelteKit 2.x, Nuxt 3.x, or Astro 3+
- Environment variables:
CMS_URL(base API endpoint) andCMS_API_KEY(bearer token) set in.env.local/ CI secrets - CMS pagination: confirm the API supports cursor or offset pagination β unbounded single-request fetches cause build timeouts on catalogs over ~500 items
- Webhook endpoint: a deployable URL that can receive
publish/unpublishevents from the CMS to trigger on-demand revalidation - Node version: 18+ (native
fetch, no polyfill required)
Architecture: CMS-to-URL Execution Path
The diagram below shows how a CMS content item becomes a pre-rendered static HTML file β and how ISR handles slugs published after the build.
Step-by-Step Implementation Workflow
Step 1 β Build the route manifest
Fetch all published items from your CMS and reduce them to a flat array of path segments. Use cursor-based pagination to stay within API rate limits and avoid CI build timeouts.
// lib/cms-routes.ts
interface CMSItem {
slug: string;
locale: string;
publishedAt: string | null;
}
export async function fetchRouteManifest(): Promise<Array<{ slug: string; locale: string }>> {
const routes: Array<{ slug: string; locale: string }> = [];
let cursor: string | null = null;
do {
const url = new URL(`${process.env.CMS_URL}/posts`);
url.searchParams.set('limit', '200');
if (cursor) url.searchParams.set('after', cursor);
const res = await fetch(url.toString(), {
headers: {
Authorization: `Bearer ${process.env.CMS_API_KEY}`,
Accept: 'application/json',
},
next: { revalidate: 0 }, // always fresh during build
});
const { data, meta } = await res.json();
const published = (data as CMSItem[]).filter((item) => item.publishedAt !== null);
routes.push(...published.map((item) => ({ slug: item.slug, locale: item.locale })));
cursor = meta.nextCursor ?? null;
} while (cursor !== null);
return routes;
}
Validation: Log routes.length during build and compare against the CMS admin dashboard published count. A mismatch indicates a pagination bug or filter issue.
Step 2 β Register routes with the framework
Pass the manifest to the frameworkβs static-path hook. Each framework has a different API but the same contract: return an array of param objects that map to URL segments.
Next.js App Router
// app/blog/[slug]/page.tsx
import { fetchRouteManifest } from '@/lib/cms-routes';
export async function generateStaticParams(): Promise<Array<{ slug: string }>> {
const routes = await fetchRouteManifest();
return routes.map((r) => ({ slug: r.slug }));
}
// Allow on-demand ISR for slugs published after the build
export const dynamicParams = true;
export const revalidate = 3600;
SEO impact: All known routes render as static HTML with sub-10 ms TTFB from the CDN edge. dynamicParams = true with revalidate = 3600 ensures newly published posts become crawlable within one hour without a full rebuild.
Validation: Inspect .next/server/app/blog/ β each slug should have a corresponding .html file. On deployed pages, confirm x-nextjs-cache: HIT on the second request.
SvelteKit
// src/routes/blog/[slug]/+page.ts
import type { EntryGenerator, PageLoad } from './$types';
import { fetchRouteManifest } from '$lib/cms-routes';
export const prerender = true;
export const entries: EntryGenerator = async () => {
const routes = await fetchRouteManifest();
return routes.map((r) => ({ slug: r.slug }));
};
export const load: PageLoad = async ({ fetch, params }) => {
const res = await fetch(`/api/articles/${params.slug}`);
if (!res.ok) return { article: null };
return { article: await res.json() };
};
SEO impact: prerender = true forces zero-JS static output for all listed entries. Crawlers receive fully resolved HTML with no hydration dependency, maximising the pages indexed per crawl budget allocation.
Validation: Run npm run build. Inspect .svelte-kit/output/prerendered/pages/blog/. Run curl -o /dev/null -w "%{http_code}" https://staging.example.com/blog/<slug> for a sample of slugs β all must return 200.
Nuxt 3
// nuxt.config.ts
export default defineNuxtConfig({
routeRules: {
'/blog/**': { prerender: true },
'/products/**': { isr: 3600 },
},
nitro: {
prerender: {
crawlLinks: true,
routes: ['/sitemap.xml'],
},
},
});
SEO impact: prerender: true on /blog/** tells the Nitro engine to generate static HTML for every blog route at build time, while /products/** uses ISR with a 1-hour TTL β appropriate for inventory data that changes frequently. This splits crawl budget allocation by content type.
Validation: Run npx nuxi build. Check .output/public/blog/ for prerendered HTML. The Nitro build log should confirm the expected route count under Prerendering routes.
Astro
// src/pages/blog/[slug].astro
---
import { fetchRouteManifest } from '../../lib/cms-routes';
import type { GetStaticPaths } from 'astro';
export const getStaticPaths: GetStaticPaths = async () => {
const routes = await fetchRouteManifest();
return routes.map((r) => ({
params: { slug: r.slug },
props: { slug: r.slug },
}));
};
const { slug } = Astro.props;
---
SEO impact: Astro generates zero-JS static HTML by default. No hydration bundle is shipped to crawlers, eliminating render-blocking resources entirely. This is the lightest possible output for SEO-critical content.
Validation: Inspect dist/blog/ for .html files. Verify Content-Type: text/html and the absence of <script type="module"> hydration tags in the rendered source.
HTTP Headers and CDN Directives Reference
| Header / Directive | Required value | Rationale |
|---|---|---|
Authorization |
Bearer ${CMS_API_KEY} |
Authenticate CMS API calls during build and ISR |
Cache-Control (static pages) |
public, s-maxage=86400, stale-while-revalidate=3600 |
Long CDN TTL; background refresh for changed content |
Cache-Control (ISR pages) |
public, s-maxage=3600, stale-while-revalidate=60 |
Shorter TTL matches revalidate interval |
Cache-Control (error pages) |
no-store |
Prevent 404/410 responses from being cached by mistake |
X-Robots-Tag |
index, follow |
Confirm indexation intent on all dynamically generated routes |
Vary |
Accept-Language |
Required when generating locale-specific route variants |
| CDN cache key | Vary on locale, content-type |
Prevent locale collisions in shared CDN caches |
| Webhook purge | Tag-based purge on publish event |
Invalidate stale ISR pages when content updates arrive |
URL Structure and Slug Processing
Raw CMS slugs often contain uppercase characters, special symbols, or trailing whitespace that produce duplicate URL variants. Apply slug normalization strategies before inserting slugs into the route manifest:
- Lowercase the entire string:
slug.toLowerCase() - Replace spaces and underscores with hyphens:
.replace(/[\s_]+/g, '-') - Strip non-alphanumeric characters (except hyphens):
.replace(/[^a-z0-9-]/g, '') - Collapse consecutive hyphens:
.replace(/-{2,}/g, '-') - Trim leading and trailing hyphens:
.replace(/^-|-$/g, '')
Configure a middleware redirect for any request arriving at a non-normalized variant:
// middleware.ts (Next.js)
import { NextRequest, NextResponse } from 'next/server';
function normalizeSlug(slug: string): string {
return slug
.toLowerCase()
.replace(/[\s_]+/g, '-')
.replace(/[^a-z0-9-]/g, '')
.replace(/-{2,}/g, '-')
.replace(/^-|-$/g, '');
}
export function middleware(req: NextRequest) {
const { pathname } = req.nextUrl;
const segments = pathname.split('/');
const normalized = segments.map(normalizeSlug).join('/');
if (normalized !== pathname) {
return NextResponse.redirect(new URL(normalized, req.url), 301);
}
}
Issue 301 redirects for malformed variants so link equity consolidates on the canonical form. Set Cache-Control: public, max-age=31536000, immutable on the redirect response β the CDN can then serve the redirect from cache, avoiding an origin roundtrip on every crawl.
Pagination and Archive Route Handling
Blog and product archive routes follow a sequential pattern (/blog/page/2, /blog/page/3) that requires explicit generation. Align this with the pagination handling for headless APIs reference for full header and canonical configuration.
// lib/pagination-routes.ts
export async function fetchPaginationManifest(
contentType: string,
pageSize = 20
): Promise<Array<{ page: string }>> {
const res = await fetch(`${process.env.CMS_URL}/${contentType}/count`, {
headers: { Authorization: `Bearer ${process.env.CMS_API_KEY}` },
});
const { total } = await res.json();
const totalPages = Math.ceil(total / pageSize);
// Exclude page 1 β that's the canonical archive index
return Array.from({ length: totalPages - 1 }, (_, i) => ({ page: String(i + 2) }));
}
On each paginated page, inject rel="prev" / rel="next" link headers and point the rel="canonical" to the first page of the archive:
// Rendered in <head>
<link rel="canonical" href="/blog/" />
<link rel="prev" href={page > 2 ? `/blog/page/${page - 1}/` : '/blog/'} />
{hasNextPage && <link rel="next" href={`/blog/page/${page + 1}/`} />}
Serve Link response headers alongside the HTML β some crawlers read headers in preference to <head> elements:
Link: </blog/page/2>; rel="next"
Cache-Control: public, s-maxage=86400
Validation Protocol
Run the following checks after each build and before any production deployment.
1. Manifest count parity
# Count generated routes in the build output
find .next/server/app/blog -name 'page.html' | wc -l
# Should match CMS published count
curl -s "${CMS_URL}/posts/count" \
-H "Authorization: Bearer ${CMS_API_KEY}" | jq '.total'
2. HTTP status sampling
# Verify a random sample of 20 routes return 200
jq -r '.[].slug' build-manifest.json | shuf | head -20 | \
xargs -I{} curl -o /dev/null -s -w "%{http_code} {}\n" \
"https://staging.example.com/blog/{}"
3. Cache header verification
curl -I https://staging.example.com/blog/my-post | \
grep -E 'cache-control|x-nextjs-cache|x-cache|cf-cache-status'
Expected output: cache-control: public, s-maxage=86400 and cf-cache-status: HIT on the second request.
4. XML sitemap coverage
curl -s https://staging.example.com/sitemap.xml | \
xmllint --noout - && echo "Valid XML"
# Count URLs in sitemap and compare to manifest length
curl -s https://staging.example.com/sitemap.xml | \
grep -c '<loc>'
For automated XML sitemap generation from the route manifest, see the sitemap configuration reference.
5. GSC URL Inspection
After deploying, use Google Search Console URL Inspection on a sample of newly generated routes. Confirm the indexing status shows URL is on Google or Discovered β currently not indexed (not Page with redirect or Soft 404).
Lighthouse CI threshold: mobile Performance score β₯ 90, LCP β€ 2.5 s. Pre-rendered static routes should clear this easily; any failure indicates a hydration issue leaking into static pages.
Troubleshooting
| Symptom | Root cause | Fix |
|---|---|---|
| Build timeout on large catalogs | Single unbounded API call fetching all routes | Switch to cursor-based pagination; chunk in batches of 200 |
| Route count mismatch (build < CMS) | Draft items included in API response | Filter by publishedAt !== null before mapping to route array |
| Soft 404 in GSC (200 status but no content) | CMS item deleted after build; framework serves empty template | Add existence check in load() / getStaticProps(); return 404 status explicitly |
x-nextjs-cache: MISS on every request |
revalidate = 0 set during debugging and not reverted |
Restore export const revalidate = 3600 on the page file |
Duplicate URLs indexed (/Blog/Post and /blog/post) |
Missing slug normalisation step | Apply lowercase + hyphen middleware redirect before route generation |
| Paginated pages missing from index | Pagination routes not included in sitemap | Generate paginated route array and append to sitemap XML |
| ISR pages not updating after CMS publish | Webhook not triggering on-demand revalidation | Verify webhook URL is accessible and res.revalidate('/blog/<slug>') is called |
410 Gone pages still indexed |
410 response cached by CDN | Set Cache-Control: no-store on error responses; purge CDN cache for deleted slugs |
Error Handling and Fallback Routing
When a request arrives for a slug that does not exist in the CMS, the framework must return an explicit 404 or 410 β not a 200 with empty content. Soft 404s are a leading cause of indexation penalties on headless sites. See fixing 404s in headless dynamic routes for the full diagnostic workflow.
// app/blog/[slug]/page.tsx β explicit 404 on missing content
import { notFound } from 'next/navigation';
export default async function BlogPost({ params }: { params: { slug: string } }) {
const res = await fetch(`${process.env.CMS_URL}/posts/${params.slug}`, {
headers: { Authorization: `Bearer ${process.env.CMS_API_KEY}` },
});
if (res.status === 404) notFound(); // renders Next.js not-found.tsx β HTTP 404
if (res.status === 410) {
// Next.js doesn't have a built-in 410 helper; use a custom response
return new Response(null, { status: 410 });
}
const post = await res.json();
return <Article post={post} />;
}
Return 410 Gone (not 404) for content that previously existed and has been permanently removed β this signals to redirect chain management workflows that no redirect is warranted and Googlebot should drop the URL from its index faster than a 404.
Set Cache-Control: no-store on all 4xx responses to prevent CDN caching of transient error states.
Automating Route Generation in CI/CD
Manual builds break the contract between CMS publishes and live URLs. Wire the full route-generation pipeline into your CI/CD system so every content change triggers a targeted rebuild. The automating dynamic route generation for headless blogs guide covers the full webhook and pipeline configuration.
High-level CI integration:
# .github/workflows/build.yml (excerpt)
on:
repository_dispatch:
types: [cms-publish, cms-unpublish]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- run: npm ci
- run: npm run build
env:
CMS_URL: ${{ secrets.CMS_URL }}
CMS_API_KEY: ${{ secrets.CMS_API_KEY }}
- run: npm run validate-routes # cross-reference manifest vs CMS count
For ISR-based stacks, prefer on-demand revalidation over full rebuilds β call res.revalidate('/blog/<slug>') from a webhook handler so only the changed route is regenerated, keeping build times flat as the catalog grows.
Pages in This Section
- Automating Dynamic Route Generation for Headless Blogs β webhook configuration, CI/CD pipeline hooks, and scheduled rebuild patterns for production CMS deployments
- Fixing 404s in Headless Dynamic Routes β diagnosing and resolving soft 404s, explicit 404/410 status handling, and GSC coverage monitoring
Frequently Asked Questions
How does dynamic route generation affect crawl budget?
Pre-rendered static routes give crawlers instantly accessible HTML with predictable TTFB, which maximises the number of pages Googlebot indexes per crawl session. Unoptimised SSR routes that trigger slow upstream CMS calls can exhaust crawl budget in headless deployments on large catalogs.
Should I use SSG or ISR for headless dynamic routes?
SSG is optimal for SEO-critical, evergreen content because every URL is pre-rendered at build time, guaranteeing instant TTFB. ISR is better for large catalogs or content that changes frequently β set a revalidate interval matched to your publish cadence. The ISR vs SSG vs CSR routing guide covers the full decision matrix.
How do I validate dynamically generated routes after deployment?
Cross-reference the build manifest count with the CMS published item count, run curl -I against a representative sample to confirm 200 OK and correct cache headers, then use GSC URL Inspection to verify successful indexation.
What causes build timeouts on large CMS datasets?
Fetching the entire route list in a single unbounded API call is the most common cause. Use cursor-based pagination on the CMS side to chunk requests, and cap static generation limits with ISR fallback for slugs outside the initial build set.
Part of: Dynamic Routing & Indexation Workflows
Related
- Slug Normalization Strategies β sanitization pipeline, middleware redirect patterns, and duplicate-URL prevention
- Pagination Handling in Headless β sequential archive route generation, rel=prev/next headers, and canonical configuration
- XML Sitemap Generation for Headless β building and serving dynamic sitemaps from the route manifest
- Canonical URL Enforcement β enforcing canonical tags across dynamically generated route variants
- Crawl Budget Impact in Headless β how rendering strategy choices affect Googlebotβs crawl allocation