Pagination Handling in Headless Architectures

Without an explicit pagination contract between your CMS API and frontend, crawlers encounter orphaned endpoints, infinite scroll traps, and index-bloating query-string duplicates. This page documents how to design the full pagination stack β€” from API contract to CDN header β€” so every /page/{n}/ route resolves predictably, carries the correct indexation signals, and doesn’t squander crawl budget on low-value archive pages.

Prerequisites

Before implementing pagination, confirm each item below is in place:

  • Node.js 20+ and the CLI for your framework (Next.js 14+, Nuxt 3.12+, SvelteKit 2+, or Astro 4+)
  • CMS_URL environment variable set to your headless CMS base endpoint in .env.local (and in your CI secret store)
  • CMS API returns pagination metadata β€” at minimum totalPages or a nextCursor field in every list response
  • Edge/CDN middleware support β€” Vercel Edge Middleware, Cloudflare Workers, or Netlify Edge Functions for header injection
  • Screaming Frog or similar crawler available for post-deploy validation
  • Google Search Console property verified so URL Inspection is accessible

Execution Path: Offset Pagination vs Cursor Pagination

The pagination strategy you choose at the API layer determines which URL shapes and indexation patterns are feasible downstream.

Offset vs Cursor Pagination Decision Tree Flowchart showing how to choose between offset-based pagination (preferred for SEO, produces /page/n/ URLs) and cursor-based pagination (suited for large datasets but requires a URL translation layer for crawlability). CMS API pagination type? totalPages or cursor in response Dataset < 100k items? Yes Offset pagination /page/1/, /page/2/… No Cursor pagination + URL translation layer Pre-render all routes generateStaticParams() Inject rel=prev/next + canonical per page Build cursorβ†’page map at build or via ISR Same URL signals canonical + noindex 2+ Validate with curl + GSC 200 status, correct headers, rel tags in <head>

Step-by-Step Implementation

Step 1 β€” Define the API pagination contract

Create a TypeScript interface that normalises both offset and cursor responses into a single shape your route generators can consume:

// lib/pagination.ts
export interface PaginationMeta {
  totalPages: number;
  currentPage: number;
  nextCursor?: string;
  prevCursor?: string;
}

export interface PaginatedResponse<T> {
  items: T[];
  pagination: PaginationMeta;
}

export async function fetchPage<T>(
  endpoint: string,
  page: number
): Promise<PaginatedResponse<T>> {
  const res = await fetch(`${process.env.CMS_URL}${endpoint}?page=${page}&pageSize=10`);
  if (!res.ok) throw new Error(`CMS fetch failed: ${res.status}`);
  const raw = await res.json();
  // Normalise vendor-specific shapes
  return {
    items: raw.data ?? raw.items ?? raw.results,
    pagination: {
      totalPages: raw.totalPages ?? raw.meta?.total_pages,
      currentPage: page,
      nextCursor: raw.nextCursor,
      prevCursor: raw.prevCursor,
    },
  };
}

Validation: Call fetchPage('/articles', 1) in a test script and assert pagination.totalPages is a positive integer before wiring it into route generation.

Step 2 β€” Generate static routes at build time

Static pre-rendering of every /page/{n}/ route guarantees 200 responses and eliminates client-side routing fallbacks that block crawlers. This is the core technique covered in dynamic route generation for headless builds.

See framework-specific implementations in the next section.

Step 3 β€” Inject rel=prev/next and canonical per page

These link tags tell search engines the pagination sequence. Without them, each page appears to be standalone content competing with page 1.

Step 4 β€” Apply indexation directives to deep pages

Pages 2 and beyond rarely warrant independent index slots β€” the first page owns the keyword ranking. Set noindex, follow on page 2+ to consolidate signals and protect crawl budget in headless deployments.

Step 5 β€” Enforce canonical URL patterns at the edge

Query-string variants (?page=2) must redirect 301 to path-based equivalents (/page/2/). Apply this at the CDN layer before requests hit your origin. This is part of the broader canonical URL enforcement strategy.

Step 6 β€” Validate and monitor

Run curl -sI header checks, Lighthouse CI, and GSC URL Inspection against every /page/{n}/ route after each deploy. See the Validation Protocol section below.

Framework-Specific Implementations

Next.js App Router

generateStaticParams pre-renders all paginated routes at build time. Combine it with a <Head> component that injects the correct rel links:

// app/blog/page/[page]/page.tsx
import { fetchPage } from '@/lib/pagination';

export async function generateStaticParams(): Promise<Array<{ page: string }>> {
  const { pagination } = await fetchPage('/articles', 1);
  return Array.from({ length: pagination.totalPages }, (_, i) => ({
    page: (i + 1).toString(),
  }));
}

export default async function BlogPage({ params }: { params: { page: string } }) {
  const current = parseInt(params.page, 10);
  const { items, pagination } = await fetchPage('/articles', current);
  const base = 'https://seo-architecture.com/blog/page';

  return (
    <>
      <head>
        <link rel="canonical" href={`${base}/${current}/`} />
        {current > 1 && <link rel="prev" href={`${base}/${current - 1}/`} />}
        {current < pagination.totalPages && (
          <link rel="next" href={`${base}/${current + 1}/`} />
        )}
        {current > 1 && <meta name="robots" content="noindex, follow" />}
      </head>
      {/* render items */}
    </>
  );
}

SEO impact: Pre-renders all paginated routes as static HTML served from the CDN edge. Crawlers receive immediate 200 responses with correct rel signals already in the markup β€” no JavaScript execution required.

Validation: Inspect .next/server/app/blog/page/ β€” one directory per page number must exist. Run curl -s https://your-domain.com/blog/page/2/ | grep -E 'rel="(prev|next|canonical)"' to confirm tags are present in the raw HTML.

SvelteKit

SvelteKit’s +page.server.ts load function handles server-side page resolution and lets you inject X-Robots-Tag headers directly:

// src/routes/blog/page/[page]/+page.server.ts
import type { PageServerLoad } from './$types';
import { fetchPage } from '$lib/pagination';

export const load: PageServerLoad = async ({ params, setHeaders }) => {
  const current = parseInt(params.page, 10);
  const { items, pagination } = await fetchPage('/articles', current);

  if (current > 1) {
    setHeaders({ 'X-Robots-Tag': 'noindex, follow' });
  }

  return { items, pagination, current };
};
<!-- src/routes/blog/page/[page]/+page.svelte -->
<script lang="ts">
  export let data;
  const { items, pagination, current } = data;
  const base = 'https://seo-architecture.com/blog/page';
</script>

<svelte:head>
  <link rel="canonical" href="{base}/{current}/" />
  {#if current > 1}<link rel="prev" href="{base}/{current - 1}/" />{/if}
  {#if current < pagination.totalPages}<link rel="next" href="{base}/{current + 1}/" />{/if}
</svelte:head>

SEO impact: Header injection via setHeaders happens before the HTML response leaves the server β€” CDN and crawler both see X-Robots-Tag without any client-side dependency.

Validation: curl -sI https://your-domain.com/blog/page/2/ and assert x-robots-tag: noindex, follow appears in the response headers.

Nuxt 3

Nuxt’s useHead composable applies rel tags during SSR, ensuring they appear in the initial HTML payload rather than being injected by client JavaScript:

// pages/blog/page/[page].vue
<script setup lang="ts">
import { fetchPage } from '~/lib/pagination';

const route = useRoute();
const current = Number(route.params.page);
const { data } = await useAsyncData(`blog-page-${current}`, () =>
  fetchPage('/articles', current)
);

const base = 'https://seo-architecture.com/blog/page';

useHead({
  link: [
    { rel: 'canonical', href: `${base}/${current}/` },
    ...(current > 1
      ? [{ rel: 'prev', href: `${base}/${current - 1}/` }]
      : []),
    ...(data.value && current < data.value.pagination.totalPages
      ? [{ rel: 'next', href: `${base}/${current + 1}/` }]
      : []),
  ],
  meta: current > 1
    ? [{ name: 'robots', content: 'noindex, follow' }]
    : [],
});
</script>

SEO impact: SSR-rendered rel tags are visible in the raw HTML β€” no hydration lag means crawlers see correct pagination signals on the first parse.

Validation: nuxi generate then check dist/blog/page/2/index.html for rel="prev", rel="canonical", and <meta name="robots" content="noindex, follow"> in the <head>.

Astro

Astro’s built-in paginate() helper handles route generation and injects metadata automatically:

// src/pages/blog/[...page].astro
---
export async function getStaticPaths({ paginate }) {
  const res = await fetch(`${import.meta.env.CMS_URL}/articles?pageSize=10&page=1`);
  const { items, pagination } = await res.json();
  // Fetch remaining pages
  const allItems = items; // simplified β€” fetch all in your real implementation
  return paginate(allItems, { pageSize: 10 });
}

const { page } = Astro.props;
const isDeepPage = page.currentPage > 1;
---
<html lang="en">
<head>
  <link rel="canonical" href={page.url.current} />
  {page.url.prev && <link rel="prev" href={page.url.prev} />}
  {page.url.next && <link rel="next" href={page.url.next} />}
  {isDeepPage && <meta name="robots" content="noindex, follow" />}
</head>

SEO impact: paginate() generates /blog/1/, /blog/2/ routes with page.url.prev and page.url.next automatically populated β€” reducing manual routing errors common in hand-rolled implementations.

Validation: astro build then verify dist/blog/ contains one directory per page number, each with a correct index.html.

HTTP Headers and CDN Directives

The table below documents every header relevant to paginated headless routes. Configure these at the CDN or edge middleware layer so they apply to all responses, regardless of framework:

Header Required value Rationale
X-Robots-Tag noindex, follow Applied on page 2+ as a server-side fallback when the <meta> tag may be delayed by hydration
Link (canonical) <https://domain.com/page/{n}/>; rel="canonical" HTTP-level canonical signal, picked up by crawlers before HTML parsing
Link (pagination) <…/page/1/>; rel="prev", <…/page/3/>; rel="next" Explicit pagination sequence for cross-engine compatibility
Cache-Control public, max-age=86400, stale-while-revalidate=604800 Serves page 1 from CDN cache; stale-while-revalidate prevents crawler timeouts during revalidation
Vary Accept-Encoding Prevents cache poisoning when serving both brotli and gzip variants

Redirect rule (Cloudflare Workers or Vercel middleware):

// middleware.ts (Next.js / Vercel Edge)
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

export function middleware(request: NextRequest) {
  const url = request.nextUrl.clone();
  const pageParam = url.searchParams.get('page');

  // Redirect ?page=2 β†’ /page/2/
  if (pageParam && url.pathname.startsWith('/blog')) {
    url.searchParams.delete('page');
    url.pathname = `/blog/page/${pageParam}/`;
    return NextResponse.redirect(url, 301);
  }

  return NextResponse.next();
}

export const config = {
  matcher: ['/blog/:path*'],
};

This redirect rule is an extension of the redirect chain management patterns that apply across all headless routing scenarios.

Sitemap Chunking for Paginated Routes

Including all /page/{n}/ routes in your XML sitemap generation pipeline is essential for discovery on sites with deep archives. However, sitemap files cap at 50,000 URLs β€” split paginated routes into a dedicated sitemap-paginated.xml and reference it from your sitemap_index.xml:

<!-- sitemap_index.xml -->
<sitemapindex xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  <sitemap>
    <loc>https://seo-architecture.com/sitemap-articles.xml</loc>
    <lastmod>2026-06-22</lastmod>
  </sitemap>
  <sitemap>
    <loc>https://seo-architecture.com/sitemap-paginated.xml</loc>
    <lastmod>2026-06-22</lastmod>
  </sitemap>
</sitemapindex>

Only include page 1 of each section in sitemaps. Pages 2+ carry noindex and their inclusion in the sitemap sends contradictory signals to Googlebot.

Validation Protocol

Run these checks after every deploy touching pagination routes:

1. Header audit (all paginated routes)

for page in 1 2 3 5 10; do
  echo "=== /blog/page/${page}/ ==="
  curl -sI "https://your-domain.com/blog/page/${page}/" \
    | grep -iE '(x-robots-tag|link:|cache-control|location|http/)'
done

Expected output for page 2+: x-robots-tag: noindex, follow and no location: (meaning no unexpected redirect).

2. Markup verification

curl -s "https://your-domain.com/blog/page/2/" \
  | grep -E '<(link|meta)[^>]+(rel="(canonical|prev|next)"|name="robots")[^>]*/>'

All four tags (canonical, prev, next, robots) must appear in the raw HTML β€” not injected post-hydration.

3. Redirect chain check

# Confirm ?page=2 redirects 301 to /page/2/
curl -sI "https://your-domain.com/blog?page=2" \
  | grep -E '(HTTP/|location:)'

Expected: HTTP/2 301 followed by location: .../blog/page/2/.

4. GSC URL Inspection

Use the Google Search Console URL Inspection API to confirm page 1 is indexed and pages 2+ return EXCLUDED with reason NOINDEX.

5. Lighthouse CI threshold

Set a custom Lighthouse CI audit in your CI pipeline to catch regressions:

# lighthouserc.yml
ci:
  assert:
    assertions:
      canonical: ['error', { minScore: 1 }]
      robots-txt: ['error', { minScore: 1 }]

Troubleshooting

Symptom Root cause Fix
Pages 2–N return 404 after deploy generateStaticParams fetched page 1 only and totalPages resolved to 1 Assert totalPages > 0 and log the raw CMS response during build; check CMS_URL env var is set in CI
rel="next" missing from <head> Tag injected by client JS after hydration β€” not present in initial HTML Move rel injection to SSR layer (useHead, getServerSideProps, or +page.server.ts)
Query-string URLs (?page=2) being indexed 301 redirect rule not applied at edge before HTML response Deploy middleware redirect and verify with curl -sI β€” confirm location header present
Duplicate canonical URLs across page 2+ All pages setting canonical to page 1 without a unique self-canonical Each page must carry its own self-referential canonical; page 1 is NOT the canonical for page 2
X-Robots-Tag: noindex applied to page 1 Off-by-one error in page comparison Change page >= 1 condition to page > 1; validate with header check script above
Sitemap includes all page numbers noindex pages included in sitemap creates contradictory signals Filter sitemap generator to page === 1 only; run xmllint against generated sitemap
ISR revalidation exposes stale totalPages New content published but page count not updated until next revalidation Set revalidate to match CMS publish frequency; add on-demand revalidation webhook triggered by CMS publish events

Child Pages

Frequently Asked Questions

Should I use offset-based or cursor-based pagination for SEO? Offset-based pagination producing /page/2/, /page/3/ paths is strongly preferred for SEO. It creates predictable, crawlable URLs that search engines can discover and re-crawl without complex state tracking. Cursor-based pagination suits API performance at scale but requires a secondary URL translation layer to remain crawlable.

How do I handle noindex for paginated pages beyond page 1? Apply noindex, follow to pages 2+. This preserves crawl budget by keeping Googlebot focused on the canonical first page, while the follow directive lets link equity pass through to deeper content pages.

Do headless frameworks automatically inject rel=prev/next? No. All major headless frameworks β€” Next.js, Nuxt, SvelteKit, Astro β€” require explicit rel=prev/next injection via their respective head-management APIs. The framework routes pages but does not add pagination link tags without configuration.

How does pagination affect Core Web Vitals in headless setups? Poorly implemented pagination causes CLS from dynamic content injection and delayed LCP from client-side data fetching. Pre-rendering /page/{n}/ routes as static HTML eliminates both problems by serving fully-rendered pages from the CDN edge.


Part of: Dynamic Routing & Indexation Workflows

Related