Handling Large Datasets with Pagination
When working with large datasets in Neo4j, returning all data at once can be slow, consume excessive memory, and overwhelm clients. Pagination solves this by returning data in smaller, manageable chunks.
Why Pagination?
Consider the following tool that lists all movies:
server.registerTool("listAllMovies", {
description: "List ALL movies in the database",
}, async () => {
const { records } = await driver.executeQuery(
"MATCH (m:Movie) RETURN m.title AS title ORDER BY m.title",
{},
{ database }
);
// What if there are 100,000 movies?
const movies = records.map(r => r.get("title"));
return {
content: [{ type: "text", text: movies.join("\n") }],
};
});This approach breaks down with large datasets. It loads every movie into memory, serializes a massive string, and sends it all at once to the client. Pagination lets you fetch and return data in smaller pages instead.
Understanding Cursor-Based Pagination
Pagination allows you to fetch data in smaller pages or batches. MCP uses cursor-based pagination, where a cursor (opaque string) marks your position in the dataset.
How it works:
- Client requests the first page (no cursor)
- Server returns the first batch plus a cursor to the next page
- Client requests the next page using the cursor
- Server returns the next batch plus a new cursor
- Process repeats until no cursor is returned (end of data)
Implementing Pagination in Neo4j
To implement pagination in a Cypher query, use Neo4j's SKIP and LIMIT clauses.
The following query returns the first 100 movies:
MATCH (m:Movie)
RETURN m.title
ORDER BY m.title
SKIP 0 LIMIT 100 // First page (0-99)The following query skips the first 100 movies and returns the next 100 movies:
MATCH (m:Movie)
RETURN m.title
ORDER BY m.title
SKIP 100 LIMIT 100 // Second page (100-199)The cursor is simply the skip value encoded as a string.
Paginated Tool Implementation
You can implement pagination as a tool with cursor and pageSize parameters:
import { z } from "zod";
server.registerTool("listMoviesPaginated", {
description: "List movies with pagination support",
inputSchema: {
cursor: z.string().default("0").describe("Pagination cursor (skip value as string, default '0')"),
pageSize: z.number().default(50).describe("Number of movies per page (default 50)"),
},
}, async ({ cursor, pageSize }) => {
// Convert cursor to skip value
const skip = parseInt(cursor, 10);
console.error(`Fetching movies ${skip} to ${skip + pageSize}...`);
// Query with SKIP and LIMIT
const { records } = await driver.executeQuery(
`
MATCH (m:Movie)
RETURN m.title AS title, m.released AS released
ORDER BY m.title
SKIP $skip
LIMIT $limit
`,
{ skip: neo4j.int(skip), limit: neo4j.int(pageSize) },
{ database }
);
const movies = records.map(record => record.toObject());
// Calculate next cursor
// If we got a full page, there might be more data
const nextCursor = movies.length === pageSize
? String(skip + pageSize)
: null;
console.error(`Returned ${movies.length} movies`);
return {
content: [{
type: "text",
text: JSON.stringify({
movies,
nextCursor,
currentPage: Math.floor(skip / pageSize),
pageSize,
}, null, 2),
}],
};
});The key elements of this implementation:
- Cursor parsing - The cursor string is converted to a numeric skip value using
parseInt() - SKIP and LIMIT - The Cypher query uses these clauses to fetch only the requested page
- Next cursor calculation - If the result set is full (equal to
pageSize), there may be more data. The next cursor is the current skip value plus the page size - Structured response - The response includes both the data and pagination metadata
Best Practices for Pagination
- Consistent ordering - Always use
ORDER BYto ensure consistent results across pages - Reasonable page sizes - Default to 20-50 items per page for good user experience
- Include metadata - Return page number, total pages (if known), and
hasMoreflag - Handle invalid cursors - Validate cursor values and handle errors gracefully
- Optimize queries - Use indexes on properties used in
ORDER BYandWHEREclauses - Consider total counts - For some UIs, include total count (but this adds query overhead)