Document Index
This workflow shows how to navigate a document's structure and retrieve verbatim text from specific provisions — without pulling the entire document.
Overview
1. Get a document → POST /documents/list (or upload one first)
2. Find relevant nodes → POST /clauses/search
3. Inspect the index → GET /documents/{id}/index?mode=structure
4. Pull verbatim text → GET /documents/{id}/index?mode=full&nodeIds=...
All requests require x-api-key, x-workspace-id, and x-app-name headers. See Authentication.
Step 1 — Get a document
If you haven't uploaded a document yet, follow the Document Management workflow first.
To list documents already in your workspace:
API Reference:
POST /documents/list
POST /documents/list
Content-Type: application/json
{
"limit": 25,
"offset": 0,
"orderBy": [{ "field": "created", "direction": "DESC" }]
}
Pick the documentId of the document you want to index.
Step 2 — Find relevant nodes with clause search
Use clause search to discover which nodes (clauses and provisions) exist across your workspace, filtered by tag. This is useful for identifying the clauseId values — which correspond to node IDs in the index — before fetching full text.
API Reference:
POST /clauses/search
POST /clauses/search
Content-Type: application/json
Request body:
{
"filter": {
"tagIds": ["tag_001", "tag_002"]
}
}
| Field | Required | Description |
|---|---|---|
filter.tagIds | Yes | Filter clauses by one or more tag IDs |
filter.tagCategoryId | No | Further narrow by tag category |
Response (200):
{
"count": 12,
"clauses": [
{
"clauseId": "98001",
"documentId": "10567",
"title": "Payment Terms",
"text": "Payment shall be made within 30 days...",
"depth": 2,
"tagIds": ["tag_001"],
"textBlockId": "70318",
"date": "2024-03-01",
"endorsed": false
}
]
}
Key fields to note:
| Field | Description |
|---|---|
clauseId | The node ID used in document index requests |
documentId | Which document this clause belongs to |
title | Clause heading |
text | Clause text as extracted |
depth | Nesting depth (1 = top-level section) |
Step 3 — Inspect the document index
The document index endpoint has three modes. Use structure first to get a full navigation map of a document without fetching any text.
API Reference:
GET /documents/{id}/index
structure mode — navigation map
Returns the section tree: node IDs, titles, clause references, snippets, keywords, and cross-references. No full verbatim text is included.
GET /documents/{id}/index?mode=structure
Optional query parameters:
| Parameter | Description |
|---|---|
depth | Limit results to nodes at this depth or shallower (integer > 0) |
includeDocumentMetadata | When true, adds a documentTags object (tags grouped by category) |
Response (200):
{
"spec": "syntheia/v1-draft",
"documentId": "10567",
"documentTitle": "Acme NDA 2024",
"extraction_mode": "acceptedOnly",
"documentIndex": [
{
"nodeId": "98001",
"title": "Payment Terms",
"clauseReference": "4",
"snippet": "Payment shall be made within 30 days...",
"crossReferencedIds": ["98045"],
"keywords": ["Payment"],
"children": [
{
"nodeId": "98002",
"title": "Late Payment",
"clauseReference": "4.1",
"snippet": "Any payment not received within the due date...",
"crossReferencedIds": [],
"keywords": [],
"children": []
}
]
}
]
}
Use the nodeId values from documentIndex to target specific nodes in the next step.
xrefs mode — cross-references
Returns cross-reference IDs for specified nodes — useful for tracing how provisions refer to each other — without fetching their text.
GET /documents/{id}/index?mode=xrefs&nodeIds=98001,98002
Response (200):
{
"spec": "syntheia/v1-draft",
"documentId": "10567",
"documentTitle": "Acme NDA 2024",
"extraction_mode": "acceptedOnly",
"nodes": [
{ "nodeId": "98001", "crossReferencedIds": ["98045", "98067"] }
]
}
Step 4 — Pull verbatim text
Once you have the nodeId values you need, use full mode to fetch the exact text for those nodes only.
GET /documents/{id}/index?mode=full&nodeIds=98001,98002
Query parameters:
| Parameter | Required | Description |
|---|---|---|
mode | Yes | Must be full |
nodeIds | No | Comma-separated node IDs to fetch text for |
depth | No | Limit candidate nodes to this depth or shallower |
Response (200):
{
"spec": "syntheia/v1-draft",
"documentId": "10567",
"documentTitle": "Acme NDA 2024",
"extraction_mode": "acceptedOnly",
"nodes": [
{
"nodeId": "98001",
"text": "Payment shall be made within 30 days of the invoice date..."
},
{
"nodeId": "98002",
"text": "Any payment not received within the due date shall incur..."
}
]
}
full mode returns a flat nodes array — one entry per requested nodeId. When a node has no direct text, its descendants' text is joined together.
End-to-end example (TypeScript)
Map a document with structure, then pull verbatim text for the top-level nodes with full.
const BASE = 'https://api.your-domain.com';
const HEADERS = {
'x-api-key': process.env.SYNTHEIA_API_KEY!,
'x-workspace-id': process.env.SYNTHEIA_WORKSPACE_ID!,
'x-app-name': 'my-app',
};
type IndexNode = { nodeId: string; title: string | null; children: IndexNode[] };
async function getStructure(docId: string, depth = 3): Promise<{ documentIndex: IndexNode[] }> {
const res = await fetch(`${BASE}/documents/${docId}/index?mode=structure&depth=${depth}`, {
headers: HEADERS,
});
return res.json();
}
async function getText(docId: string, nodeIds: string[]): Promise<{ nodes: { nodeId: string; text: string | null }[] }> {
const ids = encodeURIComponent(nodeIds.join(','));
const res = await fetch(`${BASE}/documents/${docId}/index?mode=full&nodeIds=${ids}`, {
headers: HEADERS,
});
return res.json();
}
// Map the document, then fetch text for the first 5 top-level nodes
const { documentIndex } = await getStructure('10567');
const topNodeIds = documentIndex.slice(0, 5).map((n) => n.nodeId);
const full = await getText('10567', topNodeIds);
console.log(full.nodes);
Next steps
- Search — find clauses across the workspace by tag
- API Reference — Document Index — full parameter reference