Skip to main content

Document Index

This workflow shows how to navigate a document's structure and retrieve verbatim text from specific provisions — without pulling the entire document.

Overview

1. Get a document → POST /documents/list (or upload one first)
2. Find relevant nodes → POST /clauses/search
3. Inspect the index → GET /documents/{id}/index?mode=structure
4. Pull verbatim text → GET /documents/{id}/index?mode=full&nodeIds=...

All requests require x-api-key, x-workspace-id, and x-app-name headers. See Authentication.


Step 1 — Get a document

If you haven't uploaded a document yet, follow the Document Management workflow first.

To list documents already in your workspace:

API Reference: POST /documents/list

POST /documents/list
Content-Type: application/json
{
"limit": 25,
"offset": 0,
"orderBy": [{ "field": "created", "direction": "DESC" }]
}

Pick the documentId of the document you want to index.


Use clause search to discover which nodes (clauses and provisions) exist across your workspace, filtered by tag. This is useful for identifying the clauseId values — which correspond to node IDs in the index — before fetching full text.

API Reference: POST /clauses/search

POST /clauses/search
Content-Type: application/json

Request body:

{
"filter": {
"tagIds": ["tag_001", "tag_002"]
}
}
FieldRequiredDescription
filter.tagIdsYesFilter clauses by one or more tag IDs
filter.tagCategoryIdNoFurther narrow by tag category

Response (200):

{
"count": 12,
"clauses": [
{
"clauseId": "98001",
"documentId": "10567",
"title": "Payment Terms",
"text": "Payment shall be made within 30 days...",
"depth": 2,
"tagIds": ["tag_001"],
"textBlockId": "70318",
"date": "2024-03-01",
"endorsed": false
}
]
}

Key fields to note:

FieldDescription
clauseIdThe node ID used in document index requests
documentIdWhich document this clause belongs to
titleClause heading
textClause text as extracted
depthNesting depth (1 = top-level section)

Step 3 — Inspect the document index

The document index endpoint has three modes. Use structure first to get a full navigation map of a document without fetching any text.

API Reference: GET /documents/{id}/index

structure mode — navigation map

Returns the section tree: node IDs, titles, clause references, snippets, keywords, and cross-references. No full verbatim text is included.

GET /documents/{id}/index?mode=structure

Optional query parameters:

ParameterDescription
depthLimit results to nodes at this depth or shallower (integer > 0)
includeDocumentMetadataWhen true, adds a documentTags object (tags grouped by category)

Response (200):

{
"spec": "syntheia/v1-draft",
"documentId": "10567",
"documentTitle": "Acme NDA 2024",
"extraction_mode": "acceptedOnly",
"documentIndex": [
{
"nodeId": "98001",
"title": "Payment Terms",
"clauseReference": "4",
"snippet": "Payment shall be made within 30 days...",
"crossReferencedIds": ["98045"],
"keywords": ["Payment"],
"children": [
{
"nodeId": "98002",
"title": "Late Payment",
"clauseReference": "4.1",
"snippet": "Any payment not received within the due date...",
"crossReferencedIds": [],
"keywords": [],
"children": []
}
]
}
]
}

Use the nodeId values from documentIndex to target specific nodes in the next step.

xrefs mode — cross-references

Returns cross-reference IDs for specified nodes — useful for tracing how provisions refer to each other — without fetching their text.

GET /documents/{id}/index?mode=xrefs&nodeIds=98001,98002

Response (200):

{
"spec": "syntheia/v1-draft",
"documentId": "10567",
"documentTitle": "Acme NDA 2024",
"extraction_mode": "acceptedOnly",
"nodes": [
{ "nodeId": "98001", "crossReferencedIds": ["98045", "98067"] }
]
}

Step 4 — Pull verbatim text

Once you have the nodeId values you need, use full mode to fetch the exact text for those nodes only.

GET /documents/{id}/index?mode=full&nodeIds=98001,98002

Query parameters:

ParameterRequiredDescription
modeYesMust be full
nodeIdsNoComma-separated node IDs to fetch text for
depthNoLimit candidate nodes to this depth or shallower

Response (200):

{
"spec": "syntheia/v1-draft",
"documentId": "10567",
"documentTitle": "Acme NDA 2024",
"extraction_mode": "acceptedOnly",
"nodes": [
{
"nodeId": "98001",
"text": "Payment shall be made within 30 days of the invoice date..."
},
{
"nodeId": "98002",
"text": "Any payment not received within the due date shall incur..."
}
]
}

full mode returns a flat nodes array — one entry per requested nodeId. When a node has no direct text, its descendants' text is joined together.


End-to-end example (TypeScript)

Map a document with structure, then pull verbatim text for the top-level nodes with full.

const BASE = 'https://api.your-domain.com';
const HEADERS = {
'x-api-key': process.env.SYNTHEIA_API_KEY!,
'x-workspace-id': process.env.SYNTHEIA_WORKSPACE_ID!,
'x-app-name': 'my-app',
};

type IndexNode = { nodeId: string; title: string | null; children: IndexNode[] };

async function getStructure(docId: string, depth = 3): Promise<{ documentIndex: IndexNode[] }> {
const res = await fetch(`${BASE}/documents/${docId}/index?mode=structure&depth=${depth}`, {
headers: HEADERS,
});
return res.json();
}

async function getText(docId: string, nodeIds: string[]): Promise<{ nodes: { nodeId: string; text: string | null }[] }> {
const ids = encodeURIComponent(nodeIds.join(','));
const res = await fetch(`${BASE}/documents/${docId}/index?mode=full&nodeIds=${ids}`, {
headers: HEADERS,
});
return res.json();
}

// Map the document, then fetch text for the first 5 top-level nodes
const { documentIndex } = await getStructure('10567');
const topNodeIds = documentIndex.slice(0, 5).map((n) => n.nodeId);
const full = await getText('10567', topNodeIds);
console.log(full.nodes);

Next steps