Skip to main content

Document Management

This workflow covers uploading a document into a Syntheia workspace — from checking for duplicates through to confirming ingestion — and then listing available documents.

Overview

1. Check if document already exists → GET /documents/check
2. Request a presigned upload URL → POST /documents/uploadURL
3. Upload the file → PUT <presigned_url>
4. Register the document → POST /documents
5. List workspace documents → POST /documents/list

All requests require the following headers:

HeaderDescription
x-api-keyYour API key (raw value, no prefix)
x-workspace-idThe workspace the document belongs to
x-app-nameYour application identifier

Step 1 — Check if the document exists

Before uploading, check whether a document with the same title already exists in the workspace.

API Reference: GET /documents/check

GET /documents/check?title=Acme+NDA+2024

Response (200) — document found:

{
"document": {
"documentId": "10234",
"title": "Acme NDA 2024",
"processing": false,
"failed": false,
"endorsed": true,
"created": "2024-03-01T10:00:00Z",
"date": "2024-03-01",
"tagIds": [],
"predictedTagIds": [],
"extractionType": "standard",
"documentProcessingData": {},
"user": { "userId": "55021", "email": "user@example.com" },
"snippet": [],
"similarity": 0,
"approvalStatus": "final",
"pageCount": 12
}
}

If the document already exists you can skip the remaining upload steps. If the response is 404 or document is null, proceed to Step 2.


Step 2 — Request a presigned upload URL

Generate a presigned URL to upload the file directly to storage.

API Reference: POST /documents/uploadURL

POST /documents/uploadURL
Content-Type: application/json

Request body:

{
"fileName": "acme-nda-2024.pdf",
"fileHash": "sha256-abc123...",
"title": "Acme NDA 2024"
}
FieldTypeDescription
fileNamestringThe original filename including extension
fileHashstringSHA-256 hash of the file content
titlestringDisplay title for the document

Response (200):

"https://storage.your-domain.com/uploads/acme-nda-2024.pdf?X-Amz-Signature=..."

The response is a plain string containing the presigned URL.


Step 3 — Upload the file

PUT the raw file bytes to the presigned URL. No Authorization header is needed — authentication is embedded in the URL.

PUT <presigned_url>
Content-Type: application/pdf

<raw file bytes>

A 200 response confirms the file was stored successfully.


Step 4 — Register the document

After the file is uploaded, notify Syntheia to begin processing. This triggers text extraction, indexing, and enrichment.

API Reference: POST /documents

POST /documents
Content-Type: multipart/form-data

Form fields:

FieldTypeDescription
documentJSON stringDocument metadata (see below)

The document field is a JSON string with the following shape:

{
"title": "Acme NDA 2024",
"tagIds": [],
"predictedTagIds": [],
"date": "2024-03-01",
"fileData": {
"fileName": "acme-nda-2024.pdf",
"fileHash": "sha256-abc123..."
}
}
FieldRequiredDescription
titleYesDocument title
tagIdsYesTag IDs to apply (empty array if none)
predictedTagIdsYesPredicted tag IDs (empty array if none)
fileData.fileNameYesFilename used in Step 2
fileData.fileHashYesFile hash used in Step 2
dateNoDocument date (ISO 8601)
extractionTypeNoExtraction mode override
dataNoArbitrary metadata object

Response (200):

{
"document": {
"documentId": "10567",
"title": "Acme NDA 2024",
"processing": true,
"failed": false,
"endorsed": false,
"created": "2024-06-24T12:00:00Z",
"date": "2024-03-01",
"tagIds": [],
"predictedTagIds": [],
"extractionType": "standard",
"documentProcessingData": {},
"user": { "userId": "55021", "email": "user@example.com" },
"snippet": [],
"similarity": 0,
"approvalStatus": "draft",
"pageCount": null
}
}

Note "processing": true — the document is queued for ingestion. Poll the document or listen to events to track when processing completes.


Step 5 — List workspace documents

Retrieve all documents available in the workspace.

API Reference: POST /documents/list

POST /documents/list
Content-Type: application/json

Request body:

{
"limit": 25,
"offset": 0,
"orderBy": [{ "field": "created", "direction": "DESC" }]
}
FieldRequiredDescription
limitYesNumber of results to return
offsetYesPagination offset
orderByNoArray of { field, direction } sort rules
searchNoFull-text search string
tagIdsNoFilter by tag IDs
processingNoFilter to documents currently processing
failedNoFilter to documents that failed ingestion
endorsedNoFilter to endorsed documents only

Response (200):

{
"count": 42,
"documents": [
{
"documentId": "10567",
"title": "Acme NDA 2024",
"processing": false,
"failed": false,
"endorsed": false,
"created": "2024-06-24T12:00:00Z",
"date": "2024-03-01",
"tagIds": [],
"predictedTagIds": [],
"extractionType": "standard",
"documentProcessingData": {},
"user": { "userId": "55021", "email": "user@example.com" },
"snippet": [],
"similarity": 0,
"approvalStatus": "draft",
"pageCount": 12
}
]
}

count is the total number of matching documents across all pages, not just the current page.