Document Management
This workflow covers uploading a document into a Syntheia workspace — from checking for duplicates through to confirming ingestion — and then listing available documents.
Overview
1. Check if document already exists → GET /documents/check
2. Request a presigned upload URL → POST /documents/uploadURL
3. Upload the file → PUT <presigned_url>
4. Register the document → POST /documents
5. List workspace documents → POST /documents/list
All requests require the following headers:
| Header | Description |
|---|---|
x-api-key | Your API key (raw value, no prefix) |
x-workspace-id | The workspace the document belongs to |
x-app-name | Your application identifier |
Step 1 — Check if the document exists
Before uploading, check whether a document with the same title already exists in the workspace.
API Reference:
GET /documents/check
GET /documents/check?title=Acme+NDA+2024
Response (200) — document found:
{
"document": {
"documentId": "10234",
"title": "Acme NDA 2024",
"processing": false,
"failed": false,
"endorsed": true,
"created": "2024-03-01T10:00:00Z",
"date": "2024-03-01",
"tagIds": [],
"predictedTagIds": [],
"extractionType": "standard",
"documentProcessingData": {},
"user": { "userId": "55021", "email": "user@example.com" },
"snippet": [],
"similarity": 0,
"approvalStatus": "final",
"pageCount": 12
}
}
If the document already exists you can skip the remaining upload steps. If the response is 404 or document is null, proceed to Step 2.
Step 2 — Request a presigned upload URL
Generate a presigned URL to upload the file directly to storage.
API Reference:
POST /documents/uploadURL
POST /documents/uploadURL
Content-Type: application/json
Request body:
{
"fileName": "acme-nda-2024.pdf",
"fileHash": "sha256-abc123...",
"title": "Acme NDA 2024"
}
| Field | Type | Description |
|---|---|---|
fileName | string | The original filename including extension |
fileHash | string | SHA-256 hash of the file content |
title | string | Display title for the document |
Response (200):
"https://storage.your-domain.com/uploads/acme-nda-2024.pdf?X-Amz-Signature=..."
The response is a plain string containing the presigned URL.
Step 3 — Upload the file
PUT the raw file bytes to the presigned URL. No Authorization header is needed — authentication is embedded in the URL.
PUT <presigned_url>
Content-Type: application/pdf
<raw file bytes>
A 200 response confirms the file was stored successfully.
Step 4 — Register the document
After the file is uploaded, notify Syntheia to begin processing. This triggers text extraction, indexing, and enrichment.
API Reference:
POST /documents
POST /documents
Content-Type: multipart/form-data
Form fields:
| Field | Type | Description |
|---|---|---|
document | JSON string | Document metadata (see below) |
The document field is a JSON string with the following shape:
{
"title": "Acme NDA 2024",
"tagIds": [],
"predictedTagIds": [],
"date": "2024-03-01",
"fileData": {
"fileName": "acme-nda-2024.pdf",
"fileHash": "sha256-abc123..."
}
}
| Field | Required | Description |
|---|---|---|
title | Yes | Document title |
tagIds | Yes | Tag IDs to apply (empty array if none) |
predictedTagIds | Yes | Predicted tag IDs (empty array if none) |
fileData.fileName | Yes | Filename used in Step 2 |
fileData.fileHash | Yes | File hash used in Step 2 |
date | No | Document date (ISO 8601) |
extractionType | No | Extraction mode override |
data | No | Arbitrary metadata object |
Response (200):
{
"document": {
"documentId": "10567",
"title": "Acme NDA 2024",
"processing": true,
"failed": false,
"endorsed": false,
"created": "2024-06-24T12:00:00Z",
"date": "2024-03-01",
"tagIds": [],
"predictedTagIds": [],
"extractionType": "standard",
"documentProcessingData": {},
"user": { "userId": "55021", "email": "user@example.com" },
"snippet": [],
"similarity": 0,
"approvalStatus": "draft",
"pageCount": null
}
}
Note "processing": true — the document is queued for ingestion. Poll the document or listen to events to track when processing completes.
Step 5 — List workspace documents
Retrieve all documents available in the workspace.
API Reference:
POST /documents/list
POST /documents/list
Content-Type: application/json
Request body:
{
"limit": 25,
"offset": 0,
"orderBy": [{ "field": "created", "direction": "DESC" }]
}
| Field | Required | Description |
|---|---|---|
limit | Yes | Number of results to return |
offset | Yes | Pagination offset |
orderBy | No | Array of { field, direction } sort rules |
search | No | Full-text search string |
tagIds | No | Filter by tag IDs |
processing | No | Filter to documents currently processing |
failed | No | Filter to documents that failed ingestion |
endorsed | No | Filter to endorsed documents only |
Response (200):
{
"count": 42,
"documents": [
{
"documentId": "10567",
"title": "Acme NDA 2024",
"processing": false,
"failed": false,
"endorsed": false,
"created": "2024-06-24T12:00:00Z",
"date": "2024-03-01",
"tagIds": [],
"predictedTagIds": [],
"extractionType": "standard",
"documentProcessingData": {},
"user": { "userId": "55021", "email": "user@example.com" },
"snippet": [],
"similarity": 0,
"approvalStatus": "draft",
"pageCount": 12
}
]
}
count is the total number of matching documents across all pages, not just the current page.