Quick Start
Get up and running with Refractr in 60 seconds. You'll need an API key — contact us to get one.
Python
import requests
BASE_URL = "https://api.refractr.io"
API_KEY = "dm_your_api_key_here"
payload = {
"document_text": "Invoice #2847\nDate: 2026-01-15\nTotal: €1,249.00",
"template": {
"invoice_number": None,
"date": None,
"total_amount": None
}
}
response = requests.post(
f"{BASE_URL}/api/v1/extract/",
headers={"Authorization": f"Bearer {API_KEY}"},
json=payload,
timeout=30
)
result = response.json()
print(result["extracted_data"])
curl
curl -X POST https://api.refractr.io/api/v1/extract/ \
-H "Authorization: Bearer dm_your_api_key_here" \
-H "Content-Type: application/json" \
-d '{
"document_text": "Invoice #2847\nDate: 2026-01-15\nTotal: €1,249.00",
"template": {
"invoice_number": null,
"date": null,
"total_amount": null
}
}'
JavaScript
const response = await fetch(
"https://api.refractr.io/api/v1/extract/",
{
method: "POST",
headers: {
"Authorization": "Bearer dm_your_api_key_here",
"Content-Type": "application/json"
},
body: JSON.stringify({
document_text: "Invoice #2847\nDate: 2026-01-15\nTotal: €1,249.00",
template: {
invoice_number: null,
date: null,
total_amount: null
}
})
}
);
const result = await response.json();
console.log(result.extracted_data);
The API returns your extracted data structured exactly as you defined it. In the example above, you'd get back an object with invoice_number, date, and total_amount fields filled in.
Authentication
All API requests must include your API key in the Authorization header.
Authorization: Bearer dm_your_api_key_here
Tip: Keep your API key secret. If compromised, regenerate it immediately from your dashboard. Never commit it to version control.
Extract Data
The core endpoint. Submit a document and get back structured data matching your template.
POST
/api/v1/extract/
Request Body
Example
{
"document_text": "BREAKING — Nvidia soars 12% to ~$187 after crushing Q4 earnings. Revenue hit $22.1B vs $20.4B expected.",
"template": {
"company": null,
"stock_move_pct": null,
"revenue": null
},
"wait": true
}
document_text required
The raw text to extract from. Max 50,000 characters.
template required
A JSON object defining the structure you want. Use null for scalar values, [] for lists, and nested objects for hierarchical data.
wait
Wait for the extraction to complete (default: true). Set to false for async mode and poll the result later.
Response (Sync Mode)
200 OK
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "success",
"extracted_data": {
"company": "Nvidia",
"stock_move_pct": 12.0,
"revenue": "$22.1B"
},
"metadata": {
"latency_ms": 340,
"credits_charged": 1
}
}
Response (Async Mode)
202 Accepted
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending"
}
When wait: false, the API returns immediately with a job ID. Use the Poll Status endpoint to check when your result is ready.
Poll Status
Check the status of an async extraction job.
GET
/api/v1/extract/{job_id}/
Response (Pending)
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "pending"
}
Response (Complete)
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "success",
"extracted_data": { ... },
"metadata": { ... }
}
Best Practice: Start with 100ms delays between polls, then back off exponentially. Most extractions complete within 1–5 seconds.
Error Codes
400 Bad Request
Invalid request (e.g., malformed JSON, missing required fields, template too complex). Check the error message for details.
401 Unauthorized
Invalid or missing API key. Verify your key is correct and included in the Authorization header.
402 Payment Required
Insufficient API credits. Purchase more at /billing/.
429 Too Many Requests
Rate limited. Requests are throttled to 100/min. Back off and retry with exponential backoff.
503 Service Unavailable
GPU servers are temporarily unavailable. Retry in a few moments.
504 Gateway Timeout
Extraction took longer than 30 seconds. Use async mode (wait: false) for large documents.