Skip to main content

Supported Formats

TypeExtensionsMax Size
PDF.pdf50 MB
Word.docx50 MB
Text.txt, .md50 MB
Data.csv, .json50 MB

Quick Start

1. Create a Dataset

curl -X POST https://api.cuadra.ai/v1/datasets \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: create-ds-001" \
  -d '{"name": "Support KB", "description": "FAQs and guides"}'

2. Upload Documents

Use the Files API to upload documents, then associate them with the dataset.
curl -X POST https://api.cuadra.ai/v1/models/model_abc/datasets \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"datasetId": "ds_xyz", "usageType": "rag"}'

Document Processing

Documents are processed asynchronously after upload:
StatusDescription
processingChunking and embedding in progress
readyAvailable for queries
failedProcessing error (check file integrity)
Processing time depends on document size. PDFs with complex layouts take longer. Poll the file status or use webhooks to know when processing completes.


Best Practices

Organize by topic

Create separate datasets for different knowledge domains (e.g., “Product Docs”, “Legal”, “HR Policies”). This improves retrieval relevance and lets you control which knowledge each model can access.

Keep documents focused

Prefer multiple focused documents over one large document. The chunking algorithm works best with well-structured content.

Use descriptive filenames

Filenames appear in source citations. Use descriptive names like password-reset-guide.pdf instead of doc123.pdf.