Document Data Extraction API

Streamline your workflows with our Document Data Extraction API, designed to transform any structured or unstructured document into actionable, structured data.

Elevate your data handling capabilities with Extracta.ai's Document Data Extraction API. Our cutting-edge solution empowers your systems to automatically extract structured data from a myriad of documents - whether they are scanned images, PDFs, emails, invoices, contracts, or any digital file format you can think of. Tailored to meet the needs of various industries, our API facilitates the seamless automation of workflows, significantly reducing manual efforts and enhancing overall efficiency.

Features:

Universal Compatibility: Process documents in any format - PDF, DOCX, TXT, JPG, PNG, and more.
High Accuracy and Speed: Leverage state-of-the-art technology that requires no pre-training, ensuring rapid extraction with superior accuracy.
Customizable Data Extraction: Define specific extraction criteria to meet your unique business needs, from extracting specific text sections to complex data points.
Easy Integration: With developer-friendly API documentation, integrate our service smoothly into your existing software or workflow.
Scalability: From a few documents to thousands, our API can handle batches of any size efficiently.
Security: Your data's privacy and security are paramount. We ensure that your information is never used for training purposes and is handled with the highest confidentiality.

Whether you're a software developer, a business analyst, or a data scientist, our Document Data Extraction API is designed to streamline your data processing tasks, allowing you to focus on what truly matters - driving your business forward. Start with Extracta.ai today and transform the way you handle documents forever.

API Documentation

Endpoints

Process Document

Structure your request with mandatory parameters: 'name', 'language', 'fields' and 'file'. Each field requires a 'key', with 'description' and 'example' being optional. The document must be provided as either 'base64String' or a 'fileUrl'.

## API Documentation
This section provides guidelines for structuring your Document Parsing API requests to Extracta.ai. Ensure to follow the format below for successful data extraction:

## Request Format
```
{
"extractionDetails": {
"name": "Extraction Name", // required - Name your extraction process
"language": "Supported Language", // required - Choose from the supported languages
"fields": [
{
"key": "Field Key", // required - Define the key for data extraction
"description": "Field Description", // optional - Describe the field
"example": "Field Example" // optional - Provide an example value
},
...
]
},
"file": "base64String or file URL" // required - Provide the document in base64String format or as a URL
}
```
## Advanced Format
In addition to the basic format outlined in the previous sections, Extracta.ai also supports more complex data structures for specialized extraction needs. This advanced format allows the definition of **nested objects and arrays**, catering to a broader range of data representation.

### Type `object`
The **object** type represents a structured object with multiple **properties**. Each property is defined as an object within an array, and can include its own **key**, **description**, **type**, and **example**.
```
{
"key": "personal_info",
"description": "Personal information of the person", // optional
"type": "object",
"properties": [
{
"key": "name",
"description": "Name of the person", // optional
"example": "Alex Smith", // optional
"type": "string" // optional
},
{
"key": "email",
"description": "Email of the person",
"example": "[email protected]",
"type": "string"
},
.....
]
}
```

### Type `array`
The **array** type is used for lists of **items**, such as a collection of work experiences. The items key contains an object defining the structure of each item in the array.
```
{
"key": "work_experience",
"description": "Work experience of the person", // optional
"type": "array",
"items": {
"type": "object",
"properties": [
{
"key": "title",
"description": "Title of the job", // optional
"example": "Software Engineer", // optional
"type": "string" // optional
},
{
"key": "start_date",
"description": "Start date of the job",
"example": "2022",
"type": "string"
},
...
]
}
}
```

### Notes on Usage | Document Parsing API
- For both `object` and `array` types, the `example` parameter is applicable only for their inner properties/items.
- When defining fields, if no `type` is specified, it defaults to `string`.
- For `object` and `array` types, the inner fields can only be of type `string`. This means that each property within an object or each item within an array should be a string type, ensuring consistency and simplicity in data representation.
- These advanced field types enable more detailed and structured data representation, enhancing the capabilities of Extracta.ai's data extraction process.

## Supported File Types

Extracta.ai is capable of processing documents in **image (JPG, PNG), PDF, and DOCX formats**. This enhancement allows for a wider range of document types to be submitted for extraction.

## Supported Languages

Extracta.ai currently supports document extraction in the following languages: **Romanian, English, French, Spanish, Arabic, Portuguese, German, Italian**. Additional support for 20 more languages is planned.

**Note**: If an unsupported language is specified, the API will return an error message indicating an invalid language choice. Keep updated with our API documentation for new language additions.

                                                                            
POST https://zylalabs.com/api/3606/document+data+extraction+api/4000/process+document

Process Document - Endpoint Features

Object	Description
`Request Body`	[Required] Json

Request Body

{
	"extractionDetails": {
		"name": "CV - Extraction",
		"language": "English",
		"fields": [
			{
				"key": "name",
				"description": "the name of the person in the CV",
				"example": "Johan Smith"
			},
			{
				"key": "email",
				"description": "the email of the person in the CV",
				"example": "johan@gmail.com"
			},
			{
				"key": "phone",
				"description": "the phone number of the person",
				"example": "123 333 4445"
			},
			{
				"key": "address",
				"description": "the compelte address of the person",
				"example": "1234 Main St, New York, NY 10001"
			},
			{
				"key": "soft_skills",
				"description": "the soft skills of the person",
				"example": ""
			},
			{
				"key": "hard_skills",
				"description": "the hard skills of the person",
				"example": ""
			},
			{
				"key": "last_job",
				"description": "the last job of the person",
				"example": "Software Engineer"
			},
			{
				"key": "years_of_experience",
				"description": "the years of experience of last job",
				"example": "5"
			}
		]
	},
	"file": "https://deveatery.com/extracta/cv.png"
}

Test Endpoint

API EXAMPLE RESPONSE

       
                                                                                                        
                                                                                                                                                                                                                            {
	"name": "Darren Charles",
	"email": "[email protected]",
	"phone": "+1-709-680-9033",
	"address": "9 Corpus Christi, Texas",
	"soft_skills": "highly motivated, ability to translate business strategies, learn new things",
	"hard_skills": "Matlab, MeVisLab, Keras, CUDA, Git, DataStage, MQTT",
	"last_job": "Trainee With English Communications",
	"years_of_experience": "Ongoing"
}

Process Document - CODE SNIPPETS


curl --location --request POST 'https://zylalabs.com/api/3606/document+data+extraction+api/4000/process+document' --header 'Authorization: Bearer YOUR_API_KEY' 

--data-raw '{
	"extractionDetails": {
		"name": "CV - Extraction",
		"language": "English",
		"fields": [
			{
				"key": "name",
				"description": "the name of the person in the CV",
				"example": "Johan Smith"
			},
			{
				"key": "email",
				"description": "the email of the person in the CV",
				"example": "[email protected]"
			},
			{
				"key": "phone",
				"description": "the phone number of the person",
				"example": "123 333 4445"
			},
			{
				"key": "address",
				"description": "the compelte address of the person",
				"example": "1234 Main St, New York, NY 10001"
			},
			{
				"key": "soft_skills",
				"description": "the soft skills of the person",
				"example": ""
			},
			{
				"key": "hard_skills",
				"description": "the hard skills of the person",
				"example": ""
			},
			{
				"key": "last_job",
				"description": "the last job of the person",
				"example": "Software Engineer"
			},
			{
				"key": "years_of_experience",
				"description": "the years of experience of last job",
				"example": "5"
			}
		]
	},
	"file": "https://deveatery.com/extracta/cv.png"
}'

API Access Key & Authentication

After signing up, every developer is assigned a personal API access key, a unique combination of letters and digits provided to access to our API endpoint. To authenticate with the Document Data Extraction API REST API, simply include your bearer token in the Authorization header.

Headers

Header	Description
`Authorization`	[Required] Should be `Bearer access_key`. See "Your API Access Key" above when you are subscribed.

Simple Transparent Pricing

No long term commitments. One click upgrade/downgrade or cancellation. No questions asked.

Monthly Annually

(Save 2 months with annual billing 🎉)

🏃Pay as you go

$ 0.00/Month

Start at $0.4331600 per request
No upfront cost, pay only for what you use
Specialized Customer Support
Real-Time API Monitoring

Subscribe Now

No commitment. Cancel anytime

💫Basic

$ 24.99/Month

150 Requests / Month
Then $0.2165800 per request if limit exceeded.
Specialized Customer Support
Real-Time API Monitoring

150 Requests / Month
Then $0.2165800 per request if limit exceeded.
Specialized Customer Support
Real-Time API Monitoring

Custom Volume
Specialized Customer Support
Real-Time API Monitoring

Book a Call

Customer favorite features

✔︎ Only Pay for Successful Requests
✔︎ Free 7-Day Trial
✔︎ Multi-Language Support
✔︎ One API Key, All APIs.
✔︎ Intuitive Dashboard

✔︎ Comprehensive Error Handling
✔︎ Developer-Friendly Docs
✔︎ Postman Integration
✔︎ Secure HTTPS Connections
✔︎ Reliable Uptime

What is Extracta.ai?

Extracta.ai represents an advanced technological platform dedicated to the extraction of structured data from diverse documents, such as resumes and invoices. This service aims to streamline workflows, eliminate the need for manual data entry, and boost productivity in numerous sectors.

What types of documents can we process?

We are capable of handling a broad spectrum of documents, encompassing both structured and unstructured formats, such as PDFs, Word documents, text files, and scanned images (in PNG, JPG formats), employing OCR technology as required.

Can it be integrated into existing systems?

Indeed, Extracta.ai is built for effortless integration. Our service can be easily connected to your current software and workflows via our API. Furthermore, we intend to provide options for local system deployment in the future to increase data privacy.

How do we differ from our competitors?

Diverging from the approach of competitors who depend on fixed templates and models, Extracta.ai employs meticulously adjusted Large Language Models (LLMs) for extracting data from any document without the need for previous training, achieving an accuracy rate of up to 99%. This method ensures enhanced flexibility, quicker deployment, reduced costs

Where can I find technical support or have other questions answered?

Our dedicated support team is available to assist you with any technical queries or further information. For support or any inquiries, please email us at: [email protected]

What is Zyla API Hub?

Zyla API Hub is like a big store for APIs, where you can find thousands of them all in one place. We also offer dedicated support and real-time monitoring of all APIs. Once you sign up, you can pick and choose which APIs you want to use. Just remember, each API needs its own subscription. But if you subscribe to multiple ones, you'll use the same key for all of them, making things easier for you.

What currencies and payment methods are allowed?

Prices are listed in USD (United States Dollar), EUR (Euro), CAD (Canadian Dollar), AUD (Australian Dollar), and GBP (British Pound). We accept all major debit and credit cards. Our payment system uses the latest security technology and is powered by Stripe, one of the world’s most reliable payment companies. If you have any trouble paying by card, just contact us at [email protected]

Additionally, if you already have an active subscription in any of these currencies (USD, EUR, CAD, AUD, GBP), that currency will remain for subsequent subscriptions. You can change the currency at any time as long as you don't have any active subscriptions.

Why can't I pay with my local currency even though I see it on the pricing page?

The local currency shown on the pricing page is based on the country of your IP address and is provided for reference only. The actual prices are in USD (United States Dollar). When you make a payment, the charge will appear on your card statement in USD, even if you see the equivalent amount in your local currency on our website. This means you cannot pay directly with your local currency.

My payment was declined, what should I do?

Occasionally, a bank may decline the charge due to its fraud protection settings. We suggest reaching out to your bank initially to check if they are blocking our charges. Also, you can access the Billing Portal and change the card associated to make the payment. If these does not work and you need further assistance, please contact our team at [email protected]

How will I be charged for my API subscription?

Prices are determined by a recurring monthly or yearly subscription, depending on the chosen plan.

How will my API calls be deducted from my plan?

API calls are deducted from your plan based on successful requests. Each plan comes with a specific number of calls that you can make per month. Only successful calls, indicated by a Status 200 response, will be counted against your total. This ensures that failed or incomplete requests do not impact your monthly quota.

How does your billing cycle work?

Zyla API Hub works on a recurring monthly subscription system. Your billing cycle will start the day you purchase one of the paid plans, and it will renew the same day of the next month. So be aware to cancel your subscription beforehand if you want to avoid future charges.

How do I upgrade my current subscription plan with an API?

To upgrade your current subscription plan, simply go to the pricing page of the API and select the plan you want to upgrade to. The upgrade will be instant, allowing you to immediately enjoy the features of the new plan. Please note that any remaining calls from your previous plan will not be carried over to the new plan, so be aware of this when upgrading. You will be charged the full amount of the new plan.

How can I see the remaining number of API calls I can make this month?

To check how many API calls you have left for the current month, refer to the ‘X-Zyla-API-Calls-Monthly-Remaining’ field in the response header. For example, if your plan allows 1000 requests per month and you've used 100, this field in the response header will indicate 900 remaining calls.

How do I find out the maximum number of API requests allowed in my subscription plan?

To see the maximum number of API requests your plan allows, check the ‘X-Zyla-RateLimit-Limit’ response header. For instance, if your plan includes 1000 requests per month, this header will display 1000.

How do I know when my rate limit will reset?

The ‘X-Zyla-RateLimit-Reset’ header shows the number of seconds until your rate limit resets. This tells you when your request count will start fresh. For example, if it displays 3600, it means 3600 seconds are left until the limit resets.

Can I cancel anytime?

Yes, you can cancel your plan anytime by going to your account and selecting the cancellation option on the Billing page. Please note that upgrades, downgrades, and cancellations take effect immediately. Additionally, upon cancellation, you will no longer have access to the service, even if you have remaining calls left in your quota.

If I have any problems, who I should contact?

You can contact us through our chat channel to receive immediate assistance. We are always online from 8 am to 5 pm (EST). If you reach us after that time, we will get back to you as soon as possible. Additionally, you can contact us via email at [email protected]

How does the 7-day free trial work?

To let you experience our APIs without any commitment, we offer a 7-day free trial that allows you to make API calls at no cost during this period. Please note that you can only use this trial once, so make sure to use it with the API that interests you the most. Most of our APIs provide a free trial, but some may not support it.

What happens if I forget to cancel my free trial?

After 7 days, you will be charged the full amount for the plan you were subscribed to during the trial. Therefore, it’s important to cancel before the trial period ends. Refund requests for forgetting to cancel on time are not accepted.

How many calls can I make during the free trial?

When you subscribe to an API trial, you can make only 25% of the calls allowed by that plan. For example, if the API plan offers 1000 calls, you can make only 250 during the trial. To access the full number of calls offered by the plan, you will need to subscribe to the full plan.

Start Free Trial

Service Level

100%

Response Time

1,529ms

Category:

NLP

Tags:

#dataExtraction

#documentParsing

This API extracts complete information from images of official documents, such as passports.

Tools Free 7-day Trial

Service Level:

100%

Response Time:

7,846ms

Document Data Extraction API

What would you like to see? See the information or check the documentation?

API Documentation

Endpoints

API EXAMPLE RESPONSE

Process Document - CODE SNIPPETS

API Access Key & Authentication

Simple Transparent Pricing

🏃Pay as you go

$ 0.00/Month

💫Basic

$ 24.99/Month

⚡Pro

$ 49.99/Month

🔥Pro Plus

$ 99.99/Month

⚜️Premium

$ 199.99/Month

🌟Elite

$ 499.99/Month

💎Ultimate

$ 999.99/Month

💫Basic

$ 20.83/Month

⚡Pro

$ 41.66/Month

🔥Pro Plus

$ 83.33/Month

⚜️Premium

$ 166.66/Month

🌟Elite

$ 416.66/Month

💎Ultimate

$ 833.33/Month

🚀 Enterprise

Starts at $ 10,000/Year

Customer favorite features

What is Extracta.ai?

What types of documents can we process?

Can it be integrated into existing systems?

How do we differ from our competitors?

Where can I find technical support or have other questions answered?

What is Zyla API Hub?

What currencies and payment methods are allowed?

Why can't I pay with my local currency even though I see it on the pricing page?

My payment was declined, what should I do?

How will I be charged for my API subscription?

How will my API calls be deducted from my plan?

How does your billing cycle work?

How do I upgrade my current subscription plan with an API?

How can I see the remaining number of API calls I can make this month?

How do I find out the maximum number of API requests allowed in my subscription plan?

How do I know when my rate limit will reset?

Can I cancel anytime?

If I have any problems, who I should contact?

How does the 7-day free trial work?

What happens if I forget to cancel my free trial?

How many calls can I make during the free trial?

Service Level

Response Time

Category:

Tags:

Related APIs

Document Parser API

Web Data Extractor API

Doc to Text API

Domain Data Extractor API

Deep File Detective API

Passport Data Extractor API

ID Document OCR API

Text Extractor API

Exif Extractor API

Content Extractor API

Web Content Extractor API

Article Data Extractor API

Web Extractor API

Metadata Extractor API

Business Data Extractor API

Passport Extractor API

Passport Information Capture API

Starts at
$ 10,000/Year