Text Purify API

Text Purify API extracts clean text from web pages by removing ads and irrelevant content, facilitating automated reading and processing.

About the API:  

Text Purify API is designed to transform the way you interact with web content, providing a robust and efficient solution for extracting relevant text from articles and web pages. In a world flooded with information, this API becomes an essential tool for users looking to get clean, meaningful data without the clutter of ads, menus and other unwanted elements.
The Text Purify API is a cloud-based service that allows users to extract the core content of web articles with high accuracy. This API is ideal for applications that require the collection and analysis of content from news, blogs, research and more. It uses advanced natural language processing (NLP) and machine learning techniques to identify and extract relevant text, ensuring that only valuable information is delivered to the user. The API is equipped with sophisticated algorithms that recognise and extract the main body text of a web page. This includes identifying the main text of articles and automatically excluding ads, menus, sidebars and other non-essential elements.

It can handle a wide variety of web page formats and layout styles, ensuring that content extraction is effective regardless of website design. The API is designed to work with content in different languages, making it versatile for global applications. A simple and well-documented application programming interface (API) is provided, making it easy to integrate with your existing applications and workflows. The API provides fast responses, which is crucial for real-time applications and large-scale data analysis. This enables a smooth and efficient user experience.

 

What this API receives and what your API provides (input/output)?

The Text Purify API receives a URL and optional settings, and provides clean text of the article, excluding ads, along with metadata such as title and author.

 

What are the most common uses cases of this API?

  1. Uses the API to extract the main text of articles from multiple news sources and present them in a unified platform, improving the user experience by avoiding ads and irrelevant content.

    Facilitates the collection of information from academic and research articles, allowing researchers to extract the essential content for analysis and review without the distractions of advertising.

    Create applications that generate concise summaries of web articles by extracting only the main, relevant content, offering users more digestible versions of long texts.

    Enables content curators to extract and present only the most relevant text from articles and publications, ensuring their audiences receive high quality information without distracting elements.

    Extracts relevant content from online reviews and articles to perform sentiment analysis, helping companies better understand public perception of their products or services.

     

Are there any limitations to your plans?

Basic Plan: 50 requests per minute.

Pro Plan: 100 requests per minute.

Pro Plus Plan: 240 requests per minute.

Premium Plan: 360 requests per minute.

API Documentation

Endpoints


To use this endpoint, provide the URL of the article to extract its main content, cleaning out advertisements and non-relevant elements.

 

word_per_minute (optional): this parameter influences the calculation of "time to read." By default, it's set to 300 words per minute. Adjust this value as needed to match your desired reading speed estimation

desc_truncate_len (optional): controls the maximum length of the generated description. The default is 210 characters. If the extracted description exceeds this limit, it will be truncated to ensure conciseness

desc_len_min (optional): sets the minimum required character count for the description. The default is 180 characters. If the extracted description falls below this threshold, the API will return "null"

content_len_min (optional): defines the minimum character count requirement for the extracted content. The default is 200 characters. If the content falls below this minimum, the API will return "null"



                                                                            
GET https://zylalabs.com/api/4949/text+purify+api/6229/article+extract
                                                                            
                                                                        

Article Extract - Endpoint Features

Object Description
url [Required]
word_per_minute [Optional]
desc_truncate_len [Optional]
desc_len_min [Optional]
content_len_min [Optional]
Test Endpoint

API EXAMPLE RESPONSE

       
                                                                                                        
                                                                                                                                                                                                                            {"error":0,"message":"Article extraction success","data":{"url":"https://css-tricks.com/empathetic-animation/","title":"Empathetic Animation | CSS-Tricks","description":"Animation on the web is often a contentious topic. I think, in part, it’s because bad animation is blindingly obvious, whereas well-executed animation fades seamlessly into the background. When handled well,...","links":["https://css-tricks.com/empathetic-animation/","https://css-tricks.com/?p=358975"],"image":"https://css-tricks.com/wp-json/social-image-generator/v1/image/358975","content":"<div>\n<p>Animation on the web is often a contentious topic. I think, in part, it’s because bad animation is blindingly obvious, whereas well-executed animation fades seamlessly into the background. When handled well, animation can really elevate a website, whether it’s just adding a bit of personality or providing visual hints and lessening cognitive load. Unfortunately, it often feels like there are two camps, <strong>accessibility</strong> vs. <strong>animation</strong>. This is such a shame because we can have it all! All it requires is a little consideration.</p>\n<p>Here’s a couple of important questions to ask when you’re creating animations.</p>\n<h3 id=\"does-this-animation-serve-a-purpose\">Does this animation serve a purpose?</h3>\n<p>This sounds serious, but don’t worry — the site’s purpose is key. If you’re building a personal portfolio, go wild! However, if someone’s trying to file a tax return, whimsical loading animations aren’t likely to be well-received. On the other hand, an animated progress bar could be a nice touch while providing visual feedback on the user’s action.</p>\n<h3 id=\"is-it-diverting-focus-from-important-information\">Is it diverting focus from important information?</h3>\n<p>It’s all too easy to get caught up in the excitement of whizzing things around, but remember that the web is primarily an information system. When people are trying to read, animating text or looping animations that play nearby can be hugely distracting, especially for people with ADD or ADHD. Great animation aids focus; it doesn’t disrupt it.</p>\n<p>So! Your animation’s passed the test, what next? Here are a few thoughts…</p>\n<h3 id=\"did-we-allow-users-to-optout\">Did we allow users to opt-out?</h3>\n<p>It’s important that our animations are safe for people with motion sensitivities. Those with vestibular (inner ear) disorders can experience dizziness, headaches, or even nausea from animated content.</p>\n<p>Luckily, we can tap into operating system settings with the <a target=\"_blank\" href=\"https://css-tricks.com/introduction-reduced-motion-media-query/\"><code>prefers-reduced-motion</code></a> media query. This media query detects whether the user has requested the operating system to minimize the amount of animation or motion it uses.</p>\n<figure><img src=\"https://paper-attachments.dropbox.com/s_7D92D8EBF340C60751235A0E3B9FF212DD6FAF34423CBF01BCA6C02C8A4FAF71_1638134946293_Screenshot+2021-11-28+at+21.28.58.png\" alt=\"Screenshot of the user preferences settings in MacOS, open to Accessibility and displaying options for how to display things, including one option for reduce motion, which is checked.\" /><figcaption>The reduced motion settings in macOS.</figcaption></figure>\n<p>Here’s an example:</p>\n<pre><code>@media (prefers-reduced-motion: reduce) {\n  *,\n  *::before,\n  *::after {\n    animation-duration: 0.01ms !important;\n    animation-iteration-count: 1 !important;\n    transition-duration: 0.01ms !important;\n    scroll-behavior: auto !important;\n  }\n}</code></pre>\n<p>This snippet taps into that user setting and, if enabled, it gets rid of <strong>all</strong> your CSS animations and transitions. It’s a bit of a sledgehammer approach though — remember, the key word in this media query is <em>reduced</em>. Make sure functionality isn’t breaking and that users aren’t losing important context by opting out of the animation. I prefer tailoring reduced motion options for those users. Think simple opacity fades instead of zooming or panning effects.</p>\n<h3 id=\"what-about-javascript-though\">What about JavaScript, though?</h3>\n<p>Glad you asked! We can make use of the reduced motion media query in JavaScript land, too!</p>\n<pre><code>let motionQuery = matchMedia('(prefers-reduced-motion)');\nconst handleReduceMotion = () =&gt; {\n  if (motionQuery.matches) {\n    // reduced motion options\n  }\n}\nmotionQuery.addListener(handleReduceMotion);\nhandleReduceMotion()</code></pre>\n<p>Tapping into system preferences isn’t bulletproof. After all, it’s there’s no guarantee that everyone affected by motion knows how to change their settings. To be extra safe, it’s possible to add a reduced motion toggle in the UI and put the power back in the user’s hands to decide. <a target=\"_blank\" href=\"https://www.wethecollective.com/\">We {the collective}</a> has a really nice implementation on their site</p>\n<figure></figure>\n<p>Here’s a straightforward example:</p>\n<h3 id=\"scroll-animations\">Scroll animations</h3>\n<p>One of my favorite things about animating on the web is hooking into user interactions. It opens up a world of creative possibilities and really allows you to engage with visitors. But it’s important to remember that not all interactions are opt-in — some (like scrolling) are inherently tied to how someone navigates around your site.</p>\n<p>The Nielson Norman Group has done some <a target=\"_blank\" href=\"https://www.nngroup.com/articles/scrolling-and-attention/\">great </a><a target=\"_blank\" href=\"https://www.nngroup.com/articles/scroll-animations/\">research</a> on scroll interactions. One particular part really stuck out for me. They found that a lot of task-focused users couldn’t tell the difference between slow load times and scroll-triggered entrance animations. All they noticed was a frustrating delay in the interface’s response time. I can relate to this; it’s annoying when you’re trying to scan a website for some information and you have to wait for the page to slowly ease and fade into view.</p>\n<p>If you’re using GreenSock’s <a target=\"_blank\" href=\"https://greensock.com/scrolltrigger/\">ScrollTrigger</a> plugin for your animations, you’re in luck. We’ve added a cool little property to help avoid this frustration: <strong><code>fastScrollEnd</code></strong>.</p>\n<p><code>fastScrollEnd</code> detects the users’ scroll velocity. ScrollTrigger skips the entrance animations to their end state when the user scrolls super fast, like they’re in a hurry. Check it out!</p>\n<p>There’s also a super easy way to make your scroll animations reduced-motion-friendly with <code><a target=\"_blank\" href=\"https://greensock.com/docs/v3/Plugins/ScrollTrigger/static.matchMedia()\">ScrollTrigger.matchMedia()</a></code>:</p>\n<hr />\n<p>I hope these snippets and insights help. Remember, consider the purpose, lead with empathy, and use your animation powers responsibly!</p>\n      </div>","author":"@cassiecodes","favicon":"https://i0.wp.com/css-tricks.com/wp-content/uploads/2021/07/star.png?fit=180%2C180&ssl=1","source":"css-tricks.com","published":"2022-01-18T09:38:10-07:00","ttr":150,"type":"article"}}
                                                                                                                                                                                                                    
                                                                                                    

Article Extract - CODE SNIPPETS


curl --location --request GET 'https://zylalabs.com/api/4949/text+purify+api/6229/article+extract?url=https://css-tricks.com/empathetic-animation/&word_per_minute=300&desc_truncate_len=210&desc_len_min=180&content_len_min=200' --header 'Authorization: Bearer YOUR_API_KEY' 

    

To use this endpoint, it provides the URL of the article to extract its main content through a proxy, facilitating the extraction of sites with access restrictions.

This additional endpoint can be helpful for extracting articles from websites that restrict access based on user geography or session.

When you call this endpoint, the extractor engine will randomly select a proxy agent from our pool, then attempt to load the target webpage through the chosen proxy.

Due to the nature of proxy servers, loading times may vary depending on the selected proxy's location and performance.

 



                                                                            
GET https://zylalabs.com/api/4949/text+purify+api/6230/article+proxy+extract
                                                                            
                                                                        

Article Proxy Extract - Endpoint Features

Object Description
url [Required]
word_per_minute [Optional]
desc_truncate_len [Optional]
desc_len_min [Optional]
content_len_min [Optional]
Test Endpoint

API EXAMPLE RESPONSE

       
                                                                                                        
                                                                                                                                                                                                                            {"error":0,"message":"Article extraction success","data":{"url":"https://cryptobriefing.com/fidelity-ethereum-etf-dtcc-listing/","title":"Fidelity's Ethereum spot ETF listed on DTCC under ticker $FETH","description":"Fidelity's spot Ethereum fund is now listed on DTCC under ticker $FETH following SEC's approval of multiple Ethereum ETFs.","links":["https://cryptobriefing.com/fidelity-ethereum-etf-dtcc-listing/"],"image":"https://static.cryptobriefing.com/wp-content/uploads/2024/05/29232455/img-HBnmOBf0yYWOnnbZiut1I8BO-800x457.jpg","content":"<div>\n            <section>\n            <h2>SEC's approval process for Ethereum ETFs underway, trading awaits S-1 filings.</h2>\n        </section>\n            <section>\n            <picture>\n                <source media=\"(min-width: 850px)\" srcset=\"https://static.cryptobriefing.com/wp-content/uploads/2024/05/29232455/img-HBnmOBf0yYWOnnbZiut1I8BO-800x457.jpg\"></source>\n                <img src=\"https://static.cryptobriefing.com/wp-content/uploads/2024/05/29232455/img-HBnmOBf0yYWOnnbZiut1I8BO-400x228.jpg\" alt=\"Fidelity's spot Ethereum ETF listed on DTCC under ticker $FETH\" title=\"Fidelity’s spot Ethereum ETF listed on DTCC under ticker $FETH\" />\n            </picture>\n        </section>\n    <section>\n        <p>Fidelity’s Ethereum spot ETF has been listed on the Depository Trust and Clearing Corporation (DTCC) under the ticker symbol $FETH. This development comes on the heels of the US Securities and Exchange Commission’s (SEC) <a href=\"https://cryptobriefing.com/sec-ethereum-etf-approval/\" target=\"_blank\">approval of spot Ethereum exchange-traded funds</a> (ETFs) on May 23.</p><figure><img src=\"https://static.cryptobriefing.com/wp-content/uploads/2024/05/29225708/Fidelity-Ethereum-ETF-on-DTCC.jpg\" /><figcaption>Fidelity’s Ethereum spot ETF is now listed on <a href=\"https://www.dtcc.com/products/cs/exchange_traded_funds_plain_new.php\" target=\"_blank\">DTCC</a></figcaption></figure><p>BlackRock’s Ethereum fund, iShares Ethereum Trust, is listed on the DTCC <a href=\"https://cryptobriefing.com/blackrock-ethereum-etf-dtcc/\" target=\"_blank\">under ticker $ETHA</a>. VanEck’s Ethereum ETF is listed <a href=\"https://cryptobriefing.com/vaneck-dtcc-ethereum-etf-listing/\" target=\"_blank\">under ticker $ETHV</a> and Franklin Templeton’s <a href=\"https://cryptobriefing.com/franklin-templeton-ethereum-etf-dtcc-listing/\" target=\"_blank\">under ticker $EZET</a>.</p><p>The SEC’s acceptance of the 19b-4 forms for the spot Ethereum ETFs marks a major step, although the commencement of trading awaits the approval of each ETF’s S-1 filing.</p><p>Discussions between the SEC and ETF issuers about the S-1 forms are reportedly <a href=\"https://cryptobriefing.com/sec-engages-ethereum-etf-issuers-s-1-forms/\" target=\"_blank\">underway</a>. However, the timeframe for the trading approval is uncertain, with projections ranging from weeks to months.</p><p>VanEck was among the first to submit an amended S-1 form on May 23, with BlackRock following suit with an <a href=\"https://cryptobriefing.com/blackrock-ethereum-etf-launch/\" target=\"_blank\">updated S-1 filing</a> today. The S-1 form serves as an initial registration document that must be filed with the SEC before a security can be offered to the public.</p>\n                                </section>\n    <section>\n                    <a href=\"https://cryptobriefing.com/disclaimer/\" target=\"_blank\">\n                Disclaimer            </a>\n    </section>\n</div>","author":"@crypto_briefing","favicon":"https://static.cryptobriefing.com/wp-content/uploads/2020/02/02093517/ios-144.png","source":"cryptobriefing.com","published":"2024-05-30T17:14:47+00:00","ttr":40,"type":"article"}}
                                                                                                                                                                                                                    
                                                                                                    

Article Proxy Extract - CODE SNIPPETS


curl --location --request GET 'https://zylalabs.com/api/4949/text+purify+api/6230/article+proxy+extract?url=https://cryptobriefing.com/fidelity-ethereum-etf-dtcc-listing/&word_per_minute=300&desc_truncate_len=210&desc_len_min=180&content_len_min=200' --header 'Authorization: Bearer YOUR_API_KEY' 

    

API Access Key & Authentication

After signing up, every developer is assigned a personal API access key, a unique combination of letters and digits provided to access to our API endpoint. To authenticate with the Text Purify API REST API, simply include your bearer token in the Authorization header.
Headers
Header Description
Authorization [Required] Should be Bearer access_key. See "Your API Access Key" above when you are subscribed.

Simple Transparent Pricing

No long term commitments. One click upgrade/downgrade or cancellation. No questions asked.

🚀 Enterprise

Starts at
$ 10,000/Year


  • Custom Volume
  • Dedicated account manager
  • Service-level agreement (SLA)

Customer favorite features

  • ✔︎ Only Pay for Successful Requests
  • ✔︎ Free 7-Day Trial
  • ✔︎ Multi-Language Support
  • ✔︎ One API Key, All APIs.
  • ✔︎ Intuitive Dashboard
  • ✔︎ Comprehensive Error Handling
  • ✔︎ Developer-Friendly Docs
  • ✔︎ Postman Integration
  • ✔︎ Secure HTTPS Connections
  • ✔︎ Reliable Uptime

Use the API by providing a URL to extract the main content of the article. Set optional parameters to customise the extraction and formatting.

The Text Purify API cleans and extracts relevant text from web pages, removing ads and unwanted content, providing only the main text of the article.

There are different plans suits everyone including a free trial for small amount of requests, but it’s rate is limit to prevent abuse of the service.

Zyla provides a wide range of integration methods for almost all programming languages. You can use these codes to integrate with your project as you need.

The API returns detailed information about the age and history of a domain, including years, months and days since its creation, as well as expiration and update dates.

Zyla API Hub is like a big store for APIs, where you can find thousands of them all in one place. We also offer dedicated support and real-time monitoring of all APIs. Once you sign up, you can pick and choose which APIs you want to use. Just remember, each API needs its own subscription. But if you subscribe to multiple ones, you'll use the same key for all of them, making things easier for you.

Prices are listed in USD (United States Dollar), EUR (Euro), CAD (Canadian Dollar), AUD (Australian Dollar), and GBP (British Pound). We accept all major debit and credit cards. Our payment system uses the latest security technology and is powered by Stripe, one of the world’s most reliable payment companies. If you have any trouble paying by card, just contact us at [email protected]

Additionally, if you already have an active subscription in any of these currencies (USD, EUR, CAD, AUD, GBP), that currency will remain for subsequent subscriptions. You can change the currency at any time as long as you don't have any active subscriptions.

The local currency shown on the pricing page is based on the country of your IP address and is provided for reference only. The actual prices are in USD (United States Dollar). When you make a payment, the charge will appear on your card statement in USD, even if you see the equivalent amount in your local currency on our website. This means you cannot pay directly with your local currency.

Occasionally, a bank may decline the charge due to its fraud protection settings. We suggest reaching out to your bank initially to check if they are blocking our charges. Also, you can access the Billing Portal and change the card associated to make the payment. If these does not work and you need further assistance, please contact our team at [email protected]

Prices are determined by a recurring monthly or yearly subscription, depending on the chosen plan.

API calls are deducted from your plan based on successful requests. Each plan comes with a specific number of calls that you can make per month. Only successful calls, indicated by a Status 200 response, will be counted against your total. This ensures that failed or incomplete requests do not impact your monthly quota.

Zyla API Hub works on a recurring monthly subscription system. Your billing cycle will start the day you purchase one of the paid plans, and it will renew the same day of the next month. So be aware to cancel your subscription beforehand if you want to avoid future charges.

To upgrade your current subscription plan, simply go to the pricing page of the API and select the plan you want to upgrade to. The upgrade will be instant, allowing you to immediately enjoy the features of the new plan. Please note that any remaining calls from your previous plan will not be carried over to the new plan, so be aware of this when upgrading. You will be charged the full amount of the new plan.

To check how many API calls you have left for the current month, look at the ‘X-Zyla-API-Calls-Monthly-Remaining’ header. For example, if your plan allows 1000 requests per month and you've used 100, this header will show 900.

To see the maximum number of API requests your plan allows, check the ‘X-Zyla-RateLimit-Limit’ header. For instance, if your plan includes 1000 requests per month, this header will display 1000.

The ‘X-Zyla-RateLimit-Reset’ header shows the number of seconds until your rate limit resets. This tells you when your request count will start fresh. For example, if it displays 3600, it means 3600 seconds are left until the limit resets.

Yes, you can cancel your plan anytime by going to your account and selecting the cancellation option on the Billing page. Please note that upgrades, downgrades, and cancellations take effect immediately. Additionally, upon cancellation, you will no longer have access to the service, even if you have remaining calls left in your quota.

You can contact us through our chat channel to receive immediate assistance. We are always online from 8 am to 5 pm (EST). If you reach us after that time, we will get back to you as soon as possible. Additionally, you can contact us via email at [email protected]

To let you experience our APIs without any commitment, we offer a 7-day free trial that allows you to make API calls at no cost during this period. Please note that you can only use this trial once, so make sure to use it with the API that interests you the most. Most of our APIs provide a free trial, but some may not support it.

After 7 days, you will be charged the full amount for the plan you were subscribed to during the trial. Therefore, it’s important to cancel before the trial period ends. Refund requests for forgetting to cancel on time are not accepted.

When you subscribe to an API trial, you can make only 25% of the calls allowed by that plan. For example, if the API plan offers 1000 calls, you can make only 250 during the trial. To access the full number of calls offered by the plan, you will need to subscribe to the full plan.

 Service Level
99%
 Response Time
4,614ms

Category:


Related APIs