Phrase Extraction & Named Entity Recognition¶

To extract phrases & named entities from text, do an HTTP POST to https://text-processing.com/api/phrases/ with form encoded data containg the text you want to analyze. You’ll get back a JSON object containing lists of phrases and/or named entities.

Here’s an example of how to do it using curl:

$ curl -d "text=California is nice" https://text-processing.com/api/phrases/
{
        "GPE": ["California"],
        "LOCATION": ["California"]
}

You can try out the tagging and chunking demo to get a feel for the results and the kinds of phrases that can be extracted. English phrase extraction combines the results from 4 different phrase & named entity chunkers: the default named entity chunker, a treebank trained noun phrase chunker, a conll2000 trained phrase chunker, and an ieer trained named entity chunker.

An example for the Spanish language is:

$ curl -d "language=spanish&text=San Francisco, California" https://text-processing.com/api/phrases/
{
        "LOC": ["San Francisco"]
}

Parameters¶

text:

Required - the text you want to process. It must not exceed 1,000 characters.

language:

Optional, defaults to english. There are 3 other language choices for phrase extraction:

dutch uses a tagger & named entity chunker trained on the conll2002 corpus
portuguese uses a tagger & phrase chunker trained on the floresta corpus
spanish uses a tagger & named entity chunker trained on the conll2002 corpus

Return Value¶

On success, a 200 OK response will be returned containing a JSON object that looks like this:

{
        "NP": ["noun phrase"]
}

The object will have a key for each type of phrase or named entity, and the value for that key will be a list of strings, one for each phrase of that type. If no phrases or named entities could be extracted, you’ll get an empty object.

Errors¶

A 400 Bad Request response will be returned under the following conditions:

no value for text is provided
text exceeds 1,000 characters
an incorrect language is specified

A 503 Throttled response will be returned if you exceed the daily request limit. Signup for the Text-Processing RapidAPI to get a higher limit plan.