Part-of-Speech Tagging and Chunking¶
To tag & chunk text, do an HTTP POST to http://text-processing.com/api/tag/ with form encoded data containg the text you want to tag. You’ll get back a JSON object response whose text attribute contains the tagged text. Here’s some examples of how to do it using curl:
$ curl -d "text=hello world" http://text-processing.com/api/tag/
{
"text": "(S hello/NN world/NN)"
}
IOB tags output:
$ curl -d "text=hello world&output=iob" http://text-processing.com/api/tag/
{
"text": "hello NN O\nworld NN O"
}
A named entity recognition example:
$ curl -d "text=California is nice" http://text-processing.com/api/tag/
{
"text": "(S (GPE California/NNP) is/VBZ nice/JJ)"
}
You can try out the tagging and chunking demo to get a feel for the results, but it does not show all the output formats available in the API.
Parameters¶
- text:
Required - the text you want to tag. It must not exceed 2,000 characters.
- language:
Optional, defaults to
english, which also uses phrase chunker. There are 3 other languages other thanenglishthat support phrase chunking and/or named entity recognition:dutchportuguesespanish
For these 4 languages, the default
outputissexpr. There are other language options that do only part-of-speech tagging, and their defaultoutputistagged:frenchgermangreekitalian
- tagger:
Optional, if you give a
languagevalue then it must be compatible as outlined below:defaultenglish tagger/chunkerbinaryenglish tagger/chunkerieerenglish tagger/chunkertimitenglish tagger/chunkerconll2002_neddutch tagger/chunkerconll2002_espspanish tagger/chunkermac_morphoportuguese tagger/chunkerspacy/en_core_web_smspacy english taggerspacy/de_core_news_smspacy german taggerspacy/fr_core_news_smspacy french taggerspacy/es_core_news_smspacy spanish taggerspacy/pt_core_news_smspacy portuguese taggerspacy/it_core_news_smspacy italian taggerspacy/nl_core_news_smspacy dutch taggerspacy/el_core_news_smspacy greek tagger
- output:
Optional, the default depends on your language choice. The tagged (and chunked) text can be returned in one of the following output formats:
taggedproduces part-of-speech tagged text, ignoring any phrase chunks or named entitiessexprproduces s-expressions to represent parse trees that may include sub-trees for phrases and/or named entitiesiobproduces IOB tags for each word, but only works if the language supports phrase chunking
Return Value¶
On success, a 200 OK response will be returned containing a JSON object that looks like this:
{
"text": "tagged text"
}
Errors¶
A 400 Bad Request response will be returned under the following conditions:
output=iobbutlanguagedoes not support phrase chunking or NERno value for
textis providedtextexceeds 2,000 charactersan incorrect
languageis specified
A 503 Throttled response will be returned if you exceed the daily request limit. Signup for the Text-Processing RapidAPI to get a higher limit plan.