To stem text, do an HTTP POST to http://text-processing.com/api/stem/ with form encoded data containg the text you want to stem. You’ll get back a JSON object response whose text attribute contains the stemmed text. Here’s some examples of how to do it using curl:
$ curl -d "text=processing" http://text-processing.com/api/stem/
{
"text": "process"
}
How to specify a stemmer other than porter, in this case wordnet:
$ curl -d "text=processing&stemmer=wordnet" http://text-processing.com/api/stem/
{
"text": "processing"
}
Using the snowball stemmer with spanish:
$ curl -d "text=correr&stemmer=snowball&language=spanish" http://text-processing.com/api/stem/
{
"text": "corr"
}
Specifying just the language, which in the case of portuguese defaults to using the snowball stemmer:
$ curl -d "text=correr&language=portuguese" http://text-processing.com/api/stem/
{
"text": "corr"
}
Try out the stemming demo to get a feel for the results.
text: | Required - the text you want to stem. It must not exceed 60,000 characters. |
---|---|
language: | The default language is english, unless a non-english stemmer is given. In that case, the value of language must be compatible with the chosen stemmer. Currently, the following languages are supported:
The snowball stemmer is the default stemmer for all languages except english and arabic, which default to porter and isri respectively. |
stemmer: | The stemmer parameter supports the following values
If you give both a stemmer and a language, the stemmer must support that language. Both porter and lancaster can be used with any language, while wordnet, rslp, and isri are limited to their respective languages. The snowball stemmer currently supports 14 languages, and is the default stemmer for those languages. |
On success, a 200 OK response will be returned containing a JSON object that looks like this:
{
"text": "stemmed text"
}
A 400 Bad Request response will be returned under the following conditions:
A 503 Throttled response will be returned if you exceed the daily request limit. Signup for the Mashape Text-Processing API to get a higher limit plan.