Stemming¶
To stem text, do an HTTP POST to http://text-processing.com/api/stem/
with form encoded data containg the text
you want to stem. You’ll get back a JSON object response whose text
attribute contains the stemmed text. Here’s some examples of how to do it using curl:
$ curl -d "text=processing" http://text-processing.com/api/stem/
{
"text": "process"
}
How to specify a stemmer other than porter
, in this case wordnet
:
$ curl -d "text=processing&stemmer=wordnet" http://text-processing.com/api/stem/
{
"text": "processing"
}
Using the snowball
stemmer with spanish
:
$ curl -d "text=correr&stemmer=snowball&language=spanish" http://text-processing.com/api/stem/
{
"text": "corr"
}
Specifying just the language
, which in the case of portuguese
defaults to using the snowball
stemmer:
$ curl -d "text=correr&language=portuguese" http://text-processing.com/api/stem/
{
"text": "corr"
}
Try out the stemming demo to get a feel for the results.
Paramterers¶
- text:
Required - the text you want to stem. It must not exceed 60,000 characters.
- language:
The default
language
isenglish
, unless a non-english stemmer is given. In that case, the value oflanguage
must be compatible with the chosenstemmer
. Currently, the following languages are supported:arabic
english
danish
dutch
finnish
french
german
hungarian
italian
norwegian
portuguese
romanian
russian
spanish
swedish
The
snowball
stemmer is the defaultstemmer
for all languages exceptenglish
andarabic
, which default toporter
andisri
respectively.- stemmer:
The
stemmer
parameter supports the following valuesporter
The default porter stemmer supports any
language
but defaults toenglish
lancaster
A lancaster stemmer that supports any language but defaults to
english
wordnet
Lemmatization using WordNet, only supports
english
rslp
A
portuguese
stemmerisri
An
arabic
stemmersnowball
A stemmer that supports the following languages
danish
dutch
english
finnish
french
german
hungarian
italian
norwegian
porter
portuguese
romanian
russian
spanish
swedish
If you give both a
stemmer
and alanguage
, thestemmer
must support that language. Bothporter
andlancaster
can be used with any language, whilewordnet
,rslp
, andisri
are limited to their respective languages. Thesnowball
stemmer currently supports 14 languages, and is the defaultstemmer
for those languages.
Return Value¶
On success, a 200 OK response will be returned containing a JSON object that looks like this:
{
"text": "stemmed text"
}
Errors¶
A 400 Bad Request response will be returned under the following conditions:
the
language
is not compatible with thestemmer
no value for
text
is providedtext
exceeds 60,000 characters
A 503 Throttled response will be returned if you exceed the daily request limit. Signup for the Text-Processing RapidAPI to get a higher limit plan.