AnalyzerProperties Properties |
The AnalyzerProperties type exposes the following members.
Name | Description | |
---|---|---|
Accent |
Optional. When true, accented characters are preserved.
When false, accented characters are converted to
their base characters.
| |
Case |
The case to use when normalizing the text. Possible values:
"lower" to convert to all lower-case characters
"upper" to convert to all upper-case characters
"none" to not change character case (default)
| |
Locale |
A locale in the format language[_COUNTRY][.encoding][@variant]
(square brackets denote optional parts), e.g. "de.utf-8" or
"en_US.utf-8". Only UTF-8 encoding is meaningful in ArangoDB.
The locale is forwarded to ICU without checks. An invalid
locale does not prevent the creation of the Analyzer.
| |
Stemming |
Turn Stemming ON or OFF.
If true, the analyzer stems the text,
treated as a single token, for supported languages.
Stemming support is provided by Snowball,
which supports the languages listed at:
https://www.arangodb.com/docs/stable/analyzers.html#stemming
| |
StopWords |
An Analyzer is capable of removing
specified tokens from the input.
It uses binary comparison to
determine if an input token should
be discarded. It checks for exact
matches. If the input contains only
a substring that matches one of the
defined stopwords, then it is not discarded.
Longer inputs such as prefixes of
stopwords are also not discarded.
|