+1 vote
in ElasticSearch by
What is tokenizer in Elasticsearch?

2 Answers

0 votes
by

Tokenizers are used to generate the tokens from a text string. It breaks down the text string into tokens where it finds whitespace or other punctuation symbols. Elasticsearch offers a number of built-in tokenizers to generate tokens from a string. Standard tokenizer is one of the most popular tokenizers of Elasticsearch, which is mostly used to divides a string into multiple tokens.

Apart from that, there are several other tokenizers, such as - lowercase tokenizer, whitespace tokenizer, pattern tokenizer, keyword analyzer, NGram tokenizer, and many more offered by Elasticsearch. Usually, a tokenizer helps to analyze the text string.

0 votes
by
A Tokenizer breakdown fields which values of a document into a stream. Inverted indexes are created and updated by using these values. After that, these stream of values are stored in the document.
...