How are keywords indexed

shekharshan · June 23, 2019, 11:25pm

I am new to this topic of text searching so I have some confusions. I understand that full text data goes through the analyzer that does the following processing:

Tokenizes the full text into individual words
Throws away the stop words
Stems each token
Updates the lexicon and the inverted index with data relevant for frequency/ranking etc

My question is what happens if I pick a 'keyword' data type where the values have multiple tokens. For instance, say I want to index 'full name' using keyword data type. In this case the values may be like 'Donald Knuth' or 'Ada Lovelace'. In this case what does the lexicon and the inverted index look like? Do we store 'Donald Knuth' and 'Ada Lovelace' in the lexicon (instead of single word tokens)?

Christian_Dahlqvist · June 24, 2019, 5:29am

Yes, the full string is stored as a term and is not tokenized.

system · July 22, 2019, 5:29am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Is KEYWORD data type analyzed as well? Elasticsearch	3	1530	February 14, 2017
Index text as keyword array leveraging tokenizers and filters Elasticsearch	1	522	July 19, 2020
What is the difference between keyword type vs. text type with keyword analyzer? Elasticsearch	2	22421	May 3, 2017
How does Elasticsearch indexes non-text fields Elasticsearch	5	778	September 25, 2022
Keyword, doc_value and analysis Elasticsearch	3	1526	September 22, 2019

How are keywords indexed

Related topics