WordDelimiterTokenFilter doesn't seem to be generating expected tokens

Atul_Bagga · January 22, 2018, 12:11pm

Here is my analyzer -

{
"analysis": {
"filter": {
"wordDelimiter": {
"type": "word_delimiter",
"generate_word_parts": "true",
"generate_number_parts": "true",
"catenate_words": "false",
"catenate_numbers": "false",
"catenate_all": "false",
"split_on_case_change": "true",
"preserve_original": "true",
"split_on_numerics": "true",
"stem_english_possessive": "true"
}
},
"analyzer": {
"content_analyzer1": {
"type": "custom",
"tokenizer": "standard",
"filter": [
"asciifolding",
"wordDelimiter",
"lowercase"
]
}
}
}
}

When I try to analyze the text "ElasticSearch.TestProject"

I expect the tokens elastic, search, test, project, elasticsearch, testproject, elasticsearch.testproject to be generated since I have split_on_case_change, split_on_numerics on and using a standard tokenizer which should tokenize on "."

But Actually I only see following tokens -
elasticsearch.testproject, elastic, search, test, project

Is there a way to get the expected tokens I want?

system · February 19, 2018, 12:11pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Word Delimiter Filter Elasticsearch	1	285	July 6, 2017
Issue with using word delimiter Elasticsearch	1	587	July 6, 2017
Word delimiter filter: match all word parts Elasticsearch	1	369	July 5, 2017
Word_delimiter behaviour using match query with operator and Elasticsearch	1	203	September 26, 2022
WordDelimiterTokenFilter used twice in same analyzer with different configurations causes issues Elasticsearch	7	2016	March 21, 2018

WordDelimiterTokenFilter doesn't seem to be generating expected tokens

Related topics