Hi, I'm trying to set up a simple analyzer to search for question/answers into our elastic search cluster.
We need to have the option to use english language analyzer (with the default stopwords) together with custom synonyms and stopwords. As the latters are going to change from time to time I decided to include them only in the search_anayzer
so that I can use updateable=True
. Here is my setup (sorry for the weird formatting, I'm using the elasticsearch python dsl)
index_analyzer = analyzer(
'english_analyzer',
tokenizer="standard",
filter=["lowercase",
token_filter('english_stemmer',
type='stemmer',
language='english'),
token_filter('english_excluded_words',
type='stop',
stopwords='_english_'),
token_filter('english_possessive_stemmer',
type='stemmer',
language='possessive_english')
])
search_analyzer = analyzer(
'english_search_analyzer',
tokenizer="standard",
filter=["lowercase",
token_filter('english_stemmer',
type='stemmer',
language='english'),
token_filter('english_excluded_words',
type='stop',
stopwords='_english_'),
token_filter('english_possessive_stemmer',
type='stemmer',
language='possessive_english'),
token_filter('synonyms',
type='synonym',
synonyms_path="analyzers/F79458176",
updateable=True),
token_filter('custom_stopwords',
type='stop',
stopwords_path="analyzers/F261792345",
updateable=True),
])
The paths are defined like this as I'm using packages in AWS. I tested a similar setup without the custom_stopwords
and only a bunch of synonyms and it was working just fine. Now I just added all the synonyms and some custom stopwords but I'm getting an error when initializing the index:
RequestError(400, 'illegal_argument_exception', 'failed to build synonyms'): RequestError
without any additional info.
Is it something related to a fail parsing of the synonyms file or to my setup with multiple stopwords token filters? Do I really need to use the first three token filters in the search analyzer even though they are already in the index analyzer?