Query_string DSL fails when just one search term word changes - reliably

steveh · August 26, 2021, 4:23pm

Weird problem I cannot explain or solve!

On my production server queries can fail if one word changes. This is reliable. Its just the word change that triggers the error, not a fluke of unfortunate timing (i.e. the server is overloaded)

ES 7.8 with php APi 7.7
Error: BadRequest400Exception
Search term: very long search term here and economic
Query sent as (php print_r output btw)

 [query_string] => Array
                                        (
                                            [fields] => Array
                                                (
                                                    [0] => document_content
                                                    [1] => case_summary^10
                                                )

                                            [query] => +very +long +search +term +here +and +economic
                                        )

                                )

BUT the query string
very long search term here and contains
works fine!
As does
very long search term here and continuity
but NOT
very long search term here and children

Just changing one word causes the fail, continuity OK, children nope. (those pesky kids)

Any thoughts why this may be? I am bamboozled!

steveh · August 26, 2021, 4:53pm

p.s. I have just upgraded the PHP API to 7.11 (the latest) via composer, with no affect on the problem. Nor can I find any such thing as reserved words in Elasticsearch.

spinscale · August 30, 2021, 11:26am

can you share the JSON queries you are sending to Elasticsearch plus the JSON responses coming from Elasticsearch - otherwise debugging this will be super hard.

Thank you!

steveh · August 31, 2021, 8:14am

I'll see what logging options the PHP API gives me to grab the JSON its sending. I don't use JSON directly.

steveh · August 31, 2021, 4:12pm

Moving forwards:
Enabling monolog has revealed this error:

field doc of [appeals3] index has exceeded [1000000] - maximum allowed to be analyzed for highlighting.

It appears a few of my documents are much larger than I originally planned for.

Thus am I right in thinking that changing the index from this:
"document_content" : { "type" : "text", "analyzer" : "bespoke_snowball" },
to this:
"document_content" : { "type" : "text", "term_vector" : "with_positions_offsets", "analyzer" : "bespoke_snowball" },
and rebuilding my index will magically fix the problem and use the fast highlighter instead?

I realise I could just increase the index.highlight.max_analyzed_offset but this would be a "bodgit and run" solution. Adding the term vector appears straight forward.

Your opinion is much appreciated.

Steve

spinscale · September 1, 2021, 8:35am

Glad you found the issue! Your solution will come at the expense of a bigger index size, but that might be ok for you.

steveh · September 2, 2021, 4:05pm

Thank you for your help. The fix works and the index size is little different.

system · September 30, 2021, 4:05pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Weird highlighting error ... failed to highlight field ... String index out of range Elasticsearch	12	1987	July 6, 2017
CPU and memory usage suddenly spiral out of control Elasticsearch	10	495	July 6, 2017
Highlighting Inconsistent behaviour Elasticsearch	7	700	July 6, 2017
Elasticsearch 7.7 crashing for Term Query if term text 200 - 300 char Elasticsearch	10	570	August 13, 2020
JSON wildcard search Elasticsearch	3	5389	July 6, 2017

Query_string DSL fails when just one search term word changes - reliably

Related topics