Lee_Gee
(Lee Gee)
October 1, 2014, 10:24am
1
I have an ElasticSearch string field configured for autocomplete like this:
autocomplete_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, ending_synonym, name_synonyms,
autocomplete_filter ]
autocomplete_filter:
type: edge_ngram
min_gram: 1
max_gram: 20
token_chars: [ letter, digit, whitespace, punctuation, symbol ]
search_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, standard, name_synonyms,
ending_synonym ]
I have a record where the field contains 'S XYZ', and lots of other records
where the field contains other words beginning S.
I do not understand why, when I search for 'S XYZ', it is not the first
result.
Could someone please explain ?
Many thanks in anticipation
lee
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/218280b1-2c9c-42db-854d-62d1c8de8862%40googlegroups.com .
For more options, visit https://groups.google.com/d/optout .
jpountz
(Adrien Grand)
October 1, 2014, 10:55pm
2
Maybe you can enable explanations to see how scores are computed and what
the difference is between these records?
Power insights and outcomes with the Elasticsearch Platform and AI. See into your data and find answers that matter with enterprise solutions designed to help you build, observe, and protect. Try Elasticsearch free today.
On Wed, Oct 1, 2014 at 12:24 PM, Lee Gee leegee@gmail.com wrote:
I have an Elasticsearch string field configured for autocomplete like this:
autocomplete_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, ending_synonym,
name_synonyms, autocomplete_filter ]
autocomplete_filter:
type: edge_ngram
min_gram: 1
max_gram: 20
token_chars: [ letter, digit, whitespace, punctuation, symbol ]
search_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, standard, name_synonyms,
ending_synonym ]
I have a record where the field contains 'S XYZ', and lots of other
records where the field contains other words beginning S.
I do not understand why, when I search for 'S XYZ', it is not the first
result.
Could someone please explain ?
Many thanks in anticipation
lee
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/218280b1-2c9c-42db-854d-62d1c8de8862%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/218280b1-2c9c-42db-854d-62d1c8de8862%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout .
--
Adrien Grand
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAL6Z4j6C5nvV-rzqWGQ%3DWxHYSyzX34t6qQXshomfh26p4nK_aA%40mail.gmail.com .
For more options, visit https://groups.google.com/d/optout .
Lee_Gee
(Lee Gee)
October 2, 2014, 8:41am
3
'explain' shows only two differences between the two results:
Hit on 'S' vs. hit on 'DqWjDCcsh S'
idf(docFreq=1, maxDocs=1) vs. idf(docFreq=10, maxDocs=10)
fieldNorm(doc=0) vs. fieldNorm(doc=9)
My possibly flawed understanding is that IDF is the inverse document
frequency of the search term across the whole index — what confuses me is
that these are results for the same term in the same index, so shouldn't
the IDF be the same...?
tia
lee
On Wednesday, October 1, 2014 11:24:17 AM UTC+1, Lee Gee wrote:
I have an Elasticsearch string field configured for autocomplete like this:
autocomplete_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, ending_synonym,
name_synonyms, autocomplete_filter ]
autocomplete_filter:
type: edge_ngram
min_gram: 1
max_gram: 20
token_chars: [ letter, digit, whitespace, punctuation, symbol ]
search_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, standard, name_synonyms,
ending_synonym ]
I have a record where the field contains 'S XYZ', and lots of other
records where the field contains other words beginning S.
I do not understand why, when I search for 'S XYZ', it is not the first
result.
Could someone please explain ?
Many thanks in anticipation
lee
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/681ebe12-7cfa-4ed6-a045-ad287545d4eb%40googlegroups.com .
For more options, visit https://groups.google.com/d/optout .
Lee_Gee
(Lee Gee)
October 2, 2014, 9:13am
4
The problem was that my test script did not pause between
creating/populating the index, and searching on it. Even though there are
very few documents (10), Elasticsearch still needs a second or two to catch
its breath and mop its brow before it is ready to search.
Now to find a way to rank shorter strings higher than longer ones.... but
that's another question....
thanks
Lee
On Wednesday, October 1, 2014 11:24:17 AM UTC+1, Lee Gee wrote:
I have an Elasticsearch string field configured for autocomplete like this:
autocomplete_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, ending_synonym,
name_synonyms, autocomplete_filter ]
autocomplete_filter:
type: edge_ngram
min_gram: 1
max_gram: 20
token_chars: [ letter, digit, whitespace, punctuation, symbol ]
search_analyzer:
type: custom
tokenizer: whitespace
filter: [ lowercase, asciifolding, standard, name_synonyms,
ending_synonym ]
I have a record where the field contains 'S XYZ', and lots of other
records where the field contains other words beginning S.
I do not understand why, when I search for 'S XYZ', it is not the first
result.
Could someone please explain ?
Many thanks in anticipation
lee
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com .
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c43961cb-224a-4b17-a03e-fc44926a05ec%40googlegroups.com .
For more options, visit https://groups.google.com/d/optout .