Hey everybody,
(this is my first post to this list; let me start with saying ElasticSearch
is a pleasure to work with; so thanks to everybody involved!)
I'm currently stuggeling with the question on how to build my index:
Basically, I have a few fields which contain pretty short "human readable
identifiers" which shall be searched.
Because I have multiple fields per document containing such identifiers,
I'm indexing them in the _all property.
Currently, the _all property is configured with the standard tokenizer, and
the edge NGram token filter -- so people can search for partial words (from
the start of the word) pretty well.
Now, the requirement has arisen to also be able to find any partial word,
not just from the start of each word. However, if the start of the word
matches, it should be higher ranked than a match inside the word.
Any thoughts on how to do that? I currently see different possibilities:
- can I somehow set different _all indexing configurations for different
fields? I.e. that the identifiers are indexed as multi_field, once with
edgeNgram and a higher boost, once with ngram and a lower boost. That'd be
the solution I'd prefer, but from reading the docs I doubt that's possible. - Can I somehow tell the system to use both the edge ngram and the
ngram filter in parallel, such that the tokens starting from the
beginning of every word are indexed twice per document? This should, as I
understand it, also result in a higher ranking, albeit it is somehow crude. - Should I kick out the _all field, manually concatenating the different
strings on indexing time; indexing it once with edgeNgram and once with
ngram; and then on query time boost the edgeNgram results over the
other ones? (Would dislike this the most, as this effects every place where
such queries are built...)
Thank you in advance for providing any advice,
Greets, Sebastian
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.