Hello,
one more specific question in my quest to let ElasticSearch do what I want
it to
I have some text fields which have been thoroughly analyzed before any
indexing happens. Thus, I already know the text to be stored as well as the
tokens the text should produce. I don't want to use a Lucene/ElasticSearch
analyzer in that case because it would not be able to produce the tokens I
want.
Essentially I'm looking for the PreAnalyzedField feature available in Solr.
If there is such a feature and I just missed it, you can just tell me and
skip the rest of this post
For ElasticSearch I thought I would exploit the multi_field feature by
doing the following:
"properties": {
"text_stored": {
"type": "multi_field",
"path": "just_name",
"fields": {
"text": {"type": "string","index": "no","store":"yes"}
}
},
"text_analyzed": {
"type": "multi_field",
"path": "just_name",
"fields": {
"text": {"type": "string","index": "analyzed","term_vector" : "with_positions_offsets"}
}
}
The whole example can be found here and be copy&pasted into the terminal
after starting a fresh copy of ElasticSearch:
"text_stored" should contain the original text and "text_analyzed" my
pre-analyzed terms (I could add those by using an appropriate tokenizer
plugin I hope).
If I now have a document like this
{
"text_stored": "Sebastien Lorber is awesome. Yes, old Lorber.",
"text_analyzed": "Lorber has a farm."
}'
I am able to find the document by searching "text:farm" for example.
Searching for "text:awesome" would not work here, of course, because the
"text_stored" field is not analyzed.
The only thing lacking for me now is that the "text_stored" field value
should be highlighted corresponding to the analyzed tokens in
"text_analyzed".
Thus, when searching for "lorber" I would like this highlighting:
"Sebastien Lorber is awesome. Yes, old Lorber."
Instead I get
"Sebastien Lorber is awesome. Yes, old Lorber."
When searching for "farm" I'd like highlighting to be as
"Sebastien Lorber is awesome. Yes, old Lorber."
Instead I don't get any highlighting because "farm" is not found in the text.
I know that this behaviour makes sense for the default use case.
I hoped by specifying
"term_vector" : "with_positions_offsets" highlighting would only happen based on offsets, ignoring actual text contents.
My question is whether there is a possibility to get the behaviour I'd like to see.
Thank you!
Erik
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.