You can also look at developing a custom analyzer so that your phrase is
not broken up at white space when indexed.
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html
Selecting the correct combination of char filters and tokenizers will
retain phrases.
For example, using the whitespace analyzer will separate on whitespace:
curl '192.168.w.xyz:9200/test/_analyze?pretty=1&analyzer=whitespace' -d
'foo bar baz'
{
"tokens" : [ {
"token" : "foo",
"start_offset" : 0,
"end_offset" : 3,
"type" : "word",
"position" : 1
}, {
"token" : "bar",
"start_offset" : 4,
"end_offset" : 7,
"type" : "word",
"position" : 2
}, {
"token" : "baz",
"start_offset" : 8,
"end_offset" : 11,
"type" : "word",
"position" : 3
} ]
}
However, using the keyword analyzer will retain the entire phrase:
curl '192.168.w.xyz:9200/test/_analyze?pretty=1&analyzer=keyword' -d 'foo
bAr baZ'
{
"tokens" : [ {
"token" : "foo bAr baZ",
"start_offset" : 0,
"end_offset" : 11,
"type" : "word",
"position" : 1
} ]
}
On Tuesday, October 28, 2014 10:00:01 AM UTC, vineeth mohan wrote:
Hello Valergi ,
This wont work , normally becuase the string would be tokenized into green
and energy.
If you use shingle token filter and set it as 2 , it might work.
Or in this case , you can see the position value of both the token using
the script and if its next to each other , you can take it as an
occurrence.
Thanks
Vineeth
On Tue, Oct 28, 2014 at 3:06 PM, <valerij.v...@googlemail.com
<javascript:>> wrote:
I want to access frequency of a phraze combined from multiple words e.g.
"green energy"
I can access tf of "green" and "energy", example:
"function_score":
{
"filter" : {
"terms" : { "content" : ["energy","green"]}
},
"script_score": {
"script": "_index['content']['energy'].tf() +
_index['content']['green'].tf()",
"lang":"groovy"
}
}
This works fine. However, how can I find the frequency of a term "green
energy" as
_index['content']['green energy'].tf() does not work
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/2e4388a4-72d6-4933-9686-304dea0727f1%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/2e4388a4-72d6-4933-9686-304dea0727f1%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/87fbc699-ade2-489f-b715-a987066d6cc4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.