I think this will work great for you. Just compare the number of analyzed
tokens at index and query time. I think we were solving identical
problems.
Dim client = New ElasticConnection()
Dim result = client.Post("
http://localhost:9200/chc/_analyze?analyzer=" & Analyzer, RawString)
Dim J = JObject.Parse(result.ToString())
'newtonsoft.json.linq.jobject
Return (From X In J("tokens")).Count()
Here is my query code, where "MyQuery" is my index field name. This might
be "title" or "name" or something for you.
{
"size": 30,
"query": {
"filtered": {
"query": {
"match": {
"MyQuery": {
"query": "[query]",
"operator": "AND"
}
}
},
"filter": {
"term": {
"TokenCount": "[tokencount]"
}
}
}
}
}
Sorry for the late reply.
Brian Webster | 918 633 6863
On Wed, Jul 31, 2013 at 8:52 AM, thalejacobs@gmail.com wrote:
Hello All- I am having a similar issue as the one Brian described. Brian
- did you end up going with including the token count in your index for
filtering? Did it work well? I am thinking about doing the same, but I
have one more issue to solve too.
I have another doc in the index that just contains "Square" (as well as
"Square Steakhouse"), so if I search only on Square, I only want to get a
match on the "Square" document, not the Square Steakehouse doc...
Query: Square Steakhouse Result: Match to Square Steakehouse doc
Query: Square Steakhouses Result: Match to Square Steakehouse doc
Query: Squared Steakhouse Result: Match to Square Steakehouse doc
Query: Steakhouse Result: No Match
Query: Square Result: Match to Steakehouse doc
Query: Squared Result: Match to Steakehouse doc
Any suggestions?
Thanks.
On Wednesday, January 30, 2013 2:30:52 PM UTC-5, Brian Webster wrote:
I'm going to move forward with your idea:
The only thing I can think of doing is to:
- index the number of tokens in that field
- count the number of tokens in your query string
- use a filter to make sure they are the same
Of course, that means ensuring that you're counting the same number of
tokens that would be generated by the analyzer (eg being aware of
stopwords etc)
I'm going to write a function that uses the analyze API to extract the
number of tokens given a field: String GetTokenCount(string
Field_Or_Search_Text). This function will use the correct analyzer.
Then, upon indexing the document type in question, I will store the token
count of the relevant field.
Upon searching, I will use the same GetTokenCount() function to count the
user's search tokens.
Finally, I will structure the search JSON to utilize the filters as you
have suggested.
Obviously this solution is poor for some applications, but I anticipate
fewer than 10,000 searches per day and fewer than 10 index inserts per day
of the type that is involved. Besides, I'd imagine the analyze API is
rather speedy compared to running actual queries.
Thanks for the advice. This will be a little bit tedious, but not so bad.
--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/ttJTE52hXf8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.