Need suggestions on type of query to be used for a given analysis for better results?


(dark_shadow) #1

Hi,

I'm using following analyzers for indexing my documents in ES:

        "analysis" : {
           "analyzer" : {
              "str_search_analyzer" : {
                  "tokenizer" : "standard",
                  "filter" : ["lowercase","asciifolding"]
               },
               "str_index_analyzer" : {
                 "tokenizer" : "standard",
                 "filter" : ["lowercase","asciifolding","edgengram"]
             }
           },
           "filter" : {
              "edgengram" : {
                  "type" : "edgeNGram",
                  "min_gram" : 3,
                  "max_gram" : 20,
                  "side"     : "front"
              }
          }
      }

I'm sure the search and index analyzers can serve my pupose well but
querying documents in a right manner is also necessary for better results.
I have read different queries which have been provided by ES but confused
on which query or a combination of queries can work well with my use case.

Let's say I have a document which contains 3 fields:
city_name: Palo Alto
state_name: California
country: United States

Now, my index analyzer will create following tokens on these 3 fields:
city_name: pal, palo, alt, alto
state_name: cal, cali, calif, califo, califor, californ, californi,
california
country: uni, unit, unite, united, sta, stat, state, states

And user search for a word like: palo alt
now, my search analyzer will index it like: palo, alt

Now, I want to return only those documents which contains both these
tokens, either in same field(like state, city or country) or as a
combination of 2 fields. (Not those documents where either palo or pal or
alt are present. )

Now which query can give me better results with these types of indexing and
searching ?

I read about terms query but that works on not anlyzed fields. Also,
querystring will generate some inbuilt regex queries for searching (I don't
want that) I want only those documents where all the tokens of a user
searched query are present in either same fields or multiple fields within
same document. How can I achieve this ?

Any idea ?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5e290988-db9f-4ba3-8273-f4172cd3ca3e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(dark_shadow) #2

I'll be thankful if someone can give me idea about this.

Thanks

On Saturday, 31 May 2014 00:45:21 UTC+5:30, coder wrote:

Hi,

I'm using following analyzers for indexing my documents in ES:

        "analysis" : {
           "analyzer" : {
              "str_search_analyzer" : {
                  "tokenizer" : "standard",
                  "filter" : ["lowercase","asciifolding"]
               },
               "str_index_analyzer" : {
                 "tokenizer" : "standard",
                 "filter" : ["lowercase","asciifolding","edgengram"]
             }
           },
           "filter" : {
              "edgengram" : {
                  "type" : "edgeNGram",
                  "min_gram" : 3,
                  "max_gram" : 20,
                  "side"     : "front"
              }
          }
      }

I'm sure the search and index analyzers can serve my pupose well but
querying documents in a right manner is also necessary for better results.
I have read different queries which have been provided by ES but confused
on which query or a combination of queries can work well with my use case.

Let's say I have a document which contains 3 fields:
city_name: Palo Alto
state_name: California
country: United States

Now, my index analyzer will create following tokens on these 3 fields:
city_name: pal, palo, alt, alto
state_name: cal, cali, calif, califo, califor, californ, californi,
california
country: uni, unit, unite, united, sta, stat, state, states

And user search for a word like: palo alt
now, my search analyzer will index it like: palo, alt

Now, I want to return only those documents which contains both these
tokens, either in same field(like state, city or country) or as a
combination of 2 fields. (Not those documents where either palo or pal or
alt are present. )

Now which query can give me better results with these types of indexing
and searching ?

I read about terms query but that works on not anlyzed fields. Also,
querystring will generate some inbuilt regex queries for searching (I don't
want that) I want only those documents where all the tokens of a user
searched query are present in either same fields or multiple fields within
same document. How can I achieve this ?

Any idea ?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5764f7c8-88ee-4d17-8f18-c9df346a2e3f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #3