Need to match all terms in query, but they can be in any document property


(Gregory Schier) #1

Hello.

I have an index of menu items that I am trying to query on. A
simplified representation is that an item has a name, a description,
and an array of cuisines (basically tags).
If someone searches for "baked goods" it is required that both "baked"
and "goods" appear in the document, but "baked" could be in the name,
and "goods" could be in the description. Also "baked goods" could also
just be one of the item's cuisines. I was initially using a mlt query
to do this but it doesn't seem to be returning as many results as it
should (maybe my understanding of how it works is incorrect?). It
seems to be returning all of the items where "baked" is in the
description, and "baked goods" is one of it's cuisines (note that no
items contain these terms in the "name" property). There are lots of
items with "baked goods" as a cuisine that are not being returned.

my mlt query has the following properties:

  • fields = [ 'cuisine', 'name', 'description' ]
  • max_query_terms = 12
  • min_term_freq = 1
  • min_doc_freq = 1
  • percent_terms_to_match = 1

My feeling is that cuisines being an array of strings has something to
do with it, but I really have no idea. Is there an easy way to make
sure all query terms appear in a document, but not care which fields
they are found in given a specific set of fields?

Hope I was clear enough in my explanation.

Thanks!

~Gregory


(Shay Banon) #2

Maybe just use the _all field htat gets created automatically by default,
and query on it?

On Sun, Apr 22, 2012 at 9:37 AM, Gregory Schier gschier1990@gmail.comwrote:

Hello.

I have an index of menu items that I am trying to query on. A
simplified representation is that an item has a name, a description,
and an array of cuisines (basically tags).
If someone searches for "baked goods" it is required that both "baked"
and "goods" appear in the document, but "baked" could be in the name,
and "goods" could be in the description. Also "baked goods" could also
just be one of the item's cuisines. I was initially using a mlt query
to do this but it doesn't seem to be returning as many results as it
should (maybe my understanding of how it works is incorrect?). It
seems to be returning all of the items where "baked" is in the
description, and "baked goods" is one of it's cuisines (note that no
items contain these terms in the "name" property). There are lots of
items with "baked goods" as a cuisine that are not being returned.

my mlt query has the following properties:

  • fields = [ 'cuisine', 'name', 'description' ]
  • max_query_terms = 12
  • min_term_freq = 1
  • min_doc_freq = 1
  • percent_terms_to_match = 1

My feeling is that cuisines being an array of strings has something to
do with it, but I really have no idea. Is there an easy way to make
sure all query terms appear in a document, but not care which fields
they are found in given a specific set of fields?

Hope I was clear enough in my explanation.

Thanks!

~Gregory


(system) #3