I have an index of menu items that I am trying to query on. A
simplified representation is that an item has a name, a description,
and an array of cuisines (basically tags).
If someone searches for "baked goods" it is required that both "baked"
and "goods" appear in the document, but "baked" could be in the name,
and "goods" could be in the description. Also "baked goods" could also
just be one of the item's cuisines. I was initially using a mlt query
to do this but it doesn't seem to be returning as many results as it
should (maybe my understanding of how it works is incorrect?). It
seems to be returning all of the items where "baked" is in the
description, and "baked goods" is one of it's cuisines (note that no
items contain these terms in the "name" property). There are lots of
items with "baked goods" as a cuisine that are not being returned.
my mlt query has the following properties:
fields = [ 'cuisine', 'name', 'description' ]
max_query_terms = 12
min_term_freq = 1
min_doc_freq = 1
percent_terms_to_match = 1
My feeling is that cuisines being an array of strings has something to
do with it, but I really have no idea. Is there an easy way to make
sure all query terms appear in a document, but not care which fields
they are found in given a specific set of fields?
I have an index of menu items that I am trying to query on. A
simplified representation is that an item has a name, a description,
and an array of cuisines (basically tags).
If someone searches for "baked goods" it is required that both "baked"
and "goods" appear in the document, but "baked" could be in the name,
and "goods" could be in the description. Also "baked goods" could also
just be one of the item's cuisines. I was initially using a mlt query
to do this but it doesn't seem to be returning as many results as it
should (maybe my understanding of how it works is incorrect?). It
seems to be returning all of the items where "baked" is in the
description, and "baked goods" is one of it's cuisines (note that no
items contain these terms in the "name" property). There are lots of
items with "baked goods" as a cuisine that are not being returned.
my mlt query has the following properties:
fields = [ 'cuisine', 'name', 'description' ]
max_query_terms = 12
min_term_freq = 1
min_doc_freq = 1
percent_terms_to_match = 1
My feeling is that cuisines being an array of strings has something to
do with it, but I really have no idea. Is there an easy way to make
sure all query terms appear in a document, but not care which fields
they are found in given a specific set of fields?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.