Matching every documents tokens


#1

Hi,

My documents are tiny documents (list of words and phrases) and I need that my queries match "whole" documents tokens ("whole" = tokens with stopwords) and not parts of documents.

EXAMPLE

Documents are like

doc1: {"text": "ministry"}
doc2: {"text": "justice"}
doc3: {"text": "ministry of justice"}

Analyzer is made of :

{“tokenizer": "icu_tokenizer", "filter": ["icu_normalizer", "english_stopwords", "english_stemmer]}

A basic match query would be :

GET /test/en/_search
{
  "query": {
    "match" : {
      "text": {
        "query" : "<QUERY>",
        "operator": "AND"
      }
    }
  }
} 

What query should I use to get following behavior:

  • hit doc3 for query "the justice ministry" ?
    A match query with operator:AND does the job.
    But it would not be okay if I had an extra document like {"text": "chef of ministry of justice"}, it would be matched too.
  • hit doc1 for query "ministries" (and not doc3) ?
    If I use a basic match query, I would get documents doc1 and doc3, but I only want doc1
  • hit doc2 for query "justices" (and not doc3) ?
    If I use a basic match query, I would get documents doc2 and doc3, but I only want doc2

Is there a way to do it in a single query ?

Thanks a lot for your help,


(system) #2