Matching every documents tokens

francoisguerin · August 11, 2015, 6:20pm

Hi,

My documents are tiny documents (list of words and phrases) and I need that my queries match "whole" documents tokens ("whole" = tokens with stopwords) and not parts of documents.

EXAMPLE

Documents are like

doc1: {"text": "ministry"}
doc2: {"text": "justice"}
doc3: {"text": "ministry of justice"}

Analyzer is made of :

{“tokenizer": "icu_tokenizer", "filter": ["icu_normalizer", "english_stopwords", "english_stemmer]}

A basic match query would be :

GET /test/en/_search
{
  "query": {
    "match" : {
      "text": {
        "query" : "<QUERY>",
        "operator": "AND"
      }
    }
  }
}

What query should I use to get following behavior:

hit doc3 for query "the justice ministry" ?
A match query with operator:AND does the job.
But it would not be okay if I had an extra document like {"text": "chef of ministry of justice"}, it would be matched too.
hit doc1 for query "ministries" (and not doc3) ?
If I use a basic match query, I would get documents doc1 and doc3, but I only want doc1
hit doc2 for query "justices" (and not doc3) ?
If I use a basic match query, I would get documents doc2 and doc3, but I only want doc2

Is there a way to do it in a single query ?

Thanks a lot for your help,

Topic		Replies	Views
How do I build a query such that each token in a document field is matched? Elasticsearch	12	2014	July 6, 2017
Default match_all behavior for match query with no tokens after analysis Elasticsearch	3	401	July 6, 2017
Problem understanding phrase matching with stop words Elasticsearch	3	1311	September 21, 2017
Help with analyzer and mapping Elasticsearch	9	554	July 6, 2017
matchPhraseQuery can not retrieve documents with trailing “’s” even if set word delimiter tokenfilter when created indices Elasticsearch	8	461	July 6, 2017

Matching every documents tokens

EXAMPLE

Related topics