Multiple complex exactly phrase search


(Douglas Menegon Cordeiro) #1

Hi I have build a elastic with News from several years, where before i used SPHINX to do retro searchs on the content, using very special and complex filters.

Today I can search for several words/phrases in the news and filter for that words I don't want find those news.

Pretty like this:

{
  "index": [
    "index_2014*,index_2015*,index_2011*,index_2001*"
  ],
  "type": "my_data",
  "fields": [
    "id_data"
  ],
  "size": 1000,
  "body": {
    "query": {
      "bool": {
        "must": [
          {
            "multi_match": {
              "query": [
                "Phrase that I'm searching"
              ],
              "fields": [
                "title",
                "text"
              ],
              "type": "phrase"
            }
          }
        ],
        "must_not": [
          {
            "multi_match": {
              "query": [
                "Phrase that must not contain",
                "Another phrase that must not contain",
                "word"
              ],
              "fields": [
                "title",
                "text"
              ],
              "type": "phrase"
            }
          }
        ]
      }
    }
  }
}

But what I can't do is that. I have a list of words/phrases that i call "filters" for each primary word/phrase I'm searching into news, but it must contain one or more of those "filters". If i use should I can simulate almost that, but still bring to me news that not contain any of the "filters"

Let me show an functional example.
The Elasticsearch has 1M news that contain "CACAU"
My should contain filters for "CACAU" are "Medical","Health","life style","America","Americas"
My must not contain filters for "CACAU" are "Chocolate", "Industry", "WallMart"

Below I show how these should work:

"Cacau is good for mind health"  => OK
"Cacau is cheap at wallmart" => NOT OK
"Majority of the people don't consume cacau, cacau is bad for you says doctor Alfred"  NOT OK
"I want Cacau, I want Chocolate" NOT OK
"CACAU News" NOT OK

How can I archive this behavior from elastic?


(system) #2