Multi-match query


(Philippe) #1

Hi All,
I try to find a value ( from different field" ) in two index. ( and return false if the value is not present in the two indexes.
here is my examples.
I have an index ( known_issues ) filled with the current error already identified

by ex: 2 issues

    POST /known_issues/doc
    {
      "id": "issue1",
      "description": "file not found"
    }

POST /known_issues/doc
{
  "id": "issue2",
  "description": "syntax error"
}

and another index with the log of each test.

POST /test_log/doc
{
  "id": "id1",
  "test_case": "a test case",
  "result": "syntax error"
}


POST /test_log/doc
{
  "id": "id2",
  "test_case": "another test case",
  "result": "success"
}

I would like to find by a query , all tests ( from test_log) with an issue already identified ( from known_issues )

my query is :

GET /known_issues,test_log/_search
{
  "query": {
       "multi_match": {
          "query": "syntax error",
          "type":       "phrase", 
          "fields": [
            "description",
            "result"
          ],
          "operator": "and" 
        }         
 }
}

I'm really happy, the request finds 2 hits, one in each index.

{
  "took": 0,
   ....
  },
  "hits": {
    **"total": 2,**
    "hits": [
      {
        "_index": "known_issues",
        "_source": {
          "id": "issue2",
          "description": "syntax error"
        }
      },
      {
        "_index": "test_log",
        "_source": {
          "id": "id1",
          "test_case": "a test case",
          "result": "syntax error"
        }
      }
    ]
  }
}

BUT :frowning:
, this request returns 1 hit , it there is no entries in known_issues index ! why ?!
the operator "and" did not do what i was expecting.
ex :

delete known_issues

POST /known_issues/doc
{
  "id": "issue1",
  "description": "file not found"
}

POST /known_issues/doc
{
  "id": "issue2",
  "description": "syntax_ERROR"
}


my query returns : 
{
  "took": 0,
  "timed_out": false,
  "_shards": {...},
  "hits": {
    **"total": 1,**
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "test_log",
        "_type": "doc",
        "_id": "MkOK1WMBOroXuELibpK5",
        "_score": 0.5753642,
        "_source": {
          "id": "id1",
          "test_case": "a test case",
          "result": "syntax error"
        }
      }
    ]
  }
}

I tested with a bool, must ,should, etc etc .... nothing work!
Shall i change my idea ? in a python script, i can get all known issues and make a query for each test_log ? but i lose the power of ES;
Any help , will be welcome
Thanks
Philippe.


(David Pilato) #2

Have a look at this:

POST _analyze
{
  "text": [ "syntax_ERROR" ]
}
POST _analyze
{
  "text": [ "syntax ERROR" ]
}

This is producing:

# POST _analyze
{
  "tokens": [
    {
      "token": "syntax_error",
      "start_offset": 0,
      "end_offset": 12,
      "type": "<ALPHANUM>",
      "position": 0
    }
  ]
}

# POST _analyze
{
  "tokens": [
    {
      "token": "syntax",
      "start_offset": 0,
      "end_offset": 6,
      "type": "<ALPHANUM>",
      "position": 0
    },
    {
      "token": "error",
      "start_offset": 7,
      "end_offset": 12,
      "type": "<ALPHANUM>",
      "position": 1
    }
  ]
}

As you can see, what is indexed or searched is totally different.
As you indexed basically syntax and error, obviously syntax_error does not match any of those 2 terms.


(Philippe) #3

Thank David,

I understand you example.
but what I don't understand it's why the query find 1 hits.
The fact I put a different imput in know_issues is normal. it was to make an example where nothing matches. I was expecting 0 hits.

What I want is :
If a result in test_log AND description in known_issue matches , my query shall return TRUE
BUT , if any input in know_issues matches , my query shall return FALSE

I use the type "phrase" , in order to match exactly the exact string. the descriotion/result is not only a simple 'syntax error" but it can be a long text.

/philippe.


(David Pilato) #4

Because the document matches. You are searching for syntax error and test_log/doc/MkOK1WMBOroXuELibpK5 is:

{
      "id": "id1",
      "test_case": "a test case",
      "result": "syntax error"
}

You can not do joins in elasticsearch so this is not really doable in one request IMO.
It's always better to perform the join at index time while injecting your data.

For example, index:

POST /test_log/doc
{
  "id": "id1",
  "test_case": "a test case",
  "result": "syntax error",
  "known": true
}


POST /test_log/doc
{
  "id": "id2",
  "test_case": "another test case",
  "result": "success",
  "known": false
}

How to compute known? Well. By doing lookups at index time.
I described something like this (not the same use case though) in a recent blog post: https://www.elastic.co/blog/enriching-your-postal-addresses-with-the-elastic-stack-part-2

May be that could help.


(Philippe) #5

Thank you David, It 's working fine !!!!

During indexation of test_log, I search if a known_issue matches the error log.
I adapted your example to my case.

Have a nice week-end !
Best regards.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.