Watcher searching for query term containing hyphens

Hi

I am trying to return the number of errors reported against a specific application in our stack. The search term i am using is

        "search" : {
            "request" : { 
                "indices" : "development-*",
                "body" : {
                    "query" : {
                        "bool" : {
                            "must" : [
                                {
                                    "match" : {"apigw.log_level": "ERROR"}
                                },
                                {
                                    "match" : {"message": 
                                      {
                                        "query": "ab-p-some-api*",
                                        "operator" : "and",
                                        }
                                    }
                                },
                                {
                                    "match" : {"tags": "apigw"}
                                }
                            ],
                            "filter" : {
                                "range" : {
                                    "@timestamp" : {
                                        "from" : "now-5m",
                                        "to" : "now"
                                    }
                                }
                            }
                        }
                    }
                }
            }
        }
    },

however, it returns more results than i am expecting due to the hyphens in the query string being ignored.

Can anyone advise how i make the script check on the exact query string which will be appended by the api version hence the * at the end of the ab-p-some-api* string

If you do not have any special mapping, a hyphen gets removed by default before data is stored in the inverted index, see this example

GET _analyze
{
  "text": "one-two-three"
}

returns

{
  "tokens" : [
    {
      "token" : "one",
      "start_offset" : 0,
      "end_offset" : 3,
      "type" : "<ALPHANUM>",
      "position" : 0
    },
    {
      "token" : "two",
      "start_offset" : 4,
      "end_offset" : 7,
      "type" : "<ALPHANUM>",
      "position" : 1
    },
    {
      "token" : "three",
      "start_offset" : 8,
      "end_offset" : 13,
      "type" : "<ALPHANUM>",
      "position" : 2
    }
  ]
}

If you want to keep those hyphens, you should take some time and read about analysis in Elasticsearch.

The next minor version of elasticsearch will also feature a wildcard datatype that could help you here.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

1 Like