Can't exact match the keyword which ends with equal mark

aqiao · April 14, 2016, 3:02am

i want to get all the events which Item equal EUR=,in Kibana 4.4.1 search box,i input the filter like this Item:"ERU=",however i get the following
EUR=
EUR=Q
EUR=P
I mean i just want to get EUR= events, i check the HTML source code in chrome and find some interesting code:

<td class="discover-table-datafield"><mark>EUR</mark>=</td>

but when i type Item:"PQR=", it works fine,below is the html source code

<td class="discover-table-datafield">PQR=</td>

So "EUR" is key word in Kibana ?, or if i make some mistakes

shaunak · April 15, 2016, 6:27pm

I think your post got mangled somehow, specifically the = bits after "interesting code:" and "other fields are:". Can you edit your post so it displays what you were trying to show?

aqiao · April 20, 2016, 9:38am

Hi shaunak, sorry for late! i modified the post and hope your reply . Thanks

shaunak · April 20, 2016, 1:11pm

Hmm... that's interesting. Can you post the mapping of your Elasticsearch index here, please? You can view the mapping by calling the GET {index-name}/_mapping REST API.

aqiao · April 21, 2016, 6:50am

 "tolreport-2016.04.20" : {
        "mappings" : {
          "tolcheck" : {
            "properties" : {
              "@timestamp" : {
                "type" : "date",
                "format" : "strict_date_optional_time||epoch_millis"
              },
              "@version" : {
                "type" : "string"
              },
              "Contributer" : {
                "type" : "string"
              },
              "Description" : {
                "type" : "string"
              },
              "Detail" : {
                "type" : "string"
              },
              "EventTime" : {
                "type" : "date",
                "format" : "strict_date_optional_time||epoch_millis"
              },
              "Item" : {
                "type" : "string"
              },
              "Type" : {
                "type" : "string"
              },
              "beat" : {
                "properties" : {
                  "hostname" : {
                    "type" : "string"
                  },
                  "name" : {
                    "type" : "string"
                  }
                }
              },
              "count" : {
                "type" : "long"
              },
              "host" : {
                "type" : "string"
              },
              "input_type" : {
                "type" : "string"
              },
              "message" : {
                "type" : "string"
              },
              "offset" : {
                "type" : "long"
              },
              "source" : {
                "type" : "string"
              },
              "tags" : {
                "type" : "string"
              },
              "type" : {
                "type" : "string"
              }
            }
          }
        }
      }
    }

Above is the mapping, and thanks shaunak

shaunak · April 21, 2016, 4:15pm

Thanks. Can you try calling the analyze API with the text you are searching for, like this:

curl -XGET 'http://localhost:9200/tolreport-2016.04.20/_analyze' -d '
{
  "field" : "Item",
  "text" : "EUR="
}'

and also:

curl -XGET 'http://localhost:9200/tolreport-2016.04.20/_analyze' -d '
{
  "field" : "Item",
  "text" : "PQR="
}'

aqiao · April 22, 2016, 8:56am

For first get request (EUR=):

{"tokens":[{"token":"eur","start_offset":0,"end_offset":3,"type":"<ALPHANUM>","position":0}]}

For second get request (XAUALL= ,there is no PQR= in current es, so i use XAUALL= instead):

{"tokens":[{"token":"xauall","start_offset":0,"end_offset":6,"type":"<ALPHANUM>","position":0}]}

aqiao · April 22, 2016, 9:09am

Hi, after try some times, i find the only differences between "EUR=" and "XAUALL=" is there have some similar events to "EUR=" like "EUR=M" ,"EUR=W" and so on, but "XAUALL=" isn't.
I mean there is no events like "XAUALL=M" or " "XAUALL=W".
So this is Lucene bug?

Thanks!

shaunak · April 27, 2016, 1:58pm

HI @aqiao,

This is not a bug but working as expected . Let me attempt to explain.

As indicated in your mapping, the "Item" field does not specify an analyzer to use. This means Elasticsearch (really Lucene) will use the default, which is the standard analyzer. This analyzer, amongst other things, tokenizes the input string on the = sign. So the string EUR= is analyzed into one token, eur, while the string EUR=W is analyzed into two tokens, eur and w. These analyzed tokens are stored in Lucene's inverted index, which is used at search time.

At search time, the string you want to search on goes through the same analysis process. So searching for the string EUR=W causes Lucene to search in the inverted index for eur or w. That's why you are seeing results with EUR=, EUR=W, EUR=Q, etc.

If you are looking to perform exact matches, you should index the "Item" field as index: not_analyzed in your mapping. I would recommend reading this section of the Elasticsearch Definitive Guide: https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-analysis.html

aqiao · May 3, 2016, 7:28am

Thanks @shaunak,i'll try that