Can't exact match the keyword which ends with equal mark


(neal) #1

i want to get all the events which Item equal EUR=,in Kibana 4.4.1 search box,i input the filter like this Item:"ERU=",however i get the following
EUR=
EUR=Q
EUR=P
I mean i just want to get EUR= events, i check the HTML source code in chrome and find some interesting code:

<td class="discover-table-datafield"><mark>EUR</mark>=</td>

but when i type Item:"PQR=", it works fine,below is the html source code

<td class="discover-table-datafield">PQR=</td>

So "EUR" is key word in Kibana ?, or if i make some mistakes


(Shaunak Kashyap) #2

I think your post got mangled somehow, specifically the = bits after "interesting code:" and "other fields are:". Can you edit your post so it displays what you were trying to show?


(neal) #3

Hi shaunak, sorry for late! i modified the post and hope your reply . Thanks


(Shaunak Kashyap) #4

Hmm... that's interesting. Can you post the mapping of your Elasticsearch index here, please? You can view the mapping by calling the GET {index-name}/_mapping REST API.


(neal) #5
 "tolreport-2016.04.20" : {
        "mappings" : {
          "tolcheck" : {
            "properties" : {
              "@timestamp" : {
                "type" : "date",
                "format" : "strict_date_optional_time||epoch_millis"
              },
              "@version" : {
                "type" : "string"
              },
              "Contributer" : {
                "type" : "string"
              },
              "Description" : {
                "type" : "string"
              },
              "Detail" : {
                "type" : "string"
              },
              "EventTime" : {
                "type" : "date",
                "format" : "strict_date_optional_time||epoch_millis"
              },
              "Item" : {
                "type" : "string"
              },
              "Type" : {
                "type" : "string"
              },
              "beat" : {
                "properties" : {
                  "hostname" : {
                    "type" : "string"
                  },
                  "name" : {
                    "type" : "string"
                  }
                }
              },
              "count" : {
                "type" : "long"
              },
              "host" : {
                "type" : "string"
              },
              "input_type" : {
                "type" : "string"
              },
              "message" : {
                "type" : "string"
              },
              "offset" : {
                "type" : "long"
              },
              "source" : {
                "type" : "string"
              },
              "tags" : {
                "type" : "string"
              },
              "type" : {
                "type" : "string"
              }
            }
          }
        }
      }
    }

Above is the mapping, and thanks shaunak


(Shaunak Kashyap) #6

Thanks. Can you try calling the analyze API with the text you are searching for, like this:

curl -XGET 'http://localhost:9200/tolreport-2016.04.20/_analyze' -d '
{
  "field" : "Item",
  "text" : "EUR="
}'

and also:

curl -XGET 'http://localhost:9200/tolreport-2016.04.20/_analyze' -d '
{
  "field" : "Item",
  "text" : "PQR="
}'

(neal) #7

For first get request (EUR=):

{"tokens":[{"token":"eur","start_offset":0,"end_offset":3,"type":"<ALPHANUM>","position":0}]}

For second get request (XAUALL= ,there is no PQR= in current es, so i use XAUALL= instead):

{"tokens":[{"token":"xauall","start_offset":0,"end_offset":6,"type":"<ALPHANUM>","position":0}]}


(neal) #8

Hi, after try some times, i find the only differences between "EUR=" and "XAUALL=" is there have some similar events to "EUR=" like "EUR=M" ,"EUR=W" and so on, but "XAUALL=" isn't.
I mean there is no events like "XAUALL=M" or " "XAUALL=W".
So this is Lucene bug?

Thanks!


(Shaunak Kashyap) #9

HI @aqiao,

This is not a bug but working as expected :slight_smile:. Let me attempt to explain.

As indicated in your mapping, the "Item" field does not specify an analyzer to use. This means Elasticsearch (really Lucene) will use the default, which is the standard analyzer. This analyzer, amongst other things, tokenizes the input string on the = sign. So the string EUR= is analyzed into one token, eur, while the string EUR=W is analyzed into two tokens, eur and w. These analyzed tokens are stored in Lucene's inverted index, which is used at search time.

At search time, the string you want to search on goes through the same analysis process. So searching for the string EUR=W causes Lucene to search in the inverted index for eur or w. That's why you are seeing results with EUR=, EUR=W, EUR=Q, etc.

If you are looking to perform exact matches, you should index the "Item" field as index: not_analyzed in your mapping. I would recommend reading this section of the Elasticsearch Definitive Guide: https://www.elastic.co/guide/en/elasticsearch/guide/current/mapping-analysis.html


(neal) #10

Thanks @shaunak,i'll try that


(system) #11