Search with custom analyzer returns no results

jpresley · November 14, 2018, 10:31pm

We have fix protocol log messages that look like "8=FIX.4.4|9=67|35=5|49=OMNIMKT001|56=GEMINIMKT|34=357821|52=20181109-23:00:59.800|10=056|". We would like to search for strings such as "35=5". To do this, I created a custom analyzer in my proof of concept

PUT /tokenizer-test-2018.11.14/
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "pattern",
          "pattern": "\\|"
        }
      }
    }
  }
}

I populated the index with a couple of messages:

POST /tokenizer-test-2018.11.14/_doc
{
  "@timestamp": "2018-11-4T11:26:45 -0800",
  "message": "8=FIX.4.4|9=67|35=5|49=OMNIMKT001|56=GEMINIMKT|34=357821|52=20181109-23:00:59.800|10=056|"
}

POST /tokenizer-test-2018.11.14/_doc
{
  "@timestamp": "2018-11-4T11:26:45 -0800",
  "message": "8=FIX.4.4|9=67|35=0|49=OMNIMKT001|56=GEMINIMKT|34=357821|52=20181109-23:00:59.800|10=056|"
}

I used the analyze api to test the tokenization occurs as expected, where 35=5 is a token.

However when I search

GET /tokenizer-test-2018.11.14/_search
{
  "query": {
    "match": {
      "message": {
        "query": "35=5",
        "analyzer": "my_analyzer"
      }
    }
  }
}

I get 0 hits. When I remove the analyzer from the search, I get both the document with 35=0 (which we don't want) and 35=5, which we do want.

Christian_Dahlqvist · November 15, 2018, 8:08am

It looks like a list of key-value pairs. Why not parse the data into different fields at ingest, e.g. using the Logstash kv filter or an ingest pipeline kv processor? This would allow you to search as well as aggregate over the data.

jpresley · November 15, 2018, 2:44pm

That's on our roadmap. We just don't have the resources to do it right now.

Since I asked my question, I was able to get what I needed by creating a template before ingesting the data.

PUT /_template/tokenizer-test-template/
{
  "index_patterns": "tokenizer-test-*",
  "mappings": {
    "_doc": {
      "dynamic": "strict",
      "properties": {
        "@timestamp": {
          "type": "date"
        },
        "message": {
          "type": "text",
          "analyzer": "my_analyzer"
        }
      }
    }
  },
  "settings": {
    "analysis": {
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "my_tokenizer"
        }
      },
      "tokenizer": {
        "my_tokenizer": {
          "type": "pattern",
          "pattern": "\\|"
        }
      }
    }
  }
}

system · December 13, 2018, 2:44pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
[SOLVED] Question about custom analyzer Elasticsearch	7	1689	July 5, 2017
Search query issues in Kibana Kibana	4	693	July 5, 2017
[SOLVED] Kibana search on IP part Kibana	3	8650	July 6, 2017
Search analyzer not work Elasticsearch	7	1484	August 2, 2019
Elasticsearch analyzer working with _search Elasticsearch	2	827	July 5, 2017

Search with custom analyzer returns no results

Related topics