Filebeat ingestion pipeline for ElasticSearch slow logs

Hello, I've found and possibly created a bug fix when ElasticSearch ingestion pipelines do not capture data from the regex groups in the grok processor.

My setup - ES, Kibana, and Filebeat with version 7.6.2.

I have this in elasticsearch/log4j2.properties

### Search Slowlog###
appender.index_search_slowlog_rolling.type = RollingFile
appender.index_search_slowlog_rolling.name = index_search_slowlog_rolling
appender.index_search_slowlog_rolling.fileName = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_search_slowlog.json
appender.index_search_slowlog_rolling.layout.type = ESJsonLayout
appender.index_search_slowlog_rolling.layout.type_name = index_search_slowlog
appender.index_search_slowlog_rolling.filePattern = ${sys:es.logs.base_path}${sys:file.separator}${sys:es.logs.cluster_name}_index_search_slowlog-%i.json.gz
appender.index_search_slowlog_rolling.policies.type = Policies
appender.index_search_slowlog_rolling.policies.size.type = SizeBasedTriggeringPolicy
appender.index_search_slowlog_rolling.policies.size.size = 512MB
appender.index_search_slowlog_rolling.strategy.type = DefaultRolloverStrategy
appender.index_search_slowlog_rolling.strategy.max = 4

logger.index_search_slowlog_rolling.name = index.search.slowlog
logger.index_search_slowlog_rolling.level = trace
logger.index_search_slowlog_rolling.appenderRef.index_search_slowlog_rolling.ref = index_search_slowlog_rolling
logger.index_search_slowlog_rolling.additivity = false

This is the filebeat configuration:

filebeat.inputs:
filebeat.config:
  modules:
    path: /etc/filebeat/modules.d/elasticsearch-elasticsearch.yml
    reload.enabled: true
    reload.period: 10s
output.elasticsearch:
  hosts: ["localhost:9200"]
  loadbalance: false
  index: elasticsearch-logs-%{+YYYY.MM.dd}

  bulk_max_size: 0

filebeat.shutdown_timeout: 10s

logging.to_files: true
logging.files:
  name: filebeat-elasticsearch.log

tags: []

setup:
  template:
    enabled: false
  ilm:
    enabled: false

http.enabled: true
http.host: localhost
http.port: 15047

monitoring.enabled: false

The next thing I would like to share is part of the ingestion pipeline called filebeat-7.6.2-elasticsearch-slowlog-pipeline-json :

      {
        "grok": {
          "field": "elasticsearch.slowlog.message",
          "pattern_definitions": {
            "GREEDYMULTILINE": "(.|\n)*",
            "INDEXNAME": "[a-zA-Z0-9_.-]*"
          },
          "patterns": [
            "(\\[%{INDEXNAME:elasticsearch.index.name}\\]\\[%{NUMBER:elasticsearch.shard.id}\\])?(%{SPACE})(\\[%{INDEXNAME:elasticsearch.index.name}\\/%{DATA:elasticsearch.index.id}\\])?(%{SPACE})%{SPACE}(took\\[%{DATA:elasticsearch.slowlog.took}\\],)?%{SPACE}(took_millis\\[%{NUMBER:elasticsearch.slowlog.duration:long}\\],)?%{SPACE}(type\\[%{DATA:elasticsearch.slowlog.type}\\],)?%{SPACE}(id\\[%{DATA:elasticsearch.slowlog.id}\\],)?%{SPACE}(routing\\[%{DATA:elasticsearch.slowlog.routing}\\],)?%{SPACE}(total_hits\\[%{NUMBER:elasticsearch.slowlog.total_hits:int}\\],)?%{SPACE}(types\\[%{DATA:elasticsearch.slowlog.types}\\],)?%{SPACE}(stats\\[%{DATA:elasticsearch.slowlog.stats}\\],)?%{SPACE}(search_type\\[%{DATA:elasticsearch.slowlog.search_type}\\],)?%{SPACE}(total_shards\\[%{NUMBER:elasticsearch.slowlog.total_shards:int}\\],)?%{SPACE}(source\\[%{GREEDYMULTILINE:elasticsearch.slowlog.source_query}\\])?,?%{SPACE}(extra_source\\[%{DATA:elasticsearch.slowlog.extra_source}\\])?,?",
            "\\[%{INDEXNAME:elasticsearch.index.name}\\]\\[%{NUMBER:elasticsearch.shard.id}\\]"
          ]
        }
      },

So, next, I will share the results of the grok debugger from kibana. I will use this line as a test string from https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-slowlog.html#_identifying_search_slow_log_origin

[index6][0] took[78.4micros], took_millis[0], total_hits[0 hits], stats[], search_type[QUERY_THEN_FETCH], total_shards[1], source[{"query":{"match_all":{"boost":1.0}}}], id[MY_USER_ID],

After execution, I see this as a result:

{
  "elasticsearch": {
    "slowlog": {
      "duration": 0,
      "took": "78.4micros"
    },
    "index": {
      "name": "index6"
    },
    "shard": {
      "id": "0"
    }
  }
}

There are no search_type, total_hits, stats, and id. So, after getting this pattern:

(\[%{INDEXNAME:elasticsearch.index.name}\]\[%{NUMBER:elasticsearch.shard.id}\])?(%{SPACE})(\[%{INDEXNAME:elasticsearch.index.name}\/%{DATA:elasticsearch.index.id}\])?(%{SPACE})%{SPACE}(took\[%{DATA:elasticsearch.slowlog.took}\],)?%{SPACE}(took_millis\[%{NUMBER:elasticsearch.slowlog.duration:long}\],)?%{SPACE}(type\[%{DATA:elasticsearch.slowlog.type}\],)?%{SPACE}(id\[%{DATA:elasticsearch.slowlog.id}\],)?%{SPACE}(routing\[%{DATA:elasticsearch.slowlog.routing}\],)?%{SPACE}(total_hits\[%{NUMBER:elasticsearch.slowlog.total_hits:int}\s*hits\],)?%{SPACE}(types\[%{DATA:elasticsearch.slowlog.types}\],)?%{SPACE}(stats\[%{DATA:elasticsearch.slowlog.stats}\],)?%{SPACE}(search_type\[%{DATA:elasticsearch.slowlog.search_type}\],)?%{SPACE}(total_shards\[%{NUMBER:elasticsearch.slowlog.total_shards:int}\],)?%{SPACE}(source\[%{GREEDYMULTILINE:elasticsearch.slowlog.source_query}\])?,?%{SPACE}(extra_source\[%{DATA:elasticsearch.slowlog.extra_source}\])?,?


(added \s*hits in the total_hits section )

Result:

{
  "elasticsearch": {
    "slowlog": {
      "duration": 0,
      "took": "78.4micros",
      "total_shards": 1,
      "source_query": "{\"query\":{\"match_all\":{\"boost\":1.0}}}], id[MY_USER_ID",
      "total_hits": 0,
      "stats": "",
      "search_type": "QUERY_THEN_FETCH"
    },
    "index": {
      "name": "index6"
    },
    "shard": {
      "id": "0"
    }
  }
}

It can handle everything, but source_query got corrupted. So I have to fix GREEDYMULTILINE pattern, and now we have almost everything except an id.

{
  "elasticsearch": {
    "slowlog": {
      "duration": 0,
      "took": "78.4micros",
      "total_shards": 1,
      "source_query": "{\"query\":{\"match_all\":{\"boost\":1.0}}}",
      "total_hits": 0,
      "stats": "",
      "search_type": "QUERY_THEN_FETCH"
    },
    "index": {
      "name": "index6"
    },
    "shard": {
      "id": "0"
    }
  }
}

And getting final regex like this:

(\[%{INDEXNAME:elasticsearch.index.name}\]\[%{NUMBER:elasticsearch.shard.id}\])?(%{SPACE})(\[%{INDEXNAME:elasticsearch.index.name}\/%{DATA:elasticsearch.index.id}\])?(%{SPACE})%{SPACE}(took\[%{DATA:elasticsearch.slowlog.took}\],)?%{SPACE}(took_millis\[%{NUMBER:elasticsearch.slowlog.duration:long}\],)?%{SPACE}(type\[%{DATA:elasticsearch.slowlog.type}\],)?%{SPACE}(id\[%{DATA:elasticsearch.slowlog.id}\],)?%{SPACE}(routing\[%{DATA:elasticsearch.slowlog.routing}\],)?%{SPACE}(total_hits\[%{NUMBER:elasticsearch.slowlog.total_hits:int}\s*hits\],)?%{SPACE}(types\[%{DATA:elasticsearch.slowlog.types}\],)?%{SPACE}(stats\[%{DATA:elasticsearch.slowlog.stats}\],)?%{SPACE}(search_type\[%{DATA:elasticsearch.slowlog.search_type}\],)?%{SPACE}(total_shards\[%{NUMBER:elasticsearch.slowlog.total_shards:int}\],)?%{SPACE}(source\[%{GREEDYMULTILINE:elasticsearch.slowlog.source_query}\],)?%{SPACE}(id\[%{DATA:elasticsearch.slowlog.user_id}\])?,?%{SPACE}(extra_source\[%{DATA:elasticsearch.slowlog.extra_source}\])?,?


(copypasted id section at the end)

Makes the grok pattern to capture all the necessary information:

{
  "elasticsearch": {
    "slowlog": {
      "duration": 0,
      "took": "78.4micros",
      "total_shards": 1,
      "source_query": "{\"query\":{\"match_all\":{\"boost\":1.0}}}",
      "total_hits": 0,
      "user_id": "MY_USER_ID",
      "stats": "",
      "search_type": "QUERY_THEN_FETCH"
    },
    "index": {
      "name": "index6"
    },
    "shard": {
      "id": "0"
    }
  }
}

My questions:

  • Do you have any tests for this part of functionality? Can you point it out to me? So that I can test my new pattern on different lines.
  • Are there any corner cases with different slow log line formats? And maybe it's possible that my pattern will not work for them? Can you point it out?
  • Should I make a PR for that fix? As I described above, I hope it will not break anything else, and I need help with proper testing for this fix.

Also, this issue looks similar to that one: Filebeat pipeline for elasticsearch slowlog not indexing/parsing slowlog_took (slowlog.duration)

Found a case where this regex is fails. When we have something like this in the source "param": [1, 2**],** And the regex found ], so it provides something wrong in terms of data.

So, here are the new version of regex:

(\[%{INDEXNAME:elasticsearch.index.name}\]\[%{NUMBER:elasticsearch.shard.id}\])?(%{SPACE})(\[%{INDEXNAME:elasticsearch.index.name}\/%{DATA:elasticsearch.index.id}\])?(%{SPACE})%{SPACE}(took\[%{DATA:elasticsearch.slowlog.took}\],)?%{SPACE}(took_millis\[%{NUMBER:elasticsearch.slowlog.duration:long}\],)?%{SPACE}(type\[%{DATA:elasticsearch.slowlog.type}\],)?%{SPACE}(id\[%{DATA:elasticsearch.slowlog.id}\],)?%{SPACE}(routing\[%{DATA:elasticsearch.slowlog.routing}\],)?%{SPACE}(total_hits\[%{NUMBER:elasticsearch.slowlog.total_hits:int}\s*hits\],)?%{SPACE}(types\[%{DATA:elasticsearch.slowlog.types}\],)?%{SPACE}(stats\[%{DATA:elasticsearch.slowlog.stats}\],)?%{SPACE}(search_type\[%{DATA:elasticsearch.slowlog.search_type}\],)?%{SPACE}(total_shards\[%{NUMBER:elasticsearch.slowlog.total_shards:int}\],)?%{SPACE}(source\[%{GREEDYMULTILINE:elasticsearch.slowlog.source_query}(?=\s[a-z_]*\[)\],)?%{SPACE}(id\[%{DATA:elasticsearch.slowlog.user_id}\])?,?%{SPACE}(extra_source\[%{DATA:elasticsearch.slowlog.extra_source}\])?,?
1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.