EQL sequence performance issue

Environnement :
V7.16.2
JVM : 31g (one node)
RAM : 64G

Hello,

I'm trying to use EQL to detect some changes on a specific field.
A concrete usecase would be to detect a regression on the administrative protocol on a network equipement.

I'm facing performance issues (around 21h to execute eql search)

In one index I'm stocking the state of the equipment yesterday and today.
I have around 600k equipement so 1,2M of documents
The documents look like this for one equipement (feuillet is the id of the equipement) :

      {
        "_index" : "test_kibana_alert4",
        "_type" : "_doc",
        "_id" : "RQKNvH8B3UBu97xvB-I2",
        "_score" : 12.169369,
        "_source" : {
          "type" : "test_eql2_bad_eqpts",
          "proto_cli" : "telnet",
          "feuillet" : "CDXX0",
          "@timestamp" : "2022-03-24T08:00:00.000Z",
          "event.category" : "host"
        }
      },
      {
        "_index" : "test_kibana_alert4",
        "_type" : "_doc",
        "_id" : "TwKNvH8B3UBu97xvB-I2",
        "_score" : 12.169369,
        "_source" : {
          "type" : "test_eql_bad_eqpts",
          "proto_cli" : "ssh",
          "feuillet" : "CDXX0",
          "@timestamp" : "2022-03-23T08:00:00.000Z",
          "event.category" : "host"
        }
      }

I am trying to use this kind of eql query :

sequence by feuillet
     [ host where proto_cli == "ssh" ] 
     [ host where proto_cli == "telnet" ]

And it take around 21h to execute without customisation and if we increase the fetch_size to 50k it take around 30 min that is clearly better but not useable with more usecases in //

Here the steps to reproduce :

Set the mapping :

PUT test_kibana_alert4
{
  "mappings": {
    "properties": {
      "event.category": {
        "type": "keyword"
      },
      "@timestamp": {
        "type": "date"
      },
      "proto_cli": {
        "type": "keyword"
      },
      "feuillet": {
        "type": "keyword"
      }
    }
  }
}

Set this logsatsh configuration :

input
{
    generator {
        count => 600000
        type => "test_eql_good_eqts"
    }
    generator {
        count => 10
        type => "test_eql_bad_eqpts"
    }

}
filter
{   
    if [type] =~ "test_eql_good_eqts"
    {
        if [sequence] < 10 
        {
            drop {}
        }
        clone
        {
            clones => [  "test_eql2_good_eqts" ]
        }
        mutate 
        {
            replace => { "event.category" => "host"
             "timestamp_bis" => "2022-03-23T08:00:00.000Z"
             "proto_cli" => "telnet"}
             replace => { "feuillet" => "CDXX%{sequence}" }
        }
        if [type] == "test_eql2_good_eqts" {
            mutate 
            {
                replace => {  "timestamp_bis" => "2022-03-24T08:00:00.000Z"
                 "proto_cli" => "ssh" }
            }
        }        
    }
    if [type] =~ "test_eql_bad_eqpts"
    {
        
        clone
        {
            clones => [  "test_eql2_bad_eqpts" ]
        }
        mutate 
        {
            replace => { "event.category" => "host"
             "timestamp_bis" => "2022-03-23T08:00:00.000Z"
             "proto_cli" => "ssh"}
             replace => { "feuillet" => "CDXX%{sequence}" }
        }
        if [type] == "test_eql2_bad_eqpts" {
            mutate 
            {
                replace => {  "timestamp_bis" => "2022-03-24T08:00:00.000Z"
                 "proto_cli" => "telnet" }
            }
        }        
    }
    if [type] =~ "test_eql" {
        date {
            match => [ "timestamp_bis", "ISO8601" ]
        }
        mutate 
        {
            remove_field => [  "timestamp_bis","sequence" , "host", "message", "@version"]
        }
    }
}
output
{
    if [type] =~ "test_eql" {

        elasticsearch
        {
            hosts => "<host>"
            user => "<user>"
            password => "<pwd>"
            action => "index"
            index => "test_kibana_alert4"
            ssl => true
            ssl_certificate_verification => true
            cacert => "<ca_path>"
        }
    }
}

Execute this query :

GET /test_kibana_alert4/_eql/search
{
  "wait_for_completion_timeout": "1s",
  "query": """
    sequence by feuillet
     [ host where proto_cli == "ssh" ] 
     [ host where proto_cli == "telnet" ]
  """
}

Have I missed something ?
Are there any features that I need to consider ?
Any sugegstion are welcome :slight_smile:

Interested as well...

Might be related to EQL: Sequence performance improvements · Issue #60833 · elastic/elasticsearch (github.com) ?

The support advice me to add a sort on the field with the higher cardinality :
Here the new mapping :

PUT test_kibana_alert4
{
  "mappings": {
    "properties": {
      "event.category": {
        "type": "keyword"
      },
      "@timestamp": {
        "type": "date"
      },
      "proto_cli": {
        "type": "keyword"
      },
      "feuillet": {
        "type": "keyword"
      }
    }
  },
  "settings": {
    "number_of_replicas": 0,
    "index": {
      "sort.field": "feuillet",
      "sort.order": "asc"
    }
  }
}

And I have forcemerge the indice.

The time taken decrease from 21hours to 7 minutes !
With that, if I increase fetch_size param from eql query, it take around 1 min to execute.

For me this issue is resolved. I will close the case

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.