Looking for suggestions on tuning

funtimes_ninja · February 15, 2017, 2:21am

Hello all,

I'm reaching out to the community to get some suggestions/opinions on how to best tune a small cluster that I have built on my home lab. I do not do a lot of querying of the data, mainly just indexing. I would prefer to have faster index rates than what I currently have.

Machine specs:
2x Xeon 5675 hex core
72 gigs ram
5x SSD

Currently all 4 nodes are LXC containers on a Promox server. Each node is on their own separate SSD disk, all nodes sharing 24 cores, and 16 gigs of ram for each node.
Node 1 -> ES, LS, KI
Node 2 -> ES
Node 3 -> ES
Node 4 -> ES

I seem to be hitting a bottleneck around 1.5k index /s. I haven't quite been able to pinpoint where the bottleneck is. I wouldn't assume it's disk IO as with having 4 nodes, all on their own SSD. I would believe that they are capable of speeds faster than 1.5k/s as a cluster. CPU usage is barely above 20% for the entire system. Do you believe that I should be getting more than 1.5k? Or are my rates on par with what resources I currently have?

All indexes have had their index refresh set to 30s:

curl -XPUT localhost:9200/_settings -d '{
    "index" : {
        "refresh_interval" : "30s"
    } }'

Nodes 1-4 elasticsearch.yml

# cat /etc/elasticsearch/elasticsearch.yml | egrep -v "(^#.*|^$)"
cluster.name: Cluster1
node.name: ${HOSTNAME}
network.host: [_site_, _local_]
discovery.zen.ping.unicast.hosts: ["xxx.xxx.xxx.200", "xxx.xxx.xxx.201", "xxx.xxx.xxx.202", "xxx.xxx.xxx.203"]
indices.memory.index_buffer_size: 30%
indices.fielddata.cache.size:  10%

Node 1 logstash/conf.d/01-file.conf

input {
  beats {
      port => 5045
          ssl => true
          ssl_certificate => "/etc/ssl/logstash-forwarder.crt"
          ssl_key => "/etc/ssl/logstash-forwarder.key"
    }
}
filter {
    if [type] == "cowrie" {
        json {
            source => message
        }
        date {
            match => [ "timestamp", "ISO8601" ]
        }
        if [src_ip]  {
            dns {
                reverse => [ "src_host", "src_ip" ]
                action => "append"
            }
            geoip {
                source => "src_ip"  # With the src_ip field
                target => "geoip"   # Add the geoip one
                database => "/opt/logstash/vendor/geoip/GeoLite2-City.mmdb"
            }
        }
    }
}
output {
    if [type] == "cowrie" {
        # Output to elasticsearch
        elasticsearch {
           hosts => ["xxx.xxx.xxx.200:9200","xxx.xxx.xxx.201:9200","xxx.xxx.xxx:9200","xxx.xxx.xxx.203:9200"]
           sniffing => true
           manage_template => false
           index => "%{[@metadata][beat]}-%{+YYYY.MM.dd}"
           document_type => "%{[@metadata][type]}"
        }
        # For debugging
        stdout {
            codec => rubydebug
        }
    }
}

jpountz · February 15, 2017, 10:30am

It is hard to comment whether this ingestion rate is high or not as it depends on many factors like the complexity of your documents and the mappings for instance, that are typically different for most use-cases.

Maybe look at https://www.elastic.co/guide/en/elasticsearch/reference/current/general-recommendations.html and https://www.elastic.co/guide/en/elasticsearch/reference/current/tune-for-indexing-speed.html.

system · March 15, 2017, 10:30am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cluster optimization(indexing/query performace) Elasticsearch	4	312	July 6, 2017
Fine Tuned Cluster - Consultation Elasticsearch	2	597	July 23, 2017
Index Dimensioning and Optimization (across the Cluster) Elasticsearch	6	376	March 24, 2021
Index throughput issues - tried all tuning suggestions posted Elasticsearch	1	381	July 6, 2017
Elasticsearch performance tuning on elastic 1.7 Elasticsearch	3	927	July 5, 2017

Looking for suggestions on tuning

Related topics