Controlling elasticsearch _bulk queue size

(Quintus maximus) #1

I've setup elasricsearch on a machine with the following specs:
Storage: 500 GB HDD
Ram: 8 GB
CPU : Intel i7 Quad-Core

On the same machine I have an application that listens to netflow data and exports it to elasticsearch. I've noticed , however , that the other application complains about the data being dropped since the bull queue is too long.

I've tried increasing the bull queue size to something like 10000 bit it doesn't seem to help.

New indices are build daily.

Is there anyway to troubleshoot this ? How can I find out if elasricsearch is actually dropping the data ?

(Christian Dahlqvist) #2

How many indices/shards are you actively indexing into? How many indexing threads do you have?

(Quintus maximus) #3

Hi Christian,

Sorry I thought I'd get e-mails for topics I'm watching.

I haven't changed any of the default settings. I have one index created per day "ntopng-%y-%m-%d-*" and I guess I have the default settings in place (so 5 shards)?

  1. Is there a way I can find the right numbers?
  2. what options do I have to resolve it?

(Christian Dahlqvist) #4

Do you have monitoring installed? How many client connections are indexing in parallel?

(Quintus maximus) #5

I have one application currently communication with elasticsearch and it's currently on the same host.
I just did a netatat and I huge number of connection to localhost:9200 .. could this be the reason ?

I do not have any sort of monitoring installed

(Christian Dahlqvist) #6

If you are indexing over too many connections in parallel, you will fill up the queues. In order to find the ideal number of indexing connections, start with a few connections and then gradually increase until the indexing throughput stops increasing or you start seeing bulk rejections.

(Quintus maximus) #7

Thanks .. I'll check this right away. How can I control the connections over elasticsearch?

(Christian Dahlqvist) #8

That depends on the application that is writing data to Elasticsearch.

(Quintus maximus) #9

one more question. Say if I can't control the application, what do I need to do with elasticsearch to handle this? Do I need something like SSD instead of HDD? or do i need to setup more indices? I'm just trying to figure out how elasticsearch works in this case

Here's a reference to what I'm talking about:

(system) #10

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.