I looked around the internet and the forums but can't find something similar...
Background
I have a two data node cluster (and one tie-breaker) with about 10 indexes of all the same kind of documents (same properties).
Nine of the indexes are all about 50-100GB of data 10-100M docs. I send batched requests to ingest docs (always index
not update
) and we get speeds of around 15K docs/s.
The tenth index is 500GB with 700M docs. We send data using the exact everything and get speeds of 1-3k docs/s.
The two data nodes are running on separate hardware backed by NVMe drives with 50GB of available RAM each connected with 10GbE.
I've looked through the setting to contrast and see what might the differences be. What I found is:
fast:
settings.index.number_of_shards = "2"
slow:
settings.index.number_of_shards = "4"
settings.index.refresh_interval = "600s"
Is it the fact that there are twice the number of shards or that there is a 10min refresh interval... or something else I don't know yet?
The lack of the refresh_interval
setting on the fast indexes makes me think that might have something to do with it... but before I start making changes in my production cluster I want to have a better understanding of what it does.
I googled it and came to the docs about index update setting which tells me I can update it, and to remove the setting I can set it to null
but then for bulk indexing I should set it to "-1"
?!?!
From a stack overflow thread it seems that maybe it has to do with how quickly the ingested data is available to be results of a search? Or maybe how frequently it's flushed to disk?
It would make some sense that the slowness I'm seeing is from forcing translogs to be read more frequently or if I'm writing to drive more frequently but 600s is 10mins and this is not a saw tooth performance issue, it is a static 3k docs/s all day long.
Can anyone help me know
- is this smoking gun for my performance issue?
- if not, any guesses of where to turn next?
- what is the setting
settings.index.refresh_interval
meant to be/do when changed? - is the default for
settings.index.refresh_interval
it's absence (I presumenull
) or"-1"
? - is it safe to change while the bulk indexing requests are streaming in or should I stop the bulk requests first?
Thank you for any help.