How much does index size effect search speed?

I set up Elasticsearch to create a new index on a daily interval but when I started the stack up, I "synchronized" a lot of log data from the past right away resulting in the first index being a LOT larger than the others following it (50g instead of 1g). Is this going to cause my search speed to be a lot slower? I'm worried that a single worker thread will be assigned to searching the whole thing by itself and will take a lot longer than the rest. Would it make a difference if I reindexed the data so that the initial upload was spread over several indices instead or is this not even a factor?

Thanks,
Brandon

It won't matter. A shard is multithreaded.

Alright, thanks. I'll check that one off the list of possible slowdown causes

Each query runs single threaded against each shards, but multiple queries and shards can be processed in parallel. The size of a shard does therefore affect query latencies, which is why we generally recommend benchmarking the ideal shard size.

1 Like

Interesting. Thanks for following up. I'll take a look at the presentation. Do you think this could be the cause of my search speed problems?

If the shard size has changed significantly, that is certainly possible.

To follow up on this issue, it does seem that removing the one large shard stabilized the Elasticsearch node. After increasing the number of shards, I ran into an additional issue with running out of worker thread queue space. Increasing the search queue size from 1000 to 5000 resolved this issue but did result in slightly longer lasting searches.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.