Searching performance dramatically decreases (+15s) if the indexation process is enabled

Albert_Vila · December 23, 2015, 9:03am

Hello,

We have an Elasticsearch cluster of 7 servers in Amazon, two clients, three master nodes (with data), and two data.

All master and data nodes are m4.2xlarge with 32GB of RAM 8 cores and SSD. The 2 clients are m4.xlarge with 4 cores and 16Gb of RAM. The Elasticsearch service has 15GB of memory (half of the maximum memory) with no swap.

We have three indexes, one of the most important have 100GB of data and every index have 5 shards distributed in the 5 datas (masters included) and everything replicated correctly with all the nodes.

Our frontend is using the cluster pointing to one client, and we are getting searches around 2-3 seconds that it's acceptable for us. The client seems to distribute our queries correctly, we are using the paramedic plugin to see the balancing https://github.com/karmi/elasticsearch-paramedic

The problem:

The problem begins when we start the indexation process (we index using a bulk request of 2000 documents), then we are getting searches around 17 seconds. At the same time every cluster node have 0.5 of maximum load average.

What we tried:

Increasing index interval actually to 30s.
Checking thread pools, but on the status of the threads _nodes/stats/thread_pool?pretty I can't see anything problematic.

Anyone has any ideas of what could happen, or what we can check/change to address this issue?
Is there any property that could limit the performance of searches when we are doing indexation at the same time?
Is our current cluster properly configured or well defined?

Thanks in advance

Christian_Dahlqvist · December 23, 2015, 10:40am

As both indexing and querying compete for the same resources, primarily CPU and disk IO, you may need to throttle your indexing so that it has less impact on querying. Have you tried reducing the bulk size and/or reducing the number of indexing threads to see what impact that has?

Albert_Vila · December 23, 2015, 2:11pm

We tried reducing the indexing threads and we still have the same problem.

We've seen other people complaining for the same problem and one suggestion seems to change the client to use the transport layer (https://www.elastic.co/guide/en/elasticsearch/guide/current/_transport_client_versus_node_client.html)

However, if we use just 1 thread for the indexation process, does that really make any sense to use the transport instead of node client?

Thanks

jprante · December 23, 2015, 3:50pm

What ES version?

Do you write documents into the index you just search?

To get optimal performance, use this workload pattern:

Create new index
Bulk index documents
Refresh/optimize
Search
Create another index
Bulk index into the other index
Referesh/optimize the other index
Switch from old to new index or set index alias
Search on new (or on both) indices
Remove unused indices

Christian_Dahlqvist · December 23, 2015, 3:52pm

It is quite likely that the increased latencies you are experiencing when indexing are due to resource contention in the cluster, so I do not think it will matter which client you use. How many replicas do you have configured for the indices? How many queries per second is the cluster serving?

Albert_Vila · December 24, 2015, 9:52am

Hi

@jprante We currently use Elastic version 1.7.2 and we write on the same index we search. We do incremental indexation because the index is big (more than 100Gb) and we have a lot of updates. We cannot reindex everything on a new index and then swap them.

What do the people do with big indexes with continuous updates?
We haven't executed an optimize on this index, but I don't think this could be the origin of the problem.

@christian_dahlqvist We have 4 replicas for the index. The cluster is on a testing environment, so right now there are a few queries.

We have 2 client nodes, we tried to index using one client node and search using the other, but we had the same results so far.

Thanks

jprante · December 25, 2015, 12:09am

You do not need to reindex everything. Just create a new index for incremental data and add to existing index alias.

Your issue is that indexing has a lot of moving parts (segments) that invalidate your searches, especially when using filters and aggregations. It is expected to kill performance when indexing into an index that is being searched on.

For optimal search performance, you need a compact number of segments. When massive bulk indexing is over, the number of segments may be a bit too high.

softwaredoug · December 25, 2015, 12:17am

BTW you might find this post on improving Elasticsearch indexing performance by Alan Woodward at Flax enlightening

mosiddi · December 27, 2015, 11:00am

agreed with what @jprante said! Indexing creates new segments (and invokes segment merges creating further new segments)... this definitely impacts cache and hence queries.

If you have a lot of data and properly partitioned, then you can try balancing the indexing requests so that not one shard/index is loaded at any point of time.

Albert_Vila · December 28, 2015, 8:22am

@softwaredoug Thanks, we'll check the link, it could help us to index faster.

@jprante Then, if I understood you right, you suggest to index the new documents into another index and use both indexes for searching through an alias.

At some point the new index with the incremental indexation will be big, then the standard solution is to index into a third index? Later on into a fourth? Is that what you suggest?

Our incremental data not only have new documents but updates to existent documents. It means that with your approach we'll need to perform a delete query first into the first index before indexing the document to the second one.

Igor_Berman · December 30, 2015, 8:06am

what is a point to have so many replicas if you have only 5 datas?

Albert_Vila · December 30, 2015, 8:19am

@Igor_Berman We wanted to increase the search performance. Before starting the indexation process, the cluster was able to handle a big volume of queries, not after the indexation process was started.

Are you suggesting that having that amount of replicas will decrease search performance during indexation? We can try to decrease the number of replicas and see how the cluster perform.

jprante · December 30, 2015, 2:45pm

Avoid "delete by query" at all cost. This is a very expensive operation and will also affect your searches in a bad way. Organize your data so you can keep them in separate incremental steps, or introduce special filter tags, so you can filter out old documents at search time.

Reindexing everything is better and simpler in the vast majority of cases where the overall data volume is limited.

Igor_Berman · December 31, 2015, 5:48am

Albert, Im not sure but replication isnt for free especially when you
indexing. And I have 2 thoughts a)Es should copy each piece of data from
one node to 4 in your case which takes resources...usually when
rebuilding(which is not your case but still) the advice is to turn
replication off and turn in on at the end b) if you have 5 data nodes then
you make almost every node to hold full index volume, so imho canceled
out partitioning of data(yes it probably improves search and you have fault
tolerance)

Christian_Dahlqvist · December 31, 2015, 9:43am

Have you tried reducing the number of replicas in order to see how this affects the cluster?

Albert_Vila · December 31, 2015, 10:21am

Not yet. We'll try to see how it impacts the overall performance.

Thanks

Topic		Replies	Views
Huge performance degradation during bulk indexing Elasticsearch	8	3483	May 16, 2019
Elasticsearch Query Performance while reading and writting in parallel Elasticsearch	3	679	July 6, 2017
Still have Indexing Question Elasticsearch	5	284	July 6, 2017
Heavy indexing cause severe delay for searching Elasticsearch	12	523	July 6, 2017
Bulk indexing slow down when data amount increase Elasticsearch	6	2997	July 6, 2017

Searching performance dramatically decreases (+15s) if the indexation process is enabled

Related topics