Need help to improve performance with ES

Deep_2 · January 24, 2016, 5:21pm

Hi,

I have a single node elastic search deployment with 15K documents. The machine has 4 cores and 8 Gb of RAM. The node is handling 1300 request per second with 25% cpu utilization, 75% memory utilization. In the current deployment query response time is 100 ms.

We need the search query to run in < 30 ms.

The search query is essentially geo location search that tries to fetch document that are within x miles of the input lat/lon with some additional filters and the documents are sorted on distance (nearest to furthest). Each document has multiple lat/lon. It seems geo_distance uses only first lat/lon in the array of lat/long so the simple geo_distance filter was not usable in the query.

Need help to optimize the query.

Regards,
Deep

Christian_Dahlqvist · January 24, 2016, 6:32pm

Have you tried denormalising and store a copy of each document for each location, possibly with the array of locations in a separate field if your application needs them? I suspect this would give better performance than using a groovy script, which you mention in your other post.

Deep_2 · January 24, 2016, 7:26pm

Hi Christian,
Thank you for your response.

As suggested let me denormalise and try.

But denormalizing would add more documents to the index, will that impact query performance. In some cases I would have 50 lat/lon in a document will this adversely impact query performance.

Regards,
Deep

Christian_Dahlqvist · January 24, 2016, 7:31pm

Even if all the 15k documents has 50 geopoints in them, 750000 documents in an index is still not much (unless the documents are huge). Given your amount of memory I would expect it all to be cached anyway.

Deep_2 · January 24, 2016, 7:40pm

Ok.
On marvel i can see the data size as 100 Mb. Let me denormalise and share the results.

Thanks.

Deep_2 · January 25, 2016, 12:11pm

Hi Christian,

In our search query we need to use geo_range query and to and from values are a part of the document. In the geo_range query is it possible to access to and from values from within the document. We can access these values using a script.

Regards,
Deep

Deep_2 · February 5, 2016, 11:14am

Hi Christian,

As suggested by you removing scripts and denormalising the data I can see reduction in response time. But the response time fluctuates and I can co-relate the increase in response time with increase in merge rate on marvel. When ever the merge rate goes up the response time increases.

I tried to control the increase in merge rate by increasing the index refresh interval duration. I increased it to 60s. But I still see spikes in merge rate.

Merging should happen only when the index is updated. Is this understanding correct?

How can I control Merge Rate?

Regards,
Deep

Christian_Dahlqvist · February 5, 2016, 11:52am

If you are not continuously indexing, e.g. if you perform periodic bulk uploads, you can force Elasticsearch to consolidate segments through the force merge API. This can be resource intensive in terms of CPU and disk I/O, but once you have consolidated into one segment, there should be no more merging until you index/update/delete more data.

You also have the option to tune how aggressive you want merging to by throttling merges.

Deep_2 · February 5, 2016, 12:14pm

Hi Christian,

I am using elastic search 1.4.

The force merge API returns an error.

The API curl -XPOST 'http://localhost:9200/search/_forcemerge'
{"error":"InvalidTypeNameException[mapping type name [forcemerge] can't start with '']","status":400}

Is this API supported in new versions of elastic search?

Regards,
Deep

Deep_2 · February 5, 2016, 12:22pm

Hi Christian,

just figured out that in version 1.4 this api was optimize.

The api curl -XPOST "http://localhost:9200/search/_optimize?max_num_segments=1" worked.

Regards,
Deep

Deep_2 · February 5, 2016, 2:15pm

Hi Christian,

After optimize it automatically creates multiple segments. There are no deletes/updated to the index.

What could be the reason for this?

Regards,
Deep

Christian_Dahlqvist · February 5, 2016, 3:00pm

If you look at index statistics, do the number of documents or the number of deleted documents change when segments are created?

Topic		Replies	Views
Geo_distance performance problem fixed by merging segments Elasticsearch	2	590	July 5, 2017
Geo-Distance Throughput Expectations Elasticsearch	8	470	July 6, 2017
Slow Query Performance Elasticsearch	2	322	July 6, 2017
Spatial queries with multiple locations per document Elasticsearch	9	1658	July 6, 2017
Geo distance performance varies greatly for different [lat, lon] values Elasticsearch	1	511	July 6, 2017

Need help to improve performance with ES

Related topics