ES 5.5.1 extremely slow on multiple indices

SimonC · May 28, 2018, 8:47am

Hi there,

We're using ES 5.5.1 for archival purposes.
Lately, the search engine has become extremely slow, some queries run for hours on end, until they finally time out, others time out rather quickly.

In this one particular instance, the index is > 600GiB with just under 2 million documents.
Below are the index' settings:
"settings" : {
"index" : {
"mapping" : {
"total_fields" : {
"limit" : "65535"
}
},
"refresh_interval" : "1m",
"number_of_shards" : "9",
"auto_expand_replicas" : "2-15",
"provided_name" : "de_gdsk",
"creation_date" : "1492077704119",
"store" : {
"type" : "mmapfs"
},
"number_of_replicas" : "2",
"queries" : {
"cache" : {
"enabled" : "true"
}
},
"uuid" : "4LB_JsIdToWwly9JZzX_WQ",
"version" : {
"created" : "5020299",
"upgraded" : "5050199"
}
}
}

This setup has worked for over a year, and now it is failing.
This index is rather old and is not used very often, but it contains documents vital to a customer's ability to stay in business, and after a recent file transfer (from Linux to Windows) certain paths in the documents require modifications.

A query on said index takes extremely long to complete, and as of today it ends with a shard failure. Mostly the request simply times out.

This is the shard failure:
[2018-05-28T10:42:07,408][DEBUG][o.e.a.s.TransportSearchAction] [node-windows] [85998] Failed to execute fetch phase
org.elasticsearch.transport.RemoteTransportException: [node-windows-3][192.168.253.3:9300][indices:data/read/search[phase/fetch/id]]
Caused by: org.elasticsearch.search.SearchContextMissingException: No search context found for id [85998]
at org.elasticsearch.search.SearchService.findContext(SearchService.java:443) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:410) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.action.search.SearchTransportService$12.messageReceived(SearchTransportService.java:393) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.action.search.SearchTransportService$12.messageReceived(SearchTransportService.java:390) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:69) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.transport.TcpTransport$RequestHandler.doRun(TcpTransport.java:1544) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:638) ~[elasticsearch-5.5.1.jar:5.5.1]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.5.1.jar:5.5.1]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_144]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]

Further more, the cluster is experiencing a lot of GC overhead, spewing logs like the following in to the console on all three nodes:
[2018-05-28T09:58:44,288][INFO ][o.e.m.j.JvmGcMonitorService] [node-windows] [gc][235] overhead, spent [468ms] collecting in the last [1.2s]

Any help on the matter is very much appreciated!
Without the cluster working, no documents can be indexed or retrieved and several other services fail also.

Thanks in advance!

mjunaidmuzammil · May 30, 2018, 6:54am

You index data size is too large. I think you should consider time-based indices. It is recommended to keep shard size < 50G. Based on your data size, your average shard size is around (600G/9) = 66G.

system · June 27, 2018, 6:54am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Weird timeouts with transport client after re-indexing Elasticsearch	6	1849	August 17, 2017
ES Index performance Elasticsearch	26	994	July 6, 2017
Indexing performance Elasticsearch	6	367	July 6, 2017
Elasticsearch replication performance is slow Elasticsearch	4	1295	February 7, 2017
Shard copying performance Elasticsearch	4	375	July 6, 2017

ES 5.5.1 extremely slow on multiple indices

Related topics