Search api returns no documents

jgeek · May 8, 2024, 11:50am

We're encountering a delay in real-time data retrieval from Elasticsearch despite successfully inserting data through our Java Spring Boot producer application.

Our consumer application relies on Elasticsearch's /_search REST APIs to retrieve data instantly, but there's a noticeable 5-6 second lag in receiving search query results.

Our Elasticsearch setup consists of version 8.12 with 2 master nodes and 2 data nodes in the cluster. In an attempt to address the delay, we experimented with reducing the refresh interval for a specific index to as low as 40 milliseconds and also introduced replicas in the index settings. However, neither approach alleviated the latency issue. Note this response time is when the server is not under load. We are trying with just 1 user.

Note that the documents are inserted by the producer application through the bulk api.

We're seeking guidance on configuring Elasticsearch to enable real-time search functionality for the documents that are inserted few milliseconds back. Any insights or suggestions would be greatly appreciated.

Thanks.

Christian_Dahlqvist · May 8, 2024, 12:20pm

You can pass the refresh parameter to the bulk request to only return once it has refreshed. Be aware that refreshing frequently or for every request will add significant load on the node and impact performance.

Note that this does not offer any high availability. You should always look to have 3 master eligible nodes on 3 separate hosts in an Elasticsearch cluster.

jgeek · May 8, 2024, 12:55pm

Thanks for the inputs. In our use case, we would have few thousands of documents pushed to elasticsearch per second during peak load. This would be done through multiple bulk api calls. So not sure if we should add the refresh to the bulk insert api call.

Also on a related note this data push is done by a third party application, so modifying their source code leads to maintenance problems.

Be aware that refreshing frequently or for every request will add significant load on the node and impact performance.

Right I understand. Does that mean that we should not aim for near real time search capability for our use case?

Note that this does not offer any high availability. You should always look to have 3 master eligible nodes on 3 separate hosts in an Elasticsearch cluster.

Right. I can make the master replicaCount 3. I have enabled data nodes (2) thinking the search requests would speed up but didn't seem to work that way.

jgeek · May 10, 2024, 11:17am

@Christian_Dahlqvist - any inputs on the above replies?
I am kind of new to elasticsearch. I am not sure if we can tune elasticsearch during peak load for near-real-time search use case which is stated in the documentation but I cannot get it working even for 1 user.

So your inputs would be valuable. Thanks in advance.

Christian_Dahlqvist · May 10, 2024, 11:22am

Elasticsearch is not designed for real-time search, so trying to achieve this as I explained adds significant overhead. I have not persoanlly tried to achieve this, so modifying the bulk requests to include a refresh is my best recommendation if this is truly required.

Topic		Replies	Views
ElasticSearch - Refresh issue ? Too many Requests ? Can't find documents randomly Elasticsearch	17	2874	June 14, 2021
Elasticsearch refreshing indices, but documents still don't show up in search Elasticsearch	3	222	December 19, 2022
Elasticsearch Data refresh + miss issue Elasticsearch	1	708	March 30, 2020
Force a synchronous refresh when updating documents Elasticsearch	1	700	August 26, 2022
Consistency between multiple _search requests Elasticsearch	1	389	April 13, 2018

Search api returns no documents

Related topics