Missing Data in Elasticsearch - Need Help Resolving

Hello Elasticsearch community,

I'm currently facing an issue where I have missing data in my Elasticsearch index. I have been investigating the issue, but there have been no errors logged in Logstash itself. The cluster is in a "yellow" status, and there are a large number of unassigned shards. When I try to sync data from my database to Elasticsearch, some records are missing.

Cluster Health:

  • Status: Yellow
  • Number of Nodes: 1
  • Number of Data Nodes: 1
  • Active Shards: 7093
  • Unassigned Shards: 7058
  • Active Shards Percent: 50.12%

i found this on elastic logs:

[2025-03-20T01:38:15,483][WARN ][r.suppressed ] [AUDITTRAIL-SRV] path: /.kibana_task_manager/_update_by_query, params: {ignore_unavailable=true, refresh=true, index=.kibana_task_manager}
org.elasticsearch.action.search.SearchPhaseExecutionException: all shards failed
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:729) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.executeNextPhase(AbstractSearchAsyncAction.java:419) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseDone(AbstractSearchAsyncAction.java:761) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.search.AbstractSearchAsyncAction.onShardFailure(AbstractSearchAsyncAction.java:513) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.search.AbstractSearchAsyncAction$1.onFailure(AbstractSearchAsyncAction.java:350) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionListenerImplementations.safeAcceptException(ActionListenerImplementations.java:59) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionListenerImplementations.safeOnFailure(ActionListenerImplementations.java:71) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.DelegatingActionListener.onFailure(DelegatingActionListener.java:27) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionListenerResponseHandler.handleException(ActionListenerResponseHandler.java:48) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.search.SearchTransportService$ConnectionCountingHandler.handleException(SearchTransportService.java:620) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.transport.TransportService$UnregisterChildTransportResponseHandler.handleException(TransportService.java:1688) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.transport.TransportService$ContextRestoreResponseHandler.handleException(TransportService.java:1405) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.processException(TransportService.java:1541) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.transport.TransportService$DirectResponseChannel.sendResponse(TransportService.java:1516) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.transport.TaskTransportChannel.sendResponse(TaskTransportChannel.java:51) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.support.ChannelActionListener.onFailure(ChannelActionListener.java:37) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionRunnable.onFailure(ActionRunnable.java:92) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:966) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:28) ~[elasticsearch-8.8.2.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) ~[?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) ~[?:?]
at java.lang.Thread.run(Thread.java:833) ~[?:?]
Caused by: org.elasticsearch.ElasticsearchException: Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting.
at org.elasticsearch.search.SearchService.createAndPutReaderContext(SearchService.java:919) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.search.SearchService.createOrGetReaderContext(SearchService.java:897) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.search.SearchService.executeQueryPhase(SearchService.java:632) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.search.SearchService.lambda$executeQueryPhase$2(SearchService.java:496) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:50) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionRunnable$2.accept(ActionRunnable.java:47) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.action.ActionRunnable$3.doRun(ActionRunnable.java:72) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:983) ~[elasticsearch-8.8.2.jar:?]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:26) ~[elasticsearch-8.8.2.jar:?]
... 3 more

The reason you have yellow status indices is that you have replica shard count set to 1, which can never be allocated on a single-node cluster.

It seems like you have overridden the default shard count per node limits that are there for a reason and now have an exceptionally high number of small shards. I would recommend you carefully read this section in the official documentation and reconsider how you are sharding your data. Shards are not free and each add overhead to the cluster.

I would also recommend that you upgrade to the latest version as there has been improvements in how Elasticsearch handle large shard counts in recent versions (am not sure whether the version you are using has these or not).

Do unassigned shards potentially cause missing data in Elasticsearch?

Not if they are in yellow status. If they are in red status it does.

How have you identified that data is missing? How are you ingesting data? Do you see any errors there?

What is the hardware specification of the node? What type of storage are you using? Is it local SSD as recommended here?

I use Logstash to sync data from Oracle to Elasticsearch. I’ve checked the Logstash logs, and there are no exceptions. Logstash syncs the data from Oracle every 30 minutes. To verify the synchronized records, I retrieve the minimum ID from the Oracle table and the maximum synchronized ID from Elasticsearch.

I then check the count in Oracle with the following query:

SELECT COUNT(1) FROM table_name WHERE id BETWEEN (minimum_table_id) AND (maximum_synced_id_in_Elasticsearch).

Next, I retrieve the count of records with the same IDs from Elasticsearch and compare the two counts. If the counts match, I delete the synchronized records from the Oracle table to prevent significant table growth.

Storage type : HDD

hardware specs of the node:

 "os": {
    "timestamp": 1742464425478,
    "cpu": {
      "percent": 11
    },
    "mem": {
      "total_in_bytes": 107373125632,
      "adjusted_total_in_bytes": 107373125632,
      "free_in_bytes": 27726508032,
      "used_in_bytes": 79646617600,
      "free_percent": 26,
      "used_percent": 74
    },
    "swap": {
      "total_in_bytes": 147557699584,
      "free_in_bytes": 31037906944,
      "used_in_bytes": 116519792640
    }

As you have a lot of small indices it is possible that you will end up with a lot of small writes when updating/indexing. I would recommend looking at I/O statistics as storage may be a bottleneck. On linux you could run iostat -x while indexing is progressing.

I am not sure how Logstash reports and handles different types of errors so can't comment on that. It may be wise to increase Logstash logging to check for issues in more detail.

When running Elasticsearch it is recommended to have swap completely disabled. I would recommend you address this as swap can cause serious performance issues.