Occasional spike in data indexing of Elasticsearch

Hi, we are currently working with a single node ES instance, to which we connect with our application using ESJavaClient. Recently we re-indexed all our indexes after enabling search on a time based field. Post migration, we have been observing sudden increase in the data indexing performance specifically for the newly created indexes. The average indexing performance for the new indexes are greater than 300ms compared consistently to an average of 10ms for other indexes.

Some basic debugging details:-

  • We tries deleting the whole index and letting a new one get created with no initial data. Still the average time to push any document in it was over 250ms, sometimes even going to 1000ms. Whereas, older indexes at same time got updated under 10ms.
  • There is no relation in the size of the index with the delay, as indexes with 20 times the size are not facing these delays.
  • There has been an overall increase the indexing time but, its not extremely large/
  • Ram percent usage is very high for the single node, usually higher than 95%.
ip        heap.percent ram.percent cpu load_1m load_5m load_15m node.role   master name
127.0.0.1           11          96  34   21.43   22.86    23.39 cdfhilmrstw *      socrates
  • Checked the hot threads usage too and usually they display 100% CPU utilization in write and flush threads.
100.0% [cpu=99.7%, other=0.3%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[socrates08][write][T#18]'
 100.0% [cpu=99.0%, other=1.0%] (500ms out of 500ms) cpu usage by thread 'elasticsearch[socrates08][flush][T#8]'

Extra details on implementation and memory allocation:-

  • Maximum number of indexing queries that can be made at a time: 6
  • Provided number of CPU cores: 10 (Increased from 5, before the migration)
  • Provided memory to the ES process: 80GB (increased from 64GB, before the migration)
  • Heap memory allocated: 30GB
  • Number of primary shards allocated to each index: 1
  • Number of replica shards allocated to each index: 0

We understand that there are some basic optimizations we can make like moving to a cluster or using the bulk indexing much more efficiently. However, looking at the nature of the issue we wanted to know if providing some extra resource, or making some other change can fix the issue for now. Please let us know if some other detail is required for debugging further.

Thanks,
Priyansh Maheshwari

Are you indexing using bulk requests? If so, what size are you using for your bulk requests?

Are you using dynamic mappings? If this is the case it is possible that smaller indices will result in more mapping updates, which need to be persisted to the cluster state.

RAM usage should approach 100% as the page cache fills up and is normal - not a problem.

Your load average is very high. Do you by any chance have very slow storage, e.g. HDD? For optimal indexing speed it is recommended to use local SSDs (or storage with the same level of performance)..

What kind of disks are you using? Elasticsearch is very disk intense so having slow disks (HDD) might heavily impact performance. If I don't remember wrong you should be able to see this as the system spending a lot of time in wait for io (top on a linux machine will show you this for example).

Thanks for the quick response.

Are you indexing using bulk requests? If so, what size are you using for your bulk requests?

No, we are sending individual document request. The mappings are same for all indexes.

Are you using dynamic mappings? If this is the case it is possible that smaller indices will result in more mapping updates, which need to be persisted to the cluster state.

No, the mappings are defined while the index is created and any dynamic new field is by default not indexed.

Your load average is very high. Do you by any chance have very slow storage, e.g. HDD? For optimal indexing speed it is recommended to use local SSDs (or storage with the same level of performance)..

We are using GPFS partitions for the storage of our ES data. Its not as fast as SSDs but, are better performing that HDD.

While going through the suggestions updated by you, I also figured out that the index for which there is highest delay, also has absurdly high average size for its updates. This could be a probable reason I presume.

Average Index Update Sizes

  • index1: 79,208 bytes
  • index2: 5,170 bytes
  • index3: 6,295 bytes

Average Indexing Times

  • index1: 283 milliseconds
  • index2: 14 milliseconds
  • index3: 13 milliseconds

Will check the slow logs for the index, if I am able to figure out some discrepancy.

Indexing and updating individual documents adds a lot of overhead compared to using bulk requests as each request must be committed to disk. This in my experience increases the impact of slow storage, so I would recommend switching to bulk requests.

Check await and other I/O stats on the nodes to see if the storage is the bottleneck.

Indexing and updating individual documents adds a lot of overhead compared to using bulk requests as each request must be committed to disk.

Yeah, that makes sense and we are planning to migrate to bulk updates. However, considering the behavior that the indexing time is high only for certain indexes, also leads to the reasoning that there could be some other reason too, as the file storages and await are similar for all.

The correlation between update size and latency seems quite clear. Is there any difference in document size (not just size of updates) or mappings, e,g, type of fields and features (nested documents, parent-child, vectors etc) used?

Is the indexing rate the same for the indices you are comparing? As you are indexing/updating with an external ID, might a more frequently updated index may be cached to a greater extent?

Is there any difference in document size (not just size of updates) or mappings, e,g, type of fields and features (nested documents, parent-child, vectors etc) used?

There is no difference there, as such also after making some modifications in the data we are pushing the document size for all indexes have become similar. However, it seems that the internal merging of segments is being triggered for the index1 very frequently leading to the delay in indexing speed. Verified this once too after checking that the slow logs with 500ms+ duration updates are coming more often when the merge thread for index1 is being observed in the hot_threads.

Is there any recommended setting for segments. For now, we have kept default segment settings for all indexes.

If merging slows down indexing it sounds to me like the storage you are using may provide inadequate performance. Have you looked at await and I/O statistics, e.g. using iostat -x?


avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.07    0.01    1.77    0.10    0.00   95.05

Device            r/s     w/s     rkB/s     wkB/s   rrqm/s   wrqm/s  %rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
sda              8.46   14.41    890.47    215.79     2.27     3.18  21.15  18.06   16.02    0.32   0.14   105.26    14.97   0.16   0.37
dm-0             8.37   17.23    890.29    215.76     0.00     0.00   0.00   0.00   22.79    1.08   0.21   106.42    12.52   0.14   0.35

Going through the stats I believe they are not the bottleneck causing the delay as they are idle in general.

Moreover, the delay has recently increased for all indexes. Ofcourse its extremely high for that one particular index, but, for other indexes too its pretty high.

I tried with making some minor modifications in the settings of the indexes for merge operation.

{
  "index": {
    "refresh_interval": "3s",
    "merge": {
      "policy": {
        "max_merge_at_once": "20",
        "segments_per_tier": "20",
        "max_merged_segment": "10gb"
      },
      "scheduler": {
        "max_thread_count": "1"
      }
    }
  }
}

Also, tried with setting the flush operation to async, but, none of them led to any specific improvements in performance.

As you are performing updates or indexing with an external document id each indexing operation will need one or more reads and a write. The values for r_await are quite high and could be affecting indexing latencies if the files to be read are not cached.

I would recommend switching to bulk requests and see what difference that makes.