I have a logstash ingest process that ingests records at 4k events per second. Recently we added field that is an object consisting of unique keys. I save it as a flattened data type. But ingest processing time starts at 3k per second and within an hour goes down to 60 events per second. I can't ingest more than 3 million out of the 200 million records. An example field looks like this:
{
"que": [
13
],
"cinq.": [
30
],
"qui": [
18
],
"toutes": [
29
],
"beaucoup": [
9
],
"du": [
22
],
"des": [
10
],
"lettres": [
12
],
"Simon": [
31
],
"(et": [
17
],
"m'avez": [
15
],
"parvenues)": [
25
],
"sont": [
21
],
"me": [
20
],
"cheres": [
3
],
"nombreuses": [
11
],
"Je": [
6
],
"tantes": [
4
],
"vous": [
7,
14,
27
],
"Dimanche": [
0
],
"pas": [
24
],
"reste": [
23
],
"envoyees": [
16
],
"et": [
26
],
"embrasse": [
28
],
"ne": [
19
],
"remercie": [
8
],
"Mes": [
2
]
}
The object has different keys and numbers for just about every record.
Since this field does not need to be filtered or sorted, I set it as:
"abstract_inverted_index": {
"type": "flattened",
"index": false,
"doc_values": false
}
Any idea what could be making this ingest slow down over time? I noticed event latency gradually goes up from 5ms to over 250ms fairly quickly.