Latency in seeing a document

It seems to be a delay between the time a doc is put and the time it is searchable in the Kibana. I wanted to find the reason.

      "indexing": {
        "index_total": 3568228,
        "index_time_in_millis": 1947151,
        "index_current": 0,
        "index_failed": 0,
        "delete_total": 0,
        "delete_time_in_millis": 0,
        "delete_current": 0,
        "noop_update_total": 0,
        "is_throttled": false,
        "throttle_time_in_millis": 0
      }

index_current is zero so it is not indexing anything. If so then were the docs are being held back. I also like to know how much the latency is.

Making documents searchable in Elasticsearch is a quite expensive operation, so data is batched up and published periodically. This is described in the docs.

Sure but I am looking for an indicator in the stats to give a hint on refresh/indexing workload.

In other words if I didn't know I am one hour behind, which indicator could have shown that I was?

The default refresh interval is set to 1 seconds, so that should approximately be the max delay. You should be able to check the index settings to see the refresh interval. This concerns the delay between a document being indexed into Elasticsearch and it becoming available for search. Is this what you are referring to or are you looking at the delay of a full indexing pipeline, e.g. including Filebeat, Logstash etc?

My refresh interval is 1 sec but I see more than minutes delay.

Go to the Kibana console and create a new document with a specified document ID. Then immediately get the document based on the custom ID you set . This will return the document even if it is not searchable. Then repeatedly run a search query based on the document ID (or some other query based on data in the document that is unique to the document) and see how long it takes until it appears.

The numbers in your 1st post indicate the each document is taking ~.5ms to actually index into elastic... so the longest delay you would see by the time a document is ready to be indexed and when it would show up in a search is 1s Refresh + .5ms (so about 1s)

Therefore, you Most likely have a delay in your ingest architecture.
I when the logs is produced, scraped, shipped, transformed and arrives at Elasticsearch

1 Like

Delay in injection is what I thought about. But it is direct calls from a Javascript app to the ES API. There is nothing in between except network. Delays I see are in order of an hour.

Show us the complete document... are you accounting for timezone?

How are you searching for the expected document?

Are you specific enough so that you can be sure the document is in the returned result set?

Another thing that can I have seen when hour long delays have been reported is timezone issues for timestamps in the document.

I make a query in the database and make one in Kibana. They won't match for today but for the yesterday they do.
Beside that I make a Kibana query for some time span and count the hits. After a while rerun it and see hit count is increased.

That sound a lot like delay in ingest architecture.. .

How Many Docs / Sec etc
How Are you Ingesting?
What is in the middle?

Most likely this is not on the Elasticsearch Side...

1 Like

Docs/Sec is a lot. More than 1000 in a minute and they are nested JSONs. There is a JS client that calls ES API. To be precise a library (bunyan-elasticsearch) is calling ES JS client. But they do so in realtime. There is no buffering/processing in the middle.

1000/minute is not massive, but it's enough that you should be using the bulk API. I looked at the bunyan-elasticsearch code and I think it's not doing so. This doesn't directly impact document visibility, but it might mean that you are building up a large client-side backlog of indexing which would explain a delay. The bunyan-elasticsearch API supports a callback which is called when the write completes. Are you using this callback to check that writes are happening as fast as you think?

Don't think that callback is assignable by me. The only place logs may get piled up is stream buffer. But this also would happen if the ES API respond slowly.

You can run the test I outlined earlier where you bypass your client to test how long it takes for Elasticsearch to respond. Run it from Kibana or a separate script/process. If that is fast the issue is most likely buffering in the client.

Exactly. You have to rule this out.

curious if any answers/explanations found here. This is sort of issue that often gets chucked my way. "more than minutes delay" often means a lack of understanding, rather than an actual bug.

Another idea (tho take all ideas under advisement) is too say inject the same dummy document (only difference is the timestamp and _id of course) into exact same index every X seconds, even X=1, using a completely different client, maybe directly with a PUT or POST call in a simple shell script. Log the precise time of the acknowledgement received, checking it is a success of course.

Then in kibana search for that document, you should see one per second with a short indexing delay. if you do, the "problem" is elsewhere. If you dont, then there's some other issue with your elasticsearch/kibana setup.

We had lots of people confused that say a log entry appears in a /var/log/blabla.log file, and only appears in kibana say 30-60 seconds later. Yes, it went through Logstash/kinesis/whatever pipelines during that 30-60 seconds - it's not a bug.

1 Like
% cat silly.sh
#!/bin/sh
EUS="xxx"
EPW="yyy"
EHOST="localhost"
EPORT="9200"
day=`date +"%Y-%m-%d"`
timestamp=`date +"%Y-%m-%d %T"`
value1=$(echo $RANDOM | md5)
curl \
    -s -k -u "${EUS}":"${EPW}" -XPOST \
    "https://${EHOST}:${EPORT}/dummy-"${day}"/_doc" \
    -H 'Content-Type: application/json' \
    -d "{\"timestamp\":\"$timestamp\",\"key1\": \"$value1\"}"

is such a silly little shell script, assuming elasticsearch is running locally, adjust accordingly.

I run it via something like:

watch -n 1 ./silly.sh

to run every second.

In my kibana the "delay" between the indexing command returning and result appearing in kibana search is hard to measure its so short, around 1s seems about right.

I can immediately see the doc I insert in one of the indices. For some reason I can't see it in Discover tab in Kibana. Is there a buffer of something between Kibana and ES?