Making documents searchable in Elasticsearch is a quite expensive operation, so data is batched up and published periodically. This is described in the docs.
The default refresh interval is set to 1 seconds, so that should approximately be the max delay. You should be able to check the index settings to see the refresh interval. This concerns the delay between a document being indexed into Elasticsearch and it becoming available for search. Is this what you are referring to or are you looking at the delay of a full indexing pipeline, e.g. including Filebeat, Logstash etc?
The numbers in your 1st post indicate the each document is taking ~.5ms to actually index into elastic... so the longest delay you would see by the time a document is ready to be indexed and when it would show up in a search is 1s Refresh + .5ms (so about 1s)
Therefore, you Most likely have a delay in your ingest architecture.
I when the logs is produced, scraped, shipped, transformed and arrives at Elasticsearch
Delay in injection is what I thought about. But it is direct calls from a Javascript app to the ES API. There is nothing in between except network. Delays I see are in order of an hour.
I make a query in the database and make one in Kibana. They won't match for today but for the yesterday they do.
Beside that I make a Kibana query for some time span and count the hits. After a while rerun it and see hit count is increased.
Docs/Sec is a lot. More than 1000 in a minute and they are nested JSONs. There is a JS client that calls ES API. To be precise a library (bunyan-elasticsearch) is calling ES JS client. But they do so in realtime. There is no buffering/processing in the middle.
1000/minute is not massive, but it's enough that you should be using the bulk API. I looked at the bunyan-elasticsearch code and I think it's not doing so. This doesn't directly impact document visibility, but it might mean that you are building up a large client-side backlog of indexing which would explain a delay. The bunyan-elasticsearch API supports a callback which is called when the write completes. Are you using this callback to check that writes are happening as fast as you think?
Don't think that callback is assignable by me. The only place logs may get piled up is stream buffer. But this also would happen if the ES API respond slowly.
You can run the test I outlined earlier where you bypass your client to test how long it takes for Elasticsearch to respond. Run it from Kibana or a separate script/process. If that is fast the issue is most likely buffering in the client.
curious if any answers/explanations found here. This is sort of issue that often gets chucked my way. "more than minutes delay" often means a lack of understanding, rather than an actual bug.
Another idea (tho take all ideas under advisement) is too say inject the same dummy document (only difference is the timestamp and _id of course) into exact same index every X seconds, even X=1, using a completely different client, maybe directly with a PUT or POST call in a simple shell script. Log the precise time of the acknowledgement received, checking it is a success of course.
Then in kibana search for that document, you should see one per second with a short indexing delay. if you do, the "problem" is elsewhere. If you dont, then there's some other issue with your elasticsearch/kibana setup.
We had lots of people confused that say a log entry appears in a /var/log/blabla.log file, and only appears in kibana say 30-60 seconds later. Yes, it went through Logstash/kinesis/whatever pipelines during that 30-60 seconds - it's not a bug.
is such a silly little shell script, assuming elasticsearch is running locally, adjust accordingly.
I run it via something like:
watch -n 1 ./silly.sh
to run every second.
In my kibana the "delay" between the indexing command returning and result appearing in kibana search is hard to measure its so short, around 1s seems about right.
I can immediately see the doc I insert in one of the indices. For some reason I can't see it in Discover tab in Kibana. Is there a buffer of something between Kibana and ES?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.