recvdTime is not a standard field from beats or logstash afaik, and is likely coming from a field in your log or from a step in your processing pipeline.
@timestamp can be set by beats or logstash, for example with windows logs, the @timestamp field is set to the time from the log.
If there is no @timestamp field in the document when it arrives in Elasticsearch then Elasticsearch will set the timestamp to be the time it received the document
Thanks. Yes recvdTime is a field coming from the logs.
So to my question. If @timestamp is a field introduced by ES then does a difference in 6 to 7 hours between recvdTime and @timestamp indicate a slowness or buffering either in filebeat to Logstash communication or due to a network slowness?
If you have Logstash in your pipeline I believe it will set the @timestamp field to the timestamp it received it. There are several things that can cause differencs in timestamps, e.g.:
Differences in system time on the hosts that the components run on.
Timezone issues when parsing timestamps. This often results in a large and largely static difference.
Delays in reading the initial log.
Issues sending data down the pipeline, e.g. from Filebeat to Logstash or from any of these components to Elasticsearch. This can happen if some component goes down or is temporarily unavailable.
I would recommend analysing the logs and see if there is any pattern across the ones that show delay, e.g. a specific log format, source or host. That may help you narrow down where the issue is. Also look at logs from the various components to see if there are any issues reported.
If you are running a load test it is posible that some part of the chain is a bottleneck. If this was the case I would expect to see the delay grow over time. Maybe you can create an index pipeline that calculates the delay in seconds and stores this in a new field on the events. Then use this to update the data ingested as well as any new data. That way you can easily plot delay over time and slice by sources and/or components in Kibana to identify patterns. I believe this is described in this blog post as well as here.
It could be because Elasticsearch is not able to keep up, but even though the version you are using is very old and has been EOL a long time (you should look to upgrade as soon as possible) I do not think the version in itself is the cause. I have benchmarked high ingest loads on significantly older versions.
When it comes to indexing it is often disk I/O that is the limiting factor. I would recommend you monitor disk I/O using e.g. iostat -x, on the Elasticsearch nodes to see of this looks like a possible cause. What type of storage are you using? Do you have monitoring installed so you can see the indexing rate you are achieving?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.