We would like to process old IIS Log files and have the data stored in indices that correspond to the event's date, and not the date which it was harvested. We are using version 6.5.3 for Elasticsearch, Filebeats, and the rest of our stack. IIS Module is enabled within Filebeats.yml.
We recently updated to use Monthly Indices, and this is working great. Based on some research and preliminary testing we are working on the following assumptions. If we're off on any of this we'd appreciate knowing.
The default.json file within the module\iis\access\ingest folder is where we update our grok pattern. Also within this file we see how the included @Timestamp field gets reset to the iis.access.time field from the event. The old @Timestamp field value is stored in read_timestamp.
Within filebeat.yml, we setup our index to create monthly indices.
output.elasticsearch:
# Array of hosts to connect to.
hosts: ["#{Elasticsearch.Dns.Name}:9200"]
index: "filebeat-monthly-%{[beat.version]}-%{+yyyy.MM}"
However, when we tried to process IIS Logs from June, it stuffed them into filebeat-monthly-6.5.3-2018.12
. Based on the Format String (sprintf) documentation we expected the date for the index to be based on the @Timestamp
field. Looking at the documents in the index the @Timestamp
field is accurate, as is the read_timestamp
. Those fields show the event was from June, but processed today.
Our assumption is @Timestamp
should drive which index receives the documents. We must be mistaken because documents are going into the index based on the read_timestamp
not the event @Timestamp
. What are we missing?
Also, at what point does it determine the index name? We are curious if it got the @Timestamp field when it was still the read time. Would that mean the index name is decided before the ingest pipeline processors have a chance to update fields?
Thank you!