Logstash Output to Elasticsearch using todays date instead of @timestamp

By default, when you send Logstash to Elasticsearch, the index is "logstash-%{+YYYY.MM.dd}".

The %{YYYY.MM.dd} from what I can tell, is coming from a document's @timestamp field.

My question: Is it possible to force Logstash to use "Today" as the date, as opposed to the @timestamp field?

Reason: I'm ingesting logs from a ton of devices I don't manage, and sometimes they come in with really out of whack timestamps. Then I get my ES creating new indices all over the place with 1-2 logs in each. Ends up with a ton of extra shards and memory use to manage those shards, feels dirty :slight_smile:

(Yes, I know a "right" thing to do is to fix the sources....but that's just not as simple as it sounds :smiley: )

afaik, if you have not modified @timestamp, by default @timestamp will be the timestamp logstash received your events.

however most people modified @timestamp field to match the @timestamp of the events as it’s logged by logsource rather than actual time the log is received by logstash. this is particularly helpful if you pull the log by scheduler rather than getting them in near real time, because then the actual timestamp will be retained

so the %{YYYY.MM.dd} is actually the day the log is received by logstash as per the docs

We're definitely updating @timestamp to match whatever is in the logs.

The Documentation is super vague on it's description of what the source of %{YYYY.MM.dd} is. I assumed the date it used would be "Today".

Until I had a device with a date set to January 1 1970 send a log, and Logstash happily created a new index called "logstash-1970-01-01". This is why I think the date it's using is coming from @timestamp.

The environment I'm logging has about 10,000 devices. if 1% of them have incorrect timestamps set, it's enough to create thousands of unnecessary shards in my cluster as a result. Our daily indices are created with 30 primary shards, so one log entry in a weird date ends up causing 30 extra shards.

I just noticed in the documentation that there is an option to add this field to a document:

[@metadata][target_index]

This might be a solution to my problem. I wonder though, what would happen if a document had this set, but logstash itself had a specific setting for index in it's output config?

out of curiousity, i index old documents to elastic. my output is simple

output {
  elasticsearch {}
}

the docs i'm indexing has @timestamp field , and logstash created the index with today's date.

Weird...now I'm wondering if the behaviour I'm seeing is coming from somewhere else. I'll see if I can get a screenshot.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.