Hi, I need to set different ILM policies based on logs field value (one for PROD , other for NONPROD), so I created an ingest pipeline that renames the _index field as @ruflin suggested in a github issue and works good when I reindex the datastream manually under the index pipeline with the new processors.
The problem is the new incoming logs are not being ingested at all, nor with the original index name set in fleet, or with the renamed index names.
Also, how can I recover the logs I lost during the process? In regular metricbeat I would remove the registry folder but I'm not sure how to achieve this on Fleet or if there is a way to do it differently.
created a new integration per env, now we have one for prod, another for nonprod. This change separates the indices and send the documents to different ingest pipelines
created 2 ingest pipelines, one that drops PROD docs, and the other drops NONPROD documents
adjusted index templates to now grab this new data streams
Hi @Gustavo_Llermaly For Filebeat, the registry still exists but it is in the data directory of Elastic Agent. Stop Elastic Agent, find the registry file and remove it. If you start again, all data should be shipped again.
For the pipeline you shared, I'm surprised the docs did not show up at all because you have an on_failure handling. So even if there was a failure, I would expect the data to be ingested.
An integration per env makes sense. I assume you also set different namespaces to separate the data. The part I don't understand, why do both logs end up in the same pipeline so you have to drop one of them in the pipeline? I assume these are different hosts or log files or are the events mixed together?
Thanks @ruflin for your answer, I tried to find the registry file in my mac with no success (filebeat folder with logs was there) I will try to do the same in the current machine.
I'm also impressed the docs were not going in, tried locally and the result is the same. I was expecting to do nothing if the index can not be renamed, but documents just dissapeared. Would be great if you can test locally and validate this.
Yes. I set different namespaces, same dataset per each integrations. And yes, logs are mixed together and I have seen this usecase twice in the last month and my concern is to be processing the same file twice. Index renaming sounded more performant but I was not able to make it work.
I will just close this one, thanks for your answer @ruflin. We removed the registry and ingesting data again. Not happy processing everything twice but it is what it is.
Index splitting + ignoring logs older than X time based on a field should be a must. We see cases of "all history ingested in one day" all the time. And makes sense as people install Fleet and run it against a folder full of logs expecting ILM to solve everything.
Hi @Gustavo_Llermaly Not fully happy with the solution you had to use We are currently working on quite a few efforts around data routing / index splitting and hopefully can provide you soon with a better experience around the custom log integrations.
For ignoring old data: In the context of Fleet and integrations we must think of ways to make this configurable, agree.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.