Hi,
I've been asked to save all events in Elasticsearch in 2 separate indexes:
one that is "classic" with a dynamic mapping
one that is "raw": the whole incoming event should be stored as a string, without any attempt to parse it in any way
Basis for the request: there can be unexpected structures provided, and those should never be lost. So the "raw index", would allow for that.
I'd assume a way to do it would be to have the whole event input moved to a subfield of the event.
Trying to describe the situation:
incoming event: "this is not a json, just a basic string"
Should be stored in a given index as an entry similar to: {"@timestamp":1473694864119,"message":"this is not a json, just a basic string"}
I realize this is a very peculiar use case, but it makes sense in the context of "no lost event" and would still allow for some searches to occur in Kibana.
Current research hasn't shown me an obvious way to achieve this, which is why I'm hoping the community will be able to help.
Use the clone filter to split each input event in two copies. Parse one of the copies and send it to one index and send the other copy directly to another index without any filtering.
Thanks for the information, it allows for a cleaner approach than what I had originally planned for the event duplication.
This leaves the issue of storing the whole event without elasticsearch trying to parse it at all in a given index.
It's possible the question should be asked in the Elasticsearch group, I had assumed it easier to "map" the whole content of the event in a new field at logstash level to be more easily achieved.
This leaves the issue of storing the whole event without elasticsearch trying to parse it at all in a given index.
I'm not sure what you mean by this. ES stores JSON documents and never modifies their contents. If you don't apply any filtering in Logstash what's sent to ES will look exactly like what you describe,
{"@timestamp":1473694864119,"message":"this is not a json, just a basic string"}
except that @timestamp will have another date format than in your example.
Hi again Magnus,
Probably wasn't fully functional when I wrote this request.
Your are indeed correct and I have all I now need to achieve my goal in a clean and efficient way, thank you for this.
The reason of my confusion is that I hadn't created logstash configuration from scratch for a while and so had usually a full config to start from, hadn't had to review the basis for a long time.
What I found strange reading the logstash documentation is that to my knowledge, there is no clear indication anywhere of the name of the field storing your complete input in the documentation, it has to be deduced from the examples: "message" field.
The example I gave actually referenced that field intuitively, but would be better if this information was present at the beginning of the logstash documentation.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.