I've written a Beat and it's working like a champ for our time-series data. However, we also have management data we'd like to index and I've hit a problem.
The management has timestamps and ID values for every document that we're indexing. This data changes over time, both in terms of modification (requiring a re-index, not an update), and deletions (when the object is removed altogether).
From what I've seen (in libbeat/outputs/elasticsearch/client.go), the document will only ever be created if you specify the ID. Specifically, this portion of code:
if id != "" {
return bulkCreateAction{meta}, nil
}
return bulkIndexAction{meta}, nil
Am I correct in that conclusion or am I missing something? If the libbeat client doesn't support this, is there another way I can implement this? We could live without deletion by simply marking the document as deleted and querying based on that, but it's vulnerable to showing deleted data if the query misses out the filter.
You are correct in your understanding that libbeat cannot perform update or deletes.
For the management data, roughly how many documents are we talking about here? Would it be an option to periodically index the latest management data into a new index, then delete the old one?
Thanks for confirming. I've created an issue in the repo asking if there's a non-obvious reason why this is the case.
We're not certain how much of this type of data we're expecting to receive. We won't find out until we deploy in a live environment.
There are other ways I could get the data into the indices but I'd rather have only one code base to maintain.
I have forked libbeat and implemented this functionality myself for now. I added support for an op_type metadata property which can be blank, create, index or delete. The current behaviour in master still works the same so it's a backward compatible change.
If no-one comes back to me with a technical reason this shouldn't be allowed, I'll open a PR.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.