Has anyone improvised a solution for post-indexing searching? (Splunk-alike field extraction)


Assuming a log in the format time, host, event ; splunk lets you extract fields from event AFTER they have been indexed. So if event is CSV or K=V style data, as long as you can specify what that looks like, you can extract it. I'm dealing with data whose format I don't control. An idea like this is promising to me because I can hide the details of my data until it is actually needed.

From what I can tell, there isn't currently a good way to do this inside elasticsearch -- it's all index time or nothing.

Is anyone doing anything to approximate this behavior on large bodies of data? Have any thoughts about how it might be done? What is the most flexible way to do this based on what is already available in elastic?

If I'm thinking of doing this, is this a sign that elastic isn't the right solution for my data?

What I've envisioned so far would be to index the data as described above; and then to search for the general terms I'm interested in, and then re-index the incoming documents in order to break up the parts of event as needed. So if I'm searching for "key=value", I then can look for documents that contain value (even if they are not what I'm looking for), reindex them, and then find "key=value". Is an approach like this viable?

(system) #2