How to store dictionaries with sparse fields

(Pee Wee2201) #1


Not sure my title makes it cristal clear, so here is a more detailed explanation :slight_smile:
I am also very new to ElasticSearch,

We receive events from our software as dictionaries as key/values pairs in the form:
date=xyz origin=xyz key1=value1 key2=value2 ... keyn=valuen

So all events are guaranteed to have date and origin fields. But the key(x) fields will differ from one event to the other.
In average, events will have 3 keys out of a 10000 dictionary.

I've read that documents in the same index in ES should have most of their fields in common, otherwise the Lucene index will be very sparse and inefficient. So, simply using dynamic mapping is not an option, right?

I thought about generating an eventId and storing multiple document for one event:
eventId=1 date=xyz origin=xyz key=key1 value=value1
eventId=1 date=xyz origin=xyz key=key2 value=value2
eventId=1 date=xyz origin=xyz key=keyn value=valuen

I have also read about custom analyzer that could help.

Finally, I am a bit lost how to do this with Elasticearch (or if I should do it at all).

Has anybody done the same kind of thing?


(system) #2

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.