Should I store strings directly or their numeric tokens in elasticsearch


(Fong Ng) #1

I can't decide which way to save an increasing number of event information to elasticsearch. Each filterable fields have a limited number of options, but multiple options are allowed. Should I store the information directly like this:

{
"id":"1",
"name":"Event A",
"type":"Training,Workshop,Meeting",
"industrialSector":"Energy,Transport",
"country":"China"
// + 80 fields alike
}

Or use some backend work to turn the string values into numeric tokens before saving to elasticsearch:

{
"id":"1",
"name":"Event A",
"type":"1 3 5",
"industrialSector":"2 3",
"country":"7"
// + 80 fields alike
}

There will be a map object to reference the field options before saved or after fetched:

let options =
{
type:{
Training:1,
Fair:2
Workshop:3,
Brokerage:4
Meeting:5
},
industrialSector:{
Tech:1
Energy:2
Transport:3
}
}

The first one requires less work, but does it perform slower and require more diskspaces than the second one?


(Mark Walkom) #2

I'd think the second one would be slower cause of the extra translation step?


(Fong Ng) #3

Yes, it requires some backend works to translate the numeric values into strings for the api. Do you think the second solution has no major benefits over the first one?


(Mark Walkom) #4

I don't think so.
I mean you would save some disk space because your values are smaller, but I'd question the value of that given you now have to do that second translation, so you trade one "cost" for another one that is potentially bigger (cause you have to now maintain that code and the translation table).


(Fong Ng) #5

Thanks warkolm. Good suggestions. The first solution is the way to go.


(system) #6

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.