Searching and sorting dynamic fields


(Benno Si) #1

Hi,
I'm new to elasticsearch.
At the moment I'm checking whether elasticsearch does fulfill our requirements for searching and sorting on quiet dynamic (broad schema) data. One requirement is, that we need to index arbitrary keys and values. This means, a user can define for example a field called "temperature" and set it to an int value, or he can define a field called "description" and set it to a string value. The first approach I was thinking of was to save the data in the following form:
{
"temperature/num": 29,
"description/str": "something"
}

(the suffixes are there because in another document "desctription" could be an integer..)

The problem with this approach is, that the schema can grow very fast which can get a real problem as I read.

Another approach would be the following:
{
"attributes": [
{ "k": "temperature", "v/num":29 },
{ "k": "description", "v/str": "something" }
]
}

But a problem I see with this approach is, that I cannot sort for the temperature anymore, as far as I can see.

Do you maybe have an advice for me how I can index this kind of data without making the mapping explode but still be able to sort on such fields?

Thanks for your help.
Benno


(Benno Si) #2

ah cool seems like this should even work two, e.g. by something like this ?

{
"query": {
"match_all": {}
},
"sort": [
{
"attributes.v/num": {
"order": "asc",
"nested_path": "attributes",
"nested_filter": {
"term": {
"attributes.k": "model"
}
}
}
}
]
}


(Mark Walkom) #3

Be careful of this, you could run into a mapping explosion. You'd be better off normalising things if you can.


(Benno Si) #4

thanks for your reply.
You are right with the first approach I mentioned, where there could be introduced a huge amount of new mapping fields.

But my second approach should not have this problem I guess, because there I save the data in nested array objects of the form:

{
"attributes": [
{ "k": "temperature", "v/num":29 },
{ "k": "description", "v/str": "something" }
]
}

so there shouldn't be too many mapping entries, only attributes and a nested element with fields: k, v/num, v/str, and v/bool.

Or do you see a probem here too?

regards,
Benno


(system) #5