Consider that I have objects such as: house
, person
, toy
, mouse
, etc..
And that there are following custom data types:
int
double
text
Different objects' properties hold different data types:
-
price
is adouble
-
length
is adouble
-
email
is atext
-
number of floors
is anint
What is the most appropriate way of saving this data into ElasticSearch
, and then retrieving statistical data about object properties? See more detailed info next.
I came up with 2 possible design options:
design option 1
object1 = [
"type" => "house",
"properties" => [
[
"type" => "length",
"value" => 10.12,
],[
"type" => "floor_cnt",
"value" => 5,
]
]
]
object2 = [
"type" => "person",
"properties" => [
[
"type" => "length",
"value" => 168.12,
],[
"type" => "email",
"value" => "foo@foo.bar",
]
]
]
design option 2
object1 = [
"type" => "house",
"properties" => [
"length" => 10.12,
"floor_cnt" => 5
]
]
object2 = [
"type" => "person",
"properties" => [
"length" => 168.12,
"email" => "foo@foo.bar"
]
]
Index mappings
[
'index' => 'my_index',
'body' => [
'mappings' => [
'my_data' => [
'_source' => [
'enabled' => true
],
'properties' => [
"properties" => [ // name of column
"type" => "nested"
]
]
]
]
]
]
Note: In case of design 2 - we can get rid of nesting.
I can not decide whether I should use first design pattern or the second, or possibly some other option. Here is what I am trying to accomplish. The expected result that I want to see from aggregation query is:
doc_count = 2
properties
length
count = 2
min = 10.12
max = 168.12
floor_cnt
count = 1
min = 5
max = 5
email
count = 1
I.e. min
and max
values are only calculated for int
and double
data types (or alternatively for properties length
and floor_cnt
). And then there is a total number of occurrences of each property.
Currently I don't see an easy way of retrieving such data using ElasticSearch
aggregation. Any help and advices are appreciated.
I am using ES 5.x