Showing min and max results only for relevant type of data

pavlexander · March 31, 2017, 1:35pm

Consider that I have objects such as: house, person, toy, mouse, etc..

And that there are following custom data types:

int
double
text

Different objects' properties hold different data types:

price is a double
length is a double
email is a text
number of floors is an int

What is the most appropriate way of saving this data into ElasticSearch, and then retrieving statistical data about object properties? See more detailed info next.

I came up with 2 possible design options:

design option 1

object1 = [
    "type" => "house",
    "properties" => [
        [
            "type" => "length",
            "value" => 10.12,
        ],[
            "type" => "floor_cnt",
            "value" => 5,
        ]
    ]
]

object2 = [
    "type" => "person",
    "properties" => [
        [
            "type" => "length",
            "value" => 168.12,
        ],[
            "type" => "email",
            "value" => "foo@foo.bar",
        ]
    ]
]

design option 2

object1 = [
    "type" => "house",
    "properties" => [
        "length" => 10.12,
        "floor_cnt" => 5
    ]
]

object2 = [
    "type" => "person",
    "properties" => [
        "length" => 168.12,
        "email" => "foo@foo.bar"
    ]
]

Index mappings

[
    'index' => 'my_index',
    'body' => [
        'mappings' => [
            'my_data' => [
                '_source' => [
                    'enabled' => true
                ],
                'properties' => [
                    "properties" => [ // name of column
                        "type" => "nested"
                    ]
                ]
            ]
        ]
    ]
]

Note: In case of design 2 - we can get rid of nesting.

I can not decide whether I should use first design pattern or the second, or possibly some other option. Here is what I am trying to accomplish. The expected result that I want to see from aggregation query is:

doc_count = 2
properties
    length
        count = 2
        min = 10.12
        max = 168.12
    floor_cnt
        count = 1
        min = 5
        max = 5
    email
        count = 1

I.e. min and max values are only calculated for int and double data types (or alternatively for properties length and floor_cnt). And then there is a total number of occurrences of each property.

Currently I don't see an easy way of retrieving such data using ElasticSearch aggregation. Any help and advices are appreciated.

I am using ES 5.x

system · April 28, 2017, 1:35pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Sort and Stats Aggregation Elasticsearch	2	244	December 9, 2020
Aggregation search Elasticsearch	4	713	July 5, 2017
Aggregations with object type properties Elasticsearch	3	1707	July 6, 2017
Suggest index design for my data Elasticsearch	2	182	May 4, 2022
Aggregation on fields within nested objects in elasticsearch Elasticsearch	6	6128	July 11, 2018

Showing min and max results only for relevant type of data

design option 1

design option 2

Index mappings

Related topics