Hi, List
We are trying to use Elastic search in the data analytics use case,
where we have around 7 million (increasing around 1 million per month)
transaction data. and we are digging out aggregation result from
it.
In our cases, we use facet heavily with filter/queries . Our document
is complex type with arrays and nested objects in array to allow us
store certain 1-many relationship. Have done a few performance test
and it turn out to be very good.
{
id: 123,
transaction_time: "2011-01-01T01:01:01",
attributes: [
{
attribute_id: 101,
attribute_value: "something1"
},
{
attribute_id: 102,
attribute_value: "something2"
},
{
attribute_id: 103,
attribute_value: "something3"
},
{
attribute_id: 104,
attribute_value: "something4"
}
],
hierarchies: [
{
role: "agent",
name: "Bill"
},{
role: manager,
name: "Shelly"
}
]
}
for document like this, if we perform a lot of facet on nested fileds
like hierarchies[0].role for example, would it require a more memeory
than the simple fileds like time or id?
From the facet documentation, i can almost be certain that it is the
case, but is there any recommandations or someone has already done it
who can give us some advice,
i.e. some discussion was saying in order to reteive the value, es has
to open each doc and put the nest object values in memory, if this is
a data record that has 10 million records, how much memory are we
talking about ? 2G, 4G, 10G ? 100G?
because if it is around 10G, we can handle that, but if it increase to
a certain amount that reaches say 100G, then that is very hard for us
to justify this kind of calculation using ES then, so wanted someone
to point out their experiences so that we don't have to run into the
trouble later.