Problems with parent-child relationship


(Anton Bogdanovich) #1

We have a ES v.1.4.4 index with 1 replica and store analytics data there expecting search results and aggregations to be consistent.

Unfortunately we get inconsistent results returned by has_child aggregation query.
They become consistent (either 0 or something) with ?preference=1 or 2.
If I do separate queries for contact and event types, the results show up consistently and correctly, so it seems like only parent-child relationship problem.

The confusion is that inconsistency disappears/appears again after cluster restart randomly (all nodes one by one).
We restart cluster and results become consistent (inserting new data works fine).
Restart again, and they are inconsistent again. And while they are inconsistent and we insert new data, that new data seems like doesn't have parent-child association and doesn't show up in results until next restart.
It looks like it's not reliable at all. We have this issue on 2 different staging clusters and on production cluster (same elasticsearch.yml settings). Min masters is set to 3 out of 4.

Aggregation query that is used:
{
"query": {
"filtered": {
"filter": {
"bool": {
"must": [
{
"has_child": {
"type": "event",
"filter": {
"bool": {
"must": [ { "term": { "action": "click"} }, { "term": { "user_id": 10 } } ]
}
}
}
}
]
}
}
}
},
"fields": [],
"size": 0,
"aggs": {
"contact_ids": {
"terms": {
"field": "contact_id",
"size": 1000
}
}
}
}


(Diogo Pineda) #2

I have the same problem here. Any news in this case?


(Diogo Pineda) #3

I found the problem. In my case I was forgeting the routing parameter. You must pass the grandparent id as routing, not just the parent id.

See this https://www.elastic.co/guide/en/elasticsearch/guide/current/grandparents.html


(system) #4