Painless access to nested objects


(Jean Pierre) #1

Hi,

I'm facing the following situation :

My data has a structure similar to this

{
"site1" : {
"events": [ {"event1", "date1"}, {"event1","date1-2"},{"event2","date3"}}]
...
},
"site2" : {
"events": [ {"event2", "date4"}, {"event1","date5"},{"event3","date6}}]
...
},
...
}

What I want is :

  1. Count the number of sites where a specific event occurs during a given time range.
  2. Count for each site the numer of occurences during a specific time range.

I could get the first via a simple count query. For the second my idea was to create a scripted field. For each document, the painless script iterates through the events, checks if the event is the one we are looking for and if the date is the right range then increments a counter.

The issue I am facing is that painless does not give me any way to access the fields of the events nested object.

Thank you for your help!


(Zachary Tong) #2

I think you should be able to do this with a Nested Query + Nested Aggregation.

Something like:

{
   "query":{
      "nested":{
         "path":"events",
         "score_mode":"avg",
         "query":{
            "constant_score":{
               "filter":[
                  {
                     "range":{
                        "events.date":{
                           "gt":"<some time>"
                        }
                     }
                  }
               ]
            }
         }
      }
   },
   "aggs":{
      "sites":{
         "terms":{
            "field":"site"
         },
         "aggs":{
            "events":{
               "nested":{
                  "path":"events"
               },
               "aggs":{
                  "stats":{
                     "stats":{
                        "field":"events.event"
                     }
                  }
               }
            }
         }
      }
   }
}

The nested query filters the set of documents to only those that match your desired time range. Then you do a terms aggregation on "sites", which will give you a bucket-per-site. Then you use a nested aggregation to count up the details about the individual nested documents per-site. In this case I used a stats but you could do whatever you wanted, or omit it if you just want a count.

That should be considerably easier/faster than using a script in this case :slight_smile:


(system) #3

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.