Wonder if anyone can help. I've got the following mapping:
curl -XPOST localhost:9200/test_index -d '{
"mappings": {
"test_type" : {
"dynamic": "strict",
"properties": {
"id": {
"type": "string",
"index": "not_analyzed"
},
"source_data": {
"type": "nested",
"properties": {
"source": {"type": "string", "index": "not_analyzed"}
}
}
}
}
}
}'
Insert some sample content:
curl -XPUT 'http://localhost:9200/test_index/test_type/1' -d '{
"source_data" : [{"source": "feed"}]
}'
curl -XPUT 'http://localhost:9200/test_index/test_type/2' -d '{
"source_data" : [{"source": "supermarket"}]
}'
curl -XPUT 'http://localhost:9200/test_index/test_type/3' -d '{
"source_data" : [{"source": "supermarket"}, {"source": "feed"}]
}'
So source_data is an array of objects, currently each object only having one field: source.
(in reality there are many fields, but for the purpose of this post, there's just one)
If I filter for those articles where source_data.source contains both supermarket and feed, it works,
and successfully identifies just the one article (article id '3'):
curl -XGET 'localhost:9200/test_index/_search' -d '
{
"query": {
"bool": {
"filter": {
"bool": {
"must": [
{
"nested": {
"path": "source_data",
"query": {
"query_string": {
"query": "feed",
"default_field": "source_data.source"
}
}
}
},
{
"nested": {
"path": "source_data",
"query": {
"query_string": {
"query": "supermarket",
"default_field": "source_data.source"
}
}
}
}
]
}
}
}
}
}
'
However, I now want to run a filtered aggregation, to find the count of the
number of source_data's where the 'source' array contains 'feed' and/or 'supermarket':
curl -XGET 'localhost:9200/test_index/_search' -d '
{
"query": {
"match_all": {}
},
"aggs": {
"source_data": {
"aggs": {
"filtered_agg": {
"aggs": {
"num_articles": {
"terms": {
"field": "source_data.source",
"size": 0
}
}
},
"filters": {
"filters": {
"both_or": {
"terms": {
"source_data.source": [
"feed",
"supermarket"
]
}
},
"both_and": {
"query": {
"bool": {
"must": [
{
"query": {
"query_string": {
"query": "feed",
"default_field": "source_data.source"
}
}
},
{
"query": {
"query_string": {
"query": "supermarket",
"default_field": "source_data.source"
}
}
}
]
}
}
}
}
}
}
},
"nested": {
"path": "source_data"
}
}
},
"size": 0
}
'
'feed or supermarket' works fine.
However, I cannot get 'feed AND supermarket' to work, despite the fact that I'm using the same filter structure as in the above filter.
Remember what I'm trying to count is the number of instances where the ARRAY source_data is as follows: "source_data" = [{"source": "feed"}, {"source": "supermarket"}] (ordering does not matter).
Can anyone help?
I think what the filtered agg is doing (though I'm not completely sure) is trying to find matches where the string source_data.source contains "feed" and "supermarket" - of which there aren't any ....