I have data that looks like this:
POST /test/testdoc
{
"path": "/matt/matt-pics",
"file": "dog.jpg",
"mime_type": "image/jpeg",
"META": [
{
"scheme_name": "Pseudo_EXIF_Scheme",
"scheme_data": [
{
"value": "Nikon",
"column_name": "CAMERA_MANUFACTURER"
},
{
"value": "D80",
"column_name": "CAMERA_MODEL"
}
]
}
]
}
and similar examples with different scheme_name and column_name etc.
I want to write term / bucket aggregations showing the number of top-level documents in (ideally) each permutation of scheme , column name and value.
More details at http://pastebin.com/WNRs1P6T
I've also tried flattening the structure a bit, i.e.
META": [
{
"scheme_name": "Pseudo_MP3_Scheme",
"column_name": "ARTIST",
"value": "foo" } ] ...
I've just pulled an all-nighter trying to get this to work so apologies if this isn't clear enough. Would really appreciate any help! TIA.
UPDATE - finally figured out what was wrong!
I didn't want the contents of 'scheme_name' and 'column_name' to show up in search results. So in my mapping (the full one, not the abbreviated one in the pastebin) I had "index": "no" set for them. Removing that and everything started to work - I used this Stack Overflow qn as a guide - http://stackoverflow.com/questions/24800545/how-to-aggregate-over-dynamic-fields-in-elasticsearch
Currently I get each word in the 'value' for each metadata item as a separate bucket, which isn't what I want, but the solution to that is, at indexing time, to also index that field as a 'raw' version, as described here: http://stackoverflow.com/questions/24640117/elasticsearch-aggregation-returns-terms-in-key-but-not-the-complete-field-h
Something else that was helpful along this journey (I originally used dynamic field names) was this article: https://www.elastic.co/blog/found-beginner-troubleshooting#keyvalue-woes
Thanks
Matt