Hi first time poster, probably won't be last =D.
I think Elasticsearch is cool, but I am stuck in trying to translate SQL to DSL and I was wondering if I could get some help.
Nutshell is the following (using random schema, but feature request is similar).
Assuming a table has the following columns:
table_i_am
- name
- age
- country
With the data
{ "name": "john_doe", "age": 25, "country": "moon" }
{ "name": "john_doe", "age": 25, "country": "sun" }
{ "name": "foo_bar", "age": 18, "country": "moon" }
{ "name": "foo_bar", "age": 28, "country": "sun" }
With the SQL ->
select temp.name, count(1) as counter from (select name, age from table_i_am where name in ( 'john_doe', 'foo_bar',...) group by name, age) as temp group by temp.name limit 10
so the derived table in from is to group by name, age for duplicate of name, age, country into name, age and then outer group by with select is to count on the name.
Now I tried out POST /_sql/translate for above query, but whereas the derived table was giving what I desired as a composite agg; the complete query wasn't. I figured it might be due to composite agg is not currently compatible with pipeline aggregations, so I tried to get the derived table DSL and fiddle on my own using scripting agg.
However I hit an issue bucket script aggregation only working on numeric and at this point I wasn't sure if I was doing something wrong or if it isn't supported so decided to reach out to the forum to get pointers.
"aggregations": {
"by_scripting_group_by": {
"terms": {
"script": "doc['name'].value + '_' + doc['age'].value"
}
}
}
"buckets" : [
{
"key" : "john_doe_25",
"doc_count" : 2
},
{
"key" : "foo_bar_18",
"doc_count" : 1
},
{
"key" : "foo_bar_28",
"doc_count" : 1
}
]
buckets_path must reference either a number value or a single value numeric metric aggregation, got: [StringTerms] at aggregation [by_scripting_group_by]
Thanks a bunch in advance!