I have an existing index datalog.gl_sep20 and I want to transform to a new index based on some fields and do a sum aggregation based on "count" field. This is how I do it:
curl -X PUT "localhost:9200/_transform/transform_job?pretty" -d'
{
"source": {
"index": "datalog.gl_sep20"
},
"dest" : {
"index" : "datalog.transformed_sep20"
},
"pivot": {
"group_by": {
"domain": { "terms": { "field": "domain" } },
"os": { "terms": { "field": "os" }},
"time": { "date_histogram": { "field": "time", "fixed_interval": "24h"}}
},
"aggregations": {
"count": { "sum": { "field": "count" }}
}
}
}
'
Then I start the transform job curl -X POST "localhost:9200/_transform/transform_job/_start?pretty"
The transforming does not return any error and works fine. However, when I do a sanity check on the total value of "count" on the original index and transformed index, they are not equal.
Eg original index return a count of 3731:
curl -X POST "localhost:9200/datalog.gl_sep20/_search?size=0&pretty" -d'
{
"aggs": {
"total_count": { "sum": { "field": "count" } }
}
}
'
while the transformed index return a total_count of 2724.
curl -X POST "localhost:9200/datalog.transformed_sep20/_search?size=0&pretty" -d'
{
"aggs": {
"total_count": { "sum": { "field": "count" } }
}
}
'
What might be the cause in this discrepancies in the total count? Shouldn't the transformed index have the same value as the original index since it does sum aggregation on the "count" field?