Hello everyone,
Please take a look at this dummy documents:
[
{
"project": "a",
"run": 1,
"tool_name": "tool_1",
"tool_output": "abc.com"
},
{
"project": "a",
"run": 1,
"tool_name": "tool_1",
"tool_output": "xyz.com"
},
{
"project": "a",
"run": 2,
"tool_name": "tool_1",
"tool_output": "abc.com"
},
{
"project": "a",
"run": 2,
"tool_name": "tool_1",
"tool_output": "xyz.com"
},
{
"project": "a",
"run": 2,
"tool_name": "tool_1",
"tool_output": "new.com"
}
]
I need to find tool output differences between this 2 runs - new tool outputs appearing in run 2. I want to get result like this:
{
"new_tool_outputs_count": 1,
"new_tool_outputs": ["new.com"]
}
In SQL analogy it is something like outer join.
I probably need to use some aggregation query, but I am not sure is something like this possible with elasticsearch. Also, the number of documents for comparison can reach a scale of millions.