Count duplicated field value by doc

Hi All,
I have already aggregate and filter my message into elastic search, and i have now a problem to display the results on a kibana bar chart,
This is the final filtered message:
{
"features" => [
[0] {
"action" => "[Connexion]",
"status" => "Passed"
},
[1] {
"action" => "[Creation_Circuit]",
"status" => "Passed"
},
[2] {
"action" => "[Connexion]",
"status" => "Failed"
}
],
"@version" => "1",
"scenario" => "001_SeL_Scenario_Realisation_Circuit_Nominal",
"@timestamp" => 2019-08-02T11:25:38.730Z
}
{
"features" => [
[0] {
"action" => "[Connexion]",
"status" => "Failed"
}
],
"@version" => "1",
"scenario" => "002_SeL_Scenario_Realisation_Circuit_Depuis_Modele",
"@timestamp" => 2019-08-02T11:25:44.769Z
}

the task_id is the field scenario, and i want to count the number of status (passed /failed) by scenario
here what i do on kibana but the result is not correct, i have 1 status Passed for the first scenario (001_SeL_Scenario_Realisation_Circuit_Nominal) , i must have 2 as result:

someone can help me please?
thank you in advance.

Hi @chaymaa,

The way your data is formatted is making it a bit hard to read. Is this how one of your individual documents looks when you retrieve it from ES?

{
  "features": [
    { "action": "[Connexion]", "status": "Passed" },
    { "action": "[Connexion]", "status": "Passed" }
  ],
  "@version": "1",
  "scenario": "001_SeL_Scenario_Realisation_Circuit_Nominal",
  "@timestamp": "2019-08-02T11:25:38.730Z"
}

If so, it's going to be difficult to visualize it in Kibana with the existing structure because we don't currently have support for querying nested fields, though it's one of our most requested features.

My recommendation would be to structure your data so that each of the features is a separate document in Elasticsearch. This will make it much easier to make a visualization like you've described. For example, you could have a features index:

POST features/_doc
{
  "action": "[Connexion]",
  "status": "Passed",
  "@version": "1",
  "scenario": "001_SeL_Scenario_Realisation_Circuit_Nominal",
  "@timestamp": "2019-08-02T11:25:38.730Z"
}

If each feature were split into a separate document, it would be easy to do a terms aggregation on the scenario, and then get accurate counts for status

1 Like

Hi @lukeelmers,

Thanks for your feedback,

I passed two days looking for the solution of querying nested fields :slight_smile:

Actually this feature seems important to me too and it will be nice if ELK teams plan to implement it in the future versions.

Best regards.