Elasticsearch Saved Search with Group By

Kapila · December 7, 2020, 9:42am

index_name: my_data-2020-12-01
ticket_number: T123 
ticket_status: OPEN 
ticket_updated_time: 2020-12-01 12:22:12   

index_name: my_data-2020-12-01 
ticket_number: T124 
ticket_status: OPEN 
ticket_updated_time: 2020-12-01 12:32:11   

index_name: my_data-2020-12-02 
ticket_number: T123 
ticket_status: INPROGRESS 
ticket_updated_time: 2020-12-02 12:33:12   

index_name: my_data-2020-12-02 
ticket_number: T125 
ticket_status: OPEN 
ticket_updated_time: 2020-12-02 14:11:45

I want to create a saved search with group by ticket_number field get unique doc with latest ticket status (ticket_status). Is it possible?

ylasri · December 7, 2020, 10:41am

You can use top hits aggregation

GET my-logs/_search
{
  "size": 0, 
  "aggs": {
    "tiketId": {
      "terms": {
        "field": "tiketId.keyword",
        "size": 10
      },
      "aggs": {
        "NAME": {
          "top_hits": {
           "sort": [
              {
                "updatedAt": {
                  "order": "desc"
                }
              }
            ],
            "_source": {
              "includes": [ "updatedAt", "tiketId", "status" ]
            },
            "size": 1
          }
        }
      }
    }
  }
}

Kapila · December 7, 2020, 11:21am

How to apply this in saved search?

ylasri · December 7, 2020, 11:32am

This is an aggregation,
If you want to apply same logic in search, you can use collapse

GET my-logs/_search
{
  "query": {
    "match_all": {}
  },
  "collapse": {
    "field": "tiketId.keyword"
  },
  "sort": [
    {
      "updatedAt": {
        "order": "desc"
      }
    }
  ]
}

Kapila · December 7, 2020, 11:55am

ylasri:

GET my-logs/_search
{
  "query": {
    "match_all": {}
  },
  "collapse": {
    "field": "tiketId.keyword"
  },
  "sort": [
    {
      "updatedAt": {
        "order": "desc"
      }
    }
  ]
}

How can I create visualizations using this?

ylasri · December 7, 2020, 11:57am

Check my first answer,
To create such visualisation, use the top hits as a metric

Kapila · December 7, 2020, 11:59am

Yes, I tried. Tried to create a Pie chart but seems top hits allow only numeric fields.

Any workaround?

ylasri · December 7, 2020, 2:14pm

That will not be possible with Pie
I suggest to have a separated index where you maintain the latest status & update date.
How your data is ingested into elasticsearch ?

Kapila · December 8, 2020, 3:04pm

Through a Kafka topic

ylasri · December 8, 2020, 3:08pm

Means you are using any logstash pipeline to sync data into elasticsearch ?
If yes then you can send data into 2 elasticsearch indexes

1 as uou are doing to keep track of each change on you ticket
Seconday index with document_id => %{Ticket_id}, so you will override the exiting document with the latest update

Kapila · December 8, 2020, 3:40pm

Yes, that's what exactly what we are doing but the issue is, since we are creating indices daily, the document is not getting updated in the second index after the created date.

Either it should update the existing document or create new document in new date index but nothing is happening.

ylasri · December 8, 2020, 3:45pm

Oops i see the point
I would suggest to use tranform, but it does not support yet the top hit aggs

ylasri · December 8, 2020, 3:49pm

How big is your dataset to use a daily index for tickets ?

Kapila · December 8, 2020, 5:02pm

Size of a daily index is between 1gb and 1.5 gb.

Unfortunately we don't have much control to change index creation policy (I meant daily index creation) as we are working on client environment.

Also, tried to create indices based on ticket created date but we don't get created time in every payload.

Thank you very much for your support, highly appreciated!

ylasri · December 14, 2020, 11:38am

Hi @Kapila
Here is an example on how to use a scripted aggs to get the top hits inside a tranform, this will definelty help you with your use case

ylasri · December 14, 2020, 11:47am

Here is an example

PUT _transform/tickets_tranform
{
  "source": {
    "index": [
      "tickets"
    ]
  },
  "pivot": {
    "group_by": {
      "ticket_id.keyword": {
        "terms": {
          "field": "ticket_id.keyword"
        }
      }
    },
    "aggregations": {
      "latest_doc": {
        "scripted_metric": {
          "init_script": "state.timestamp_latest = 0L; state.last_doc = ''",
          "map_script": """ 
        def current_date = doc['updated_at'].getValue().toInstant().toEpochMilli();
        if (current_date > state.timestamp_latest)
        {state.timestamp_latest = current_date;
        state.last_doc = new HashMap(params['_source']);}
      """,
          "combine_script": "return state",
          "reduce_script": """ 
        def last_doc = '';
        def timestamp_latest = 0L;
        for (s in states) {if (s.timestamp_latest > (timestamp_latest))
        {timestamp_latest = s.timestamp_latest; last_doc = s.last_doc;}}
        return last_doc
      """
        }
      }
    }
  },
  "description": "Get the latest update per ticket_id",
  "dest": {
    "index": "tickets_tranform"
  },
  "frequency": "5m",
  "sync": {
    "time": {
      "field": "updated_at",
      "delay": "60s"
    }
  }
}

system · January 11, 2021, 11:47am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Retrieve latest documents in Saved Searches Kibana	6	1462	July 6, 2017
Find servers whose last logged event was "error" Elasticsearch	8	895	July 5, 2017
Get the latest data for each day Elastic Observability transforms	6	133	March 21, 2025
How to aggregate data by an field? Elasticsearch	5	4974	July 5, 2017
Double Terms aggregation with top_hits sub-aggregation not working Elasticsearch	5	81	January 9, 2025

Elasticsearch Saved Search with Group By

Related topics