When creating an index in Elasticsearch is there a way to map _id field to incoming data field

My team is using ELK Stack 7.3.2

I have tried many ways to show what we need in Kibana.

Lets say I have some JSON coming in from Kafka topic.
ex. {"processName":"SM1","processType":"serviceManager","status":"UP","update":"2h14m","hostname":"devvm20"}

Then I see a message that says:
{"processName":"SM1","processType":"serviceManager","status":"DOWN","update":"2h14m","hostname":"devvm20"}

Here is what I have tried to create an index in Elasticseach.
curl -X PUT "nyuatalcasvm01.sscnydirect.local:9200/boris-index?pretty" -H 'Content-Type: application/json' -d'
{
"mappings": {
"properties": {
"processName": { "type": "keyword", "index": false, "fielddata": true },
"hostname":{"type"},
"processType": { "type": "text", "fielddata": true },
"status":{"type":"text"},
"update":{"type":"text"},
}
}
}
'
Is there a way for me to set _id from field processName.

All I want to is to get the latest record from the message that will be sent via Kafka and not all of the records that come in.

Basically the same thing as lets say as
SELECT * FROM my_table GROUP BY (processName)

After that I would like to see if the process is UP or DOWN.

You should be able to do that using an ingest node pipeline.

Yes this can be done with an ingest pipleline.

PUT _ingest/pipeline/my_pipeline
{
    "processors": [
      {
        "set" : {
          "field" : "_id",
          "value" : "{{processName}}"
        }
      }
    ]
}


POST foo/_doc?pipeline=my_pipeline&refresh=true
{
  "processName": "SM1",
  "processType": "serviceManager",
  "status": "DOWN",
  "update": "2h14m",
  "hostname": "devvm20"
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.