What's best way to drop a field while Remote Reindexing

I need to pull data from one ES Cluster to another ES Cluster, but I want to drop fields. For example, I'd like to drop user or ip address.

  • Can I do this with reindex?
  • Combination of reindex and ingest node?
  • Use painless?

I'm running Elastic Stack v5.3.2

curl -XPOST host_1:9200/_reindex?wait_for_completion=true -d'{
"source": {
  "remote": {
    "host": "http://host_2:9200"
  },
  "index": "logstash-2017.05.31",
  "query": {
    "match": {
      "type": "web-service"
    }
  }
},
"dest": {
  "index": "logstash-web-service-2017.05.31"
}
}'

Thanks,
Rich

1 Like

You can do this by combining reindex with an ingest pipeline.

1 Like

Here's my solution.

Ingest Pipeline:
Created an ingest pipeline with ignore failures since the field is not present in every log line.

PUT _ingest/pipeline/remove-logmsg
{
  "description" : "Remove logmsg pipeline",
  "processors" : [
    {
      "remove" : {
        "field": "logmsg",
        "ignore_failure" : true
      }
    }
  ]
}

Remote Reindex with ingest pipeline:

curl -XPOST host_1:9200/_reindex?wait_for_completion=true -d'{
"source": {
  "remote": {
    "host": "http://host_2:9200"
  },
  "index": "logstash-2017.05.31",
  "query": {
    "match": {
      "type": "web-service"
    }
  }
},
"dest": {
  "index": "logstash-web-service-2017.05.31",
    "pipeline": "remove-logmsg"
}
}'
2 Likes

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.