I have nested fields in the elastic search document. I want to delete those objects(nested objects) which have 'created_at' timestamp < Time.now- 6 months. 'created_at' field is inside the nested fields.
This is what I am doing right now. fetch ES docs (using "scroll" in the batches of 100)that match timestamp criteria, delete the nested object from it. POST updated document into ES at the same id(bulk update).
I am using "scroll" cause there is a limit of 10000 docs on the regular search query.
Can we do better than this?
for example, fetch only nested fields in the first place(not the whole doc) and perform the partial updates?
one more doubt.
the docs in which there is no nested object left(cause all the field match timestamp criteria) we need to delete the whole document. Can we do bulk delete in that case?
{
"demand_supply_leads_20200218205905":{
"mappings":{
"leads":{
"dynamic":"false",
"properties":{
"buy_leads":{
"type":"nested",
"properties":{
"city_uuid":{
"type":"keyword"
},
"created_at":{
"type":"integer"
}
}
},
"pg_leads":{
"type":"nested",
"properties":{
"city_uuid":{
"type":"keyword"
},
"created_at":{
"type":"integer"
}
}
}
}
}
}
}
}