Is Elastic Search Supports Partition by Column and Overwrite Specific Partition data within the Same Index?

dillibabu.mekala · July 23, 2020, 7:32am

Okay, as you said assume i have created a carid (where carid <car_name>__modelnumber) locate single document uniquely.
I have read the Elastic Search API where we have 2-options to update data.

Update API (to update single record by id (carid))
Update by query, but it wont feasible if we are updating with different data for each document.

I observed that Update API will supports only single document at a time to update. But i would want to update the 1-million records through spark then how can we update is my question, but Update API will supports only single document at a time.

I have explored one more way as well, where i can delete some of the data (using delete by query) as per our requirement and append with new data. But delete by query is not performing as expected and it is not deleting all the data if we have millions of records.

Here is the sample code that i have executed in Kibana.

POST /carsdata/_delete_by_query
{
"query": {
"bool": {
"must": [
{
"match": {
"carname": "BMW4"
}
},
{
"match": {
"enginetype": "Petrol"
}
}
]
}
}
}

Could you please help me out how to delete some data (might be millions) based on the condition without any failures. and how to integrate this with Spark Elastic Search API if delete is working fine without any problem.

Topic		Replies	Views
Partial Updates on index using Spark Streaming Elasticsearch	1	530	July 4, 2018
Unable to overwrite dataframe when documents have parent Elasticsearch es-hadoop	4	1103	August 29, 2017
How to update about JavaEsSpark.saveToEs Elasticsearch es-hadoop	1	1493	December 1, 2017
Delete index from elasticsearch thru spark scala code Elasticsearch es-hadoop	2	3197	July 20, 2017
Spark RDD.saveToES Elasticsearch es-hadoop	4	5455	July 6, 2017

Is Elastic Search Supports Partition by Column and Overwrite Specific Partition data within the Same Index?

Related topics