Sending request to one index, writing to multiple indices

Aditya_Teltia · July 11, 2023, 8:50am

I have a index named index1. I want to configure it such that any write/update request that comes to index1 gets written to both index1 and index2 but any search request still uses index1. Is this possible with some existing settings in Elasticsearch 7.17.10 or do I have to implement it inside the Elasticsearch? If yes, How can I go by implementing this ?

Christian_Dahlqvist · July 11, 2023, 8:51am

That is something I believe you need to implement outside of Elasticsearch.

Aditya_Teltia · July 11, 2023, 10:22am

Can you suggest some way to achieve this ? I am having hard time finding anything to implement the same.

Christian_Dahlqvist · July 11, 2023, 10:26am

It is not possible within Elasticsearch, so you may need to create a proxy of some kind to intercept and duplicate requests.

Aditya_Teltia · July 11, 2023, 10:37am

Is it possible to implement this inside Elasticsearch maybe by creating a new pipeline processor something like this:

{
            "processors": [
                {
                    "set": {
                        "field": "_index",
                        "value": [index1 , index2]
                    }
                }
            ]
        }

Or will this result in some errors ?

Christian_Dahlqvist · July 11, 2023, 10:38am

No, I do not think you can do this within Elasticsearch.

Aditya_Teltia · July 11, 2023, 10:44am

May I know why Elasticsearch doesn't allow to do something like this? Will this result in some sort of error conditions ?

Christian_Dahlqvist · July 11, 2023, 11:10am

I do not know why but suspect it could cause issues where a single index request could be partially successful, which is something that is currently not possible as far as I know. I also guess it could have security implications.

Aditya_Teltia · July 11, 2023, 11:39am

But I am trying to write to an existing index index1 and a new index index2. I don't think this can possibly result in having security implications. I want to know can I add this feature inside Elasticsearch Github codebase. Not saying using any existing feature.

Christian_Dahlqvist · July 11, 2023, 11:41am

I do not know if this would at all be possible or whether it would break something, so will leave that for others.

btsinfo · July 11, 2023, 12:42pm

To my knowledge, no, there is no specific configuration required to handle your use case. However, you can use a transform to write each entry from Index 1 to Index 2. Please note that this approach does not handle data updates. On the other hand, you can use Logstash to perform such processing and manage additions and updates between the two indexes based on the document_id. This approach is suitable for a moderate volume of source data but may not be efficient for a very large volume of source data. I believe it would be helpful if you could share your specific business use case to propose a solution more effectively.

Aditya_Teltia · July 11, 2023, 12:45pm

I want to perform it for moderate volume of data only. While index1 is going under snapshot and restore only till then I want all the data updates or additions to be written to both the indices index1 and index2. How can I do it using logstash ? Or is there any other way to do the same ?

Can you describe this approach a little bit more in detail.

leandrojmp · July 11, 2023, 12:49pm

This is not possible by design, it would add a lot of complexity and can cause multiple issues, I don't think a feature request to add this would be considered.

There are performance issue, management issue, security issues and probably a lot more.

What you want to do can be easily done outside Elasticsearch, but you would need to change how you index your data.

If you use Logstash you can have two Elasticsearch outputs, each one pointing to one of the indices and for the search you could use an alias, that would only point to one of the indices.

Can you provide more context on why you would want to do that? It is not clear.

Aditya_Teltia · July 11, 2023, 12:57pm

I want to segregate the updates and addition that happens during the process while still being able to query in index1 with no data inconsistency.

btsinfo · July 11, 2023, 1:05pm

You can use a transform of type 'latest' to achieve the desired behavior where each document in Index 1 is written to Index 2 after a configurable duration, for example, 60 seconds. However, it is important to configure the unique keys and sort fields correctly to meet your requirements

leandrojmp · July 11, 2023, 1:05pm

And how writing to two indices would help with that?

Assume that you are using Logstash to write data to both index_1 and index_2, and want to create a snapshot of index_1, everything written after your snapshot request will be added to both index_1 and index_2, no matter what indice you query it will return the same data.

Aditya_Teltia · July 11, 2023, 1:09pm

index_1 initially have huge amount of data which I am transferring to cluster2 using snapshot and restore. While the process is ongoing. I want those updates and writes segregated from the remaining data.

leandrojmp · July 11, 2023, 1:30pm

If you are writing into two indices, like index_1 and index_2, any document added to index_1 will also be added to index_2, so you will have new writes in the index_1 that will not be present in the current snapshot.

Same thing with updates, unless you update just one of the indices, but again, this will make your data inconsistent between the indices.

As already explained in your other post, I don't think you can achieve what you want without any downtime

Aditya_Teltia · July 11, 2023, 1:39pm

I am not initially writing to both the indices. I am writing to index_1 initially then after running snapshot and restore. I want to write to both index_1 and index_2 to store the updates and addition segregated and stored in index_2 while I am still able to query data from index_1 this will not result in data inconsistency.

Christian_Dahlqvist · July 11, 2023, 1:45pm

If you are performing updates where the existing document is modified instead of overwritten I do not believe this statement is true. Even if you perform updates by overwriting I suspect there would be race conditions where you would have inconsistencies, but it may be less likely.

Topic		Replies	Views
Send same document to multiple elasticsearch indexes Logstash	5	1591	April 9, 2019
Write to multiple indices via one alias Elasticsearch	9	5139	July 6, 2017
One alias with multiple indices Elasticsearch	3	20894	July 6, 2017
Writing to multiple indices and documentation about it Elasticsearch es-hadoop	1	931	May 17, 2017
Updating docs in multiple indices Elasticsearch	2	442	May 7, 2019

Sending request to one index, writing to multiple indices

Related topics