Does Rally support benchmarking partial updates (doc_as_upsert)


(Zixuan Liu) #1

Hi team,

I have a use case where I need to do partial updates to a document.

Example

POST /index/mapping/1/_update
{
"doc" : {
"name" : "A",
},
"doc_as_upsert" : true
}

POST /index/mapping/1/_update
{
"doc" : {
"age" : 23,
},
"doc_as_upsert" : true
}

output
{"id" : 1, "name": "A", "age": 20}

Can rally support this? If not, do you have any recommendations on how to benchmark this scenario?

Thanks


(Daniel Mitterdorfer) #2

Hi @Zixuan_Liu,

Rally does not support the update API out of the box. However, you can write a so-called custom runner and use the Python client's update() method.

However, that solves only part of the problem. You also need a way to specify the documents that you want to update. This is what so-called parameter sources are meant for. It depends a bit how you provide the documents to upsert.

  • The simplest case would be to use just a dict or to generate the documents on the fly. Then you can write a parameter source similar to the one in the documentation.
  • Alternatively you could also read them from a file. However, this is more complicated, especially if you want to use this with multiple clients. For bulk-indexing Rally provides a pretty generic solution that allows you to read portions a file with an arbitrary number of clients but it is by far the most complex parameter source. In your special case, however, you might get away with a less generic solution: You could create one file per client. If you want to go down that route I think our bulk parameter source could still serve as a starting point.

I hope that helps you to get started.

Daniel


(Zixuan Liu) #3

Thanks for the insights. Will start from there


(system) #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.