Parallel Bulk from multiple source files

I'm trying to build a parallel track, that will have 2 parallel bulk operations from two different files, using the includes-action-and-meta-data so the index\type will be set in the data files.

Will this work the way I want it to ? or there is some kind of limitation or I didn't understand the wiki? also inside the Bulk section, corpora property is a list of names? or just 1 name?

I want to use this track and corpora:

{
  "corpora": [
    {
      "name": "t1",
      "documents": [
        {
          "source-file": "documents-t1.json",
          "document-count": 10000,
          "includes-action-and-meta-data": true
        }
      ]
    },
    {
      "name": "t2",
      "documents": [
        {
          "source-file": "documents-t2.json",
          "document-count": 10000,
          "includes-action-and-meta-data": true
        }
      ]
    }
  ],
  "schedule": [
    {
      "parallel": {
        "tasks": [
          {
            "name": "bulk1",
            "operation-type": "bulk",
            "corpora": "t1",
            "bulk-size": 5000
          },
          {
            "name": "bulk2",
            "operation-type": "bulk",
            "corpora": "t2",
            "bulk-size": 5000
          }
        ]
      }
    }
  ]
}

Thank you very much,
David.

Hi David,

honestly, I never tried what you want to achieve here but this should just work fine with Rally. The only thing that I can see is that you mixed task properties with operation properties. Instead of doing this:

{
  "name": "bulk1",
  "operation-type": "bulk",
  "corpora": "t1",
  "bulk-size": 5000
}

you need to do this (note that we define the operation parameters in a dedicated operation element):

{
  "name": "bulk1",
  "operation": {
    "operation-type": "bulk",
    "corpora": "t1",
    "bulk-size": 5000
  }
}

We also cover this in the docs but we are aware that the way it has to be done is not easy to grasp.

If you want to specify e.g. the number of clients you need to do this on task level (see the docs for other properties of a task):

{
  "name": "bulk1",
  "clients": 4,
  "operation": {
    "operation-type": "bulk",
    "corpora": "t1",
    "bulk-size": 5000
  }
}

This would run the specified operation with four instead of one client.

Daniel

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.