Multiple indices are indexing in sequence

Hi

I am using Rally's existing tracks to perform benchmarking. I noticed that nyc taxis is the largest track with 4.5GB compressed and 74.3 GB uncompressed docs. I want to test with larger data volume.

I have used the following trick..
the nyc_taxis document corpus ten times (note the index_count variable at the top):

{% set index_count = 10 %}
{
  "version": 2,
  "description": "Taxi rides in New York in 2015",
  "indices": [
  {% set comma = joiner() %}
  {% for item in range(index_count) %}
  {{ comma() }}
    {
      "name": "nyc_taxis-{{item}}",
      "body": "index.json",
      "types": [ "type" ],
      "auto-managed": false
    }
  {% endfor %}
  ],
  "corpora": [
    {
      "name": "nyc_taxis",
      "base-url": "http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/nyc_taxis",
      "documents": [
      {% set comma = joiner() %}
      {% for item in range(index_count) %}
      {{ comma() }}

i have referred below link for above trick.

My Concern is: when i used above trick .. the ES will have 10 different indices and those are running in sequence ..

how to make that indices to run in parallel so that it can utilize the CPU in an optimal way.

All specified clients will send bulk requests to Elasticsearch as fast as they can, i.e. you control that by varying the number of clients instead of the number of indices that you bulk-index into unless I misunderstand what you're after. Hope that helps. :slight_smile:

Daniel

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.