3G/s reads with fusion


(None) #1

Just for fun and search! :slight_smile:

Testing with 1.6.0 and Linux

4 Nodes 1.5 billion documents 7 indexes. 1 query with simple "user id" filter and 20 aggregations using doc values.

Unofficial number by eyeballing iotop about 3G/s reads per node. Takes about 20 seconds to calculate all the 20 aggregations.

The cards are capable of 6G/s but that could just be the nature of the query and why it's not any faster.

Also Linux seems to be performing allot better then Windows and mmapfs and all that swap business.


(Mark Walkom) #2

It'd be nice to have some snazzy graphs to go with this :wink:


(Ed) #3

Is this with default configs or have you set it up with any tunnings, JVM, masters/data ....


(None) #4

What would you liek to see if I am allowed from work to provide it...


(Mark Walkom) #5

Some throughput would be nice! For both query times and IO.


(None) #6

This is all default settings.

Running 4 dedicated data nodes, 1 master (this is test environment).
Data nodes are each: 32core (16 hyperthreaded), 128GB RAM. Java 1.8_45 ES_HEAP=30G
Did some minor tweaks to recovery settings but nothing else.

8 "monthly" indexes with 8 shards + 1 replica, with about 250,000,000 documents each. So total is now 1.8 billion documents.
18TB on disk including replicas.
Use doc values as much as possible.

The query is executed as follows...

GET /my-index*/my-type/_search
{
  "size": 0,
  "query": {
    "filtered": {
      "filter": {
        "bool": {
          "must": [
            {
              "term": {
                "my user": {
                  "value": "john.smith@gmail.com"
                }
              }
            }
          ]
        }
      }
    }
  },
  "aggs": {
  // 20 aggregations here...
  }
}

(system) #7