Very slow ElasticSearch Index


(lekkie) #1

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is too
slow especially for the amount of resources we have setup. We will like to
index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can afford
to send multiple http index request to the server at once)? I understand
bulk indexing is a preferred approach, for a 30K document how much can be
bulked at once? How many http bulk request (supposing I am using a
multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast as
possible. We have two nodes set up, the config below is for one out of the
two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" : "rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and it
looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #2

First tip would be to drop OpenJDK and move to Oracle, you'll get a lot
better performance.
Bulk depends a lot on your setup and document size etc, but upwards of 5K
is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs listed in
the discovery hosts, is that intentional? Same for minimum_master_nodes
being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo lekkie.aydot@gmail.com wrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is too
slow especially for the amount of resources we have setup. We will like to
index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can afford
to send multiple http index request to the server at once)? I understand
bulk indexing is a preferred approach, for a 30K document how much can be
bulked at once? How many http bulk request (supposing I am using a
multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast as
possible. We have two nodes set up, the config below is for one out of the
two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" : "rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and it
looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624YfDT4WToSXE8S9Vm1Mkp0hrn45bZuH3cZWrmzKVQssCg%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(lekkie) #3

Thanks for the insight.

First tip would be to drop OpenJDK and move to Oracle, you'll get a
lot better performance.

So I changed to Oracle JDK and latency dropped from 4000millisecond to
around 2500millisecond.

.....................

It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

We also dropped indices.ttl.interval and it further dropped to
1500milliseconds.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for minimum_master_nodes
being 3.
Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but 2 are
for disaster recovery. And the discovery.zen.minimum_master_nodes was
calculated based on the n/2 + 1, where n was 4.

One other thing to note, every request is an upsert.

What we have now is 1500milliseconds per upsert. This is still very high.
We are looking at doing sub-zero millisecond or 10s of millisecond for bulk
upload. Can this be achieved or it is a pipe dream?

On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote:

First tip would be to drop OpenJDK and move to Oracle, you'll get a lot
better performance.
Bulk depends a lot on your setup and document size etc, but upwards of 5K
is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs listed
in the discovery hosts, is that intentional? Same for minimum_master_nodes
being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo <lekkie...@gmail.com<javascript:>

wrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is too
slow especially for the amount of resources we have setup. We will like to
index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can afford
to send multiple http index request to the server at once)? I understand
bulk indexing is a preferred approach, for a 30K document how much can be
bulked at once? How many http bulk request (supposing I am using a
multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast as
possible. We have two nodes set up, the config below is for one out of the
two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" : "rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and it
looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f0be2cb9-9832-4de3-a3ab-ab0843662ae4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #4

ES will only go as fast as the slowest node. With that in mind, are your
"DR" nodes the same capacity?

I also notice they are in different subnets, does that imply they are in
different datacenters?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 18 December 2013 00:44, lekkie omotayo lekkie.aydot@gmail.com wrote:

Thanks for the insight.

First tip would be to drop OpenJDK and move to Oracle, you'll get a
lot better performance.

So I changed to Oracle JDK and latency dropped from 4000millisecond to
around 2500millisecond.

.....................

It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

We also dropped indices.ttl.interval and it further dropped to
1500milliseconds.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.
Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but 2
are for disaster recovery. And the discovery.zen.minimum_master_nodes was
calculated based on the n/2 + 1, where n was 4.

One other thing to note, every request is an upsert.

What we have now is 1500milliseconds per upsert. This is still very high.
We are looking at doing sub-zero millisecond or 10s of millisecond for bulk
upload. Can this be achieved or it is a pipe dream?

On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote:

First tip would be to drop OpenJDK and move to Oracle, you'll get a lot
better performance.
Bulk depends a lot on your setup and document size etc, but upwards of 5K
is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs listed
in the discovery hosts, is that intentional? Same for minimum_master_nodes
being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo lekkie...@gmail.com wrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is too
slow especially for the amount of resources we have setup. We will like to
index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can
afford to send multiple http index request to the server at once)? I
understand bulk indexing is a preferred approach, for a 30K document how
much can be bulked at once? How many http bulk request (supposing I am
using a multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast as
possible. We have two nodes set up, the config below is for one out of the
two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" : "rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and it
looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f0be2cb9-9832-4de3-a3ab-ab0843662ae4%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624ZShZP7WLgCAFHuhFdSGgJF41P3qWVwhB0LCo8hoGNGHA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(lekkie) #5

Yes they all have the same capacity.

Yes, they are in different data centers (off-site).

On Tuesday, 17 December 2013 22:33:07 UTC+1, Mark Walkom wrote:

ES will only go as fast as the slowest node. With that in mind, are your
"DR" nodes the same capacity?

I also notice they are in different subnets, does that imply they are in
different datacenters?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 18 December 2013 00:44, lekkie omotayo <lekkie...@gmail.com<javascript:>

wrote:

Thanks for the insight.

First tip would be to drop OpenJDK and move to Oracle, you'll get a
lot better performance.

So I changed to Oracle JDK and latency dropped from 4000millisecond to
around 2500millisecond.

.....................

It might also be worth removing indices.ttl.interval and just using
a script to delete old indices as TTL searches can use a fair bit of
resources.

We also dropped indices.ttl.interval and it further dropped to
1500milliseconds.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.
Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but 2
are for disaster recovery. And the discovery.zen.minimum_master_nodes
was calculated based on the n/2 + 1, where n was 4.

One other thing to note, every request is an upsert.

What we have now is 1500milliseconds per upsert. This is still very high.
We are looking at doing sub-zero millisecond or 10s of millisecond for bulk
upload. Can this be achieved or it is a pipe dream?

On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote:

First tip would be to drop OpenJDK and move to Oracle, you'll get a lot
better performance.
Bulk depends a lot on your setup and document size etc, but upwards of
5K is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs listed
in the discovery hosts, is that intentional? Same for minimum_master_nodes
being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo lekkie...@gmail.com wrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is too
slow especially for the amount of resources we have setup. We will like to
index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can
afford to send multiple http index request to the server at once)? I
understand bulk indexing is a preferred approach, for a 30K document how
much can be bulked at once? How many http bulk request (supposing I am
using a multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast as
possible. We have two nodes set up, the config below is for one out of the
two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" : "rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and it
looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/f0be2cb9-9832-4de3-a3ab-ab0843662ae4%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a1961bc0-f9f9-46fa-a8e5-59b37fadfb28%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #6

The lag over that inter-DC link is probably causing your issues.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 18 December 2013 18:12, lekkie omotayo lekkie.aydot@gmail.com wrote:

Yes they all have the same capacity.

Yes, they are in different data centers (off-site).

On Tuesday, 17 December 2013 22:33:07 UTC+1, Mark Walkom wrote:

ES will only go as fast as the slowest node. With that in mind, are your
"DR" nodes the same capacity?

I also notice they are in different subnets, does that imply they are in
different datacenters?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 18 December 2013 00:44, lekkie omotayo lekkie...@gmail.com wrote:

Thanks for the insight.

First tip would be to drop OpenJDK and move to Oracle, you'll get a
lot better performance.

So I changed to Oracle JDK and latency dropped from 4000millisecond to
around 2500millisecond.

.....................

It might also be worth removing indices.ttl.interval and just using
a script to delete old indices as TTL searches can use a fair bit of
resources.

We also dropped indices.ttl.interval and it further dropped to
1500milliseconds.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.
Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but 2
are for disaster recovery. And the discovery.zen.minimum_master_nodes
was calculated based on the n/2 + 1, where n was 4.

One other thing to note, every request is an upsert.

What we have now is 1500milliseconds per upsert. This is still very
high. We are looking at doing sub-zero millisecond or 10s of millisecond
for bulk upload. Can this be achieved or it is a pipe dream?

On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote:

First tip would be to drop OpenJDK and move to Oracle, you'll get a lot
better performance.
Bulk depends a lot on your setup and document size etc, but upwards of
5K is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo lekkie...@gmail.com wrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is
too slow especially for the amount of resources we have setup. We will like
to index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can
afford to send multiple http index request to the server at once)? I
understand bulk indexing is a preferred approach, for a 30K document how
much can be bulked at once? How many http bulk request (supposing I am
using a multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast
as possible. We have two nodes set up, the config below is for one out of
the two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" : "rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and
it looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%40goo
glegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/f0be2cb9-9832-4de3-a3ab-ab0843662ae4%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a1961bc0-f9f9-46fa-a8e5-59b37fadfb28%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624a5Rz9dEfvohDWO7NwzaJ%3Dw96Dv8ysah7XhUZ6m_XssJQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(lekkie) #7

Are there any other protocols other than HTTP I can send request over?
Something faster than HTTP? Or do you mean the physical NIC? We run on a LAN

On Wednesday, 18 December 2013 11:49:07 UTC+1, Mark Walkom wrote:

The lag over that inter-DC link is probably causing your issues.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com <javascript:>
web: www.campaignmonitor.com

On 18 December 2013 18:12, lekkie omotayo <lekkie...@gmail.com<javascript:>

wrote:

Yes they all have the same capacity.

Yes, they are in different data centers (off-site).

On Tuesday, 17 December 2013 22:33:07 UTC+1, Mark Walkom wrote:

ES will only go as fast as the slowest node. With that in mind, are your
"DR" nodes the same capacity?

I also notice they are in different subnets, does that imply they are in
different datacenters?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 18 December 2013 00:44, lekkie omotayo lekkie...@gmail.com wrote:

Thanks for the insight.

First tip would be to drop OpenJDK and move to Oracle, you'll get
a lot better performance.

So I changed to Oracle JDK and latency dropped from 4000millisecond to
around 2500millisecond.

.....................

It might also be worth removing indices.ttl.interval and just
using a script to delete old indices as TTL searches can use a fair bit of
resources.

We also dropped indices.ttl.interval and it further dropped to
1500milliseconds.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.
Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but 2
are for disaster recovery. And the discovery.zen.minimum_master_nodes
was calculated based on the n/2 + 1, where n was 4.

One other thing to note, every request is an upsert.

What we have now is 1500milliseconds per upsert. This is still very
high. We are looking at doing sub-zero millisecond or 10s of millisecond
for bulk upload. Can this be achieved or it is a pipe dream?

On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote:

First tip would be to drop OpenJDK and move to Oracle, you'll get a
lot better performance.
Bulk depends a lot on your setup and document size etc, but upwards of
5K is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo lekkie...@gmail.com wrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is
too slow especially for the amount of resources we have setup. We will like
to index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can
afford to send multiple http index request to the server at once)? I
understand bulk indexing is a preferred approach, for a 30K document how
much can be bulked at once? How many http bulk request (supposing I am
using a multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast
as possible. We have two nodes set up, the config below is for one out of
the two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" :
"rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and
it looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/94df7d01-94d5-49f2-b817-f821e4910219%40goo
glegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/f0be2cb9-9832-4de3-a3ab-ab0843662ae4%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a1961bc0-f9f9-46fa-a8e5-59b37fadfb28%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a0bb33c4-e6e9-46c7-aa3c-d261fa5090ae%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Mark Walkom) #8

Intra cluster comms are all handled over HTTP. What is the link between
your DCs like; 100M, 1G, 10G?

You could try using something like logstash to replicate the indexes, that
way you can have two clusters and it should reduce any latency.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: markw@campaignmonitor.com
web: www.campaignmonitor.com

On 19 December 2013 03:10, lekkie omotayo lekkie.aydot@gmail.com wrote:

Are there any other protocols other than HTTP I can send request over?
Something faster than HTTP? Or do you mean the physical NIC? We run on a LAN

On Wednesday, 18 December 2013 11:49:07 UTC+1, Mark Walkom wrote:

The lag over that inter-DC link is probably causing your issues.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 18 December 2013 18:12, lekkie omotayo lekkie...@gmail.com wrote:

Yes they all have the same capacity.

Yes, they are in different data centers (off-site).

On Tuesday, 17 December 2013 22:33:07 UTC+1, Mark Walkom wrote:

ES will only go as fast as the slowest node. With that in mind, are
your "DR" nodes the same capacity?

I also notice they are in different subnets, does that imply they are
in different datacenters?

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 18 December 2013 00:44, lekkie omotayo lekkie...@gmail.com wrote:

Thanks for the insight.

First tip would be to drop OpenJDK and move to Oracle, you'll get
a lot better performance.

So I changed to Oracle JDK and latency dropped from 4000millisecond to
around 2500millisecond.

.....................

It might also be worth removing indices.ttl.interval and just
using a script to delete old indices as TTL searches can use a fair bit of
resources.

We also dropped indices.ttl.interval and it further dropped to
1500milliseconds.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.
Yes, the other 2 nodes are DR nodes. So we basically have 4 nodes but
2 are for disaster recovery. And the discovery.zen.minimum_master_nodes
was calculated based on the n/2 + 1, where n was 4.

One other thing to note, every request is an upsert.

What we have now is 1500milliseconds per upsert. This is still very
high. We are looking at doing sub-zero millisecond or 10s of millisecond
for bulk upload. Can this be achieved or it is a pipe dream?

On Tuesday, 17 December 2013 09:12:52 UTC+1, Mark Walkom wrote:

First tip would be to drop OpenJDK and move to Oracle, you'll get a
lot better performance.
Bulk depends a lot on your setup and document size etc, but upwards
of 5K is generally towards the upper limit.
It might also be worth removing indices.ttl.interval and just using a
script to delete old indices as TTL searches can use a fair bit of
resources.

You also mentioned you have 2 nodes, but there are a lot more IPs
listed in the discovery hosts, is that intentional? Same for
minimum_master_nodes being 3.

Regards,
Mark Walkom

Infrastructure Engineer
Campaign Monitor
email: ma...@campaignmonitor.com
web: www.campaignmonitor.com

On 17 December 2013 18:55, lekkie omotayo lekkie...@gmail.comwrote:

Hi guys,

We index a document at *** milli on ElasticSearch, which I think is
too slow especially for the amount of resources we have setup. We will like
to index at the rate of 500tps. Each document weighs between 20K and 30K.
How many indexes are advisable to be done at once (assuming we can
afford to send multiple http index request to the server at once)? I
understand bulk indexing is a preferred approach, for a 30K document how
much can be bulked at once? How many http bulk request (supposing I am
using a multi-threaded http client ot make requests) is advisable to make?

I will appreciate suggestions and how to index this document as fast
as possible. We have two nodes set up, the config below is for one out of
the two:

Shards: 5
Replica: 1

"nodes" : {
"T5l5mvIdQsW3je7WmSPOcg" : {
"name" : "SEARCH-01",
"version" : "0.90.7",
"attributes" : {
"rack_id" : "prod",
"max_local_storage_nodes" : "1"
},
"settings" : {
"node.rack_id" : "prod",
"action.disable_delete_all_indices" : "true",
"cloud.node.auto_attributes" : "true",
"indices.ttl.interval" : "90d",
"node.max_local_storage_nodes" : "1",
"bootstrap.mlockall" : "true",
"index.mapper.dynamic" : "true",
"cluster.routing.allocation.awareness.attributes" :
"rack_id",
"discovery.zen.minimum_master_nodes" : "3",
"gateway.expected_nodes" : "1",
"discovery.zen.ping.unicast.hosts" :
"172.25.15.170,172.25.15.172,172.46.1.170,172.46.1.172",
"discovery.zen.ping.multicast.enabled" : "false",
"action.auto_create_index" : "true"
},
"os" : {
"refresh_interval" : 1000,
"available_processors" : 8,
"cpu" : {
"vendor" : "Intel",
"model" : "Xeon",
"mhz" : 2600,
"total_cores" : 8,
"total_sockets" : 2,
"cores_per_socket" : 4,
"cache_size" : "20kb",
"cache_size_in_bytes" : 20480
},
"mem" : {
"total" : "17.5gb",
"total_in_bytes" : 18836545536
},
"swap" : {
"total" : "5.8gb",
"total_in_bytes" : 6274670592
}
},
"process" : {
"refresh_interval" : 1000,
"id" : 3459,
"max_file_descriptors" : 64000
},
"jvm" : {
"pid" : 3459,
"version" : "1.7.0_45",
"vm_name" : "OpenJDK 64-Bit Server VM",
"vm_version" : "24.45-b08",
"vm_vendor" : "Oracle Corporation",
"start_time" : 1386953353018,
"mem" : {
"heap_init" : "10.5gb",
"heap_init_in_bytes" : 11301552128,
"heap_max" : "10.4gb",
"heap_max_in_bytes" : 11231821824,
"non_heap_init" : "23.1mb",
"non_heap_init_in_bytes" : 24313856,
"non_heap_max" : "214mb",
"non_heap_max_in_bytes" : 224395264,
"direct_max" : "10.4gb",
"direct_max_in_bytes" : 11231821824
}
},
"thread_pool" : {
"generic" : {
"type" : "cached",
"keep_alive" : "30s"
},
"index" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "200"
},
"get" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"snapshot" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"merge" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"suggest" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"bulk" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "50"
},
"optimize" : {
"type" : "fixed",
"min" : 1,
"max" : 1
},
"warmer" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"flush" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
},
"search" : {
"type" : "fixed",
"min" : 24,
"max" : 24,
"queue_size" : "1k"
},
"percolate" : {
"type" : "fixed",
"min" : 8,
"max" : 8,
"queue_size" : "1k"
},
"management" : {
"type" : "scaling",
"min" : 1,
"max" : 5,
"keep_alive" : "5m"
},
"refresh" : {
"type" : "scaling",
"min" : 1,
"max" : 4,
"keep_alive" : "5m"
}
},
"network" : {
"refresh_interval" : 5000
},
"http" : {
"max_content_length" : "100mb",
"max_content_length_in_bytes" : 104857600
},
"plugins" : [ ]
}
}

The Mapping is dynamically created because we create types daily and
it looks like:
{
"consumers-20131216": {
"properties": {
"requestData": {
"type": "string"
},
"requestTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"responseData": {
"type": "string"
},
"responseTimestamp": {
"type": "date",
"format": "dateOptionalTime"
},
"sequenceId": {
"type": "long"
},
"service": {
"type": "string"
},
"systemResponseCode": {
"type": "string"
},
"systemResponseMessage": {
"type": "string"
},
"transactionComponentTypeId": {
"type": "long"
},
"transactionLogId": {
"type": "long"
},
"user": {
"type": "string"
}
}
}
}

Regards.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/94df7d01-94d
5-49f2-b817-f821e4910219%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/f0be2cb9-9832-4de3-a3ab-ab0843662ae4%40goo
glegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/
msgid/elasticsearch/a1961bc0-f9f9-46fa-a8e5-59b37fadfb28%
40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/a0bb33c4-e6e9-46c7-aa3c-d261fa5090ae%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEM624bx-Fs8ivTatR70yv_FurwsY9VHSo02F4e6LjaELyX1og%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #9