Testing distributed characteristic of Elasticsearch

Luke_Laird · December 14, 2014, 5:31pm

Hi guys,
Don't get me wrong. This is absolutely not another post about benchmark of
Elasticsearch.
First, I am pretty new to ES. Please be patient if I ask dumb questions. I
am doing a test for academic use only that proving ES's distributed
characteristic is an improvement over Lucene, which is the base of ES. I
want to test that with more than 1 node, the time we get from a search
query is shorter or 'faster'. It is clear that with 2 nodes ( 2 hard disks
) we could get double bandwidth in theory ( each normal disk peak at ~
50MB/s < 128MB = 1Gb of Ethernet so Ethernet is not a bottle neck).

I have 2 physical nodes ( normal laptop ) connected directly via 1Gb
Ethernet port, no router in between. My data is 20GB ( + 20 GB replica) of
3 million records like this : http://pastebin.com/FDhfy6C3
( the source of data I get is http://www.mockaroo.com/67e33320 )

My strategy is to write as many as possible search queries and at the same
time clear the cache. Something like

curl -XPOST "http://192.168.57.103:9200/myjson/_cache/clear"

curl -XPOST "http://192.168.57.103:9200/myjson/_flush?force=true"

curl -XGET "http://192.168.57.103:9200/myjson/myjson/_search?pretty" -d
'{
"query" : {
"bool" : {
"should" : [
{ "match" : { "first_name" : "Clarence"}},
{ "match" : { "last_name" : "Fernandez"}},
{ "match" : { "country": "uk" }},
{ "match" : { "amount": "$9001.19" }},
{ "match" : { "password_hash": "Th94hnXtaYtZ" }}
]
}
}
}'

I am writing a script to generate as many as possible those match fields but I still want to ask if what I am doing is right?
Any comment/opinion is really appreciated.
Thanks.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e0eb8437-a629-4e10-85da-9b9da0076c45%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
Query Performance Elasticsearch	11	1826	July 6, 2017
How to test ES cluster speedup Elasticsearch	1	352	July 6, 2017
Further optimization to ES queries / performance Elasticsearch	1	343	September 3, 2020
Design a write performance test for elasticsearch Elasticsearch	3	541	March 9, 2018
Elasticsearch performance questions Elasticsearch	1	329	July 6, 2017

Testing distributed characteristic of Elasticsearch

Related topics