Many slow query with high load after a hour


(Weiwei Wang) #1

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields":"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}
}


(Weiwei Wang) #2

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Karussell) #3

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Weiwei Wang) #4

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Clinton Gormley) #5

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall, ulimit -l,
and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this list, so a
quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Weiwei Wang) #6

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall, ulimit -l,
and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this list, so a
quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Clinton Gormley) #7

On Thu, 2011-12-15 at 07:24 -0800, Weiwei Wang wrote:

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

you don't mention whether you are setting: ulimit -l unlimited

without that, mlockall won't work.

clint

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall, ulimit -l,
and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this list, so a
quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Weiwei Wang) #8

[2011-12-15 17:05:54,090][INFO ][bootstrap ]
max_open_files[65510]
i think it's enough.

I've already use es for another project with more than 2620000
documents(10g+) but with low pressure. However, when i reindex all the
documents i find es eat so much memory (5g+), so i set -Xms8g -Xmx8g
for that project. I doubt that es has memory leak.

On Dec 15, 11:42 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 07:24 -0800, Weiwei Wang wrote:

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

you don't mention whether you are setting: ulimit -l unlimited

without that, mlockall won't work.

clint

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall, ulimit -l,
and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this list, so a
quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Clinton Gormley) #9

On Thu, 2011-12-15 at 18:29 -0800, Weiwei Wang wrote:

[2011-12-15 17:05:54,090][INFO ][bootstrap ]
max_open_files[65510]

ulimit -l is to do with locking memory, not open files. that is ulimit
-n

clint

i think it's enough.

I've already use es for another project with more than 2620000
documents(10g+) but with low pressure. However, when i reindex all the
documents i find es eat so much memory (5g+), so i set -Xms8g -Xmx8g
for that project. I doubt that es has memory leak.

On Dec 15, 11:42 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 07:24 -0800, Weiwei Wang wrote:

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

you don't mention whether you are setting: ulimit -l unlimited

without that, mlockall won't work.

clint

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall, ulimit -l,
and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this list, so a
quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"
},
"nets" : {
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"androidAPILevels":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"hMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMin":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"wMax":{
"type" : "integer",
"index" : "not_analyzed",
"store":"no"
},
"models":{
"type" : "string",
"index": "analyzed",
"index_analyzer":"standardAnalyzer",
"search_analyzer":"standardAnalyzer",
"store":"no"
}
}
}

}


(Weiwei Wang) #10

sorry, i misunderstood.
i checked ulimit -l and it shows that the max locked memory is only
32k.

Today, i have changed from es to pure lucene memory index(with
RAMDirectory) and the speed is much faster and only 500m+ memory is
ued. Current the system can response 2000 requests per second.

I will set ulimit -l and test es again, thanks clint

On Dec 16, 5:43 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 18:29 -0800, Weiwei Wang wrote:

[2011-12-15 17:05:54,090][INFO ][bootstrap ]
max_open_files[65510]

ulimit -l is to do with locking memory, not open files. that is ulimit
-n

clint

i think it's enough.

I've already use es for another project with more than 2620000
documents(10g+) but with low pressure. However, when i reindex all the
documents i find es eat so much memory (5g+), so i set -Xms8g -Xmx8g
for that project. I doubt that es has memory leak.

On Dec 15, 11:42 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 07:24 -0800, Weiwei Wang wrote:

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

you don't mention whether you are setting: ulimit -l unlimited

without that, mlockall won't work.

clint

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I tested
another time with 4g maximum memeory, after one hour, the problem
occurs again. I stop jmeter and use top to monitor the es process and
found the memory usage stays on 4.2g and cpu load jump from 20+% to
300%. I was considering using es in my program for a high load web
service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall, ulimit -l,
and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this list, so a
quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and then
garbage collector will be called more often.

not sure why ES will take so much time after an hour. how many shards
do you have + how many CPUs? also try to monitor with ES-DESK or
jvisualvm to see the real RAM usage and CPU load while testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and 100k data
storage, the memory use under pressure is more than 2g(with top under
linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com wrote:

I test my program with high pressure of 1500 request per second and
each request need to do a es query(i use MatchAllQuery and a bunch of
filters as filters can be cached). The size of the index is 106.7kb
with 200 documents. I start my es with paramters: bin/elasticsearch -
Xms2g -Xmx2g -Des.max-open-files=true -Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and from log i can
see lots of slow query, i paste some logs below and wish your help:

[2011-12-15 13:49:22,050][WARN ][index.search.slowlog.query]
[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2], source[{"from":
0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":
{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":
{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":
{"status":1}},"must":{"range":{"from":{"from":null,"to":
1323928153712,"include_lower":true,"include_upper":false}}},"must":
{"range":{"to":{"from":
1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must ":
{"range":{"hMax":{"from":
800,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"hMin":{"from":null,"to":
800,"include_lower":true,"include_upper":true}}},"must":{"range":
{"wMax":{"from":
480,"to":null,"include_lower":true,"include_upper":true}}},"must":
{"range":{"wMin":{"from":null,"to":
480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields" :"id"}
extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"

...

read more »


(Shay Banon) #11

First of all, since you allocated 2gb to ES, it will use it (min/max),
thats why you see it takes 2gb. Second, where do you run jmeter? Is it on
the same box as the elasticsearch instance? If you want, you can dropbox /
shard the data directory of an already indexed data and the jmx jmeter
file, I can have a look.

On Fri, Dec 16, 2011 at 2:53 PM, Weiwei Wang ww.wang.cs@gmail.com wrote:

sorry, i misunderstood.
i checked ulimit -l and it shows that the max locked memory is only
32k.

Today, i have changed from es to pure lucene memory index(with
RAMDirectory) and the speed is much faster and only 500m+ memory is
ued. Current the system can response 2000 requests per second.

I will set ulimit -l and test es again, thanks clint

On Dec 16, 5:43 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 18:29 -0800, Weiwei Wang wrote:

[2011-12-15 17:05:54,090][INFO ][bootstrap ]
max_open_files[65510]

ulimit -l is to do with locking memory, not open files. that is ulimit
-n

clint

i think it's enough.

I've already use es for another project with more than 2620000
documents(10g+) but with low pressure. However, when i reindex all the
documents i find es eat so much memory (5g+), so i set -Xms8g -Xmx8g
for that project. I doubt that es has memory leak.

On Dec 15, 11:42 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 07:24 -0800, Weiwei Wang wrote:

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

you don't mention whether you are setting: ulimit -l unlimited

without that, mlockall won't work.

clint

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I
tested

another time with 4g maximum memeory, after one hour, the
problem

occurs again. I stop jmeter and use top to monitor the es
process and

found the memory usage stays on 4.2g and cpu load jump from
20+% to

300%. I was considering using es in my program for a high load
web

service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall,
ulimit -l,

and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this
list, so a

quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com
wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and
then

garbage collector will be called more often.

not sure why ES will take so much time after an hour. how
many shards

do you have + how many CPUs? also try to monitor with
ES-DESK or

jvisualvm to see the real RAM usage and CPU load while
testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and
100k data

storage, the memory use under pressure is more than
2g(with top under

linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com
wrote:

I test my program with high pressure of 1500 request
per second and

each request need to do a es query(i use MatchAllQuery
and a bunch of

filters as filters can be cached). The size of the index
is 106.7kb

with 200 documents. I start my es with paramters:
bin/elasticsearch -

Xms2g -Xmx2g -Des.max-open-files=true
-Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and
from log i can

see lots of slow query, i paste some logs below and wish
your help:

[2011-12-15 13:49:22,050][WARN
][index.search.slowlog.query]

[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2],
source[{"from":

0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":

{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":

{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":

{"status":1}},"must":{"range":{"from":{"from":null,"to":

1323928153712,"include_lower":true,"include_upper":false}}},"must":

{"range":{"to":{"from":

1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must
":

{"range":{"hMax":{"from":

800,"to":null,"include_lower":true,"include_upper":true}}},"must":

{"range":{"hMin":{"from":null,"to":

800,"include_lower":true,"include_upper":true}}},"must":{"range":

{"wMax":{"from":

480,"to":null,"include_lower":true,"include_upper":true}}},"must":

{"range":{"wMin":{"from":null,"to":

480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields"
:"id"}

extra_source[],

the mapping are:
{
"test":{
"_all" : {
"enabled" : false
},
"properties" : {
"id":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"yes"
},
"pkgs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"lcs":{
"type":"string",
"index":"not_analyzed",
"search_analyzer":"keyword",
"store":"no"
},
"from":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"to":{
"type":"long",
"index":"not_analyzed",
"store":"no"
},
"status":{
"type":"integer",
"index":"not_analyzed",
"store":"no"

...

read more »


(Weiwei Wang) #12

it's ok for es to use all the memory, but the phenomenon is that the
memory is increasing slowly in the pressure progress and finally
reaches the limit, after that slow query log comes. But I have only
200 documents and 100k+ index storage. I change from es to Lucene
memory index and the system can run under high load for a long
time(more than 2 hours) with 500m+ memory.

i have sent the data and jmeter script to your gmail,shay

thanks~

On Dec 17, 1:45 am, Shay Banon kim...@gmail.com wrote:

First of all, since you allocated 2gb to ES, it will use it (min/max),
thats why you see it takes 2gb. Second, where do you run jmeter? Is it on
the same box as the elasticsearch instance? If you want, you can dropbox /
shard the data directory of an already indexed data and the jmx jmeter
file, I can have a look.

On Fri, Dec 16, 2011 at 2:53 PM, Weiwei Wang ww.wang...@gmail.com wrote:

sorry, i misunderstood.
i checked ulimit -l and it shows that the max locked memory is only
32k.

Today, i have changed from es to pure lucene memory index(with
RAMDirectory) and the speed is much faster and only 500m+ memory is
ued. Current the system can response 2000 requests per second.

I will set ulimit -l and test es again, thanks clint

On Dec 16, 5:43 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 18:29 -0800, Weiwei Wang wrote:

[2011-12-15 17:05:54,090][INFO ][bootstrap ]
max_open_files[65510]

ulimit -l is to do with locking memory, not open files. that is ulimit
-n

clint

i think it's enough.

I've already use es for another project with more than 2620000
documents(10g+) but with low pressure. However, when i reindex all the
documents i find es eat so much memory (5g+), so i set -Xms8g -Xmx8g
for that project. I doubt that es has memory leak.

On Dec 15, 11:42 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 07:24 -0800, Weiwei Wang wrote:

i start es with bin/elasticsearch -Xms2g -Xmx2g -Des.max-open-
files=true -Dbootstrap.mlockall=true -p es.pid

you don't mention whether you are setting: ulimit -l unlimited

without that, mlockall won't work.

clint

On Dec 15, 8:00 pm, Clinton Gormley cl...@traveljury.com wrote:

On Thu, 2011-12-15 at 01:03 -0800, Weiwei Wang wrote:

only one elasticsearch instance with one shard, 0 replica. I
tested

another time with 4g maximum memeory, after one hour, the
problem

occurs again. I stop jmeter and use top to monitor the es
process and

found the memory usage stays on 4.2g and cpu load jump from
20+% to

300%. I was considering using es in my program for a high load
web

service project, now i have to consider using pure lucene.

Are you disabling swap, ie configuring bootstrap.mlockall,
ulimit -l,

and ES_MIN/MAX_MEM correctly

There are numerous emails explaining how to do this in this
list, so a

quick search should find some

clint

my es config is shown as:
{
"cluster":{
"name":"es-cluster"
},
"gateway":{
"recover_after_nodes": 1,
"recover_after_time": "30s",
"expected_nodes": 2
},
"network":
{
"host":"0.0.0.0",
"tcp":{
"keep_alive":true,
"send_buffer_size":"50m",
"receive_buffer_size":"50m"
}
},
"transport":{
"tcp":{
"port":"9350-9400",
"keep_alive":true,
"send_buffer_size":"20m",
"receive_buffer_size":"20m",
"connect_timeout":"5s"
}
},
"http":{
"port":"9250-9300"
},
"index" : {
"store":{
"cache":{
"memory":{
"small_buffer_size":"32mb",
"large_buffer_size":"64mb",
"small_cache_size":"512mb",
"large_cache_size":"1g"
}
}
},
"search":{
"slowlog":{
"threshold":{
"query":{
"warn":"1s",
"info":"500ms",
"debug":"200ms",
"trace":"50ms"
},
"fetch":{
"warn":"100ms",
"info":"80ms",
"debug":"50ms",
"trace":"20ms"
}
}
}
},
"number_of_shards":2,
"number_of_replicas":1,
"refresh_interval":"1s",
"term_index_interval":64,
"analysis" : {
"analyzer" : {
"nGramAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball","nGramFilter"]
},
"standardAnalyzer":{
"type":"custom",
"tokenizer":"standard",
"filter":
["standard","lowercase","englishSnowball"]
}
},
"filter":{
"nGramFilter":{
"type":"nGram",
"min_gram":1,
"max_gram":64
},
"edgeNGramFilter":{
"type":"edgeNGram",
"min_gram":1,
"max_gram":64,
"side":"front"
},
"englishSnowball":{
"type":"snowball",
"language":"English"
}
}
}
}
}

On Dec 15, 4:23 pm, Karussell tableyourt...@googlemail.com
wrote:

you can try to give it a lot less memory (e.g. <50MB ?) and
then

garbage collector will be called more often.

not sure why ES will take so much time after an hour. how
many shards

do you have + how many CPUs? also try to monitor with
ES-DESK or

jvisualvm to see the real RAM usage and CPU load while
testing.

Peter.

On 15 Dez., 06:59, Weiwei Wang ww.wang...@gmail.com wrote:

another problem is though there is only 200 documents and
100k data

storage, the memory use under pressure is more than
2g(with top under

linux). I don't know why so much memory is needed.

On Dec 15, 1:55 pm, Weiwei Wang ww.wang...@gmail.com
wrote:

I test my program with high pressure of 1500 request
per second and

each request need to do a es query(i use MatchAllQuery
and a bunch of

filters as filters can be cached). The size of the index
is 106.7kb

with 200 documents. I start my es with paramters:
bin/elasticsearch -

Xms2g -Xmx2g -Des.max-open-files=true
-Dbootstrap.mlockall=true

After a hour, es begins to become slow for query and
from log i can

see lots of slow query, i paste some logs below and wish
your help:

[2011-12-15 13:49:22,050][WARN
][index.search.slowlog.query]

[Nefarius] [dianxin][1] took[8.3s], took_millis[8329],
search_type[QUERY_THEN_FETCH], total_shards[2],
source[{"from":

0,"size":1,"query":{"match_all":{}},"filter":{"bool":{"must":{"term":

{"pkgs":"recommendation.test.pkg.3"}},"must":{"term":
{"lcs":"recommendation.test.lc.3"}},"must":{"term":

{"nets":"1"}},"must":{"term":{"androidAPILevels":"9"}},"must":{"term":

{"status":1}},"must":{"range":{"from":{"from":null,"to":

1323928153712,"include_lower":true,"include_upper":false}}},"must":

{"range":{"to":{"from":

1323928153712,"to":null,"include_lower":false,"include_upper":true}}},"must
":

{"range":{"hMax":{"from":

800,"to":null,"include_lower":true,"include_upper":true}}},"must":

{"range":{"hMin":{"from":null,"to":

800,"include_lower":true,"include_upper":true}}},"must":{"range":

{"wMax":{"from":

480,"to":null,"include_lower":true,"include_upper":true}}},"must":

{"range":{"wMin":{"from":null,"to":

480,"include_lower":true,"include_upper":true}}}}},"explain":false,"fields"
:"id"}

extra_source[],

the mapping are:
{
"test":{

...

read more »


(system) #13