Elasticsearch 0.17.4 index uses a lot more disk than 0.16.5?


(Jamshid) #1

Hi, I upgraded from 0.16.5 to current 0.17.4 so I could use
lowercase_expanded_terms in a uri request:

curl -i 'http://localhost:9200/_search?
pretty=yes&default_operator=AND&lowercase_expanded_terms=false&q=context:mybucket
%20name:Foo/*'

I recreated my index:

curl -XPUT 'localhost:9200/myindex' -d '{ "mappings" :{ "files" :
{ "_source" : { "enabled" : true },"properties" : { "name" :
{ "type" : "string", "analyzer" : "keyword" }, "context" : { "type" :
"string", "analyzer" : "keyword" }}}}}'

and repopulated it with 200 million records, and the index size is
about 80% larger than the same index on 0.16.5. This is both "data"
disk usage and from node stats.

I don't see anything mentioned in the 0.17.x release notes about index
size. Is this expected, maybe because of Lucene upgrade?

Thanks,
Jamshid


(Shay Banon) #2

This is probably because of failing to clean the transaction logs properly.
The fix for it was not full in 0.17.4, and will be part of 0.17.5 (will be
released today).

On Thu, Aug 11, 2011 at 6:48 PM, Jamshid jamshid69@gmail.com wrote:

Hi, I upgraded from 0.16.5 to current 0.17.4 so I could use
lowercase_expanded_terms in a uri request:

curl -i 'http://localhost:9200/_search?

pretty=yes&default_operator=AND&lowercase_expanded_terms=false&q=context:mybucket
%20name:Foo/*'

I recreated my index:

curl -XPUT 'localhost:9200/myindex' -d '{ "mappings" :{ "files" :
{ "_source" : { "enabled" : true },"properties" : { "name" :
{ "type" : "string", "analyzer" : "keyword" }, "context" : { "type" :
"string", "analyzer" : "keyword" }}}}}'

and repopulated it with 200 million records, and the index size is
about 80% larger than the same index on 0.16.5. This is both "data"
disk usage and from node stats.

I don't see anything mentioned in the 0.17.x release notes about index
size. Is this expected, maybe because of Lucene upgrade?

Thanks,
Jamshid


(system) #3