Indexing performance problems


(Thiago Souza) #1

Hello ppl,

I currently experiencing performance problem when indexing. The system

is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated physical

disks (although the work dir is shared in the same disk by the 2 nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is 25% in
total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from all

sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s], breached
threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException: No
commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster, clean up

the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Pablo Borges) #2

I've seen the same problem, which was solved increasing -Xmx to 2G, but also
got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.com wrote:

Hello ppl,

I currently experiencing performance problem when indexing. The system

is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated physical

disks (although the work dir is shared in the same disk by the 2 nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is 25% in
total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from all

sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s], breached
threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException: No
commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster, clean up

the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Thiago Souza) #3

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G, but
also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The system

is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated physical

disks (although the work dir is shared in the same disk by the 2 nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is 25%
in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from all

sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException: No
commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster, clean

up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Shay Banon) #4

You might need to increase it. There is a limit to what can fit into memory.
For example, to provide fast search, terms are loaded in interval to memory,
so a lot of terms means more memory required. There are other aspects like
sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much faster.
How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense then
having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.com wrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G, but
also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is 25%
in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from all

sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException: No
commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster, clean

up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Thiago Souza) #5

Hi Shay,

   First of all, I forgot to mention, I'm using ES 0.10.

   The only query that is made is the 5 most youtube video mentioned in

the last 2h. This query is made every 5 min.

   The work dir is in a local disk.

   I'll restart the indexing process without the youtube report and see

if it lasts longer than 10-20h.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:04, Shay Banon shay.banon@elasticsearch.comwrote:

You might need to increase it. There is a limit to what can fit into
memory. For example, to provide fast search, terms are loaded in interval to
memory, so a lot of terms means more memory required. There are other
aspects like sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much faster.
How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense then
having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.comwrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G, but
also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is 25%
in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from all

sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException:
No commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster, clean

up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Shay Banon) #6

Great!. Can you share the youtube query? Also, can you move to 0.11?

On Wed, Oct 6, 2010 at 10:11 PM, Thiago Souza tcostasouza@gmail.com wrote:

Hi Shay,

   First of all, I forgot to mention, I'm using ES 0.10.

   The only query that is made is the 5 most youtube video mentioned in

the last 2h. This query is made every 5 min.

   The work dir is in a local disk.

   I'll restart the indexing process without the youtube report and see

if it lasts longer than 10-20h.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:04, Shay Banon shay.banon@elasticsearch.comwrote:

You might need to increase it. There is a limit to what can fit into
memory. For example, to provide fast search, terms are loaded in interval to
memory, so a lot of terms means more memory required. There are other
aspects like sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much
faster. How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense then
having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.comwrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G, but
also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is
25% in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from

all sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException:
No commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster,

clean up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Thiago Souza) #7

Hi shay,

 Here is the youtube query: (header.end - header.start is always 2h back

from current time)
{
"query" : {
"bool" : {
"must" : {
"term" : { "links.expanded.domain" : "www.youtube.com" }
},
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
},
"must": {
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

      I'm planning moving to .11, but not yet. I'll try to do it ASAP.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:19, Shay Banon shay.banon@elasticsearch.comwrote:

Great!. Can you share the youtube query? Also, can you move to 0.11?

On Wed, Oct 6, 2010 at 10:11 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hi Shay,

   First of all, I forgot to mention, I'm using ES 0.10.

   The only query that is made is the 5 most youtube video mentioned

in the last 2h. This query is made every 5 min.

   The work dir is in a local disk.

   I'll restart the indexing process without the youtube report and

see if it lasts longer than 10-20h.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:04, Shay Banon shay.banon@elasticsearch.comwrote:

You might need to increase it. There is a limit to what can fit into
memory. For example, to provide fast search, terms are loaded in interval to
memory, so a lot of terms means more memory required. There are other
aspects like sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much
faster. How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense
then having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.comwrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G, but
also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is
25% in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from

all sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception like java.io.IOException:
No commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster,

clean up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Thiago Souza) #8

Shay,

  This query is buggy, I posted the older version, here is the correct

one:
{
"query" : {
"bool" : {
"must" : [{
"term" : { "links.expanded.domain" : "www.youtube.com" }
},{
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}],
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

On Wed, Oct 6, 2010 at 17:24, Thiago Souza tcostasouza@gmail.com wrote:

Hi shay,

 Here is the youtube query: (header.end - header.start is always 2h

back from current time)
{
"query" : {
"bool" : {
"must" : {
"term" : { "links.expanded.domain" : "www.youtube.com" }
},
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
},
"must": {
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

      I'm planning moving to .11, but not yet. I'll try to do it ASAP.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:19, Shay Banon shay.banon@elasticsearch.comwrote:

Great!. Can you share the youtube query? Also, can you move to 0.11?

On Wed, Oct 6, 2010 at 10:11 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hi Shay,

   First of all, I forgot to mention, I'm using ES 0.10.

   The only query that is made is the 5 most youtube video mentioned

in the last 2h. This query is made every 5 min.

   The work dir is in a local disk.

   I'll restart the indexing process without the youtube report and

see if it lasts longer than 10-20h.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:04, Shay Banon shay.banon@elasticsearch.comwrote:

You might need to increase it. There is a limit to what can fit into
memory. For example, to provide fast search, terms are loaded in interval to
memory, so a lot of terms means more memory required. There are other
aspects like sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much
faster. How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense
then having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.comwrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G,
but also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is
25% in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from

all sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception
like java.io.IOException: No commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster,

clean up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Shay Banon) #9

Hey mate,

0.11 will help, but still, the mentioned query will cause all the links to
be loaded to memory (at least per segment in an index) and processed. 1gb
might not be enough... .

But, the good news is that I have been thinking hard on exactly the
scenario you face. Basically, run a "heavy" search, usually with some heavy
lifting components (facets and so on) every once in a while (and possibly,
the result, is indexed as another document. Something like a time series
db.). For this, loading the data into memory is not always desired, and
either loading a stored field, or even parsing the source and fetching the
relevant data is enough.

For that, the ability to access the _source in a script is already there.
In master I have already added _fields option to just load stored fields.

The next step is to enhance the terms facet to allow for a script to
provide the terms. Also, provide reacher options when it comes to scripts
(more lang support). And last, allow to provide complete custom code that
defines a facet.

0.12 will have the above, master already has some of it. 0.11 will at
least give you better caching management of facet fields that I hope will
mean that you will hit the wall in a much later time (and that wall will be
hit eventually when staying with the same number of nodes and same amount of
mem and keep indexing new data).

-shay.banon

On Wed, Oct 6, 2010 at 10:35 PM, Thiago Souza tcostasouza@gmail.com wrote:

Shay,

  This query is buggy, I posted the older version, here is the correct

one:
{
"query" : {
"bool" : {
"must" : [{
"term" : { "links.expanded.domain" : "www.youtube.com" }
},{
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}],
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

On Wed, Oct 6, 2010 at 17:24, Thiago Souza tcostasouza@gmail.com wrote:

Hi shay,

 Here is the youtube query: (header.end - header.start is always 2h

back from current time)
{
"query" : {
"bool" : {
"must" : {
"term" : { "links.expanded.domain" : "www.youtube.com" }
},
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
},
"must": {
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

      I'm planning moving to .11, but not yet. I'll try to do it ASAP.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:19, Shay Banon shay.banon@elasticsearch.comwrote:

Great!. Can you share the youtube query? Also, can you move to 0.11?

On Wed, Oct 6, 2010 at 10:11 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hi Shay,

   First of all, I forgot to mention, I'm using ES 0.10.

   The only query that is made is the 5 most youtube video mentioned

in the last 2h. This query is made every 5 min.

   The work dir is in a local disk.

   I'll restart the indexing process without the youtube report and

see if it lasts longer than 10-20h.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:04, Shay Banon shay.banon@elasticsearch.comwrote:

You might need to increase it. There is a limit to what can fit into
memory. For example, to provide fast search, terms are loaded in interval to
memory, so a lot of terms means more memory required. There are other
aspects like sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much
faster. How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense
then having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.comwrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.com wrote:

I've seen the same problem, which was solved increasing -Xmx to 2G,
but also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that is
25% in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space (from

all sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception
like java.io.IOException: No commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster,

clean up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(Shay Banon) #10

Pushed support for "full" scripted term facet:
http://github.com/elasticsearch/elasticsearch/issues/issue/410.

On Thu, Oct 7, 2010 at 12:58 AM, Shay Banon shay.banon@elasticsearch.comwrote:

Hey mate,

0.11 will help, but still, the mentioned query will cause all the links
to be loaded to memory (at least per segment in an index) and processed. 1gb
might not be enough... .

But, the good news is that I have been thinking hard on exactly the
scenario you face. Basically, run a "heavy" search, usually with some heavy
lifting components (facets and so on) every once in a while (and possibly,
the result, is indexed as another document. Something like a time series
db.). For this, loading the data into memory is not always desired, and
either loading a stored field, or even parsing the source and fetching the
relevant data is enough.

For that, the ability to access the _source in a script is already there.
In master I have already added _fields option to just load stored fields.

The next step is to enhance the terms facet to allow for a script to
provide the terms. Also, provide reacher options when it comes to scripts
(more lang support). And last, allow to provide complete custom code that
defines a facet.

0.12 will have the above, master already has some of it. 0.11 will at
least give you better caching management of facet fields that I hope will
mean that you will hit the wall in a much later time (and that wall will be
hit eventually when staying with the same number of nodes and same amount of
mem and keep indexing new data).

-shay.banon

On Wed, Oct 6, 2010 at 10:35 PM, Thiago Souza tcostasouza@gmail.comwrote:

Shay,

  This query is buggy, I posted the older version, here is the correct

one:
{
"query" : {
"bool" : {
"must" : [{
"term" : { "links.expanded.domain" : "www.youtube.com" }
},{
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}],
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

On Wed, Oct 6, 2010 at 17:24, Thiago Souza tcostasouza@gmail.com wrote:

Hi shay,

 Here is the youtube query: (header.end - header.start is always 2h

back from current time)
{
"query" : {
"bool" : {
"must" : {
"term" : { "links.expanded.domain" : "www.youtube.com" }
},
"must_not" : {
"term" : { "links.expanded" : "http://www.youtube.com/watch?v=" }
},
"must": {
"range": {
"timestamp": {
"from" : "${header.start}",
"to" : "${header.end}",
"include_lower" : false,
"include_upper": true
}
}
}
}
},
"facets" : {
"links" : {
"terms" : {
"field" : "links.expanded",
"script" : "term.contains('watch')",
"size" : 5
}
}
}
}

      I'm planning moving to .11, but not yet. I'll try to do it

ASAP.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:19, Shay Banon shay.banon@elasticsearch.comwrote:

Great!. Can you share the youtube query? Also, can you move to 0.11?

On Wed, Oct 6, 2010 at 10:11 PM, Thiago Souza tcostasouza@gmail.comwrote:

Hi Shay,

   First of all, I forgot to mention, I'm using ES 0.10.

   The only query that is made is the 5 most youtube video

mentioned in the last 2h. This query is made every 5 min.

   The work dir is in a local disk.

   I'll restart the indexing process without the youtube report and

see if it lasts longer than 10-20h.

Regards,
Thiago Souza

On Wed, Oct 6, 2010 at 17:04, Shay Banon <shay.banon@elasticsearch.com

wrote:

You might need to increase it. There is a limit to what can fit into
memory. For example, to provide fast search, terms are loaded in interval to
memory, so a lot of terms means more memory required. There are other
aspects like sorting and faceting that would require more memory as well.

The indexing speed is strange. Simpel tweets should be indexed much
faster. How do you interact with elasticsearch?

Another point, having the "work" dir on a local disk makes more sense
then having it on a remote dir. Local will (usually) be much faster.

-shay.banon

On Wed, Oct 6, 2010 at 9:31 PM, Thiago Souza tcostasouza@gmail.comwrote:

Unfortunately I can not increase it to 2G

On Wed, Oct 6, 2010 at 14:31, Pablo Borges pablort@gmail.comwrote:

I've seen the same problem, which was solved increasing -Xmx to 2G,
but also got index corruption, which required reindexing.

On Wed, Oct 6, 2010 at 1:41 PM, Thiago Souza <tcostasouza@gmail.com

wrote:

Hello ppl,

I currently experiencing performance problem when indexing. The

system is indexing ~30tweets/sec, and after a period (10h-20h) indexing
ElasticSearch stops responding consuming 100% of a cpu's core.
My current setup is:

     2Ghz Quad Core Xeon CPU
     6GB RAM
     Currently, there's ~15.000.000 indexed
     2 elasticsearch node setup:
           Xms256M and Xmx768M
           Default shard configuration
           FS gateway data dir and node work dir in separated

physical disks (although the work dir is shared in the same disk by the 2
nodes)

When ElasticSearch starts consuming 100% of a cpu's core (that
is 25% in total, since it's a quadcore) I get the following in log files:

       First, a serie of OutOfMemoryError: Java heap space

(from all sort of stacktrace points).
Then tons of Long GC collection occurred, took [15.1s],
breached threshold [10s] (with the first number varying from 10s to 60s)
And finally all sort of exception
like java.io.IOException: No commit point data and
org.elasticsearch.transport.SendRequestTransportException

The situation is only normalized after I shutdown the cluster,

clean up the work dir and wait for a full clust recovery.

Any clue anyone?

Regards,
Thiago Souza


(system) #11