ES Index performance

Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

Does this happen with search request, where you see the old data? By
default, elasticsearch will refresh an index to see newly indexed docs (or
deletes) every seconds. Can you use the index stats API to see if there was
a bump in how long it took to refresh (there is refresh stats there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.mazor@gmail.com wrote:

Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data? By
default, elasticsearch will refresh an index to see newly indexed docs (or
deletes) every seconds. Can you use the index stats API to see if there was
a bump in how long it took to refresh (there is refresh stats there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com wrote:

Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data? By
default, elasticsearch will refresh an index to see newly indexed docs (or
deletes) every seconds. Can you use the index stats API to see if there was
a bump in how long it took to refresh (there is refresh stats there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com wrote:

Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

It makes little sense to use query_string as a filter, I suggest you don't
do that. But, even when using it as a filter, you should still see changes.
Can you verify its not the query? i.e. just search for a document recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor oren.mazor@gmail.com wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data? By
default, elasticsearch will refresh an index to see newly indexed docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com
wrote:

Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon kim...@gmail.com wrote:

It makes little sense to use query_string as a filter, I suggest you don't
do that. But, even when using it as a filter, you should still see changes.
Can you verify its not the query? i.e. just search for a document recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor oren.ma...@gmail.com wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data? By
default, elasticsearch will refresh an index to see newly indexed docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com
wrote:

Hi all,

We've deployed elasticsearch in our production and we're incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some cases
it can take up to half an hour before the proper (latest) version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to test. But
I'd love to hear ideas.

thanks!

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor oren.mazor@gmail.com wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon kim...@gmail.com wrote:

It makes little sense to use query_string as a filter, I suggest you
don't
do that. But, even when using it as a filter, you should still see
changes.
Can you verify its not the query? i.e. just search for a document
recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor oren.ma...@gmail.com
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed
docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats
there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some
cases
it can take up to half an hour before the proper (latest)
version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so
it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon kim...@gmail.com wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor oren.ma...@gmail.com wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon kim...@gmail.com wrote:

It makes little sense to use query_string as a filter, I suggest you
don't
do that. But, even when using it as a filter, you should still see
changes.
Can you verify its not the query? i.e. just search for a document
recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor oren.ma...@gmail.com
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed
docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats
there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some
cases
it can take up to half an hour before the proper (latest)
version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so
it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon kim...@gmail.com wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API? How many clients are indexing the data?

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Hello Oren,

Im having a similar problem, meaning ES is nearly unresponsive during
the index of a large batch insert - in my case its about 1300 to 1600
batch items per batch insert following a whole index drop and create
cycle. Can you please tell me what the best batch size was so you
didnt encounter any delays on the system?

Best

KB

On 25 Jan., 07:54, Oren Mazor oren.ma...@gmail.com wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon kim...@gmail.com wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor oren.ma...@gmail.com wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon kim...@gmail.com wrote:

It makes little sense to use query_string as a filter, I suggest you
don't
do that. But, even when using it as a filter, you should still see
changes.
Can you verify its not the query? i.e. just search for a document
recently
added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor oren.ma...@gmail.com
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed
docs
(or
deletes) every seconds. Can you use the index stats API to see if
there was
a bump in how long it took to refresh (there is refresh stats
there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional
issues
where ES seems to return an older version of a record. In some
cases
it can take up to half an hour before the proper (latest)
version of
a
record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so
it's
not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

The optimal batch size is really dependent on what you index. Indexing 100 items with 1mb size is different than indexing 100 items with 1k size. Also, it depends on how many concurrent clients are issuing the bulk requests.

On Monday, February 6, 2012 at 2:05 PM, K.B. wrote:

Hello Oren,

Im having a similar problem, meaning ES is nearly unresponsive during
the index of a large batch insert - in my case its about 1300 to 1600
batch items per batch insert following a whole index drop and create
cycle. Can you please tell me what the best batch size was so you
didnt encounter any delays on the system?

Best

KB

On 25 Jan., 07:54, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :slight_smile:

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon kim...@gmail.com wrote:

On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API? How many clients are indexing the data?

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Shay,

Is there a max in data size or data size / node? All valid rules of thumb
are welcome.

On Tue, Feb 7, 2012 at 5:05 AM, Shay Banon kimchy@gmail.com wrote:

The optimal batch size is really dependent on what you index. Indexing
100 items with 1mb size is different than indexing 100 items with 1k size.
Also, it depends on how many concurrent clients are issuing the bulk
requests.

On Monday, February 6, 2012 at 2:05 PM, K.B. wrote:

Hello Oren,

Im having a similar problem, meaning ES is nearly unresponsive during
the index of a large batch insert - in my case its about 1300 to 1600
batch items per batch insert following a whole index drop and create
cycle. Can you please tell me what the best batch size was so you
didnt encounter any delays on the system?

Best

KB

On 25 Jan., 07:54, Oren Mazor oren.ma...@gmail.com wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon kim...@gmail.com wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor oren.ma...@gmail.com wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon kim...@gmail.com wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor oren.ma...@gmail.com

wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor oren.ma...@gmail.com wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon kim...@gmail.com wrote:

Does this happen with search request, where you see the old data?

By

default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor oren.ma...@gmail.com

wrote:

Hi all,

We've deployed elasticsearch in our production and we're

incredibly

happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to

test. But

I'd love to hear ideas.

thanks!

Going back to your question, do you see that issuing a Get (which is realtime) does not return the correct version of the data? I would be helpful to understand where the stalling is coming from. If a "get" does not return your expect version of the data, it means that it didn't get indexed, so you will need to look at the indexer code and see if maybe something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough resources on the machine CPU/Mem, overloading the machines you have in the cluster, GC… .

Which JVM version are you using? Are you running on EC2? If so, which instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM, Oren Mazor wrote:

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :slight_smile:

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API? How many clients are indexing the data?

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

issues

where ES seems to return an older version of a record. In some

cases

it can take up to half an hour before the proper (latest)

version of

a

record shows up. We have two nodes with 10 shards each with one
replica, and the index is about 30m records and 25gb in size, so

it's

not the smallest :slight_smile:

This is pretty hard to reproduce, so its relatively hard to
test. But
I'd love to hear ideas.

thanks!

Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :slight_smile:

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops up…?

On Feb 7, 2:00 pm, Shay Banon kim...@gmail.com wrote:

Going back to your question, do you see that issuing a Get (which is realtime) does not return the correct version of the data? I would be helpful to understand where the stalling is coming from. If a "get" does not return your expect version of the data, it means that it didn't get indexed, so you will need to look at the indexer code and see if maybe something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough resources on the machine CPU/Mem, overloading the machines you have in the cluster, GC… .

Which JVM version are you using? Are you running on EC2? If so, which instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM, Oren Mazor wrote:

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :slight_smile:

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok? When you say 10k inserts per minute, is that using the bulk API? How many clients are indexing the data?

I'm still having some difficulty wrapping my head around the algorithm
in the bottom end. the refresh total_time is 17h and merges is 14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of 10
insertions a second, I now end up doing one bulk insertion every ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week, but
tomorrow I'll do some bigger load testing with our big index. it seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to see changes,
see how memory is behaving. Though you way you have a 30 minute "pause",
which is strange. Did you check the refresh stats? Also, when this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I suggest you

don't

do that. But, even when using it as a filter, you should still see

changes.

Can you verify its not the query? i.e. just search for a document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but not the
response. now that I think of it), but running my query shows no
record is available for that item.

all of our records are virtually the same size (about 1kb), and the
most insertions we'd be seeing is 10-20 per second. occasionally that
might go up to 50.

how often does refresh happen by default, and how long does it take?

I'm wondering if 10 shards is not enough for the size of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Does this happen with search request, where you see the old data?
By
default, elasticsearch will refresh an index to see newly indexed

docs

(or

deletes) every seconds. Can you use the index stats API to see if

there was

a bump in how long it took to refresh (there is refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor <oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and we're
incredibly
happy with search performance. However, we're seeing occasional

...

read more »

hi,Oren Mazor
did you checked the response of the bulk operation,are they all successful
indexed?
and also check your translog status.
also manually refresh the index and to see if can get the current version

-----Original Message-----
From: Oren Mazor
Sent: Friday, February 10, 2012 2:25 AM
To: elasticsearch
Subject: Re: ES Index performance

Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :slight_smile:

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops upÂ…?

On Feb 7, 2:00 pm, Shay Banon kim...@gmail.com wrote:

Going back to your question, do you see that issuing a Get (which is
realtime) does not return the correct version of the data? I would be
helpful to understand where the stalling is coming from. If a "get" does
not return your expect version of the data, it means that it didn't get
indexed, so you will need to look at the indexer code and see if maybe
something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough
resources on the machine CPU/Mem, overloading the machines you have in the
cluster, GCÂ… .

Which JVM version are you using? Are you running on EC2? If so, which
instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM, Oren Mazor wrote:

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :slight_smile:

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <kim...@gmail.com (http://gmail.com)>
wrote:

On Saturday, February 4, 2012 at 3:18 AM, Oren Mazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok?
When you say 10k inserts per minute, is that using the bulk API? How
many clients are indexing the data?

I'm still having some difficulty wrapping my head around the
algorithm
in the bottom end. the refresh total_time is 17h and merges is
14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go
faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start
with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)>
wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM, Oren Mazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of
10
insertions a second, I now end up doing one bulk insertion every
ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I
shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week,
but
tomorrow I'll do some bigger load testing with our big index. it
seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com
(http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to
see changes,
see how memory is behaving. Though you way you have a 30
minute "pause",
which is strange. Did you check the refresh stats? Also, when
this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM, Oren Mazor
<oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be
there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm
wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on
the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com
(http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I
suggest you

don't

do that. But, even when using it as a filter, you should
still see

changes.

Can you verify its not the query? i.e. just search for a
document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM, Oren Mazor
<oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the
AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am, Oren Mazor <oren.ma...@gmail.com
(http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but
not the
response. now that I think of it), but running my
query shows no
record is available for that item.

all of our records are virtually the same size (about
1kb), and the
most insertions we'd be seeing is 10-20 per second.
occasionally that
might go up to 50.

how often does refresh happen by default, and how long
does it take?

I'm wondering if 10 shards is not enough for the size
of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com
(http://gmail.com)> wrote:

Does this happen with search request, where you see
the old data?
By
default, elasticsearch will refresh an index to see
newly indexed

docs

(or

deletes) every seconds. Can you use the index stats
API to see if

there was

a bump in how long it took to refresh (there is
refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM, Oren Mazor
<oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and
we're
incredibly
happy with search performance. However, we're
seeing occasional

...

read more »

Yup. the bulk operations are all okay, at least as far as the http
response is concerned.

I'm almost certain that my problem is just that we're hitting some
resource limit for the size of our index (40gb), but I cant figure out
where to find the blockage. I'm watching the stats on the cluster and
seeing nothing other than flat/healthy usage.

I am seeing a higher than normal read/write activity over the past 24
hours (huge number of documents added)

On Feb 9, 9:15 pm, medcl2...@gmail.com wrote:

hi,OrenMazor
did you checked the response of the bulk operation,are they all successful
indexed?
and also check your translog status.
also manually refresh the index and to see if can get the current version

-----Original Message-----
From:OrenMazor
Sent: Friday, February 10, 2012 2:25 AM
To: elasticsearch
Subject: Re: ES Index performance

Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :slight_smile:

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops up ?

On Feb 7, 2:00 pm, Shay Banon kim...@gmail.com wrote:

Going back to your question, do you see that issuing a Get (which is
realtime) does not return the correct version of the data? I would be
helpful to understand where the stalling is coming from. If a "get" does
not return your expect version of the data, it means that it didn't get
indexed, so you will need to look at the indexer code and see if maybe
something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough
resources on the machine CPU/Mem, overloading the machines you have in the
cluster, GC .

Which JVM version are you using? Are you running on EC2? If so, which
instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM,OrenMazor wrote:

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :slight_smile:

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <kim...@gmail.com (http://gmail.com)>
wrote:

On Saturday, February 4, 2012 at 3:18 AM,OrenMazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok?
When you say 10k inserts per minute, is that using the bulk API? How
many clients are indexing the data?

I'm still having some difficulty wrapping my head around the
algorithm
in the bottom end. the refresh total_time is 17h and merges is
14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go
faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start
with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)>
wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM,OrenMazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of
10
insertions a second, I now end up doing one bulk insertion every
ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I
shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week,
but
tomorrow I'll do some bigger load testing with our big index. it
seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com
(http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to
see changes,
see how memory is behaving. Though you way you have a 30
minute "pause",
which is strange. Did you check the refresh stats? Also, when
this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM,OrenMazor
<oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be
there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm
wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on
the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com
(http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I
suggest you

don't

do that. But, even when using it as a filter, you should
still see

changes.

Can you verify its not the query? i.e. just search for a
document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM,OrenMazor
<oren.ma...@gmail.com (http://gmail.com)>
wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the
AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am,OrenMazor <oren.ma...@gmail.com
(http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but
not the
response. now that I think of it), but running my
query shows no
record is available for that item.

all of our records are virtually the same size (about
1kb), and the
most insertions we'd be seeing is 10-20 per second.
occasionally that
might go up to 50.

how often does refresh happen by default, and how long
does it take?

I'm wondering if 10 shards is not enough for the size
of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com
(http://gmail.com)> wrote:

Does this happen with search request, where you see
the old data?
By
default, elasticsearch will refresh an index to see
newly indexed

docs

(or

deletes) every seconds. Can you use the index stats
API to see if

there was

a bump in how long it took to refresh (there is
refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM,OrenMazor
<oren.ma...@gmail.com (http://gmail.com)>
wrote:

Hi all,

We've deployed elasticsearch in our production and
we're
incredibly
happy with search performance. However, we're
seeing occasional

...

read more

The bulk request returns a response per item if it succeeded or not (and if failed, the failure itself), so you need to check the actual response body. Also, can you try and use a newer Java version, the one you use is pretty old.

On Saturday, February 11, 2012 at 11:47 PM, Oren Mazor wrote:

Yup. the bulk operations are all okay, at least as far as the http
response is concerned.

I'm almost certain that my problem is just that we're hitting some
resource limit for the size of our index (40gb), but I cant figure out
where to find the blockage. I'm watching the stats on the cluster and
seeing nothing other than flat/healthy usage.

I am seeing a higher than normal read/write activity over the past 24
hours (huge number of documents added)

On Feb 9, 9:15 pm, <medcl2...@gmail.com (http://gmail.com)> wrote:

hi,OrenMazor
did you checked the response of the bulk operation,are they all successful
indexed?
and also check your translog status.
also manually refresh the index and to see if can get the current version

-----Original Message-----
From:OrenMazor
Sent: Friday, February 10, 2012 2:25 AM
To: elasticsearch
Subject: Re: ES Index performance

Issuing a Get does not return the correct version.

we're using OpenJDK java version 1.6.0_18, not on EC2, with debian. we
have 1 replica and 10 shards.

I'm suspecting an IO issue, to be honest, given that if there were
indexing performance issues somebody may have seen them already, what
with "select isn't broken" :slight_smile:

that said, we do have a pretty high rate of indexes, so maybe at scale
some issue pops up ?

On Feb 7, 2:00 pm, Shay Banon <kim...@gmail.com (http://gmail.com)> wrote:

Going back to your question, do you see that issuing a Get (which is
realtime) does not return the correct version of the data? I would be
helpful to understand where the stalling is coming from. If a "get" does
not return your expect version of the data, it means that it didn't get
indexed, so you will need to look at the indexer code and see if maybe
something is stalling on the bulk API execution.

The stalling can be for many reasons, starting with slow IO, not enough
resources on the machine CPU/Mem, overloading the machines you have in the
cluster, GC .

Which JVM version are you using? Are you running on EC2? If so, which
instances / os version? How many shards do you have in the index?

On Tuesday, February 7, 2012 at 5:04 PM,OrenMazor wrote:

hi Shay,

we use server density to keep track of ES and I'm not seeing any
spikes in resource use at all. I'm suspecting we're just pushing it
more than most people do?

I do use the bulk API to send perhaps 1000 insertions every 10 seconds
on average. in some cases these are new records, and sometimes they
are new versions of existing records. I also send some amount of
deletes every minute, but these are not really using the bulk api.

could you recommend a more low level primer on ES? I'd love to have a
more low level understanding of how/why things work. it'll make it
easier for me to tune my algorithms. I can go through the source, but
if there're some papers out there I could read, that'd be better :slight_smile:

thanks!
Oren

PS. I noticed there is now a mongodb river in development. I'm
wondering whether my efforts might be better spent helping it to
production status rather than trying to tune my own code..

On Feb 5, 4:31 pm, Shay Banon <kim...@gmail.com (http://gmail.com)>
wrote:

On Saturday, February 4, 2012 at 3:18 AM,OrenMazor wrote:

so, I'm starting to see these again on heavy really heavy load (lets
say around 10k insertions a minute)

Whats the behavior of elasticsearch in this case? Memory usage ok?
When you say 10k inserts per minute, is that using the bulk API? How
many clients are indexing the data?

I'm still having some difficulty wrapping my head around the
algorithm
in the bottom end. the refresh total_time is 17h and merges is
14.9h.
this seems pretty ambiguous. I'm guessing its the total time spent
executing these actions rather than the time since, right?

Yes, thats the total time that was spent doing it.

are there some hardware settings I can make to make lucene go
faster?
also, is there anything I can read to level up on understanding the
low level side of things? I'm going through the ES code to start
with
and learning more there.

On Jan 25, 9:37 am, Shay Banon <kim...@gmail.com (http://gmail.com)>
wrote:

Great, thanks for the update!

On Wednesday, January 25, 2012 at 8:54 AM,OrenMazor wrote:

Hi Shay,

just a follow up (because I hate it when there is no closure).

I modified my import script to use bulk imports, so instead of
10
insertions a second, I now end up doing one bulk insertion every
ten
seconds. I had it up to a minute, but I think inserting 600-800
records in one bulk request was causing some problems, so I
shortened
the frequency.

so far I'm not seeeing any serious delays in testing this week,
but
tomorrow I'll do some bigger load testing with our big index. it
seems
promising at the moment!

On Jan 20, 2:26 pm, Shay Banon <kim...@gmail.com (http://gmail.com)
(http://gmail.com)> wrote:

Hard to tell if its GC, you can monitor it using bigdesk to
see changes,
see how memory is behaving. Though you way you have a 30
minute "pause",
which is strange. Did you check the refresh stats? Also, when
this happens,
can you simply get by id the relevant new / modified document?

On Fri, Jan 20, 2012 at 5:58 PM,OrenMazor
<oren.ma...@gmail.com (http://gmail.com)> wrote:

Yup. I've done direct queries for a document that should be
there, and
even 30 minutes later, it is still not available.

based on the semi-regular pattern of these delays, I'm
wondering if
there's some kind of memory or gc issue playing up?

we have two nodes with 16gb/32 on the first, and 10/24 on
the second.

On Jan 20, 10:06 am, Shay Banon <kim...@gmail.com (http://gmail.com)
(http://gmail.com)> wrote:

It makes little sense to use query_string as a filter, I
suggest you

don't

do that. But, even when using it as a filter, you should
still see

changes.

Can you verify its not the query? i.e. just search for a
document

recently

added and see if you get it back?

On Fri, Jan 20, 2012 at 8:07 AM,OrenMazor
<oren.ma...@gmail.com (http://gmail.com)>

wrote:

also, its probably worth sharing my frontend's query:

{
"filter" : {
"and" : [
{
"term": {
"SID": $num
}
},
{
"query": {
"query_string" : {
"default_operator" : "AND",
"fields": ["X","Y"],
"query" : "$QUERY"
}
}
}
]
},
"sort" : [
{
"Y" : {
"order" : "desc"
}
}
],
"size" : 1
}'

I understand that there is no caching involved with the
AND filter,
but sort is a different matter (Y is a date)

On Jan 20, 12:21 am,OrenMazor <oren.ma...@gmail.com (http://gmail.com)
(http://gmail.com)> wrote:

yup. I can see an insertion request going into ES (but
not the
response. now that I think of it), but running my
query shows no
record is available for that item.

all of our records are virtually the same size (about
1kb), and the
most insertions we'd be seeing is 10-20 per second.
occasionally that
might go up to 50.

how often does refresh happen by default, and how long
does it take?

I'm wondering if 10 shards is not enough for the size
of our index.

On Jan 18, 4:13 pm, Shay Banon <kim...@gmail.com (http://gmail.com)
(http://gmail.com)> wrote:

Does this happen with search request, where you see
the old data?

By

default, elasticsearch will refresh an index to see
newly indexed

docs

(or

deletes) every seconds. Can you use the index stats
API to see if

there was

a bump in how long it took to refresh (there is
refresh stats

there).

On Wed, Jan 18, 2012 at 8:15 AM,OrenMazor
<oren.ma...@gmail.com (http://gmail.com)>

wrote:

Hi all,

We've deployed elasticsearch in our production and
we're

incredibly

happy with search performance. However, we're
seeing occasional

...

read more