Documents missing after indexing and refreshing


(Wojciech Durczyński) #1

Hello Shay.

I noticed a problem with ES 0.17.
I have automatic tests that:

  1. creates two local ES nodes with different cluster names
  2. create index without mappings in one of the nodes
  3. add there a lot of documents quickly (without refresh) - mapping is
    generated automatically
  4. refresh this index
  5. create another index in the second node
  6. read all documents from first index (scan) and index them in second index
    (other cluster) - mappings are created automatically
  7. close first node
  8. refresh second index
  9. search a document from second index

This scenario is used by many of my tests, and it sometimes fail in
non-deterministic way (document is not found). After a while this document
is searchable.

In ES 0.16 after indexing and refreshing search was always successful. In ES
0.17 it isn't.
Do You know what may be a source of this problem? Maybe reading from replica
is introduced in 0.17 or different caching than before?

Regards
Wojciech Durczyński


(Shay Banon) #2

Hey,

Is there a chance you can extract it into a standalone test that I can run
and gist it? If this happens, I think you should be able to simplify it by
just indexing data into a node, the indexing process does not care where the
documents came from.

-shay.banon

2011/8/26 Wojciech Durczyński wojciech.durczynski@comarch.com

Hello Shay.

I noticed a problem with ES 0.17.
I have automatic tests that:

  1. creates two local ES nodes with different cluster names
  2. create index without mappings in one of the nodes
  3. add there a lot of documents quickly (without refresh) - mapping is
    generated automatically
  4. refresh this index
  5. create another index in the second node
  6. read all documents from first index (scan) and index them in second
    index (other cluster) - mappings are created automatically
  7. close first node
  8. refresh second index
  9. search a document from second index

This scenario is used by many of my tests, and it sometimes fail in
non-deterministic way (document is not found). After a while this document
is searchable.

In ES 0.16 after indexing and refreshing search was always successful. In
ES 0.17 it isn't.
Do You know what may be a source of this problem? Maybe reading from
replica is introduced in 0.17 or different caching than before?

Regards
Wojciech Durczyński


(Wojciech Durczyński) #3

Here is my test example (in Scala but using Elastic Search Java API):

Version 0.17.5 of ES fails assertion before 100th loop.
Version 0.16.2 of ES doesn't fail assertion.


(Wojciech Durczyński) #4

In ES 0.16.2 sometimes there is an exception:
org.elasticsearch.indices.IndexAlreadyExistsException: [testindex] Already
exists
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.validate(MetaDataCreateIndexService.java:401)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService.access$100(MetaDataCreateIndexService.java:79)
at
org.elasticsearch.cluster.metadata.MetaDataCreateIndexService$1.execute(MetaDataCreateIndexService.java:126)
at
org.elasticsearch.cluster.service.InternalClusterService$2.run(InternalClusterService.java:180)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)


(Wojciech Durczyński) #5

I modified a little this test:


If document can't be searched by name in ES 0.17.5 it is always gettable by
id.
I get message: Record wasn't found after 10s but can be get by id


(Wojciech Durczyński) #6

What do You think about this Shay? Is it a bug, or I'm doing something
wrong?


(Wojciech Durczyński) #7

Hello Shay. I improved my test case. This time it's in Java:

Problem is that in ES 0.17.* (I checked versions 0.17.5 and 0.17.7) this
code throws an exception "Record wasn't found after 10s but can be get by
id".
But this code works well in ES 0.16.2.
It's because in ES 0.16.2 putMapping returns when mapping is successfully
created and in ES 0.17.* it returns earlier.
If mapping isn't created before indexing, then indexing is wrong and object
can't be found by search operation.
If "sleep after put mapping" is uncommented, then this test case works
better in ES 0.17.*.
But then there are sometimes errors: Record found after 100 ms. It means
that mapping was succesfully created before indexing of object, but object
isn't searchable yet even after refresh. Uncommenting "sleep after refresh"
solves this problem. Maybe it's because refresh returns before refreshing
operation finishes?

I'm sure that it's a regression and execute().actionGet() should return
after operation is succesfully finished. What do you think about it?

*
*
*
*
*
*


(Shay Banon) #8

Hi,

Thanks for the recreation, opened an issue:
https://github.com/elasticsearch/elasticsearch/issues/1355. The problem is
that put mapping on a single node will not wait for the mapping to be
applied so a subsequent index request might sneak in.

Btw, if you want to fully protect against that, even in the distributed /
multi client scenario, you can simply have the mappings set as part of hte
create index API.

-shay.banon

2011/9/22 Wojciech Durczyński wojciech.durczynski@comarch.com

Hello Shay. I improved my test case. This time it's in Java:
https://gist.github.com/1234552

Problem is that in ES 0.17.* (I checked versions 0.17.5 and 0.17.7) this
code throws an exception "Record wasn't found after 10s but can be get by
id".
But this code works well in ES 0.16.2.
It's because in ES 0.16.2 putMapping returns when mapping is successfully
created and in ES 0.17.* it returns earlier.
If mapping isn't created before indexing, then indexing is wrong and object
can't be found by search operation.
If "sleep after put mapping" is uncommented, then this test case works
better in ES 0.17.*.
But then there are sometimes errors: Record found after 100 ms. It means
that mapping was succesfully created before indexing of object, but object
isn't searchable yet even after refresh. Uncommenting "sleep after refresh"
solves this problem. Maybe it's because refresh returns before refreshing
operation finishes?

I'm sure that it's a regression and execute().actionGet() should return
after operation is succesfully finished. What do you think about it?

*
*
*
*
*
*


(Wojciech Durczyński) #9

Thank you for your answer.
I modified this test case to create two nodes cluster.
And now: "Record wasn't found after 10s but can be get by id" error happens
much less but still exists.
"Record found after 100 m" error happens very often - refresh operation
doesn't work well even in two nodes cluster.


(Shay Banon) #10

Can you try with the change I just committed, see if it fixes it? I ran your
test and have not encountered it. Its committed on both 0.17 branch and
master.

2011/9/22 Wojciech Durczyński wojciech.durczynski@comarch.com

Thank you for your answer.
I modified this test case to create two nodes cluster.
And now: "Record wasn't found after 10s but can be get by id" error happens
much less but still exists.
"Record found after 100 m" error happens very often - refresh operation
doesn't work well even in two nodes cluster.


(Wojciech Durczyński) #11

I tried current master with your change.
PutMapping works as a charm - mapping is now always available when document
is indexed.
But there is still problem with refresh.
After indexing and refreshing sometimes object isn't found (but becomes
searchable after a while).
My test case fails now with "Record found after 100 ms" error always after
5-200 loops.


(Wojciech Durczyński) #12

I did some more tests using master revision and:

scenario 1:
create index
putMapping
sleep
loop {
index
refresh
search
}
works without problems. If index is created much before loop, then
refreshing works and document is always searchable after refresh.

scenario2:
loop {
create index
putMapping
index
refresh
search
}
fails very quickly, usually at 2nd loop. Mapping is created well but indexed
document isn't searchable after refresh if index was created recently. It
becomes searchable after a while.

scenario3:
loop {
create index
putMapping
wait for yellow status
index
refresh
search
}
doesn't fail as quickly as scenario2, but also fails. It means that even
waiting for yellow status doesn't assure as that document will be searchable
after indexing and refreshing. It becomes searchable after a while.

I'd like to be sure that I'll find all documents which was indexed before
search. How to do this?


(Frederic) #13

Hi there. For what it worths, same issue here but with default index
and mapping creation at indexing time. Searching items right after
indexing them and refreshing the index, does found documents but not
all of them. Sleeping 5secs after populating ES does the work. (main
parts of the test at https://gist.github.com/1242136)

On 23 sep, 10:29, Wojciech Durczyński
wojciech.durczyn...@comarch.com wrote:

I did some more tests using master revision and:

scenario 1:
create index
putMapping
sleep
loop {
index
refresh
search}

works without problems. If index is created much before loop, then
refreshing works and document is always searchable after refresh.

scenario2:
loop {
create index
putMapping
index
refresh
search}

fails very quickly, usually at 2nd loop. Mapping is created well but indexed
document isn't searchable after refresh if index was created recently. It
becomes searchable after a while.

scenario3:
loop {
create index
putMapping
wait for yellow status
index
refresh
search}

doesn't fail as quickly as scenario2, but also fails. It means that even
waiting for yellow status doesn't assure as that document will be searchable
after indexing and refreshing. It becomes searchable after a while.

I'd like to be sure that I'll find all documents which was indexed before
search. How to do this?


(Shay Banon) #14

Hey,

Ran the test again, but now with deleting the index so I can run it over
night, and it managed to fail (~3000 iteration). I guess the difference
comes from the different HW we use as its a concurrency problem. I pushed
two fixes to master that fix the ones that I managed to recreate (
https://github.com/elasticsearch/elasticsearch/issues/1369, and
https://github.com/elasticsearch/elasticsearch/issues/1370). They mainly
relate to replica shard recovery and applying refresh / mapping to it. Can
you give it a go?

-shay.banon

2011/9/23 Wojciech Durczyński wojciech.durczynski@comarch.com

I did some more tests using master revision and:

scenario 1:
create index
putMapping
sleep
loop {
index
refresh
search
}
works without problems. If index is created much before loop, then
refreshing works and document is always searchable after refresh.

scenario2:
loop {
create index
putMapping
index
refresh
search
}
fails very quickly, usually at 2nd loop. Mapping is created well but
indexed document isn't searchable after refresh if index was created
recently. It becomes searchable after a while.

scenario3:
loop {
create index
putMapping
wait for yellow status
index
refresh
search
}
doesn't fail as quickly as scenario2, but also fails. It means that even
waiting for yellow status doesn't assure as that document will be searchable
after indexing and refreshing. It becomes searchable after a while.

I'd like to be sure that I'll find all documents which was indexed before
search. How to do this?


(Wojciech Durczyński) #15

I tried current master and my tests passed. Your patch seems to be working.
Is it possible to include changes from this thread (issue 1355, 1369 and
1370) in ES 0.17.8?
When do you plan to release it, and when 0.18.0?

Regards
Wojciech Durczyński


(Shay Banon) #16

The fixes are in 0.17 branch. No concrete date for 0.18, 0.17.8 can probably
be released in a week or so.

2011/9/28 Wojciech Durczyński wojciech.durczynski@comarch.com

I tried current master and my tests passed. Your patch seems to be working.
Is it possible to include changes from this thread (issue 1355, 1369 and
1370) in ES 0.17.8?
When do you plan to release it, and when 0.18.0?

Regards
Wojciech Durczyński


(system) #17