Yesterday I had to restart my cluster. Afterwards, my one index, "atlas20160713" was stuck in INITIALIZING phase for hours. When I went to look at why it was taking so long, I saw that it was in the translog phase for over an hour. This index was about 36GB when it was being recovered. **The part that I DON"T UNDERSTAND is that while this index was in translog phase, NO INDEXING was happening on this index at all. Why is that? Why would replication which should be a background action prevent indexing from occuring? Can someone help me understand why?
There have been bugs that slow down translog phase in the past ... are you using the latest ES (2.3.4)?
During the translog phase, recently indexed documents since the last flush are re-indexed ... if you have an unusually large translog that can make things slower. Have you changed any translog settings?
Currently running ES 1.7
I have not changed any translog settings.
I had one of these recoveries this morning that prevented any new documents from being indexed for about 13 minutes. I checked the translog size and it was about 500MB. I part that boggles my mind is why aren't new documents being indexed? Even documents that aren't being indexed into the recovery index.
There are a couple of issues here that work together. This is how 1.x recovery works:
We prevent the translog from flushing, which means it will keep all operations since the start of the recovery
We copy over a point in time snapshot the lucene index files. We throttle this so it make take long.
We replay everything in the translog to the target replica, making sure that everything that was indexed since we took the lucene index snapshot is replayed.
We stop indexing on the primary and replay everything that was indexed while we were doing step 3.
I suspect step 4 is the source of your troubles, correct?
The ES 2.x we became smarter and removed step 4 all together. If this is your issue, it will be gone if you upgrade.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.