ES index shows lots of deletes even though no delete operation

hi,
i have a heavy index into which i am indexing documents.. lot of times, i
am updating existing docs with new/updated field values..

when i run the stats api, i see that the index is accumulating bunch of
deletes over time even though no explicit delete api is called. how is that
possible? is there something happening under the covers?

thanks

--

Updates in Lucene are essentially deletes + inserts.

On Wed, Nov 21, 2012 at 3:19 PM, T Vinod Gupta tvinod@readypulse.comwrote:

hi,
i have a heavy index into which i am indexing documents.. lot of times, i
am updating existing docs with new/updated field values..

when i run the stats api, i see that the index is accumulating bunch of
deletes over time even though no explicit delete api is called. how is that
possible? is there something happening under the covers?

thanks

--

--

i thought so too.. but then i created a test index with a test document and
updated it.. running stats on test index didn't show any deletes. hence i
got confused.
assuming that updates causes deletes, does it make sense to have a
daily/weekly cron job to run optimize with expunge deletes option to keep
the index lightweight all the time? is that a recommended practice? if not,
what are the recommendations in this regard?

thanks

On Wed, Nov 21, 2012 at 3:24 PM, Ivan Brusic ivan@brusic.com wrote:

Updates in Lucene are essentially deletes + inserts.

On Wed, Nov 21, 2012 at 3:19 PM, T Vinod Gupta tvinod@readypulse.comwrote:

hi,
i have a heavy index into which i am indexing documents.. lot of times, i
am updating existing docs with new/updated field values..

when i run the stats api, i see that the index is accumulating bunch of
deletes over time even though no explicit delete api is called. how is that
possible? is there something happening under the covers?

thanks

--

--

--

The recommendation is to let Lucene's readers and segment do their own
thing. Much work has been done to minimize the cost of a delete.

If your search traffic has a consistent period of low activity, you can
schedule an optimize for that time.

--
Ivan

On Wed, Nov 21, 2012 at 3:32 PM, T Vinod Gupta tvinod@readypulse.comwrote:

i thought so too.. but then i created a test index with a test document
and updated it.. running stats on test index didn't show any deletes. hence
i got confused.
assuming that updates causes deletes, does it make sense to have a
daily/weekly cron job to run optimize with expunge deletes option to keep
the index lightweight all the time? is that a recommended practice? if not,
what are the recommendations in this regard?

thanks

On Wed, Nov 21, 2012 at 3:24 PM, Ivan Brusic ivan@brusic.com wrote:

Updates in Lucene are essentially deletes + inserts.

On Wed, Nov 21, 2012 at 3:19 PM, T Vinod Gupta tvinod@readypulse.comwrote:

hi,
i have a heavy index into which i am indexing documents.. lot of times,
i am updating existing docs with new/updated field values..

when i run the stats api, i see that the index is accumulating bunch of
deletes over time even though no explicit delete api is called. how is that
possible? is there something happening under the covers?

thanks

--

--

--

--

I agree with Ivan, I'd let Lucene merge away the deleted documents when
it's ready to. Optimizing isn't necessary and is very IO intensive.

On Thursday, November 22, 2012 12:54:07 PM UTC+13, Ivan Brusic wrote:

The recommendation is to let Lucene's readers and segment do their own
thing. Much work has been done to minimize the cost of a delete.

If your search traffic has a consistent period of low activity, you can
schedule an optimize for that time.

--
Ivan

On Wed, Nov 21, 2012 at 3:32 PM, T Vinod Gupta <tvi...@readypulse.com<javascript:>

wrote:

i thought so too.. but then i created a test index with a test document
and updated it.. running stats on test index didn't show any deletes. hence
i got confused.
assuming that updates causes deletes, does it make sense to have a
daily/weekly cron job to run optimize with expunge deletes option to keep
the index lightweight all the time? is that a recommended practice? if not,
what are the recommendations in this regard?

thanks

On Wed, Nov 21, 2012 at 3:24 PM, Ivan Brusic <iv...@brusic.com<javascript:>

wrote:

Updates in Lucene are essentially deletes + inserts.

On Wed, Nov 21, 2012 at 3:19 PM, T Vinod Gupta <tvi...@readypulse.com<javascript:>

wrote:

hi,
i have a heavy index into which i am indexing documents.. lot of times,
i am updating existing docs with new/updated field values..

when i run the stats api, i see that the index is accumulating bunch of
deletes over time even though no explicit delete api is called. how is that
possible? is there something happening under the covers?

thanks

--

--

--

--