I have an application where I need to run searches on an index periodically picking up "new" entries from the index and I am wondering the best way of doing this.
I assume that _id is serial and simply saving the _id of the last document in each run and doing the next search for _id greater than my saved value should work. Correct ?
I have timestamps (to second resolution) but timestamps may coincide. That is why I am looking at alternatives. The data is coming off a network Intrusion detection system and we regularly see rates of 100 events per second or more.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.