They describe a number of features that they've developed and made available on Github; in particular they describe their implementation of a Bloom filter. I've noticed that ES incorporates a "SimpleBloomCache" as per this thread:
would the Greplin Bloom implementation be a worthwhile (or even possible) improvement to ES?
would any/all of the other features be worth porting to ES?
I'm normally reluctant to suggest large enhancements/features, but in this case it seems that having code already written and openly available might make it a reasonable notion. What are your thoughts?
They describe a number of features that they've developed and made available on Github; in particular they describe their implementation of a Bloom filter. I've noticed that ES incorporates a "SimpleBloomCache" as per this thread:
would the Greplin Bloom implementation be a worthwhile (or even possible) improvement to ES?
The bloom filter greplin did is not really related to Lucene, though can be used in certain places when working with Lucene. Had a quick look at the implementation, ES one is different, inspired by Cassandra (which uses Lucene OpenBitSet for it :), round and round the open source goes). The ES implementation will be faster and use less memory, though I have not tested it, just by looking at the code.
would any/all of the other features be worth porting to ES?
The only thing that I see is the phrase query, which can be easily added as another query option (it simply wraps Lucene MultiPhraseQuery).
Actually, the more interesting work Greplin did is with interval fields (its another project on github). That would be a cool feature to have in ES, but I need to review it more before including it in ES.
I'm normally reluctant to suggest large enhancements/features, but in this case it seems that having code already written and openly available might make it a reasonable notion. What are your thoughts?
Don't be reluctant to suggest any type of feature, no matter how big or small, thats the first thought that jumps to mind
would the Greplin Bloom implementation be a worthwhile (or even possible) improvement to ES?
... The ES implementation will be faster and use less memory, though I have not tested it, just by looking at the code.
Fair enough; I'm not surprised of course, but wanted to mention it so you were aware of it.
would any/all of the other features be worth porting to ES?
The only thing that I see is the phrase query, which can be easily added as another query option (it simply wraps Lucene MultiPhraseQuery).
That might be handy.
I'm normally reluctant to suggest large enhancements/features, but in this case it seems that having code already written and openly available might make it a reasonable notion. What are your thoughts?
Don't be reluctant to suggest any type of feature, no matter how big or small, thats the first thought that jumps to mind
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.