Compared to Solr (with Solr Cloud), what is the advantage(s) of Elasticsearch?


(Daniel Guo) #1

I never used Apache Solr before, and I'm trying ElasticSearch in my project.
The document of ES is a little scarce, but I have to explain to my
supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed
indexing, near real-time update and searching, and automatic load
balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody give
me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(David Pilato) #2

I would say: play with both for some hours.
I really think you will get some answers by yourself!

I don't want to say more than this as I have probably a biased opinion :wink:

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr

Le 24 décembre 2013 at 15:16:59, Daniel Guo (daniel5hbs@gmail.com) a écrit:

I never used Apache Solr before, and I'm trying ElasticSearch in my project.
The document of ES is a little scarce, but I have to explain to my supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed indexing, near real-time update and searching, and automatic load balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody give me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52b99df5.238e1f29.45b0%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.


(Nik Everett) #3

About six months ago I spent a week porting a prototype from Solr Cloud to
Elasticsearch with the intent of evaluating Elasticsearch and either
throwing out the port or building off of it. By the third day or so I was
convinced I'd stick with Elasticsearch because:

  1. I was impressed with
    http://www.elasticsearch.org/contributing-to-elasticsearch/.
  2. The documentation is better.
  3. I liked the query DSL better than solr's.
  4. There is some http GET that you can hit in solr that will delete the
    index (or a shard or something). That shook my faith in humanity a little.
    Especially when I pasted it into IRC and my coworker clicked it or mouse
    overed it or something.... Gets. Idempotent.
  5. I liked the phrase suggester.
  6. My ops team seemed like it better.
  7. There was (and still is) a deb package.
  8. I liked the way Elasticsearch was tested. I admit I haven't actually
    looked into how Solr is tested.

Since then:

  1. I've enjoyed the process of landing changes in Elasticsearch much more
    then Lucene. I assume Solr would be the same because it is in the same
    repository as Lucene, The github process (pull request, etc) is better
    than JIRA/svn/patch files. I also think the Elasticsearch
    committers/repository collaborators are easier to work with then the Lucene
    folks.
  2. The phrase suggester needed some work to be as good as our
    (surprisingly advanced) home grown suggester. It is now that good.
  3. Elasticsearch has really improved the process of maintaining their
    documentation so I imagine it'll only get better.
  4. It seems to be working. We're using 0.90.7 at this point (see
    https://en.wikisource.org/wiki/Special:Version) to power the search on a
    couple hundred wikis without any trouble. Try it:
    https://en.wikisource.org/w/index.php?search=alias&title=Special%3ASearch

Nik

On Tue, Dec 24, 2013 at 9:45 AM, David Pilato david@pilato.fr wrote:

I would say: play with both for some hours.
I really think you will get some answers by yourself!

I don't want to say more than this as I have probably a biased opinion :wink:

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 24 décembre 2013 at 15:16:59, Daniel Guo (daniel5hbs@gmail.com//daniel5hbs@gmail.com)
a écrit:

I never used Apache Solr before, and I'm trying ElasticSearch in my
project.
The document of ES is a little scarce, but I have to explain to my
supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed
indexing, near real-time update and searching, and automatic load
balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody give
me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.52b99df5.238e1f29.45b0%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd0twn4ywTg5wUS44F_otxHCVcYT2vHjG8%2B0DS_PXHF_TQ%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Otis Gospodnetić) #4

Hi Daniel,

Here is a an unbiased 6-part series on this very
topic: http://blog.sematext.com/2012/08/23/solr-vs-elasticsearch-part-1-overview/

Note that SolrCloud has improved a lot since then and ES also got a number
of new features.

Sometimes one's requirements and must-have features determine the selection.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Tuesday, December 24, 2013 9:16:55 AM UTC-5, Daniel Guo wrote:

I never used Apache Solr before, and I'm trying ElasticSearch in my
project.
The document of ES is a little scarce, but I have to explain to my
supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed
indexing, near real-time update and searching, and automatic load
balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody give
me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8dc98e78-507b-46bf-9f29-ad0129898edc%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Daniel Guo) #5

Hi Nik:
Thanks for sharing your experience and opinion on the topic.
Could you please give me some advice on a bigger picture, such as the
distributed model, read-time indexing, search performance and so on.
Thanks so much.

On Tuesday, December 24, 2013 11:17:18 PM UTC+8, Nikolas Everett wrote:

About six months ago I spent a week porting a prototype from Solr Cloud to
Elasticsearch with the intent of evaluating Elasticsearch and either
throwing out the port or building off of it. By the third day or so I was
convinced I'd stick with Elasticsearch because:

  1. I was impressed with
    http://www.elasticsearch.org/contributing-to-elasticsearch/.
  2. The documentation is better.
  3. I liked the query DSL better than solr's.
  4. There is some http GET that you can hit in solr that will delete the
    index (or a shard or something). That shook my faith in humanity a little.
    Especially when I pasted it into IRC and my coworker clicked it or mouse
    overed it or something.... Gets. Idempotent.
  5. I liked the phrase suggester.
  6. My ops team seemed like it better.
  7. There was (and still is) a deb package.
  8. I liked the way Elasticsearch was tested. I admit I haven't actually
    looked into how Solr is tested.

Since then:

  1. I've enjoyed the process of landing changes in Elasticsearch much more
    then Lucene. I assume Solr would be the same because it is in the same
    repository as Lucene, The github process (pull request, etc) is better
    than JIRA/svn/patch files. I also think the Elasticsearch
    committers/repository collaborators are easier to work with then the Lucene
    folks.
  2. The phrase suggester needed some work to be as good as our
    (surprisingly advanced) home grown suggester. It is now that good.
  3. Elasticsearch has really improved the process of maintaining their
    documentation so I imagine it'll only get better.
  4. It seems to be working. We're using 0.90.7 at this point (see
    https://en.wikisource.org/wiki/Special:Version) to power the search on a
    couple hundred wikis without any trouble. Try it:
    https://en.wikisource.org/w/index.php?search=alias&title=Special%3ASearch

Nik

On Tue, Dec 24, 2013 at 9:45 AM, David Pilato <da...@pilato.fr<javascript:>

wrote:

I would say: play with both for some hours.
I really think you will get some answers by yourself!

I don't want to say more than this as I have probably a biased opinion :wink:

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 24 décembre 2013 at 15:16:59, Daniel Guo (danie...@gmail.com<javascript:>)
a écrit:

I never used Apache Solr before, and I'm trying ElasticSearch in my
project.
The document of ES is a little scarce, but I have to explain to my
supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed
indexing, near real-time update and searching, and automatic load
balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody
give me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/etPan.52b99df5.238e1f29.45b0%40MacBook-Air-de-David.local
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/444449c6-73cf-43f3-8815-720e55ada45c%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Daniel Guo) #6

David, nice to see you. Your opinion is very helpful to me, even though it
may be biased.
When I have more time, I'll accept your advice and try both of them by
myself.

On Tuesday, December 24, 2013 10:45:09 PM UTC+8, David Pilato wrote:

I would say: play with both for some hours.
I really think you will get some answers by yourself!

I don't want to say more than this as I have probably a biased opinion :wink:

--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet https://twitter.com/dadoonet | @elasticsearchfrhttps://twitter.com/elasticsearchfr

Le 24 décembre 2013 at 15:16:59, Daniel Guo (danie...@gmail.com<javascript:>)
a écrit:

I never used Apache Solr before, and I'm trying ElasticSearch in my
project.
The document of ES is a little scarce, but I have to explain to my
supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed
indexing, near real-time update and searching, and automatic load
balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody give
me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/44aa3f8d-59cb-4500-9b81-694718019057%40googlegroups.com
.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6c700832-f294-4c64-9407-87fd671b29c2%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Daniel Guo) #7

Hi Otis:
Thank you for your advice. It's helpful, really.

On Wednesday, December 25, 2013 11:37:11 AM UTC+8, Otis Gospodnetic wrote:

Hi Daniel,

Here is a an unbiased 6-part series on this very topic:
http://blog.sematext.com/2012/08/23/solr-vs-elasticsearch-part-1-overview/

Note that SolrCloud has improved a lot since then and ES also got a number
of new features.

Sometimes one's requirements and must-have features determine the
selection.

Otis

Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/

On Tuesday, December 24, 2013 9:16:55 AM UTC-5, Daniel Guo wrote:

I never used Apache Solr before, and I'm trying ElasticSearch in my
project.
The document of ES is a little scarce, but I have to explain to my
supervisor why I chose ES over Solr.

As far as I know, Solr (with Solr Cloud) also supports distributed
indexing, near real-time update and searching, and automatic load
balancing,
which are the main features of ElasticSearch.

What are the advantages of ES comparing to Apache Solr? Could anybody
give me a tip, or some information links?
Thanks a lot.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3da6f2bf-1b69-43a8-ba26-ee0c4f0d629f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Nik Everett) #8

On Fri, Dec 27, 2013 at 4:30 AM, Daniel Guo daniel5hbs@gmail.com wrote:

distributed model

Automatic shard rebalancing works quite well. We're able to do rolling
restarts without losing any redundancy. It is useful to keep in mind that
some things, like scores and suggestions, come from data that is per shard
rather across the whole index.

read-time indexing

I assume you mean real time indexing. That works fine. Our problem is
actually getting the documents built and shipped of to Elasticsearch in a
timely manner, not Elasticsearch being able to ingest them. It is
important to make sure that you have a process for doing on line schema
changes like
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/ .
Those processes can push Elasticsearch to its limit if you do them
multi-threaded/multi-process (shakes fist at PHP). Just don't use so many
threads that you crush Elasticsearch. You'll have to measure that. We
crushed three Elasticsearch nodes with 20 processes but your mileage will
vary.

search performance

So far everything is quite quick and we're happy that we can add more
replicas to increase performance. We're not sure yet if we'll do that. I
suggest setting up whatever kind of performance metrics gathering system
you have in house. Capturing those metrics is pretty simple as you can
just dig them out of the rest api. If you happen to use ganglia feel free
to use our script:
http://git.wikimedia.org/tree/operations%2Fpuppet.git/8509513c2ec7c0114554deac3dbb6aa177ce743a/modules%2Felasticsearch%2Ffiles%2Fganglia

and so on.

As I said before I like the Elasticsearch community. They are helpful.

Make sure to wait a week to ten days after each release to see if some
critical flaw is discovered. Elasticsearch is pretty well tested but every
other release seems to have had some trouble recently. I doubt this'll
happen every time but you may as well be safe.

For my use case automatic index creation and automatic field creation more
trouble then helpful. These may be worth turning off for you. They are on
by default because they work well for some significant portion of users and
they make playing around really easy.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPmjWd23HZkvpOGww_sL3Tu%2BqEoSdUtPUsH5HvMPoBn3W4To5Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Daniel Guo) #9

Hi, Nik:
Thank you for your practical experience sharing. I''ll remember and follow
your advice. Thanks again!

On Saturday, December 28, 2013 3:57:38 AM UTC+8, Nikolas Everett wrote:

On Fri, Dec 27, 2013 at 4:30 AM, Daniel Guo <danie...@gmail.com<javascript:>

wrote:

distributed model

Automatic shard rebalancing works quite well. We're able to do rolling
restarts without losing any redundancy. It is useful to keep in mind that
some things, like scores and suggestions, come from data that is per shard
rather across the whole index.

read-time indexing

I assume you mean real time indexing. That works fine. Our problem is
actually getting the documents built and shipped of to Elasticsearch in a
timely manner, not Elasticsearch being able to ingest them. It is
important to make sure that you have a process for doing on line schema
changes like
http://www.elasticsearch.org/blog/changing-mapping-with-zero-downtime/ .
Those processes can push Elasticsearch to its limit if you do them
multi-threaded/multi-process (shakes fist at PHP). Just don't use so many
threads that you crush Elasticsearch. You'll have to measure that. We
crushed three Elasticsearch nodes with 20 processes but your mileage will
vary.

search performance

So far everything is quite quick and we're happy that we can add more
replicas to increase performance. We're not sure yet if we'll do that. I
suggest setting up whatever kind of performance metrics gathering system
you have in house. Capturing those metrics is pretty simple as you can
just dig them out of the rest api. If you happen to use ganglia feel free
to use our script:
http://git.wikimedia.org/tree/operations%2Fpuppet.git/8509513c2ec7c0114554deac3dbb6aa177ce743a/modules%2Felasticsearch%2Ffiles%2Fganglia

and so on.

As I said before I like the Elasticsearch community. They are helpful.

Make sure to wait a week to ten days after each release to see if some
critical flaw is discovered. Elasticsearch is pretty well tested but every
other release seems to have had some trouble recently. I doubt this'll
happen every time but you may as well be safe.

For my use case automatic index creation and automatic field creation more
trouble then helpful. These may be worth turning off for you. They are on
by default because they work well for some significant portion of users and
they make playing around really easy.

Nik

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a2f4eede-c4e9-4369-96f0-ee4acfe9afa4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Pierce Wetter) #10

We use both SOLR and Elasticsearch at Chegg.

The search for www.chegg.com is powered by SOLR, because that's done by the
search team, who are are more hard-core search nerds, like XML instead of
JSON, etc. They have one master and a whole bunch of slaves, and rebuild
the master continuously.

I wanted to switch to Elasticsearch for the eReader team, because we're
constantly adding new eBooks to our catalog, so I needed something that
clustered. We had a bunch of endless meetings discussing it. Ops wanted a
zone-aware solution, which Solr Cloud, since its based on Zookeeper,
couldn't do automatically. Plus realistically, only the search folks knew
how to deal with Solr. I could deal with ES with just my team with partial
attention.

Elasticsearch could do the zone aware thing, so that's how I got Ops to
sign up. Plus they were already using Logstash. But really, its because its
much easier for me to administrate, and the clustering part just works on
its own without needing zookeeper.

Pierce

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d871e8ad-8403-46cc-aa4b-942d7477604a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #11