Integration with NoSQL

Sergio_Bossa · October 6, 2010, 2:52pm

Hi Shay,

thanks for your response.
My thoughts below ...

Storing the actual source of course causes more overhead when indexing a
document (basically, when a lucene segment is written, more data needs to be
written to disk), and when segments are merged (again, more IO operations,
but very trivial ones). This is a very minimal cost compared to all the
other things that happens when indexing a document, so I am surprised of
your experience Sergio..., I think there might have been something else in
play here...

There might have been something else, I'm no way an expert in Lucene
internals: I'm just reporting my experiences.

On the other hand, not storing the source document means several things.
First, you need to fetch the relevant data from your "other storage",
possibly a nosql. Now, lets compare the cost of this: When executing a
search, the "fetching" phase is already executing on the relevant node, so
all that is added is accessing the shard storage (lets say fs) and fetching
it.

Agree, reading everything from ES will be certainly faster: but I was
talking about write performance degradation.

Thats why, by default, elasticsearch does store the _source, and I think
that its a good "out of the box" solution,

I'm not saying it is not a good "out of the box" solution: it actually
is indeed, so I agree with you there

and this is what I would
recommend on using most (if not almost always) of the time.

Here's where I disagree, but it's just my opinion.

Cheers,

Sergio B.

--
Sergio Bossa
http://www.linkedin.com/in/sergiob

Topic		Replies	Views
ElasticSearch vs NoSQL Elasticsearch	22	1214	July 6, 2017
ES DataBase Engine Elasticsearch	18	3263	July 6, 2017
MongoDB + SOLR integration Elasticsearch	7	1193	July 6, 2017
elasticSearch as a document database Elasticsearch	16	1503	July 6, 2017
What MongoDB can do and ES cannot? Elasticsearch	10	1858	July 6, 2017

Integration with NoSQL

Related topics