One index to rule them all?


(Brian O'Neill) #1

Looking for some advice…
(i'm sure this is a question that has been asked before --- feel free to point me to a url)

We are about to scale a system to accommodate thousands of data feeds. Each data feed contains millions of documents. We need to search across all of those feeds with arbitrarily complex queries. (We will need filtering, custom scoring, etc.)

What are the pros and cons to using a single index for all of those feeds?
Are there any constraints we may run into if we create one index per feed?
Any impact on query time performance/storage requirements, etc?

Or are we safe to implement either given that it all boils down to shards in the end?

-brian


Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive • King of Prussia, PA • 19406
M: 215.588.6024 • @boneill42healthmarketscience.com

This information transmitted in this email message is for the intended recipient only and may contain confidential and/or privileged material. If you received this email in error and are not the intended recipient, or the person responsible to deliver it to the intended recipient, please contact the sender at the email above and delete this email and any attachments and destroy any copies thereof. Any review, retransmission, dissemination, copying or other use of, or taking any action in reliance upon, this information by persons or entities other than the intended recipient is strictly prohibited.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Itamar Syn-Hershko) #2

Have you looked into sharding? basically, that's having the best of both
worlds (if you choose the sharding function wisely!)

I would say go with separate indexes only if you have plans to retire some
of them after time, or do mass-deletes of some sorts, otherwise I don't see
a reason for doing that

On Tue, Oct 15, 2013 at 11:23 PM, Brian O'Neill bone@alumni.brown.eduwrote:

Looking for some advice…
(i'm sure this is a question that has been asked before --- feel free to
point me to a url)

We are about to scale a system to accommodate thousands of data feeds.
Each data feed contains millions of documents. We need to search across
all of those feeds with arbitrarily complex queries. (We will need
filtering, custom scoring, etc.)

What are the pros and cons to using a single index for all of those feeds?
Are there any constraints we may run into if we create one index per feed?

Any impact on query time performance/storage requirements, etc?

Or are we safe to implement either given that it all boils down to shards
in the end?

-brian


Brian O'Neill
Chief Architect
Health Market Science
The Science of Better Results
2700 Horizon Drive • King of Prussia, PA • 19406****
M: 215.588.6024 • @boneill42 http://www.twitter.com/boneill42
healthmarket http://healthmarketscience.com/sciencehttp://healthmarketscience.com/
.com http://healthmarketscience.com/

This information transmitted in this email message is for the intended
recipient only and may contain confidential and/or privileged material. If
you received this email in error and are not the intended recipient, or the
person responsible to deliver it to the intended recipient, please contact
the sender at the email above and delete this email and any attachments and
destroy any copies thereof. Any review, retransmission, dissemination,
copying or other use of, or taking any action in reliance upon, this
information by persons or entities other than the intended recipient is
strictly prohibited.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #3