1 large index vs several smaller indexes


(Marcio Rodrigues) #1

Still new to ES, so I was wondering what would be the best approach.

Say you have for example a company with different branhes throughout the
country and their documents will be stored centrally, but need to be
searched sepperately.

Would it be better to create one index for all documents and use filters on
the queries or better to make a sepperate index for each branch?

By better I mean most efficient in terms of performance a resource usage.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/bf5107ce-4803-4fa1-89fe-a20cf678bb00%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Itamar Syn-Hershko) #2

Geo-locating can help with performance (having the UK branch use a UK
datacenter and a US branch a NA datacenter) and this most easily achieved
by separating clusters. You can use a Tribe node to work on multiple
clusters if it ever comes to it/

Other than that your question really depends on the amounts of data and
types of queries you make. It does sound like the docs are completely
unrelated so different indexes (and even clusters) does make sense.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Apr 8, 2014 at 12:58 PM, Marcio Rodrigues marshy1001@gmail.comwrote:

Still new to ES, so I was wondering what would be the best approach.

Say you have for example a company with different branhes throughout the
country and their documents will be stored centrally, but need to be
searched sepperately.

Would it be better to create one index for all documents and use filters
on the queries or better to make a sepperate index for each branch?

By better I mean most efficient in terms of performance a resource usage.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf5107ce-4803-4fa1-89fe-a20cf678bb00%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bf5107ce-4803-4fa1-89fe-a20cf678bb00%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZvZ4ejSgRPdQz4vN5GZ6jzviu60Ke6Z7%2B7fcfw5mPXkcQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Marcio Rodrigues) #3

Sorry, just saw your Reply now.

Thnaks, I will go with multiple indexes.

On Tuesday, April 8, 2014 11:02:05 AM UTC+1, Itamar Syn-Hershko wrote:

Geo-locating can help with performance (having the UK branch use a UK
datacenter and a US branch a NA datacenter) and this most easily achieved
by separating clusters. You can use a Tribe node to work on multiple
clusters if it ever comes to it/

Other than that your question really depends on the amounts of data and
types of queries you make. It does sound like the docs are completely
unrelated so different indexes (and even clusters) does make sense.

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Tue, Apr 8, 2014 at 12:58 PM, Marcio Rodrigues <marsh...@gmail.com<javascript:>

wrote:

Still new to ES, so I was wondering what would be the best approach.

Say you have for example a company with different branhes throughout the
country and their documents will be stored centrally, but need to be
searched sepperately.

Would it be better to create one index for all documents and use filters
on the queries or better to make a sepperate index for each branch?

By better I mean most efficient in terms of performance a resource usage.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearc...@googlegroups.com <javascript:>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/bf5107ce-4803-4fa1-89fe-a20cf678bb00%40googlegroups.comhttps://groups.google.com/d/msgid/elasticsearch/bf5107ce-4803-4fa1-89fe-a20cf678bb00%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/910af15d-8385-4024-984c-ea77f505c705%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Jilles van Gurp) #4

I would separate the performance issue from the logical structure of your
domain. You really need to thing in terms of numbers of documents and
shards (and not indices).

You may want to look into using index aliases, which can take a filter.
That way you can have one index and several branch aliases with a filter.
Or alternatively you can have one index per branch and then a company alias
that includes all branch indexes. That way you can do company wide
searches. It all depends on what you need.

Jilles

On Tuesday, April 8, 2014 11:58:17 AM UTC+2, Marcio Rodrigues wrote:

Still new to ES, so I was wondering what would be the best approach.

Say you have for example a company with different branhes throughout the
country and their documents will be stored centrally, but need to be
searched sepperately.

Would it be better to create one index for all documents and use filters
on the queries or better to make a sepperate index for each branch?

By better I mean most efficient in terms of performance a resource usage.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/4c77e453-20c2-40aa-88da-20086cd096fa%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5