Just starting out with ES, and I have a question about the purpose of
indexes.
In my use case, I will be searching millions of U.S. court cases. These
cases come from multiple courts--i.e., the U.S. Supreme Court, the First
Circuit Court of Appeals, Second Circuit... Iowa State Supreme Court...
etc. Each case will have the same fields, so mapping for each document will
not differ.
Sometimes my users will want to search across all courts. Other times,
they'll want to just search specific courts -- for example, only courts in
California, or only the Supreme Court.
I was originally thinking of having just one index, and making the court a
(faceted?) field. But I'm wondering if this is an appropriate use of
multiple indexes? If so, what do I gain by having more than one index? Is
it performance-related, or will it somehow be better for me as the
developer?
With multiple indices, you can set different sharding and replication
values for each individual index if the size varies greatly. I doubt your
volume of data differs several orders of magnitude between each court.
Another benefit is the ability to close a particular index if it is no
longer needed. Once again, not your use case.
Just starting out with ES, and I have a question about the purpose of
indexes.
In my use case, I will be searching millions of U.S. court cases. These
cases come from multiple courts--i.e., the U.S. Supreme Court, the First
Circuit Court of Appeals, Second Circuit... Iowa State Supreme Court...
etc. Each case will have the same fields, so mapping for each document will
not differ.
Sometimes my users will want to search across all courts. Other times,
they'll want to just search specific courts -- for example, only courts in
California, or only the Supreme Court.
I was originally thinking of having just one index, and making the court a
(faceted?) field. But I'm wondering if this is an appropriate use of
multiple indexes? If so, what do I gain by having more than one index? Is
it performance-related, or will it somehow be better for me as the
developer?
Thanks very much, Ivan! I'll proceed with one index then.
On Wed, Apr 10, 2013 at 5:15 PM, Ivan Brusic ivan@brusic.com wrote:
With multiple indices, you can set different sharding and replication
values for each individual index if the size varies greatly. I doubt your
volume of data differs several orders of magnitude between each court.
Another benefit is the ability to close a particular index if it is no
longer needed. Once again, not your use case.
Just starting out with ES, and I have a question about the purpose of
indexes.
In my use case, I will be searching millions of U.S. court cases. These
cases come from multiple courts--i.e., the U.S. Supreme Court, the First
Circuit Court of Appeals, Second Circuit... Iowa State Supreme Court...
etc. Each case will have the same fields, so mapping for each document will
not differ.
Sometimes my users will want to search across all courts. Other times,
they'll want to just search specific courts -- for example, only courts in
California, or only the Supreme Court.
I was originally thinking of having just one index, and making the court
a (faceted?) field. But I'm wondering if this is an appropriate use of
multiple indexes? If so, what do I gain by having more than one index? Is
it performance-related, or will it somehow be better for me as the
developer?
Thanks,
Jake
--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
Also something to keep in mind, if a particular court ever does grow an
order of magnitude, you can use aliases to transparently route those courts
to their own index, allowing you to fine-tune their shard allocation.
-Zach
On Thursday, April 11, 2013 10:48:23 AM UTC-4, Jacob Heller wrote:
Thanks very much, Ivan! I'll proceed with one index then.
On Wed, Apr 10, 2013 at 5:15 PM, Ivan Brusic <iv...@brusic.com<javascript:>
wrote:
With multiple indices, you can set different sharding and replication
values for each individual index if the size varies greatly. I doubt your
volume of data differs several orders of magnitude between each court.
Another benefit is the ability to close a particular index if it is no
longer needed. Once again, not your use case.
I would use one index, IMHO.
Cheers,
Ivan
On Wed, Apr 10, 2013 at 9:10 AM, Jacob Heller <jacob....@gmail.com<javascript:>
wrote:
Hi all,
Just starting out with ES, and I have a question about the purpose of
indexes.
In my use case, I will be searching millions of U.S. court cases. These
cases come from multiple courts--i.e., the U.S. Supreme Court, the First
Circuit Court of Appeals, Second Circuit... Iowa State Supreme Court...
etc. Each case will have the same fields, so mapping for each document will
not differ.
Sometimes my users will want to search across all courts. Other times,
they'll want to just search specific courts -- for example, only courts in
California, or only the Supreme Court.
I was originally thinking of having just one index, and making the court
a (faceted?) field. But I'm wondering if this is an appropriate use of
multiple indexes? If so, what do I gain by having more than one index? Is
it performance-related, or will it somehow be better for me as the
developer?
Thanks,
Jake
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearc...@googlegroups.com <javascript:>.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.