Ok, so the index aliasing feature is really cool and powerful. I was
wondering if there is any similar sort of functionality for types? The use
case is this - you have 100,000 news articles and 100,000 tweets indexed. I
want to change the mapping on the news articles (index a new field) and do a
reindex of just news articles into the new mapping. Is there a way to alias
these so that you could atomically cut from a query on the old type to a
query on the new type without having to rewrite the queries?
By the way Shay, amazing work. And not even just in code, though what is
there is really elegant, but with the whole project - the releases, the site
maintenance, responsiveness on the listserv, the whole bucket. Hats off.
I was thinking in using JMX monitoring to plot some data, like index
size, doc. count, translog ops, etc. The problem I am having is that
only the indices names are shown under "org.elasticsearch/indices",
not the aliases, so it's impossible to add an agent request for that
index, as the name changes from time to time (when I do a full reindex
for example, as I increment the index version and swap the alias
later). Also the caches status would be a nice JMX addition.
I think it must be somewhere in the roadmap, meanwhile I can implement
myself an MBean using the status API and the like, but I would like to
know if you are planning to add it soon so I won't waste time in it
now.
First of all, thanks! :). Regarding alias for types, I think that if you
want to solve the scenario you mentioned, it might make sense to have the
two types as different indices. This will allow you to easily reindex only a
specific "type" (which is now an index), deleting an index is much more
lightweight compared to deleting a type (and all its docs), and, you can
alway search across indices.
On Wed, Aug 24, 2011 at 11:30 PM, Will Ezell will@dotcms.com wrote:
Ok, so the index aliasing feature is really cool and powerful. I was
wondering if there is any similar sort of functionality for types? The use
case is this - you have 100,000 news articles and 100,000 tweets indexed. I
want to change the mapping on the news articles (index a new field) and do a
reindex of just news articles into the new mapping. Is there a way to alias
these so that you could atomically cut from a query on the old type to a
query on the new type without having to rewrite the queries?
By the way Shay, amazing work. And not even just in code, though what is
there is really elegant, but with the whole project - the releases, the site
maintenance, responsiveness on the listserv, the whole bucket. Hats off.
I was thinking in using JMX monitoring to plot some data, like index
size, doc. count, translog ops, etc. The problem I am having is that
only the indices names are shown under "org.elasticsearch/indices",
not the aliases, so it's impossible to add an agent request for that
index, as the name changes from time to time (when I do a full reindex
for example, as I increment the index version and swap the alias
later). Also the caches status would be a nice JMX addition.
I think it must be somewhere in the roadmap, meanwhile I can implement
myself an MBean using the status API and the like, but I would like to
know if you are planning to add it soon so I won't waste time in it
now.
Sorry, my example was not complete. One type per index was my first thought
too - but my issue is that there can be 100's of types which would mean
100's of indexes. Knowing a little about lucene and OS resource
consumption, I was wary of this approach. Maybe we do the type alias idea
in code and just translate to ES or should I not be scared of the 1 index
per type idea?
On Wed, Aug 24, 2011 at 6:53 PM, Shay Banon kimchy@gmail.com wrote:
First of all, thanks! :). Regarding alias for types, I think that if you
want to solve the scenario you mentioned, it might make sense to have the
two types as different indices. This will allow you to easily reindex only a
specific "type" (which is now an index), deleting an index is much more
lightweight compared to deleting a type (and all its docs), and, you can
alway search across indices.
On Wed, Aug 24, 2011 at 11:30 PM, Will Ezell will@dotcms.com wrote:
Ok, so the index aliasing feature is really cool and powerful. I was
wondering if there is any similar sort of functionality for types? The use
case is this - you have 100,000 news articles and 100,000 tweets indexed. I
want to change the mapping on the news articles (index a new field) and do a
reindex of just news articles into the new mapping. Is there a way to alias
these so that you could atomically cut from a query on the old type to a
query on the new type without having to rewrite the queries?
By the way Shay, amazing work. And not even just in code, though what is
there is really elegant, but with the whole project - the releases, the site
maintenance, responsiveness on the listserv, the whole bucket. Hats off.
I was thinking in using JMX monitoring to plot some data, like index
size, doc. count, translog ops, etc. The problem I am having is that
only the indices names are shown under "org.elasticsearch/indices",
not the aliases, so it's impossible to add an agent request for that
index, as the name changes from time to time (when I do a full reindex
for example, as I increment the index version and swap the alias
later). Also the caches status would be a nice JMX addition.
I think it must be somewhere in the roadmap, meanwhile I can implement
myself an MBean using the status API and the like, but I would like to
know if you are planning to add it soon so I won't waste time in it
now.
An index per type will require more resources, you can try and reduce that
by using less shards per index, but even with 1 shard per index, on a single
node, that does mean 100s of shards on a node. Stil something that is
doable, but you will need to test it (I know of people doing it).
If not, then I suggest doing it on the client side for now. Its tricky to
implement now in elasticsearch.
On Thu, Aug 25, 2011 at 1:50 PM, Will Ezell will@dotcms.com wrote:
Sorry, my example was not complete. One type per index was my first
thought too - but my issue is that there can be 100's of types which would
mean 100's of indexes. Knowing a little about lucene and OS resource
consumption, I was wary of this approach. Maybe we do the type alias idea
in code and just translate to ES or should I not be scared of the 1 index
per type idea?
On Wed, Aug 24, 2011 at 6:53 PM, Shay Banon kimchy@gmail.com wrote:
First of all, thanks! :). Regarding alias for types, I think that if you
want to solve the scenario you mentioned, it might make sense to have the
two types as different indices. This will allow you to easily reindex only a
specific "type" (which is now an index), deleting an index is much more
lightweight compared to deleting a type (and all its docs), and, you can
alway search across indices.
On Wed, Aug 24, 2011 at 11:30 PM, Will Ezell will@dotcms.com wrote:
Ok, so the index aliasing feature is really cool and powerful. I was
wondering if there is any similar sort of functionality for types? The use
case is this - you have 100,000 news articles and 100,000 tweets indexed. I
want to change the mapping on the news articles (index a new field) and do a
reindex of just news articles into the new mapping. Is there a way to alias
these so that you could atomically cut from a query on the old type to a
query on the new type without having to rewrite the queries?
By the way Shay, amazing work. And not even just in code, though what is
there is really elegant, but with the whole project - the releases, the site
maintenance, responsiveness on the listserv, the whole bucket. Hats off.
I was thinking in using JMX monitoring to plot some data, like index
size, doc. count, translog ops, etc. The problem I am having is that
only the indices names are shown under "org.elasticsearch/indices",
not the aliases, so it's impossible to add an agent request for that
index, as the name changes from time to time (when I do a full reindex
for example, as I increment the index version and swap the alias
later). Also the caches status would be a nice JMX addition.
I think it must be somewhere in the roadmap, meanwhile I can implement
myself an MBean using the status API and the like, but I would like to
know if you are planning to add it soon so I won't waste time in it
now.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.