Ok, I think I understand based on some other things I found. Basically
read the records out of HBase and form some type of document ( JSON, etc ).
Once the document is created post it to the Elasticsearch cluster via the
JAVA api.
On Sun, Nov 20, 2011 at 9:43 PM, elasticsearch@googlegroups.com wrote:
Today's Topic Summary
Group: http://groups.google.com/group/elasticsearch/topics
- Gateway Snapshot settings & unexpected behaviour (maybe bug?)<#133c4017358326a2_group_thread_0>[1 Update]
- Sort on a deeply nested field? <#133c4017358326a2_group_thread_1> [1
Update]- Caveats of a large mapping <#133c4017358326a2_group_thread_2> [1
Update]- Thrift transport hangs when requesting bogus URL?<#133c4017358326a2_group_thread_3>[1 Update]
- indexing in Hadoop <#133c4017358326a2_group_thread_4> [1 Update]
Gateway Snapshot settings & unexpected behaviour (maybe bug?)http://groups.google.com/group/elasticsearch/t/99c4a93467ad5f4d
Paul Smith tallpsmith@gmail.com Nov 21 09:18AM +1100
gateway was simpler, its not really needed with shared gateway (like
the fs
shared one you use), and disabling the snapshot interval should be
enough
for it.Ok, thanks, that makes sense.
What you say is that when you set the snapshot interval to 0, a
snapshot
still happens?Yes, when set to 0, periodic/timed snapshots stop happening, but as
soon as
we index something the snapshot happens (see the log gist). We tried
experimenting setting the snapshot internal to a large number, and that
does work EXCEPT when resetting interval value one has to wait for
the
larger interval value to complete before the new setting takes affect.
This is presumably because the thread sleeps until that larger interval
value and isn't woken up when the configuration changes.Should I write up a bug report for this snapshot_interva=0 doesn't
work?Regarding the settings, the one returned for the get settings API
are only
the ones explicitly set, it does not return settings with "default"
values.Is there any way other than looking at the docs to then interpret what
a
particular setting is configured to then?Sort on a deeply nested field?http://groups.google.com/group/elasticsearch/t/4369e5f38812899
Nick Hoffman nick@deadorange.com Nov 20 01:58PM -0800
The "properties" is a keyword that's used in my app and that we
decided to
use in ES, too.Any idea why the error in that gist is occurring? I've been wracking
brain,
but can't see anything wrong with the mapping, document, or query.
Why does sorting fail here? · GitHubCaveats of a large mappinghttp://groups.google.com/group/elasticsearch/t/73919c6613703fba
Nick Hoffman nick@deadorange.com Nov 20 01:56PM -0800
Thanks for the clarification!
Thrift transport hangs when requesting bogus URL?http://groups.google.com/group/elasticsearch/t/f16682529bf38e00
"Matthew A. Brown" mat.a.brown@gmail.com Nov 20 03:39PM -0500
I don't have it in front of me, but I think it was a GET request for
the
URL "/bogus"indexing in Hadoophttp://groups.google.com/group/elasticsearch/t/a6d66db4a8385012
Otis Gospodnetic otis.gospodnetic@gmail.com Nov 19 07:38PM -0800
Hello,
I think you should think about this a little differently. For
example, think about sending documents formed from data in HBase
directly to ES via its API instead of thinking how to index with
Lucene. When you do that, you'll learn answers to your questions as
you learn about using the ES API to index data.In terms or reading data from HBase, you could start by looking at
HBase's Export MR job.
For indexing to ES:
Elasticsearch Platform — Find real-time answers at scale | ElasticOtis
Sematext :: http://sematext.com/ :: Solr - Lucene - Hadoop - HBase
Lucene ecosystem search :: http://search-lucene.com/You received this message because you are subscribed to the Google Group
elasticsearch.
You can post via email elasticsearch@googlegroups.com.
To unsubscribe from this group, sendelasticsearch+unsubscribe@googlegroups.coman empty message.
For more options, visithttp://groups.google.com/group/elasticsearch/topicsthis group.