Hey guys,
Was wondering if you guys could help me -- I've got a single
elasticsearch node running 0.8 and am using the Java API to index a
bunch of documents. I've got around 40 separate collections of data
and spawn threads to index 10 at a time the first time my app starts
up with no indexes present. Sadly, during the middle of all this the
search server inevitably times out on one or more indexes, causing
corruption.
Here's a sample of what goes on in my logs:
2010-07-11 18:53:50,604 ERROR service.BaseSearchService$5 - [pool-1-
thread-8] Error indexing card kdtgxj
org.elasticsearch.action.PrimaryNotStartedActionException:
[board.user1][0] Timeout waiting for [1m]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$4.onTimeout(TransportShardReplicationOperationAction.java:311)
at org.elasticsearch.cluster.service.InternalClusterService
$1$1.run(InternalClusterService.java:87)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:637)
2010-07-11 18:53:50,605 INFO service.ElasticSearchService - [pool-1-
thread-8] Rebuilding index for vinntestboard.vinn01/In Process...
2010-07-11 18:53:50,604 ERROR service.BaseSearchService$5 - [pool-1-
thread-5] Error indexing card kwzjds
org.elasticsearch.action.PrimaryNotStartedActionException:
[board2.user2][1] Timeout waiting for [1m]
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction
$AsyncShardOperationAction
$4.onTimeout(TransportShardReplicationOperationAction.java:311)
at org.elasticsearch.cluster.service.InternalClusterService
$1$1.run(InternalClusterService.java:87)
at java.util.concurrent.ThreadPoolExecutor
$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor
$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:637)
I'm also noticing that while searching multiple indexes at the same
time I get very inconsistent results. Trying the same terms comes
back with different documents every time. Am I doing something wrong,
or does elasticsearch really not support searching multiple indexes
simultaneously?
It's important to note that I've been working through converting an
existing application from using Lucene embedded into an app that can
switch back and forth. Most of the code I've written works great when
I use Lucene with ParallelMultiReaders, but when switching all that
out with the elasticsearch Java API, things go very badly.
Are there any hints folks can offer? Are there some things that
Lucene programmers should look out for when converting over to the
elasticsearch mindset?
Thanks,
Stephen Starkey.
P.S. Here's the code that does the index rebuild (minus the braindead
stuff that creates the maps of strings) .. I'm using the Strategy
Pattern here so I can use the same algorithm to index data with Lucene
and elasticsearch, so there are magic fingers calling the
makeRebuildJob method that shoves the Callable into a fixed thread
pool of size 10:
@Override
protected Callable makeRebuildJob(final String
absoluteName) {
return new Callable() {
public Boolean call() throws Exception {
newBoard(absoluteName);
int cardsIndexed = indexCards(absoluteName, new CardIndexer()
{
public void indexCard(Card card) throws Exception {
client.index(buildIndexRequest(card).create(true)).actionGet();
}
});
if (cardsIndexed > 0) {
client.admin().indices().optimize(optimizeRequest(absoluteName).maxNumSegments(5).refresh(true).
listenerThreaded(false)).actionGet();
}
return true;
}
};
}
private static IndexRequest buildIndexRequest(Card card) {
return
indexRequest(card.getColumn().getBoard().getAbsoluteName()).
type("card").
operationThreaded(false).
id(String.valueOf(card.getId())).
source(IndexUtil.cardToMap(card));
}