I have elastic search installed on 2 windows 2012 servers, (2 cores each,
8GB of RAM each, 4GB ES_HEAP each) with mostly default settings, and the
I was trying optimize for bulk indexing.
I'm reading 500k rows from a database at about 3 min per read, and then
calling the bulk api with a batch size of 5k in a new thread so I can index
while making the next read. I'm running this process on one of the
elasticsearch nodes, which also gets marked as master.
I'm experiencing a whole slew of problems.
- Swap space is 5-6GB on each node - I'm not sure what this means on
windows, but I've disabled the page file and that didn't help.
- Some data isn't getting indexed. I'll run it pointing to one index, and
the count doesn't match up with running it against a new index. Retrying
sometimes solves this.
- The data randomly disappears. At various times it claims I have
anywhere from 500k documents to 80M documents, even when both nodes are up.
- I have it set to two shards, which defaulted to 2 shards on the same
node, but randomly switches to one shard on the other node.
- Nodes seem to get unbootstrapped fairly frequently, which results in
loss of data as well.
- ElasticHq claims, again at various stages, that elasticsearch has
deleted the missing documents. Sometimes these documents mysteriously get
undeleted, and show up again in search.
- I've tried to stop these issues by refreshing the index, then stopping
the bulk indexer app, then restarting elasticsearch on both nodes, which
has also resulted in GB worth of data loss.
What am I doing wrong here? Elasticsearch is completely unusable from my
perspective. These issues aren't even acceptable in a development
environment, let alone a production environment.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to email@example.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ef0d783a-9be1-463b-9705-6919694ca3f8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.