Hi,
I have a single node ES (0.18.7 version) setup. unfortunately, i didn't
change the default config much and it has 5 shards.. and now i have quite a
bit of production data stored on it (12GB). what we are seeing is reduced
throughput over time and search times sometimes as high as few minutes.. im
looking at some help on how to bring the situation under control as we are
constantly indexing data and also serving realtime customer requests.
questions
is it possible to reduce the number of shards from 5 to 2 somehow? does
that work once the system is already in place?
i read somewhere that it could be due to threadpool pressure. but the
node stats ( curl -XGET ' http://localhost:9200/_clustr/nodes/stats?pretty=true') is not giving
thread pool information. how do i get around identifying the root cause?
my throughput is around 300-400 index calls per sec. how do i make it
higher?
if i were to optimize such that my gets and search calls are faster, is
it possible? it can be at the expense of slower index calls.
this is on a dual core machine (ec2 m1.large instance) and i gave ES 4GB
ram. has there been any benchmarking done on ec2 instances.
On Saturday, July 7, 2012 3:05:03 AM UTC+3, T Vinod Gupta wrote:
Hi,
I have a single node ES (0.18.7 version) setup. unfortunately, i didn't
change the default config much and it has 5 shards.. and now i have quite a
bit of production data stored on it (12GB). what we are seeing is reduced
throughput over time and search times sometimes as high as few minutes.. im
looking at some help on how to bring the situation under control as we are
constantly indexing data and also serving realtime customer requests.
questions
is it possible to reduce the number of shards from 5 to 2 somehow? does
that work once the system is already in place?
i read somewhere that it could be due to threadpool pressure. but the
node stats ( curl -XGET ' http://localhost:9200/_clustr/nodes/stats?pretty=true') is not giving
thread pool information. how do i get around identifying the root cause?
I would start by looking at BigDesk and in the logs.
my throughput is around 300-400 index calls per sec. how do i make it
higher?
It depends a lot on how you data looks like. But increasing the
refresh_interval should always help.
if i were to optimize such that my gets and search calls are faster, is
it possible? it can be at the expense of slower index calls.
How does your data and searches look like?
If you find your storage slow, you might benefit from compressing your
source. I would also try upgrading ES to a newer version. I find it faster,
although I don't have a clear benchmark to show that. Please note that
upgrading needs some care. Quote:
Upgrade Notes:
Upgrading from 0.18 requires issuing a full flush of all the indices
in the cluster (curl host:9200/_flush) before shutting down the cluster,
with no indexing operations happening after the flush.
The local gateway state structure has changed, a backup of the state
files is created when upgrading, they can then be used to downgrade back to
0.18. Don’t downgrade without using them.
this is on a dual core machine (ec2 m1.large instance) and i gave ES 4GB
ram. has there been any benchmarking done on ec2 instances.
Thanks Radu..
I increased the refresh interval to 60sec.. that didnt help.. i see bunch
of error messages in elasticsearch.log file that look like below. could
that be the reason for slow search? now i see slowness even when there is
not much indexing happening. these messages occur twice/thrice a minute.
[2012-07-09 00:00:55,238][WARN ][index.merge.scheduler ] [] [facebook][3] failed to merge
java.io.IOException: Input/output error:
NIOFSIndexInput(path="/media/ephemeral0/ES_data/elasticsearch/nodes/0/indices/facebook/3/index/_qicw.fdt")
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:155)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:110)
at org.apache.lucene.store.DataOutput.copyBytes(DataOutput.java:123)
at
org.apache.lucene.index.FieldsWriter.addRawDocuments(FieldsWriter.java:216)
at
org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:301)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:248)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:108)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4295)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3940)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:388)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:88)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:456)
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcher.pread0(Native Method)
at sun.nio.ch.FileDispatcher.pread(FileDispatcher.java:49)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:248)
at sun.nio.ch.IOUtil.read(IOUtil.java:224)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:663)
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:162)
... 12 more
also, i was not able to install bigdesk - due to below error.
On Saturday, July 7, 2012 3:05:03 AM UTC+3, T Vinod Gupta wrote:
Hi,
I have a single node ES (0.18.7 version) setup. unfortunately, i didn't
change the default config much and it has 5 shards.. and now i have quite a
bit of production data stored on it (12GB). what we are seeing is reduced
throughput over time and search times sometimes as high as few minutes.. im
looking at some help on how to bring the situation under control as we are
constantly indexing data and also serving realtime customer requests.
questions
is it possible to reduce the number of shards from 5 to 2 somehow?
does that work once the system is already in place?
I would start by looking at BigDesk and in the logs.
my throughput is around 300-400 index calls per sec. how do i make it
higher?
It depends a lot on how you data looks like. But increasing the
refresh_interval should always help.
if i were to optimize such that my gets and search calls are faster,
is it possible? it can be at the expense of slower index calls.
How does your data and searches look like?
If you find your storage slow, you might benefit from compressing your
source. I would also try upgrading ES to a newer version. I find it faster,
although I don't have a clear benchmark to show that. Please note that
upgrading needs some care. Quote:
Upgrade Notes:
Upgrading from 0.18 requires issuing a full flush of all the indices
in the cluster (curl host:9200/_flush) before shutting down the
cluster, with no indexing operations happening after the flush.
The local gateway state structure has changed, a backup of the state
files is created when upgrading, they can then be used to downgrade back to
0.18. Don’t downgrade without using them.
this is on a dual core machine (ec2 m1.large instance) and i gave ES 4GB
ram. has there been any benchmarking done on ec2 instances.
On Monday, July 9, 2012 5:19:06 AM UTC+3, T Vinod Gupta wrote:
Thanks Radu..
I increased the refresh interval to 60sec.. that didnt help.. i see bunch
of error messages in elasticsearch.log file that look like below. could
that be the reason for slow search? now i see slowness even when there is
not much indexing happening. these messages occur twice/thrice a minute.
I don't know what that error means, besides from what the text says (read
error). And I don't know how that would impact performance. I mean, it must
have a performance impact, I just don't know how significant it is.
[2012-07-09 00:00:55,238][WARN ][index.merge.scheduler ] [] [facebook][3] failed to merge
java.io.IOException: Input/output error:
NIOFSIndexInput(path="/media/ephemeral0/ES_data/elasticsearch/nodes/0/indices/facebook/3/index/_qicw.fdt")
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:180)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:155)
at
org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:110)
at
org.apache.lucene.store.DataOutput.copyBytes(DataOutput.java:123)
at
org.apache.lucene.index.FieldsWriter.addRawDocuments(FieldsWriter.java:216)
at
org.apache.lucene.index.SegmentMerger.copyFieldsWithDeletions(SegmentMerger.java:301)
at
org.apache.lucene.index.SegmentMerger.mergeFields(SegmentMerger.java:248)
at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:108)
at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4295)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3940)
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:388)
at
org.apache.lucene.index.TrackingConcurrentMergeScheduler.doMerge(TrackingConcurrentMergeScheduler.java:88)
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:456)
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcher.pread0(Native Method)
at sun.nio.ch.FileDispatcher.pread(FileDispatcher.java:49)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:248)
at sun.nio.ch.IOUtil.read(IOUtil.java:224)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:663)
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:162)
... 12 more
also, i was not able to install bigdesk - due to below error.
then extract it and open index.html from the lukas-vlcek... directory.
i would really appreciate any help i can get here..
thanks
On Sat, Jul 7, 2012 at 8:08 AM, Radu Gheorghe <> wrote:
Hi,
On Saturday, July 7, 2012 3:05:03 AM UTC+3, T Vinod Gupta wrote:
Hi,
I have a single node ES (0.18.7 version) setup. unfortunately, i didn't
change the default config much and it has 5 shards.. and now i have quite a
bit of production data stored on it (12GB). what we are seeing is reduced
throughput over time and search times sometimes as high as few minutes.. im
looking at some help on how to bring the situation under control as we are
constantly indexing data and also serving realtime customer requests.
questions
is it possible to reduce the number of shards from 5 to 2 somehow?
does that work once the system is already in place?
I would start by looking at BigDesk and in the logs.
my throughput is around 300-400 index calls per sec. how do i make it
higher?
It depends a lot on how you data looks like. But increasing the
refresh_interval should always help.
if i were to optimize such that my gets and search calls are faster,
is it possible? it can be at the expense of slower index calls.
How does your data and searches look like?
If you find your storage slow, you might benefit from compressing your
source. I would also try upgrading ES to a newer version. I find it faster,
although I don't have a clear benchmark to show that. Please note that
upgrading needs some care. Quote:
Upgrade Notes:
Upgrading from 0.18 requires issuing a full flush of all the
indices in the cluster (curl host:9200/_flush) before shutting down
the cluster, with no indexing operations happening after the flush.
The local gateway state structure has changed, a backup of the
state files is created when upgrading, they can then be used to downgrade
back to 0.18. Don’t downgrade without using them.
this is on a dual core machine (ec2 m1.large instance) and i gave ES 4GB
ram. has there been any benchmarking done on ec2 instances.
such errors like below are outside of ES but should be interpreted as a
serious hint that the system is not able to write via NIO because of disk
errors, file system errors, short on resources etc. So I'd watch out for
messages of the system (syslog, disk damages, disk full, etc.)
Best regards,
Jörg
Quote:
[...]
Caused by: java.io.IOException: Input/output error
at sun.nio.ch.FileDispatcher.pread0(Native Method)
at sun.nio.ch.FileDispatcher.pread(FileDispatcher.java:49)
at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:248)
at sun.nio.ch.IOUtil.read(IOUtil.java:224)
at sun.nio.ch.FileChannelImpl.read(FileChannelImpl.java:663)
at
org.apache.lucene.store.NIOFSDirectory$NIOFSIndexInput.readInternal(NIOFSDirectory.java:162)
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.