Put each shard on different disk

I am trying to set up a ES server (one node only) with 4 shards. Is it
possible to put 4 shard on 4 different disks? How?

Thanks,
Ming-

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hello Ming,

The only way I can think of is to create the index, then shut down
Elasticsearch, then copy each shard's directory (eg:
/var/lib/elasticsearch/elasticsearch/nodes/0/indices/test/$SHARD_NUMBER/)
to each of your disks, then mount each disk on each shard directory. On
restarting Elasticsearch, it should use the 4 disks.

I've never tried it, I have no idea if it works but I guess it's worth
trying :slight_smile:

Best regards,
Radu

http://sematext.com/ -- Elasticsearch -- Solr -- Lucene

On Tue, Aug 20, 2013 at 3:16 AM, mfyang@wisewindow.com wrote:

I am trying to set up a ES server (one node only) with 4 shards. Is it
possible to put 4 shard on 4 different disks? How?

Thanks,
Ming-

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

you can specify multiple data directories like path.data=["path1", "path2"]

this will put shards on different disks if you configure it to point to
them.. it will use the least used one

simon

On Tuesday, August 20, 2013 2:16:38 AM UTC+2, mfy...@wisewindow.com wrote:

I am trying to set up a ES server (one node only) with 4 shards. Is it
possible to put 4 shard on 4 different disks? How?

Thanks,
Ming-

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Looking deeper at what Simon suggested, that's how you can have multiple
data directories as mentioned herehttp://www.elasticsearch.org/guide/reference/setup/dir-layout/.
The files are going to get distributed into those different directories
depending on the index.store.distributor setting which can be set to
least_used (selects the directory with the most available space) or random.
But the distribution happens per file, not per shard (lucene index), thus
it is not possible to control where every (whole) shard is stored.

It would be interesting to know more about the usecase here. Could you
elaborate a bit more about it? And on question: do you want to just put
every shard in a different directory or would you like to control where
each shard goes too?

On Tuesday, August 20, 2013 5:35:15 PM UTC+2, simonw wrote:

you can specify multiple data directories like path.data=["path1", "path2"]

this will put shards on different disks if you configure it to point to
them.. it will use the least used one

simon

On Tuesday, August 20, 2013 2:16:38 AM UTC+2, mfy...@wisewindow.com wrote:

I am trying to set up a ES server (one node only) with 4 shards. Is it
possible to put 4 shard on 4 different disks? How?

Thanks,
Ming-

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Luca,

In this particular case, I intend to use ES as kinda of data archive for a
few TB of text info, and I want to do a query with keywords from time to
time to get related info. I intend to use 8 shards. In the server I set
up (amazon ec2), I have 8 cpu cores, and 8 1TB disk. I'd like to put one
shard on each disk.

From what I learned about ES, seems like it can distribute storage across
multiple disk, like raid0. But it can not control which shard goes to
which place. As a result, data for each shard would spread across all
eight disks. True?

Ming-

On Wed, Aug 21, 2013 at 1:54 AM, Luca Cavanna cavannaluca@gmail.com wrote:

Looking deeper at what Simon suggested, that's how you can have multiple
data directories as mentioned herehttp://www.elasticsearch.org/guide/reference/setup/dir-layout/.
The files are going to get distributed into those different directories
depending on the index.store.distributor setting which can be set to
least_used (selects the directory with the most available space) or random.
But the distribution happens per file, not per shard (lucene index), thus
it is not possible to control where every (whole) shard is stored.

It would be interesting to know more about the usecase here. Could you
elaborate a bit more about it? And on question: do you want to just put
every shard in a different directory or would you like to control where
each shard goes too?

On Tuesday, August 20, 2013 5:35:15 PM UTC+2, simonw wrote:

you can specify multiple data directories like path.data=["path1", "path2"]

this will put shards on different disks if you configure it to point to
them.. it will use the least used one

simon

On Tuesday, August 20, 2013 2:16:38 AM UTC+2, mfy...@wisewindow.comwrote:

I am trying to set up a ES server (one node only) with 4 shards. Is it
possible to put 4 shard on 4 different disks? How?

Thanks,
Ming-

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/irDHHN5yBps/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi Ming,
yes you got it right, using multiple data directories the distribution
would happen on a file basis.

On Wed, Aug 21, 2013 at 7:45 PM, Mingfeng Yang mfyang@wisewindow.comwrote:

Hi Luca,

In this particular case, I intend to use ES as kinda of data archive for a
few TB of text info, and I want to do a query with keywords from time to
time to get related info. I intend to use 8 shards. In the server I set
up (amazon ec2), I have 8 cpu cores, and 8 1TB disk. I'd like to put one
shard on each disk.

From what I learned about ES, seems like it can distribute storage across
multiple disk, like raid0. But it can not control which shard goes to
which place. As a result, data for each shard would spread across all
eight disks. True?

Ming-

On Wed, Aug 21, 2013 at 1:54 AM, Luca Cavanna cavannaluca@gmail.comwrote:

Looking deeper at what Simon suggested, that's how you can have multiple
data directories as mentioned herehttp://www.elasticsearch.org/guide/reference/setup/dir-layout/.
The files are going to get distributed into those different directories
depending on the index.store.distributor setting which can be set to
least_used (selects the directory with the most available space) or random.
But the distribution happens per file, not per shard (lucene index), thus
it is not possible to control where every (whole) shard is stored.

It would be interesting to know more about the usecase here. Could you
elaborate a bit more about it? And on question: do you want to just put
every shard in a different directory or would you like to control where
each shard goes too?

On Tuesday, August 20, 2013 5:35:15 PM UTC+2, simonw wrote:

you can specify multiple data directories like path.data=["path1", "path2"]

this will put shards on different disks if you configure it to point to
them.. it will use the least used one

simon

On Tuesday, August 20, 2013 2:16:38 AM UTC+2, mfy...@wisewindow.comwrote:

I am trying to set up a ES server (one node only) with 4 shards. Is it
possible to put 4 shard on 4 different disks? How?

Thanks,
Ming-

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/irDHHN5yBps/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/irDHHN5yBps/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Instead of fiddling with shards and paths and reallocation etc. it is much
easier to use mdadm to create a single RAID0 on EC2 and use Elasticsearch
with the default settings. As a bonus with 8 disks, you can read 8x faster.

For example, like described here
http://www.gabrielweinberg.com/blog/2011/05/raid0-ephemeral-storage-on-aws-ec2.html

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Or this http://aws.amazon.com/articles/Amazon-EC2/1074

Jörg

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.