Sizing RAM vs DISK


(Stanislas Polu) #1

Hi,

After stuffing a large amount of data into a pre-production cluster made of
7 shards, replication 1 on 3 machines, I have a better understanding of how
my index will grow.
What I am still missing is how much memory vs disk space I'm supposed to
provision for acceptable performance.

Is there any rule of thumb here?

Best,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime Photo
Search


(Stanislas Polu) #2

I'm using a disk based index with local gateway

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime Photo
Search

On Wed, Jan 18, 2012 at 12:20 PM, Stanislas Polu
polu.stanislas@gmail.comwrote:

Hi,

After stuffing a large amount of data into a pre-production cluster made
of 7 shards, replication 1 on 3 machines, I have a better understanding of
how my index will grow.
What I am still missing is how much memory vs disk space I'm supposed to
provision for acceptable performance.

Is there any rule of thumb here?

Best,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime
Photo Search


(Karussell) #3

What do you mean with disc size? You'll need as many GB for disc as
you have documents :slight_smile: so it depends on your documents + document
count ...

Regarding the RAM try to use only the half of the maximum RAM of the
system, so that the OS cache can be used.

Peter.

On 18 Jan., 13:53, Stanislas Polu polu.stanis...@gmail.com wrote:

I'm using a disk based index with local gateway

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime Photo
Search

On Wed, Jan 18, 2012 at 12:20 PM, Stanislas Polu
polu.stanis...@gmail.comwrote:

Hi,

After stuffing a large amount of data into a pre-production cluster made
of 7 shards, replication 1 on 3 machines, I have a better understanding of
how my index will grow.
What I am still missing is how much memory vs disk space I'm supposed to
provision for acceptable performance.

Is there any rule of thumb here?

Best,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime
Photo Search


(Stanislas Polu) #4

Agreed for the disk. Thanks for the clarification + advise on the RAM.

  • Does the _status indices.xxx.index.size include everything?
  • Does more RAM mean better perf?

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime Photo
Search

On Wed, Jan 18, 2012 at 2:09 PM, Karussell tableyourtime@googlemail.comwrote:

What do you mean with disc size? You'll need as many GB for disc as
you have documents :slight_smile: so it depends on your documents + document
count ...

Regarding the RAM try to use only the half of the maximum RAM of the
system, so that the OS cache can be used.

Peter.

On 18 Jan., 13:53, Stanislas Polu polu.stanis...@gmail.com wrote:

I'm using a disk based index with local gateway

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime Photo
Search

On Wed, Jan 18, 2012 at 12:20 PM, Stanislas Polu
polu.stanis...@gmail.comwrote:

Hi,

After stuffing a large amount of data into a pre-production cluster
made

of 7 shards, replication 1 on 3 machines, I have a better
understanding of

how my index will grow.
What I am still missing is how much memory vs disk space I'm supposed
to

provision for acceptable performance.

Is there any rule of thumb here?

Best,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime
Photo Search


(Shay Banon) #5

Use node stats to see where memory is spent. Mainly look at field data
cache (sort and facet related) and the jvm memory used. This will give you
an indication if you are running low on memory. Use bigdesk plugin to
visualize it.

On Wed, Jan 18, 2012 at 3:34 PM, Stanislas Polu polu.stanislas@gmail.comwrote:

Agreed for the disk. Thanks for the clarification + advise on the RAM.

  • Does the _status indices.xxx.index.size include everything?
  • Does more RAM mean better perf?

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime
Photo Search

On Wed, Jan 18, 2012 at 2:09 PM, Karussell tableyourtime@googlemail.comwrote:

What do you mean with disc size? You'll need as many GB for disc as
you have documents :slight_smile: so it depends on your documents + document
count ...

Regarding the RAM try to use only the half of the maximum RAM of the
system, so that the OS cache can be used.

Peter.

On 18 Jan., 13:53, Stanislas Polu polu.stanis...@gmail.com wrote:

I'm using a disk based index with local gateway

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime
Photo
Search

On Wed, Jan 18, 2012 at 12:20 PM, Stanislas Polu
polu.stanis...@gmail.comwrote:

Hi,

After stuffing a large amount of data into a pre-production cluster
made

of 7 shards, replication 1 on 3 machines, I have a better
understanding of

how my index will grow.
What I am still missing is how much memory vs disk space I'm supposed
to

provision for acceptable performance.

Is there any rule of thumb here?

Best,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime
Photo Search


(Stanislas Polu) #6

Thanks!

bigdesk is awesome.

Cheers,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime Photo
Search

On Wed, Jan 18, 2012 at 10:24 PM, Shay Banon kimchy@gmail.com wrote:

Use node stats to see where memory is spent. Mainly look at field data
cache (sort and facet related) and the jvm memory used. This will give you
an indication if you are running low on memory. Use bigdesk plugin to
visualize it.

On Wed, Jan 18, 2012 at 3:34 PM, Stanislas Polu polu.stanislas@gmail.comwrote:

Agreed for the disk. Thanks for the clarification + advise on the RAM.

  • Does the _status indices.xxx.index.size include everything?
  • Does more RAM mean better perf?

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu | http://teleportd.com | Realtime
Photo Search

On Wed, Jan 18, 2012 at 2:09 PM, Karussell tableyourtime@googlemail.comwrote:

What do you mean with disc size? You'll need as many GB for disc as
you have documents :slight_smile: so it depends on your documents + document
count ...

Regarding the RAM try to use only the half of the maximum RAM of the
system, so that the OS cache can be used.

Peter.

On 18 Jan., 13:53, Stanislas Polu polu.stanis...@gmail.com wrote:

I'm using a disk based index with local gateway

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime
Photo
Search

On Wed, Jan 18, 2012 at 12:20 PM, Stanislas Polu
polu.stanis...@gmail.comwrote:

Hi,

After stuffing a large amount of data into a pre-production cluster
made

of 7 shards, replication 1 on 3 machines, I have a better
understanding of

how my index will grow.
What I am still missing is how much memory vs disk space I'm
supposed to

provision for acceptable performance.

Is there any rule of thumb here?

Best,

-stan

--
Stanislas Polu
Mo: +33 6 83 71 90 04 | Tw: @spolu |http://teleportd.com| Realtime
Photo Search


(system) #7