# Breaking Change: Majore gateway refactoring, and improved throttling

(Shay Banon) #1

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway module
``````

and improved throttling support. The gateway change is a breaking change,
meaning that new version will not be able to recover from a 0.9 gateway. I
will provide an upgrade script for file based gateway, s3 based gateway will
require reindexing, though, potentially, that script can be adjusted to
support it.

Let me explain some of the changes. The first is throttling support. In
0.9, recoveries are being throttled on a specific node in order to reduce
the load a that node. The throttling was done on the node level, after a
shard has been allocated to it. Maintain the count of current recoveries is
quite tricky because of the complexity of the recovery process. This has now
been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10 shards and
2 replicas, which sums it up to 12TB). This has uncovered some problems with
the current design, specifically how md5 are computed (and the time it takes
to compute them on the local storage on ec2), as well as other possibilities
for gateway corruptions using this load. Of course, elasticsearch aim is to
be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same manner
git works. Each snapshot is a commit point, that stores files in the gateway
into an auto generated name, and finally, a commit point is written with the
"directory" which maps between this pseudo name to physical name, and the
size. The new design allows for more resiliency when it comes to corruption.
It also allows for exciting future features like saving a commit point and
restoring from it, or automatically create a commit point each day for the
last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the final
version, and resilient for future changes. It takes some time to get there,
but once we are there, I can safely stand behind using elasticsearch as the
main storage as well as releasing v1.0 (I think elasticsearch has enough
features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out. The
next version, as a result of that is going to be 0.10, and I will release it
in the following days.

-shay.banon

(talsalmona) #2

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway module
``````

and improved throttling support. The gateway change is a breaking change,
meaning that new version will not be able to recover from a 0.9 gateway. I
will provide an upgrade script for file based gateway, s3 based gateway will
require reindexing, though, potentially, that script can be adjusted to
support it.

Let me explain some of the changes. The first is throttling support. In
0.9, recoveries are being throttled on a specific node in order to reduce
the load a that node. The throttling was done on the node level, after a
shard has been allocated to it. Maintain the count of current recoveries is
quite tricky because of the complexity of the recovery process. This has now
been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10 shards and
2 replicas, which sums it up to 12TB). This has uncovered some problems with
the current design, specifically how md5 are computed (and the time it takes
to compute them on the local storage on ec2), as well as other possibilities
for gateway corruptions using this load. Of course, elasticsearch aim is to
be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same manner
git works. Each snapshot is a commit point, that stores files in the gateway
into an auto generated name, and finally, a commit point is written with the
"directory" which maps between this pseudo name to physical name, and the
size. The new design allows for more resiliency when it comes to corruption.
It also allows for exciting future features like saving a commit point and
restoring from it, or automatically create a commit point each day for the
last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the final
version, and resilient for future changes. It takes some time to get there,
but once we are there, I can safely stand behind using elasticsearch as the
main storage as well as releasing v1.0 (I think elasticsearch has enough
features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out. The
next version, as a result of that is going to be 0.10, and I will release it
in the following days.

-shay.banon

(Shay Banon) #3

Yes, though there is a whole "workflow" level APIs and support for. But, the
basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalmona@gmail.com wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway
``````

module

and improved throttling support. The gateway change is a breaking change,
meaning that new version will not be able to recover from a 0.9 gateway.
I
will provide an upgrade script for file based gateway, s3 based gateway
will
require reindexing, though, potentially, that script can be adjusted to
support it.

Let me explain some of the changes. The first is throttling support.
In
0.9, recoveries are being throttled on a specific node in order to reduce
the load a that node. The throttling was done on the node level, after a
shard has been allocated to it. Maintain the count of current recoveries
is
quite tricky because of the complexity of the recovery process. This has
now
been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several
problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10 shards
and
2 replicas, which sums it up to 12TB). This has uncovered some problems
with
the current design, specifically how md5 are computed (and the time it
takes
to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course, elasticsearch aim is
to
be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same
manner
git works. Each snapshot is a commit point, that stores files in the
gateway
into an auto generated name, and finally, a commit point is written with
the
"directory" which maps between this pseudo name to physical name, and the
size. The new design allows for more resiliency when it comes to
corruption.
It also allows for exciting future features like saving a commit point
and
restoring from it, or automatically create a commit point each day for
the
last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the final
version, and resilient for future changes. It takes some time to get
there,
but once we are there, I can safely stand behind using elasticsearch as
the
main storage as well as releasing v1.0 (I think elasticsearch has enough
features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out. The
next version, as a result of that is going to be 0.10, and I will release
it
in the following days.

-shay.banon

(Grant Rodgers) #4

I've been testing with the new stuff, seems cool. I saw this error
while indexing, not sure if it's useful or not but I thought I'd pass
it on:

[16:55:30,719][WARN ][index.gateway ] [Ankhi] [versions][1]
failed to snapshot (scheduled)
org.elasticsearch.index.gateway.IndexShardGatewaySnapshotFailedException:
[versions][1] For input string: "_1.swp"
at
org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.snapshot(BlobStoreIndexShardGateway.java:
152)
at org.elasticsearch.index.gateway.IndexShardGatewayService
\$2.snapshot(IndexShardGatewayService.java:232)
at org.elasticsearch.index.gateway.IndexShardGatewayService
\$2.snapshot(IndexShardGatewayService.java:227)
at
org.elasticsearch.index.engine.robin.RobinEngine.snapshot(RobinEngine.java:
426)
at
org.elasticsearch.index.shard.service.InternalIndexShard.snapshot(InternalIndexShard.java:
372)
at
org.elasticsearch.index.gateway.IndexShardGatewayService.snapshot(IndexShardGatewayService.java:
227)
at org.elasticsearch.index.gateway.IndexShardGatewayService
\$SnapshotRunnable.run(IndexShardGatewayService.java:320)
441)
Caused by: java.lang.NumberFormatException: For input string: "_1.swp"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:
48)
at java.lang.Long.parseLong(Long.java:410)
at
org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.findLatestFileNameGeneration(BlobStoreIndexShardGateway.java:
809)
at
org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.doSnapshot(BlobStoreIndexShardGateway.java:
169)
at
org.elasticsearch.index.gateway.blobstore.BlobStoreIndexShardGateway.snapshot(BlobStoreIndexShardGateway.java:
142)
... 15 more

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, though there is a whole "workflow" level APIs and support for. But, the
basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway
``````

module

and improved throttling support. The gateway change is a breaking change,
meaning that new version will not be able to recover from a 0.9 gateway.
I
will provide an upgrade script for file based gateway, s3 based gateway
will
require reindexing, though, potentially, that script can be adjusted to
support it.

Let me explain some of the changes. The first is throttling support.
In
0.9, recoveries are being throttled on a specific node in order to reduce
the load a that node. The throttling was done on the node level, after a
shard has been allocated to it. Maintain the count of current recoveries
is
quite tricky because of the complexity of the recovery process. This has
now
been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several
problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10 shards
and
2 replicas, which sums it up to 12TB). This has uncovered some problems
with
the current design, specifically how md5 are computed (and the time it
takes
to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course, elasticsearch aim is
to
be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same
manner
git works. Each snapshot is a commit point, that stores files in the
gateway
into an auto generated name, and finally, a commit point is written with
the
"directory" which maps between this pseudo name to physical name, and the
size. The new design allows for more resiliency when it comes to
corruption.
It also allows for exciting future features like saving a commit point
and
restoring from it, or automatically create a commit point each day for
the
last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the final
version, and resilient for future changes. It takes some time to get
there,
but once we are there, I can safely stand behind using elasticsearch as
the
main storage as well as releasing v1.0 (I think elasticsearch has enough
features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out. The
next version, as a result of that is going to be 0.10, and I will release
it
in the following days.

-shay.banon

(Grant Rodgers) #5

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, though there is a whole "workflow" level APIs and support for. But, the
basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway
``````

module

and improved throttling support. The gateway change is a breaking change,
meaning that new version will not be able to recover from a 0.9 gateway.
I
will provide an upgrade script for file based gateway, s3 based gateway
will
require reindexing, though, potentially, that script can be adjusted to
support it.

Let me explain some of the changes. The first is throttling support.
In
0.9, recoveries are being throttled on a specific node in order to reduce
the load a that node. The throttling was done on the node level, after a
shard has been allocated to it. Maintain the count of current recoveries
is
quite tricky because of the complexity of the recovery process. This has
now
been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several
problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10 shards
and
2 replicas, which sums it up to 12TB). This has uncovered some problems
with
the current design, specifically how md5 are computed (and the time it
takes
to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course, elasticsearch aim is
to
be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same
manner
git works. Each snapshot is a commit point, that stores files in the
gateway
into an auto generated name, and finally, a commit point is written with
the
"directory" which maps between this pseudo name to physical name, and the
size. The new design allows for more resiliency when it comes to
corruption.
It also allows for exciting future features like saving a commit point
and
restoring from it, or automatically create a commit point each day for
the
last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the final
version, and resilient for future changes. It takes some time to get
there,
but once we are there, I can safely stand behind using elasticsearch as
the
main storage as well as releasing v1.0 (I think elasticsearch has enough
features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out. The
next version, as a result of that is going to be 0.10, and I will release
it
in the following days.

-shay.banon

(Shay Banon) #6

Strange, not sure how this file ended up in the gateway, I can't see where
elasticsearch would write it. It only writes __xxx files (no . something)
and commit- files. I will fix it to ignore files that don't conform to the
format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers grantr@gmail.com wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, though there is a whole "workflow" level APIs and support for. But,
the
basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway
``````

module

and improved throttling support. The gateway change is a breaking
change,

meaning that new version will not be able to recover from a 0.9
gateway.

I

will provide an upgrade script for file based gateway, s3 based
gateway

will

require reindexing, though, potentially, that script can be adjusted
to

support it.

Let me explain some of the changes. The first is throttling
support.

In

0.9, recoveries are being throttled on a specific node in order to
reduce

the load a that node. The throttling was done on the node level,
after a

shard has been allocated to it. Maintain the count of current
recoveries

is

quite tricky because of the complexity of the recovery process. This
has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several
problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10
shards

and

2 replicas, which sums it up to 12TB). This has uncovered some
problems

with

the current design, specifically how md5 are computed (and the time
it

takes

to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course, elasticsearch aim
is

to

be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same
manner
git works. Each snapshot is a commit point, that stores files in the
gateway
into an auto generated name, and finally, a commit point is written
with

the

"directory" which maps between this pseudo name to physical name, and
the

size. The new design allows for more resiliency when it comes to
corruption.
It also allows for exciting future features like saving a commit
point

and

restoring from it, or automatically create a commit point each day
for

the

last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the
final

version, and resilient for future changes. It takes some time to get
there,
but once we are there, I can safely stand behind using elasticsearch
as

the

main storage as well as releasing v1.0 (I think elasticsearch has
enough

features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out.
The

next version, as a result of that is going to be 0.10, and I will
release

it

in the following days.

-shay.banon

(Grant Rodgers) #7

Oh you know it was probably a vi swap file. I was taking a look at one
of the commit logs, and it might have tried to snapshot while it was
open.

I think you have committed another change since 80c7135 that ignores
files elasticsearch didn't create. I'll build the latest head and try
viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, not sure how this file ended up in the gateway, I can't see where
elasticsearch would write it. It only writes __xxx files (no . something)
and commit- files. I will fix it to ignore files that don't conform to the
format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers gra...@gmail.com wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, though there is a whole "workflow" level APIs and support for. But,
the
basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway
``````

module

and improved throttling support. The gateway change is a breaking
change,

meaning that new version will not be able to recover from a 0.9
gateway.

I

will provide an upgrade script for file based gateway, s3 based
gateway

will

require reindexing, though, potentially, that script can be adjusted
to

support it.

Let me explain some of the changes. The first is throttling
support.

In

0.9, recoveries are being throttled on a specific node in order to
reduce

the load a that node. The throttling was done on the node level,
after a

shard has been allocated to it. Maintain the count of current
recoveries

is

quite tricky because of the complexity of the recovery process. This
has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several
problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10
shards

and

2 replicas, which sums it up to 12TB). This has uncovered some
problems

with

the current design, specifically how md5 are computed (and the time
it

takes

to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course, elasticsearch aim
is

to

be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same
manner
git works. Each snapshot is a commit point, that stores files in the
gateway
into an auto generated name, and finally, a commit point is written
with

the

"directory" which maps between this pseudo name to physical name, and
the

size. The new design allows for more resiliency when it comes to
corruption.
It also allows for exciting future features like saving a commit
point

and

restoring from it, or automatically create a commit point each day
for

the

last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the
final

version, and resilient for future changes. It takes some time to get
there,
but once we are there, I can safely stand behind using elasticsearch
as

the

main storage as well as releasing v1.0 (I think elasticsearch has
enough

features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out.
The

next version, as a result of that is going to be 0.10, and I will
release

it

in the following days.

-shay.banon

(Grant Rodgers) #8

I think it's fixed in head; I didn't see this error again when trying
the test below.

On Aug 24, 1:12 pm, Grant Rodgers gra...@gmail.com wrote:

Oh you know it was probably a vi swap file. I was taking a look at one
of the commit logs, and it might have tried to snapshot while it was
open.

I think you have committed another change since 80c7135 that ignores
files elasticsearch didn't create. I'll build the latest head and try
viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, not sure how this file ended up in the gateway, I can't see where
elasticsearch would write it. It only writes __xxx files (no . something)
and commit- files. I will fix it to ignore files that don't conform to the
format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers gra...@gmail.com wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yes, though there is a whole "workflow" level APIs and support for. But,
the
basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the gateway
``````

module

and improved throttling support. The gateway change is a breaking
change,

meaning that new version will not be able to recover from a 0.9
gateway.

I

will provide an upgrade script for file based gateway, s3 based
gateway

will

require reindexing, though, potentially, that script can be adjusted
to

support it.

Let me explain some of the changes. The first is throttling
support.

In

0.9, recoveries are being throttled on a specific node in order to
reduce

the load a that node. The throttling was done on the node level,
after a

shard has been allocated to it. Maintain the count of current
recoveries

is

quite tricky because of the complexity of the recovery process. This
has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were several
problems
with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with 10
shards

and

2 replicas, which sums it up to 12TB). This has uncovered some
problems

with

the current design, specifically how md5 are computed (and the time
it

takes

to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course, elasticsearch aim
is

to

be able to store much more data than that, we are getting there... .

In general, the new implementation works (in spirit) in the same
manner
git works. Each snapshot is a commit point, that stores files in the
gateway
into an auto generated name, and finally, a commit point is written
with

the

"directory" which maps between this pseudo name to physical name, and
the

size. The new design allows for more resiliency when it comes to
corruption.
It also allows for exciting future features like saving a commit
point

and

restoring from it, or automatically create a commit point each day
for

the

last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be the
final

version, and resilient for future changes. It takes some time to get
there,
but once we are there, I can safely stand behind using elasticsearch
as

the

main storage as well as releasing v1.0 (I think elasticsearch has
enough

features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it out.
The

next version, as a result of that is going to be 0.10, and I will
release

it

in the following days.

-shay.banon

(Shay Banon) #9

Great, thanks for validating that. Did not expect someone to open vi on the
gateway

On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers grantr@gmail.com wrote:

I think it's fixed in head; I didn't see this error again when trying
the test below.

On Aug 24, 1:12 pm, Grant Rodgers gra...@gmail.com wrote:

Oh you know it was probably a vi swap file. I was taking a look at one
of the commit logs, and it might have tried to snapshot while it was
open.

I think you have committed another change since 80c7135 that ignores
files elasticsearch didn't create. I'll build the latest head and try
viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, not sure how this file ended up in the gateway, I can't see
where

elasticsearch would write it. It only writes __xxx files (no .
something)

and commit- files. I will fix it to ignore files that don't conform to
the

format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers gra...@gmail.com
wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Yes, though there is a whole "workflow" level APIs and support for.
But,

the

basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com
wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the
``````

gateway

module

and improved throttling support. The gateway change is a
breaking

change,

meaning that new version will not be able to recover from a 0.9
gateway.

I

will provide an upgrade script for file based gateway, s3 based
gateway

will

require reindexing, though, potentially, that script can be

to

support it.

Let me explain some of the changes. The first is throttling
support.

In

0.9, recoveries are being throttled on a specific node in order
to

reduce

the load a that node. The throttling was done on the node
level,

after a

shard has been allocated to it. Maintain the count of current
recoveries

is

quite tricky because of the complexity of the recovery process.
This

has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were
several

problems

with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with
10

shards

and

2 replicas, which sums it up to 12TB). This has uncovered some
problems

with

the current design, specifically how md5 are computed (and the
time

it

takes

to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course,
elasticsearch aim

is

to

be able to store much more data than that, we are getting
there... .

In general, the new implementation works (in spirit) in the
same

manner

git works. Each snapshot is a commit point, that stores files
in the

gateway

into an auto generated name, and finally, a commit point is
written

with

the

"directory" which maps between this pseudo name to physical
name, and

the

size. The new design allows for more resiliency when it comes
to

corruption.

It also allows for exciting future features like saving a
commit

point

and

restoring from it, or automatically create a commit point each
day

for

the

last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be
the

final

version, and resilient for future changes. It takes some time
to get

there,

but once we are there, I can safely stand behind using
elasticsearch

as

the

main storage as well as releasing v1.0 (I think elasticsearch
has

enough

features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it
out.

The

next version, as a result of that is going to be 0.10, and I
will

release

it

in the following days.

-shay.banon

(Grant Rodgers) #10

I was very curious to see what the file format was!

On Aug 24, 3:05 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Great, thanks for validating that. Did not expect someone to open vi on the
gateway

On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers gra...@gmail.com wrote:

I think it's fixed in head; I didn't see this error again when trying
the test below.

On Aug 24, 1:12 pm, Grant Rodgers gra...@gmail.com wrote:

Oh you know it was probably a vi swap file. I was taking a look at one
of the commit logs, and it might have tried to snapshot while it was
open.

I think you have committed another change since 80c7135 that ignores
files elasticsearch didn't create. I'll build the latest head and try
viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, not sure how this file ended up in the gateway, I can't see
where

elasticsearch would write it. It only writes __xxx files (no .
something)

and commit- files. I will fix it to ignore files that don't conform to
the

format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers gra...@gmail.com
wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Yes, though there is a whole "workflow" level APIs and support for.
But,

the

basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com
wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the
``````

gateway

module

and improved throttling support. The gateway change is a
breaking

change,

meaning that new version will not be able to recover from a 0.9
gateway.

I

will provide an upgrade script for file based gateway, s3 based
gateway

will

require reindexing, though, potentially, that script can be

to

support it.

Let me explain some of the changes. The first is throttling
support.

In

0.9, recoveries are being throttled on a specific node in order
to

reduce

the load a that node. The throttling was done on the node
level,

after a

shard has been allocated to it. Maintain the count of current
recoveries

is

quite tricky because of the complexity of the recovery process.
This

has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were
several

problems

with how the gateway works today that were exposed by a user of
elasticsearch that stores 4TB data (several indices, each with
10

shards

and

2 replicas, which sums it up to 12TB). This has uncovered some
problems

with

the current design, specifically how md5 are computed (and the
time

it

takes

to compute them on the local storage on ec2), as well as other
possibilities
for gateway corruptions using this load. Of course,
elasticsearch aim

is

to

be able to store much more data than that, we are getting
there... .

In general, the new implementation works (in spirit) in the
same

manner

git works. Each snapshot is a commit point, that stores files
in the

gateway

into an auto generated name, and finally, a commit point is
written

with

the

"directory" which maps between this pseudo name to physical
name, and

the

size. The new design allows for more resiliency when it comes
to

corruption.

It also allows for exciting future features like saving a
commit

point

and

restoring from it, or automatically create a commit point each
day

for

the

last 5 days and be able to rollback to a specific commit point.

The aim is to create a gateway storage that is going to be
the

final

version, and resilient for future changes. It takes some time
to get

there,

but once we are there, I can safely stand behind using
elasticsearch

as

the

main storage as well as releasing v1.0 (I think elasticsearch
has

enough

features for 1.0, just the stability of the gateway is needed).

I would love for people to take this for a ride and check it
out.

The

next version, as a result of that is going to be 0.10, and I
will

release

it

in the following days.

-shay.banon

(Shay Banon) #11

Indeed. All the __xxx fils are binary files (either index files or
transaction log parts). The commit-N is a json file that provides meta
information on the commit point.

On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers grantr@gmail.com wrote:

I was very curious to see what the file format was!

On Aug 24, 3:05 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Great, thanks for validating that. Did not expect someone to open vi on
the
gateway

On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers gra...@gmail.com
wrote:

I think it's fixed in head; I didn't see this error again when trying
the test below.

On Aug 24, 1:12 pm, Grant Rodgers gra...@gmail.com wrote:

Oh you know it was probably a vi swap file. I was taking a look at
one

of the commit logs, and it might have tried to snapshot while it was
open.

I think you have committed another change since 80c7135 that ignores
files elasticsearch didn't create. I'll build the latest head and try
viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, not sure how this file ended up in the gateway, I can't
see

where

elasticsearch would write it. It only writes __xxx files (no .
something)

and commit- files. I will fix it to ignore files that don't conform
to

the

format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers gra...@gmail.com
wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Yes, though there is a whole "workflow" level APIs and support
for.

But,

the

basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com
wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon <
shay.ba...@elasticsearch.com>

wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the
``````

gateway

module

and improved throttling support. The gateway change is a
breaking

change,

meaning that new version will not be able to recover from a
0.9

gateway.

I

will provide an upgrade script for file based gateway, s3
based

gateway

will

require reindexing, though, potentially, that script can be

to

support it.

Let me explain some of the changes. The first is
throttling

support.

In

0.9, recoveries are being throttled on a specific node in
order

to

reduce

the load a that node. The throttling was done on the node
level,

after a

shard has been allocated to it. Maintain the count of
current

recoveries

is

quite tricky because of the complexity of the recovery
process.

This

has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were
several

problems

with how the gateway works today that were exposed by a
user of

elasticsearch that stores 4TB data (several indices, each
with

10

shards

and

2 replicas, which sums it up to 12TB). This has uncovered
some

problems

with

the current design, specifically how md5 are computed (and
the

time

it

takes

to compute them on the local storage on ec2), as well as
other

possibilities

for gateway corruptions using this load. Of course,
elasticsearch aim

is

to

be able to store much more data than that, we are getting
there... .

In general, the new implementation works (in spirit) in
the

same

manner

git works. Each snapshot is a commit point, that stores
files

in the

gateway

into an auto generated name, and finally, a commit point is
written

with

the

"directory" which maps between this pseudo name to physical
name, and

the

size. The new design allows for more resiliency when it
comes

to

corruption.

It also allows for exciting future features like saving a
commit

point

and

restoring from it, or automatically create a commit point
each

day

for

the

last 5 days and be able to rollback to a specific commit
point.

The aim is to create a gateway storage that is going to
be

the

final

version, and resilient for future changes. It takes some
time

to get

there,

but once we are there, I can safely stand behind using
elasticsearch

as

the

main storage as well as releasing v1.0 (I think
elasticsearch

has

enough

features for 1.0, just the stability of the gateway is
needed).

I would love for people to take this for a ride and
check it

out.

The

next version, as a result of that is going to be 0.10, and
I

will

release

it

in the following days.

-shay.banon

(ppearcy) #12

Hi Shay,
Just curious, is the conversion script for file based gateway is
available? I haven't searched through GIT, so maybe it is there?

Spent the weekend building up 20mil docs and would prefer not to
repeat this.

Thanks,
Paul

On Aug 24, 4:34 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Indeed. All the __xxx fils are binary files (either index files or
transaction log parts). The commit-N is a json file that provides meta
information on the commit point.

On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers gra...@gmail.com wrote:

I was very curious to see what the file format was!

On Aug 24, 3:05 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Great, thanks for validating that. Did not expect someone to open vi on
the
gateway

On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers gra...@gmail.com
wrote:

I think it's fixed in head; I didn't see this error again when trying
the test below.

On Aug 24, 1:12 pm, Grant Rodgers gra...@gmail.com wrote:

Oh you know it was probably a vi swap file. I was taking a look at
one

of the commit logs, and it might have tried to snapshot while it was
open.

I think you have committed another change since 80c7135 that ignores
files elasticsearch didn't create. I'll build the latest head and try
viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Strange, not sure how this file ended up in the gateway, I can't
see

where

elasticsearch would write it. It only writes __xxx files (no .
something)

and commit- files. I will fix it to ignore files that don't conform
to

the

format, but we should try and understand where its coming from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers gra...@gmail.com
wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Yes, though there is a whole "workflow" level APIs and support
for.

But,

the

basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal talsalm...@gmail.com
wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon <
shay.ba...@elasticsearch.com>

wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of the
``````

gateway

module

and improved throttling support. The gateway change is a
breaking

change,

meaning that new version will not be able to recover from a
0.9

gateway.

I

will provide an upgrade script for file based gateway, s3
based

gateway

will

require reindexing, though, potentially, that script can be

to

support it.

Let me explain some of the changes. The first is
throttling

support.

In

0.9, recoveries are being throttled on a specific node in
order

to

reduce

the load a that node. The throttling was done on the node
level,

after a

shard has been allocated to it. Maintain the count of
current

recoveries

is

quite tricky because of the complexity of the recovery
process.

This

has

now

been refactored into a better place, which is the actual
allocation algorithm that runs and shuffles shards around.

The more interesting change is the gateway. There were
several

problems

with how the gateway works today that were exposed by a
user of

elasticsearch that stores 4TB data (several indices, each
with

10

shards

and

2 replicas, which sums it up to 12TB). This has uncovered
some

problems

with

the current design, specifically how md5 are computed (and
the

time

it

takes

to compute them on the local storage on ec2), as well as
other

possibilities

for gateway corruptions using this load. Of course,
elasticsearch aim

is

to

be able to store much more data than that, we are getting
there... .

In general, the new implementation works (in spirit) in
the

same

manner

git works. Each snapshot is a commit point, that stores
files

in the

gateway

into an auto generated name, and finally, a commit point is
written

with

the

"directory" which maps between this pseudo name to physical
name, and

the

size. The new design allows for more resiliency when it
comes

to

corruption.

It also allows for exciting future features like saving a
commit

point

and

restoring from it, or automatically create a commit point
each

day

for

the

last 5 days and be able to rollback to a specific commit
point.

The aim is to create a gateway storage that is going to
be

the

final

version, and resilient for future changes. It takes some
time

to get

there,

but once we are there, I can safely stand behind using
elasticsearch

as

the

main storage as well as releasing v1.0 (I think
elasticsearch

has

enough

features for 1.0, just the stability of the gateway is
needed).

I would love for people to take this for a ride and
check it

out.

The

next version, as a result of that is going to be 0.10, and
I

will

release

it

in the following days.

-shay.banon

(Shay Banon) #13

Yes, here it is: http://gist.github.com/546494.

-shay.banon

On Wed, Aug 25, 2010 at 8:46 PM, Paul ppearcy@gmail.com wrote:

Hi Shay,
Just curious, is the conversion script for file based gateway is
available? I haven't searched through GIT, so maybe it is there?

Spent the weekend building up 20mil docs and would prefer not to
repeat this.

Thanks,
Paul

On Aug 24, 4:34 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Indeed. All the __xxx fils are binary files (either index files or
transaction log parts). The commit-N is a json file that provides meta
information on the commit point.

On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers gra...@gmail.com wrote:

I was very curious to see what the file format was!

On Aug 24, 3:05 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Great, thanks for validating that. Did not expect someone to open vi
on

the

gateway

On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers gra...@gmail.com
wrote:

I think it's fixed in head; I didn't see this error again when
trying

the test below.

On Aug 24, 1:12 pm, Grant Rodgers gra...@gmail.com wrote:

Oh you know it was probably a vi swap file. I was taking a look
at

one

of the commit logs, and it might have tried to snapshot while it
was

open.

I think you have committed another change since 80c7135 that
ignores

files elasticsearch didn't create. I'll build the latest head and
try

viewing a commit snapshot again.

On Aug 24, 1:00 pm, Shay Banon shay.ba...@elasticsearch.com
wrote:

Strange, not sure how this file ended up in the gateway, I
can't

see

where

elasticsearch would write it. It only writes __xxx files (no .
something)

and commit- files. I will fix it to ignore files that don't
conform

to

the

format, but we should try and understand where its coming
from...

-shay.banon

On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers <
gra...@gmail.com>

wrote:

Oh btw that error was with 80c7135

On Aug 23, 12:22 pm, Shay Banon <
shay.ba...@elasticsearch.com>

wrote:

Yes, though there is a whole "workflow" level APIs and
support

for.

But,

the

basics are there in the gateway.

-shay.banon

On Mon, Aug 23, 2010 at 10:21 PM, Tal <
talsalm...@gmail.com>

wrote:

Very nice feature indeed.
Will this also allow for index level commit points?

Tal

On Aug 23, 10:11 pm, Shay Banon <
shay.ba...@elasticsearch.com>

wrote:

Hi,

``````I am going to push (pretty soon) a major rewrite of
``````

the

gateway

module

and improved throttling support. The gateway change is
a

breaking

change,

meaning that new version will not be able to recover
from a

0.9

gateway.

I

will provide an upgrade script for file based gateway,
s3

based

gateway

will

require reindexing, though, potentially, that script
can be

to

support it.

Let me explain some of the changes. The first is
throttling

support.

In

0.9, recoveries are being throttled on a specific node
in

order

to

reduce

the load a that node. The throttling was done on the
node

level,

after a

shard has been allocated to it. Maintain the count of
current

recoveries

is

quite tricky because of the complexity of the recovery
process.

This

has

now

been refactored into a better place, which is the
actual

allocation algorithm that runs and shuffles shards
around.

The more interesting change is the gateway. There
were

several

problems

with how the gateway works today that were exposed by a
user of

elasticsearch that stores 4TB data (several indices,
each

with

10

shards

and

2 replicas, which sums it up to 12TB). This has
uncovered

some

problems

with

the current design, specifically how md5 are computed
(and

the

time

it

takes

to compute them on the local storage on ec2), as well
as

other

possibilities

for gateway corruptions using this load. Of course,
elasticsearch aim

is

to

be able to store much more data than that, we are
getting

there... .

In general, the new implementation works (in spirit)
in

the

same

manner

git works. Each snapshot is a commit point, that stores
files

in the

gateway

into an auto generated name, and finally, a commit
point is

written

with

the

"directory" which maps between this pseudo name to
physical

name, and

the

size. The new design allows for more resiliency when it
comes

to

corruption.

It also allows for exciting future features like saving
a

commit

point

and

restoring from it, or automatically create a commit
point

each

day

for

the

last 5 days and be able to rollback to a specific
commit

point.

The aim is to create a gateway storage that is going
to

be

the

final

version, and resilient for future changes. It takes
some

time

to get

there,

but once we are there, I can safely stand behind using
elasticsearch

as

the

main storage as well as releasing v1.0 (I think
elasticsearch

has

enough

features for 1.0, just the stability of the gateway is
needed).

I would love for people to take this for a ride and
check it

out.

The

next version, as a result of that is going to be 0.10,
and

I

will

release

it

in the following days.

-shay.banon

(Kenneth Loafman) #14

Any hints for how to upgrade an S3 gateway?

...Ken

Shay Banon wrote:

Yes, here it is: http://gist.github.com/546494.

-shay.banon

On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppearcy@gmail.com
mailto:ppearcy@gmail.com> wrote:

``````Hi Shay,
Just curious, is the conversion script for file based gateway is
available? I haven't searched through GIT, so maybe it is there?

Spent the weekend building up 20mil docs and would prefer not to
repeat this.

Thanks,
Paul

On Aug 24, 4:34 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>> wrote:
> Indeed. All the __xxx fils are binary files (either index files or
> transaction log parts). The commit-N is a json file that provides meta
> information on the commit point.
>
>
>
> On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>> wrote:
> > I was very curious to see what the file format was!
>
> > On Aug 24, 3:05 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>> wrote:
> > > Great, thanks for validating that. Did not expect someone to
open vi on
> > the
> > > gateway ;)
>
> > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>>
> > wrote:
> > > > I think it's fixed in head; I didn't see this error again
when trying
> > > > the test below.
>
> > > > On Aug 24, 1:12 pm, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>> wrote:
> > > > > Oh you know it was probably a vi swap file. I was taking a
look at
> > one
> > > > > of the commit logs, and it might have tried to snapshot
while it was
> > > > > open.
>
> > > > > I think you have committed another change since 80c7135
that ignores
> > > > > files elasticsearch didn't create. I'll build the latest
> > > > > viewing a commit snapshot again.
>
> > > > > On Aug 24, 1:00 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>>
wrote:
>
> > > > > > Strange, not sure how this file ended up in the gateway,
I can't
> > see
> > > > where
> > > > > > elasticsearch would write it. It only writes __xxx files
(no .
> > > > something)
> > > > > > and commit- files. I will fix it to ignore files that
don't conform
> > to
> > > > the
> > > > > > format, but we should try and understand where its
coming from...
>
> > > > > > -shay.banon
>
> > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>>
> > > > wrote:
> > > > > > > Oh btw that error was with 80c7135
>
> > > > > > > On Aug 23, 12:22 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>>
> > > > wrote:
> > > > > > > > Yes, though there is a whole "workflow" level APIs
and support
> > for.
> > > > But,
> > > > > > > the
> > > > > > > > basics are there in the gateway.
>
> > > > > > > > -shay.banon
>
> > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
<talsalm...@gmail.com <mailto:talsalm...@gmail.com>>
> > > > wrote:
> > > > > > > > > Very nice feature indeed.
> > > > > > > > > Will this also allow for index level commit points?
>
> > > > > > > > > Tal
>
> > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
> > shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>>
> > > > wrote:
> > > > > > > > > > Hi,
>
> > > > > > > > > >     I am going to push (pretty soon) a major
rewrite of the
> > > > gateway
> > > > > > > > > module
> > > > > > > > > > and improved throttling support. The gateway
change is a
> > > > breaking
> > > > > > > change,
> > > > > > > > > > meaning that new version will not be able to
recover from a
> > 0.9
> > > > > > > gateway.
> > > > > > > > > I
> > > > > > > > > > will provide an upgrade script for file based
gateway, s3
> > based
> > > > > > > gateway
> > > > > > > > > will
> > > > > > > > > > require reindexing, though, potentially, that
script can be
> > > > > > > to
> > > > > > > > > > support it.
>
> > > > > > > > > >    Let me explain some of the changes. The first is
> > throttling
> > > > > > > support.
> > > > > > > > > In
> > > > > > > > > > 0.9, recoveries are being throttled on a
specific node in
> > order
> > > > to
> > > > > > > reduce
> > > > > > > > > > the load a that node. The throttling was done on
the node
> > > > level,
> > > > > > > after a
> > > > > > > > > > shard has been allocated to it. Maintain the
count of
> > current
> > > > > > > recoveries
> > > > > > > > > is
> > > > > > > > > > quite tricky because of the complexity of the
recovery
> > process.
> > > > This
> > > > > > > has
> > > > > > > > > now
> > > > > > > > > > been refactored into a better place, which is
the actual
> > > > > > > > > > allocation algorithm that runs and shuffles
shards around.
>
> > > > > > > > > >    The more interesting change is the gateway.
There were
> > > > several
> > > > > > > > > problems
> > > > > > > > > > with how the gateway works today that were
exposed by a
> > user of
> > > > > > > > > > elasticsearch that stores 4TB data (several
indices, each
> > with
> > > > 10
> > > > > > > shards
> > > > > > > > > and
> > > > > > > > > > 2 replicas, which sums it up to 12TB). This has
uncovered
> > some
> > > > > > > problems
> > > > > > > > > with
> > > > > > > > > > the current design, specifically how md5 are
computed (and
> > the
> > > > time
> > > > > > > it
> > > > > > > > > takes
> > > > > > > > > > to compute them on the local storage on ec2), as
well as
> > other
> > > > > > > > > possibilities
> > > > > > > > > > for gateway corruptions using this load. Of course,
> > > > elasticsearch aim
> > > > > > > is
> > > > > > > > > to
> > > > > > > > > > be able to store much more data than that, we
are getting
> > > > there... .
>
> > > > > > > > > >    In general, the new implementation works (in
spirit) in
> > the
> > > > same
> > > > > > > > > manner
> > > > > > > > > > git works. Each snapshot is a commit point, that
stores
> > files
> > > > in the
> > > > > > > > > gateway
> > > > > > > > > > into an auto generated name, and finally, a
commit point is
> > > > written
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > "directory" which maps between this pseudo name
to physical
> > > > name, and
> > > > > > > the
> > > > > > > > > > size. The new design allows for more resiliency
when it
> > comes
> > > > to
> > > > > > > > > corruption.
> > > > > > > > > > It also allows for exciting future features like
saving a
> > > > commit
> > > > > > > point
> > > > > > > > > and
> > > > > > > > > > restoring from it, or automatically create a
commit point
> > each
> > > > day
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > last 5 days and be able to rollback to a
specific commit
> > point.
>
> > > > > > > > > >    The aim is to create a gateway storage that
is going to
> > be
> > > > the
> > > > > > > final
> > > > > > > > > > version, and resilient for future changes. It
takes some
> > time
> > > > to get
> > > > > > > > > there,
> > > > > > > > > > but once we are there, I can safely stand behind
using
> > > > elasticsearch
> > > > > > > as
> > > > > > > > > the
> > > > > > > > > > main storage as well as releasing v1.0 (I think
> > elasticsearch
> > > > has
> > > > > > > enough
> > > > > > > > > > features for 1.0, just the stability of the
gateway is
> > needed).
>
> > > > > > > > > >    I would love for people to take this for a
ride and
> > check it
> > > > out.
> > > > > > > The
> > > > > > > > > > next version, as a result of that is going to be
0.10, and
> > I
> > > > will
> > > > > > > release
> > > > > > > > > it
> > > > > > > > > > in the following days.
>
> > > > > > > > > > -shay.banon
``````

(Shay Banon) #15

You can check what I do in the script, its pretty simple. The problem is
that S3 does not have a rename method..., though maybe a copy command can be
used, but not sure how long it takes for large files. You can replace the
file based operations with S3 ones.

-shay.banon

On Thu, Aug 26, 2010 at 2:22 AM, Kenneth Loafman
kenneth.loafman@gmail.comwrote:

Any hints for how to upgrade an S3 gateway?

...Ken

Shay Banon wrote:

Yes, here it is: http://gist.github.com/546494.

-shay.banon

On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppearcy@gmail.com
mailto:ppearcy@gmail.com> wrote:

``````Hi Shay,
Just curious, is the conversion script for file based gateway is
available? I haven't searched through GIT, so maybe it is there?

Spent the weekend building up 20mil docs and would prefer not to
repeat this.

Thanks,
Paul

On Aug 24, 4:34 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>> wrote:
> Indeed. All the __xxx fils are binary files (either index files or
> transaction log parts). The commit-N is a json file that provides
``````

meta

``````> information on the commit point.
>
>
>
> On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>> wrote:
> > I was very curious to see what the file format was!
>
> > On Aug 24, 3:05 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>> wrote:
> > > Great, thanks for validating that. Did not expect someone to
open vi on
> > the
> > > gateway ;)
>
> > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>>
> > wrote:
> > > > I think it's fixed in head; I didn't see this error again
when trying
> > > > the test below.
>
> > > > On Aug 24, 1:12 pm, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>> wrote:
> > > > > Oh you know it was probably a vi swap file. I was taking a
look at
> > one
> > > > > of the commit logs, and it might have tried to snapshot
while it was
> > > > > open.
>
> > > > > I think you have committed another change since 80c7135
that ignores
> > > > > files elasticsearch didn't create. I'll build the latest
> > > > > viewing a commit snapshot again.
>
> > > > > On Aug 24, 1:00 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>>
wrote:
>
> > > > > > Strange, not sure how this file ended up in the gateway,
I can't
> > see
> > > > where
> > > > > > elasticsearch would write it. It only writes __xxx files
(no .
> > > > something)
> > > > > > and commit- files. I will fix it to ignore files that
don't conform
> > to
> > > > the
> > > > > > format, but we should try and understand where its
coming from...
>
> > > > > > -shay.banon
>
> > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>>
> > > > wrote:
> > > > > > > Oh btw that error was with 80c7135
>
> > > > > > > On Aug 23, 12:22 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>>
> > > > wrote:
> > > > > > > > Yes, though there is a whole "workflow" level APIs
and support
> > for.
> > > > But,
> > > > > > > the
> > > > > > > > basics are there in the gateway.
>
> > > > > > > > -shay.banon
>
> > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
<talsalm...@gmail.com <mailto:talsalm...@gmail.com>>
> > > > wrote:
> > > > > > > > > Very nice feature indeed.
> > > > > > > > > Will this also allow for index level commit points?
>
> > > > > > > > > Tal
>
> > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
> > shay.ba...@elasticsearch.com <mailto:
``````
``````> > > > wrote:
> > > > > > > > > > Hi,
>
> > > > > > > > > >     I am going to push (pretty soon) a major
rewrite of the
> > > > gateway
> > > > > > > > > module
> > > > > > > > > > and improved throttling support. The gateway
change is a
> > > > breaking
> > > > > > > change,
> > > > > > > > > > meaning that new version will not be able to
recover from a
> > 0.9
> > > > > > > gateway.
> > > > > > > > > I
> > > > > > > > > > will provide an upgrade script for file based
gateway, s3
> > based
> > > > > > > gateway
> > > > > > > > > will
> > > > > > > > > > require reindexing, though, potentially, that
script can be
> > > > > > > to
> > > > > > > > > > support it.
>
> > > > > > > > > >    Let me explain some of the changes. The first
``````

is

``````> > throttling
> > > > > > > support.
> > > > > > > > > In
> > > > > > > > > > 0.9, recoveries are being throttled on a
specific node in
> > order
> > > > to
> > > > > > > reduce
> > > > > > > > > > the load a that node. The throttling was done on
the node
> > > > level,
> > > > > > > after a
> > > > > > > > > > shard has been allocated to it. Maintain the
count of
> > current
> > > > > > > recoveries
> > > > > > > > > is
> > > > > > > > > > quite tricky because of the complexity of the
recovery
> > process.
> > > > This
> > > > > > > has
> > > > > > > > > now
> > > > > > > > > > been refactored into a better place, which is
the actual
> > > > > > > > > > allocation algorithm that runs and shuffles
shards around.
>
> > > > > > > > > >    The more interesting change is the gateway.
There were
> > > > several
> > > > > > > > > problems
> > > > > > > > > > with how the gateway works today that were
exposed by a
> > user of
> > > > > > > > > > elasticsearch that stores 4TB data (several
indices, each
> > with
> > > > 10
> > > > > > > shards
> > > > > > > > > and
> > > > > > > > > > 2 replicas, which sums it up to 12TB). This has
uncovered
> > some
> > > > > > > problems
> > > > > > > > > with
> > > > > > > > > > the current design, specifically how md5 are
computed (and
> > the
> > > > time
> > > > > > > it
> > > > > > > > > takes
> > > > > > > > > > to compute them on the local storage on ec2), as
well as
> > other
> > > > > > > > > possibilities
> > > > > > > > > > for gateway corruptions using this load. Of
``````

course,

``````> > > > elasticsearch aim
> > > > > > > is
> > > > > > > > > to
> > > > > > > > > > be able to store much more data than that, we
are getting
> > > > there... .
>
> > > > > > > > > >    In general, the new implementation works (in
spirit) in
> > the
> > > > same
> > > > > > > > > manner
> > > > > > > > > > git works. Each snapshot is a commit point, that
stores
> > files
> > > > in the
> > > > > > > > > gateway
> > > > > > > > > > into an auto generated name, and finally, a
commit point is
> > > > written
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > "directory" which maps between this pseudo name
to physical
> > > > name, and
> > > > > > > the
> > > > > > > > > > size. The new design allows for more resiliency
when it
> > comes
> > > > to
> > > > > > > > > corruption.
> > > > > > > > > > It also allows for exciting future features like
saving a
> > > > commit
> > > > > > > point
> > > > > > > > > and
> > > > > > > > > > restoring from it, or automatically create a
commit point
> > each
> > > > day
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > last 5 days and be able to rollback to a
specific commit
> > point.
>
> > > > > > > > > >    The aim is to create a gateway storage that
is going to
> > be
> > > > the
> > > > > > > final
> > > > > > > > > > version, and resilient for future changes. It
takes some
> > time
> > > > to get
> > > > > > > > > there,
> > > > > > > > > > but once we are there, I can safely stand behind
using
> > > > elasticsearch
> > > > > > > as
> > > > > > > > > the
> > > > > > > > > > main storage as well as releasing v1.0 (I think
> > elasticsearch
> > > > has
> > > > > > > enough
> > > > > > > > > > features for 1.0, just the stability of the
gateway is
> > needed).
>
> > > > > > > > > >    I would love for people to take this for a
ride and
> > check it
> > > > out.
> > > > > > > The
> > > > > > > > > > next version, as a result of that is going to be
0.10, and
> > I
> > > > will
> > > > > > > release
> > > > > > > > > it
> > > > > > > > > > in the following days.
>
> > > > > > > > > > -shay.banon
``````

(Kenneth Loafman) #16

Is it possible to copy a gateway from S3 to local, run the conversion
locally, then copy it back to S3? The S3 rename issue could be solved
this way, with less total downtime.

...Ken

Shay Banon wrote:

You can check what I do in the script, its pretty simple. The problem is
that S3 does not have a rename method..., though maybe a copy command
can be used, but not sure how long it takes for large files. You can
replace the file based operations with S3 ones.

-shay.banon

On Thu, Aug 26, 2010 at 2:22 AM, Kenneth Loafman
<kenneth.loafman@gmail.com mailto:kenneth.loafman@gmail.com> wrote:

``````Any hints for how to upgrade an S3 gateway?

...Ken

Shay Banon wrote:
> Yes, here it is: http://gist.github.com/546494.
>
> -shay.banon
>
> On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppearcy@gmail.com
<mailto:ppearcy@gmail.com>
> <mailto:ppearcy@gmail.com <mailto:ppearcy@gmail.com>>> wrote:
>
>     Hi Shay,
>      Just curious, is the conversion script for file based gateway is
>     available? I haven't searched through GIT, so maybe it is there?
>
>     Spent the weekend building up 20mil docs and would prefer not to
>     repeat this.
>
>     Thanks,
>     Paul
>
>     On Aug 24, 4:34 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > Indeed. All the __xxx fils are binary files (either index
files or
>     > transaction log parts). The commit-N is a json file that
provides meta
>     > information on the commit point.
>     >
>     >
>     >
>     > On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>> wrote:
>     > > I was very curious to see what the file format was!
>     >
>     > > On Aug 24, 3:05 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > > > Great, thanks for validating that. Did not expect someone to
>     open vi on
>     > > the
>     > > > gateway ;)
>     >
>     > > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > wrote:
>     > > > > I think it's fixed in head; I didn't see this error again
>     when trying
>     > > > > the test below.
>     >
>     > > > > On Aug 24, 1:12 pm, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>> wrote:
>     > > > > > Oh you know it was probably a vi swap file. I was
taking a
>     look at
>     > > one
>     > > > > > of the commit logs, and it might have tried to snapshot
>     while it was
>     > > > > > open.
>     >
>     > > > > > I think you have committed another change since 80c7135
>     that ignores
>     > > > > > files elasticsearch didn't create. I'll build the latest
>     > > > > > viewing a commit snapshot again.
>     >
>     > > > > > On Aug 24, 1:00 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     wrote:
>     >
>     > > > > > > Strange, not sure how this file ended up in the
gateway,
>     I can't
>     > > see
>     > > > > where
>     > > > > > > elasticsearch would write it. It only writes __xxx
files
>     (no .
>     > > > > something)
>     > > > > > > and commit- files. I will fix it to ignore files that
>     don't conform
>     > > to
>     > > > > the
>     > > > > > > format, but we should try and understand where its
>     coming from...
>     >
>     > > > > > > -shay.banon
>     >
>     > > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > Oh btw that error was with 80c7135
>     >
>     > > > > > > > On Aug 23, 12:22 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > Yes, though there is a whole "workflow" level APIs
>     and support
>     > > for.
>     > > > > But,
>     > > > > > > > the
>     > > > > > > > > basics are there in the gateway.
>     >
>     > > > > > > > > -shay.banon
>     >
>     > > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
>     <talsalm...@gmail.com <mailto:talsalm...@gmail.com>
<mailto:talsalm...@gmail.com <mailto:talsalm...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > > > Very nice feature indeed.
>     > > > > > > > > > Will this also allow for index level commit
points?
>     >
>     > > > > > > > > > Tal
>     >
>     > > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
>     > > shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > > > Hi,
>     >
>     > > > > > > > > > >     I am going to push (pretty soon) a major
>     rewrite of the
>     > > > > gateway
>     > > > > > > > > > module
>     > > > > > > > > > > and improved throttling support. The gateway
>     change is a
>     > > > > breaking
>     > > > > > > > change,
>     > > > > > > > > > > meaning that new version will not be able to
>     recover from a
>     > > 0.9
>     > > > > > > > gateway.
>     > > > > > > > > > I
>     > > > > > > > > > > will provide an upgrade script for file based
>     gateway, s3
>     > > based
>     > > > > > > > gateway
>     > > > > > > > > > will
>     > > > > > > > > > > require reindexing, though, potentially, that
>     script can be
>     > > > > adjusted
>     > > > > > > > to
>     > > > > > > > > > > support it.
>     >
>     > > > > > > > > > >    Let me explain some of the changes. The
first is
>     > > throttling
>     > > > > > > > support.
>     > > > > > > > > > In
>     > > > > > > > > > > 0.9, recoveries are being throttled on a
>     specific node in
>     > > order
>     > > > > to
>     > > > > > > > reduce
>     > > > > > > > > > > the load a that node. The throttling was
done on
>     the node
>     > > > > level,
>     > > > > > > > after a
>     > > > > > > > > > > shard has been allocated to it. Maintain the
>     count of
>     > > current
>     > > > > > > > recoveries
>     > > > > > > > > > is
>     > > > > > > > > > > quite tricky because of the complexity of the
>     recovery
>     > > process.
>     > > > > This
>     > > > > > > > has
>     > > > > > > > > > now
>     > > > > > > > > > > been refactored into a better place, which is
>     the actual
>     > > > > > > > > > > allocation algorithm that runs and shuffles
>     shards around.
>     >
>     > > > > > > > > > >    The more interesting change is the gateway.
>     There were
>     > > > > several
>     > > > > > > > > > problems
>     > > > > > > > > > > with how the gateway works today that were
>     exposed by a
>     > > user of
>     > > > > > > > > > > elasticsearch that stores 4TB data (several
>     indices, each
>     > > with
>     > > > > 10
>     > > > > > > > shards
>     > > > > > > > > > and
>     > > > > > > > > > > 2 replicas, which sums it up to 12TB).
This has
>     uncovered
>     > > some
>     > > > > > > > problems
>     > > > > > > > > > with
>     > > > > > > > > > > the current design, specifically how md5 are
>     computed (and
>     > > the
>     > > > > time
>     > > > > > > > it
>     > > > > > > > > > takes
>     > > > > > > > > > > to compute them on the local storage on
ec2), as
>     well as
>     > > other
>     > > > > > > > > > possibilities
>     > > > > > > > > > > for gateway corruptions using this load.
Of course,
>     > > > > elasticsearch aim
>     > > > > > > > is
>     > > > > > > > > > to
>     > > > > > > > > > > be able to store much more data than that, we
>     are getting
>     > > > > there... .
>     >
>     > > > > > > > > > >    In general, the new implementation
works (in
>     spirit) in
>     > > the
>     > > > > same
>     > > > > > > > > > manner
>     > > > > > > > > > > git works. Each snapshot is a commit
point, that
>     stores
>     > > files
>     > > > > in the
>     > > > > > > > > > gateway
>     > > > > > > > > > > into an auto generated name, and finally, a
>     commit point is
>     > > > > written
>     > > > > > > > with
>     > > > > > > > > > the
>     > > > > > > > > > > "directory" which maps between this pseudo
name
>     to physical
>     > > > > name, and
>     > > > > > > > the
>     > > > > > > > > > > size. The new design allows for more
resiliency
>     when it
>     > > comes
>     > > > > to
>     > > > > > > > > > corruption.
>     > > > > > > > > > > It also allows for exciting future
features like
>     saving a
>     > > > > commit
>     > > > > > > > point
>     > > > > > > > > > and
>     > > > > > > > > > > restoring from it, or automatically create a
>     commit point
>     > > each
>     > > > > day
>     > > > > > > > for
>     > > > > > > > > > the
>     > > > > > > > > > > last 5 days and be able to rollback to a
>     specific commit
>     > > point.
>     >
>     > > > > > > > > > >    The aim is to create a gateway storage that
>     is going to
>     > > be
>     > > > > the
>     > > > > > > > final
>     > > > > > > > > > > version, and resilient for future changes. It
>     takes some
>     > > time
>     > > > > to get
>     > > > > > > > > > there,
>     > > > > > > > > > > but once we are there, I can safely stand
behind
>     using
>     > > > > elasticsearch
>     > > > > > > > as
>     > > > > > > > > > the
>     > > > > > > > > > > main storage as well as releasing v1.0 (I
think
>     > > elasticsearch
>     > > > > has
>     > > > > > > > enough
>     > > > > > > > > > > features for 1.0, just the stability of the
>     gateway is
>     > > needed).
>     >
>     > > > > > > > > > >    I would love for people to take this for a
>     ride and
>     > > check it
>     > > > > out.
>     > > > > > > > The
>     > > > > > > > > > > next version, as a result of that is going
to be
>     0.10, and
>     > > I
>     > > > > will
>     > > > > > > > release
>     > > > > > > > > > it
>     > > > > > > > > > > in the following days.
>     >
>     > > > > > > > > > > -shay.banon
>
>
``````

(Shay Banon) #17

Yea, sure.

On Thu, Aug 26, 2010 at 3:30 PM, Kenneth Loafman
kenneth.loafman@gmail.comwrote:

Is it possible to copy a gateway from S3 to local, run the conversion
locally, then copy it back to S3? The S3 rename issue could be solved
this way, with less total downtime.

...Ken

Shay Banon wrote:

You can check what I do in the script, its pretty simple. The problem is
that S3 does not have a rename method..., though maybe a copy command
can be used, but not sure how long it takes for large files. You can
replace the file based operations with S3 ones.

-shay.banon

On Thu, Aug 26, 2010 at 2:22 AM, Kenneth Loafman
<kenneth.loafman@gmail.com mailto:kenneth.loafman@gmail.com> wrote:

``````Any hints for how to upgrade an S3 gateway?

...Ken

Shay Banon wrote:
> Yes, here it is: http://gist.github.com/546494.
>
> -shay.banon
>
> On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppearcy@gmail.com
<mailto:ppearcy@gmail.com>
> <mailto:ppearcy@gmail.com <mailto:ppearcy@gmail.com>>> wrote:
>
>     Hi Shay,
>      Just curious, is the conversion script for file based gateway
``````

is

``````>     available? I haven't searched through GIT, so maybe it is
``````

there?

``````>
>     Spent the weekend building up 20mil docs and would prefer not
``````

to

``````>     repeat this.
>
>     Thanks,
>     Paul
>
>     On Aug 24, 4:34 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > Indeed. All the __xxx fils are binary files (either index
files or
>     > transaction log parts). The commit-N is a json file that
provides meta
>     > information on the commit point.
>     >
>     >
>     >
>     > On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>> wrote:
>     > > I was very curious to see what the file format was!
>     >
>     > > On Aug 24, 3:05 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > > > Great, thanks for validating that. Did not expect someone
``````

to

``````>     open vi on
>     > > the
>     > > > gateway ;)
>     >
>     > > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > wrote:
>     > > > > I think it's fixed in head; I didn't see this error
``````

again

``````>     when trying
>     > > > > the test below.
>     >
>     > > > > On Aug 24, 1:12 pm, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>> wrote:
>     > > > > > Oh you know it was probably a vi swap file. I was
taking a
>     look at
>     > > one
>     > > > > > of the commit logs, and it might have tried to
``````

snapshot

``````>     while it was
>     > > > > > open.
>     >
>     > > > > > I think you have committed another change since
``````

80c7135

``````>     that ignores
>     > > > > > files elasticsearch didn't create. I'll build the
``````

latest

``````>     head and try
>     > > > > > viewing a commit snapshot again.
>     >
>     > > > > > On Aug 24, 1:00 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     wrote:
>     >
>     > > > > > > Strange, not sure how this file ended up in the
gateway,
>     I can't
>     > > see
>     > > > > where
>     > > > > > > elasticsearch would write it. It only writes __xxx
files
>     (no .
>     > > > > something)
>     > > > > > > and commit- files. I will fix it to ignore files
``````

that

``````>     don't conform
>     > > to
>     > > > > the
>     > > > > > > format, but we should try and understand where its
>     coming from...
>     >
>     > > > > > > -shay.banon
>     >
>     > > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > Oh btw that error was with 80c7135
>     >
>     > > > > > > > On Aug 23, 12:22 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > Yes, though there is a whole "workflow" level
``````

APIs

``````>     and support
>     > > for.
>     > > > > But,
>     > > > > > > > the
>     > > > > > > > > basics are there in the gateway.
>     >
>     > > > > > > > > -shay.banon
>     >
>     > > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
>     <talsalm...@gmail.com <mailto:talsalm...@gmail.com>
<mailto:talsalm...@gmail.com <mailto:talsalm...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > > > Very nice feature indeed.
>     > > > > > > > > > Will this also allow for index level commit
points?
>     >
>     > > > > > > > > > Tal
>     >
>     > > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
>     > > shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > > > Hi,
>     >
>     > > > > > > > > > >     I am going to push (pretty soon) a
``````

major

``````>     rewrite of the
>     > > > > gateway
>     > > > > > > > > > module
>     > > > > > > > > > > and improved throttling support. The
``````

gateway

``````>     change is a
>     > > > > breaking
>     > > > > > > > change,
>     > > > > > > > > > > meaning that new version will not be able
``````

to

``````>     recover from a
>     > > 0.9
>     > > > > > > > gateway.
>     > > > > > > > > > I
>     > > > > > > > > > > will provide an upgrade script for file
``````

based

``````>     gateway, s3
>     > > based
>     > > > > > > > gateway
>     > > > > > > > > > will
>     > > > > > > > > > > require reindexing, though, potentially,
``````

that

``````>     script can be
>     > > > > adjusted
>     > > > > > > > to
>     > > > > > > > > > > support it.
>     >
>     > > > > > > > > > >    Let me explain some of the changes. The
first is
>     > > throttling
>     > > > > > > > support.
>     > > > > > > > > > In
>     > > > > > > > > > > 0.9, recoveries are being throttled on a
>     specific node in
>     > > order
>     > > > > to
>     > > > > > > > reduce
>     > > > > > > > > > > the load a that node. The throttling was
done on
>     the node
>     > > > > level,
>     > > > > > > > after a
>     > > > > > > > > > > shard has been allocated to it. Maintain
``````

the

``````>     count of
>     > > current
>     > > > > > > > recoveries
>     > > > > > > > > > is
>     > > > > > > > > > > quite tricky because of the complexity of
``````

the

``````>     recovery
>     > > process.
>     > > > > This
>     > > > > > > > has
>     > > > > > > > > > now
>     > > > > > > > > > > been refactored into a better place, which
``````

is

``````>     the actual
>     > > > > > > > > > > allocation algorithm that runs and shuffles
>     shards around.
>     >
>     > > > > > > > > > >    The more interesting change is the
``````

gateway.

``````>     There were
>     > > > > several
>     > > > > > > > > > problems
>     > > > > > > > > > > with how the gateway works today that were
>     exposed by a
>     > > user of
>     > > > > > > > > > > elasticsearch that stores 4TB data (several
>     indices, each
>     > > with
>     > > > > 10
>     > > > > > > > shards
>     > > > > > > > > > and
>     > > > > > > > > > > 2 replicas, which sums it up to 12TB).
This has
>     uncovered
>     > > some
>     > > > > > > > problems
>     > > > > > > > > > with
>     > > > > > > > > > > the current design, specifically how md5
``````

are

``````>     computed (and
>     > > the
>     > > > > time
>     > > > > > > > it
>     > > > > > > > > > takes
>     > > > > > > > > > > to compute them on the local storage on
ec2), as
>     well as
>     > > other
>     > > > > > > > > > possibilities
>     > > > > > > > > > > for gateway corruptions using this load.
Of course,
>     > > > > elasticsearch aim
>     > > > > > > > is
>     > > > > > > > > > to
>     > > > > > > > > > > be able to store much more data than that,
``````

we

``````>     are getting
>     > > > > there... .
>     >
>     > > > > > > > > > >    In general, the new implementation
works (in
>     spirit) in
>     > > the
>     > > > > same
>     > > > > > > > > > manner
>     > > > > > > > > > > git works. Each snapshot is a commit
point, that
>     stores
>     > > files
>     > > > > in the
>     > > > > > > > > > gateway
>     > > > > > > > > > > into an auto generated name, and finally, a
>     commit point is
>     > > > > written
>     > > > > > > > with
>     > > > > > > > > > the
>     > > > > > > > > > > "directory" which maps between this pseudo
name
>     to physical
>     > > > > name, and
>     > > > > > > > the
>     > > > > > > > > > > size. The new design allows for more
resiliency
>     when it
>     > > comes
>     > > > > to
>     > > > > > > > > > corruption.
>     > > > > > > > > > > It also allows for exciting future
features like
>     saving a
>     > > > > commit
>     > > > > > > > point
>     > > > > > > > > > and
>     > > > > > > > > > > restoring from it, or automatically create
``````

a

``````>     commit point
>     > > each
>     > > > > day
>     > > > > > > > for
>     > > > > > > > > > the
>     > > > > > > > > > > last 5 days and be able to rollback to a
>     specific commit
>     > > point.
>     >
>     > > > > > > > > > >    The aim is to create a gateway storage
``````

that

``````>     is going to
>     > > be
>     > > > > the
>     > > > > > > > final
>     > > > > > > > > > > version, and resilient for future changes.
``````

It

``````>     takes some
>     > > time
>     > > > > to get
>     > > > > > > > > > there,
>     > > > > > > > > > > but once we are there, I can safely stand
behind
>     using
>     > > > > elasticsearch
>     > > > > > > > as
>     > > > > > > > > > the
>     > > > > > > > > > > main storage as well as releasing v1.0 (I
think
>     > > elasticsearch
>     > > > > has
>     > > > > > > > enough
>     > > > > > > > > > > features for 1.0, just the stability of the
>     gateway is
>     > > needed).
>     >
>     > > > > > > > > > >    I would love for people to take this for
``````

a

``````>     ride and
>     > > check it
>     > > > > out.
>     > > > > > > > The
>     > > > > > > > > > > next version, as a result of that is going
to be
>     0.10, and
>     > > I
>     > > > > will
>     > > > > > > > release
>     > > > > > > > > > it
>     > > > > > > > > > > in the following days.
>     >
>     > > > > > > > > > > -shay.banon
>
>
``````

(talsalmona) #18

Hi Shay,
Can you please explain or point me to the docs or example of how to
work with the commit point API and how the commit / rollback workflow
works?

Thanks,
Tal

On Aug 26, 4:27 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yea, sure.

On Thu, Aug 26, 2010 at 3:30 PM, Kenneth Loafman
kenneth.loaf...@gmail.comwrote:

Is it possible to copy a gateway from S3 to local, run the conversion
locally, then copy it back to S3? The S3 rename issue could be solved
this way, with less total downtime.

...Ken

Shay Banon wrote:

You can check what I do in the script, its pretty simple. The problem is
that S3 does not have a rename method..., though maybe a copy command
can be used, but not sure how long it takes for large files. You can
replace the file based operations with S3 ones.

-shay.banon

On Thu, Aug 26, 2010 at 2:22 AM, Kenneth Loafman
<kenneth.loaf...@gmail.com mailto:kenneth.loaf...@gmail.com> wrote:

``````Any hints for how to upgrade an S3 gateway?
``````
``````...Ken
``````
``````Shay Banon wrote:
> Yes, here it is:http://gist.github.com/546494.
``````
``````> -shay.banon
``````
``````> On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppea...@gmail.com
<mailto:ppea...@gmail.com>
> <mailto:ppea...@gmail.com <mailto:ppea...@gmail.com>>> wrote:
``````
``````>     Hi Shay,
>      Just curious, is the conversion script for file based gateway
``````

is

``````>     available? I haven't searched through GIT, so maybe it is
``````

there?

``````>     Spent the weekend building up 20mil docs and would prefer not
``````

to

``````>     repeat this.
``````
``````>     Thanks,
>     Paul
``````
``````>     On Aug 24, 4:34 pm, Shay Banon <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > Indeed. All the __xxx fils are binary files (either index
files or
>     > transaction log parts). The commit-N is a json file that
provides meta
>     > information on the commit point.
``````
``````>     > On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>> wrote:
>     > > I was very curious to see what the file format was!
``````
``````>     > > On Aug 24, 3:05 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > > > Great, thanks for validating that. Did not expect someone
``````

to

``````>     open vi on
>     > > the
>     > > > gateway ;)
``````
``````>     > > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > wrote:
>     > > > > I think it's fixed in head; I didn't see this error
``````

again

``````>     when trying
>     > > > > the test below.
``````
``````>     > > > > On Aug 24, 1:12 pm, Grant Rodgers <gra...@gmail.com
<mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>> wrote:
>     > > > > > Oh you know it was probably a vi swap file. I was
taking a
>     look at
>     > > one
>     > > > > > of the commit logs, and it might have tried to
``````

snapshot

``````>     while it was
>     > > > > > open.
``````
``````>     > > > > > I think you have committed another change since
``````

80c7135

``````>     that ignores
>     > > > > > files elasticsearch didn't create. I'll build the
``````

latest

``````>     head and try
>     > > > > > viewing a commit snapshot again.
``````
``````>     > > > > > On Aug 24, 1:00 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     wrote:
``````
``````>     > > > > > > Strange, not sure how this file ended up in the
gateway,
>     I can't
>     > > see
>     > > > > where
>     > > > > > > elasticsearch would write it. It only writes __xxx
files
>     (no .
>     > > > > something)
>     > > > > > > and commit- files. I will fix it to ignore files
``````

that

``````>     don't conform
>     > > to
>     > > > > the
>     > > > > > > format, but we should try and understand where its
>     coming from...
``````
``````>     > > > > > > -shay.banon
``````
``````>     > > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > Oh btw that error was with 80c7135
``````
``````>     > > > > > > > On Aug 23, 12:22 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > Yes, though there is a whole "workflow" level
``````

APIs

``````>     and support
>     > > for.
>     > > > > But,
>     > > > > > > > the
>     > > > > > > > > basics are there in the gateway.
``````
``````>     > > > > > > > > -shay.banon
``````
``````>     > > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
>     <talsalm...@gmail.com <mailto:talsalm...@gmail.com>
<mailto:talsalm...@gmail.com <mailto:talsalm...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > > > Very nice feature indeed.
>     > > > > > > > > > Will this also allow for index level commit
points?
``````
``````>     > > > > > > > > > Tal
``````
``````>     > > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
>     > > shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > > > Hi,
``````
``````>     > > > > > > > > > >     I am going to push (pretty soon) a
``````

major

``````>     rewrite of the
>     > > > > gateway
>     > > > > > > > > > module
>     > > > > > > > > > > and improved throttling support. The
``````

gateway

``````>     change is a
>     > > > > breaking
>     > > > > > > > change,
>     > > > > > > > > > > meaning that new version will not be able
``````

to

``````>     recover from a
>     > > 0.9
>     > > > > > > > gateway.
>     > > > > > > > > > I
>     > > > > > > > > > > will provide an upgrade script for file
``````

based

``````>     gateway, s3
>     > > based
>     > > > > > > > gateway
>     > > > > > > > > > will
>     > > > > > > > > > > require reindexing, though, potentially,
``````

that

``````>     script can be
>     > > > > adjusted
>     > > > > > > > to
>     > > > > > > > > > > support it.
``````
``````>     > > > > > > > > > >    Let me explain some of the changes. The
first is
>     > > throttling
>     > > > > > > > support.
>     > > > > > > > > > In
>     > > > > > > > > > > 0.9, recoveries are being throttled on a
>     specific node in
>     > > order
>     > > > > to
>     > > > > > > > reduce
>     > > > > > > > > > > the load a that node. The throttling was
done on
>     the node
>     > > > > level,
>     > > > > > > > after a
>     > > > > > > > > > > shard has been allocated to it. Maintain
``````

the

``````>     count of
>     > > current
>     > > > > > > > recoveries
>     > > > > > > > > > is
>     > > > > > > > > > > quite tricky because of the complexity of
``````

the

``````>     recovery
>     > > process.
>     > > > > This
>     > > > > > > > has
>     > > > > > > > > > now
>     > > > > > > > > > > been refactored into a better place, which
``````

is

``````>     the actual
>     > > > > > > > > > > allocation algorithm that runs and shuffles
>     shards around.
``````
``````>     > > > > > > > > > >    The more interesting change is the
``````

gateway.

``````>     There were
>     > > > > several
>     > > > > > > > > > problems
>     > > > > > > > > > > with how the gateway works today that were
>     exposed by a
>     > > user of
>     > > > > > > > > > > elasticsearch that stores 4TB data (several
>     indices, each
>     > > with
>     > > > > 10
>     > > > > > > > shards
>     > > > > > > > > > and
>     > > > > > > > > > > 2 replicas, which sums it up to 12TB).
This has
>     uncovered
>     > > some
>     > > > > > > > problems
>     > > > > > > > > > with
>     > > > > > > > > > > the current design, specifically how md5
``````

are

``````>     computed (and
>     > > the
>     > > > > time
>     > > > > > > > it
>     > > > > > > > > > takes
>     > > > > > > > > > > to compute them on the local storage on
ec2), as
>     well as
>     > > other
>     > > > > > > > > > possibilities
>     > > > > > > > > > > for gateway corruptions using this load.
Of course,
>     > > > > elasticsearch aim
>     > > > > > > > is
>     > > > > > > > > > to
``````

...

(Shay Banon) #19

There isn't an API for that. Thats what I meant that there is a whole
"orchestration" API that needs to be built on top of the new capabilities of
the gateway storage model. The good news is that the new storage model
allows for that, but there is still work left to actually expose it to the
user.

-shay.banon

On Sun, Aug 29, 2010 at 3:33 PM, Tal talsalmona@gmail.com wrote:

Hi Shay,
Can you please explain or point me to the docs or example of how to
work with the commit point API and how the commit / rollback workflow
works?

Thanks,
Tal

On Aug 26, 4:27 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yea, sure.

On Thu, Aug 26, 2010 at 3:30 PM, Kenneth Loafman
kenneth.loaf...@gmail.comwrote:

Is it possible to copy a gateway from S3 to local, run the conversion
locally, then copy it back to S3? The S3 rename issue could be solved
this way, with less total downtime.

...Ken

Shay Banon wrote:

You can check what I do in the script, its pretty simple. The problem
is

that S3 does not have a rename method..., though maybe a copy command
can be used, but not sure how long it takes for large files. You can
replace the file based operations with S3 ones.

-shay.banon

On Thu, Aug 26, 2010 at 2:22 AM, Kenneth Loafman
<kenneth.loaf...@gmail.com mailto:kenneth.loaf...@gmail.com>
wrote:

``````Any hints for how to upgrade an S3 gateway?
``````
``````...Ken
``````
``````Shay Banon wrote:
> Yes, here it is:http://gist.github.com/546494.
``````
``````> -shay.banon
``````
``````> On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppea...@gmail.com
<mailto:ppea...@gmail.com>
> <mailto:ppea...@gmail.com <mailto:ppea...@gmail.com>>> wrote:
``````
``````>     Hi Shay,
>      Just curious, is the conversion script for file based
``````

gateway

is

``````>     available? I haven't searched through GIT, so maybe it is
``````

there?

``````>     Spent the weekend building up 20mil docs and would prefer
``````

not

to

``````>     repeat this.
``````
``````>     Thanks,
>     Paul
``````
``````>     On Aug 24, 4:34 pm, Shay Banon <
``````

shay.ba...@elasticsearch.com

``````<mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > Indeed. All the __xxx fils are binary files (either index
files or
>     > transaction log parts). The commit-N is a json file that
provides meta
>     > information on the commit point.
``````
``````>     > On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
``````

wrote:

``````>     > > I was very curious to see what the file format was!
``````
``````>     > > On Aug 24, 3:05 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:
``````
``````>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > > > Great, thanks for validating that. Did not expect
``````

someone

to

``````>     open vi on
>     > > the
>     > > > gateway ;)
``````
``````>     > > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > wrote:
>     > > > > I think it's fixed in head; I didn't see this error
``````

again

``````>     when trying
>     > > > > the test below.
``````
``````>     > > > > On Aug 24, 1:12 pm, Grant Rodgers <
``````

gra...@gmail.com

``````<mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
``````

wrote:

``````>     > > > > > Oh you know it was probably a vi swap file. I was
taking a
>     look at
>     > > one
>     > > > > > of the commit logs, and it might have tried to
``````

snapshot

``````>     while it was
>     > > > > > open.
``````
``````>     > > > > > I think you have committed another change since
``````

80c7135

``````>     that ignores
>     > > > > > files elasticsearch didn't create. I'll build the
``````

latest

``````>     head and try
>     > > > > > viewing a commit snapshot again.
``````
``````>     > > > > > On Aug 24, 1:00 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     wrote:
``````
``````>     > > > > > > Strange, not sure how this file ended up in the
gateway,
>     I can't
>     > > see
>     > > > > where
>     > > > > > > elasticsearch would write it. It only writes
``````

__xxx

``````files
>     (no .
>     > > > > something)
>     > > > > > > and commit- files. I will fix it to ignore
``````

files

that

``````>     don't conform
>     > > to
>     > > > > the
>     > > > > > > format, but we should try and understand where
``````

its

``````>     coming from...
``````
``````>     > > > > > > -shay.banon
``````
``````>     > > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > Oh btw that error was with 80c7135
``````
``````>     > > > > > > > On Aug 23, 12:22 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > Yes, though there is a whole "workflow"
``````

level

APIs

``````>     and support
>     > > for.
>     > > > > But,
>     > > > > > > > the
>     > > > > > > > > basics are there in the gateway.
``````
``````>     > > > > > > > > -shay.banon
``````
``````>     > > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
>     <talsalm...@gmail.com <mailto:talsalm...@gmail.com>
<mailto:talsalm...@gmail.com <mailto:talsalm...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > > > Very nice feature indeed.
>     > > > > > > > > > Will this also allow for index level
``````

commit

``````points?
``````
``````>     > > > > > > > > > Tal
``````
``````>     > > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
>     > > shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > > > Hi,
``````
``````>     > > > > > > > > > >     I am going to push (pretty soon) a
``````

major

``````>     rewrite of the
>     > > > > gateway
>     > > > > > > > > > module
>     > > > > > > > > > > and improved throttling support. The
``````

gateway

``````>     change is a
>     > > > > breaking
>     > > > > > > > change,
>     > > > > > > > > > > meaning that new version will not be
``````

able

to

``````>     recover from a
>     > > 0.9
>     > > > > > > > gateway.
>     > > > > > > > > > I
>     > > > > > > > > > > will provide an upgrade script for file
``````

based

``````>     gateway, s3
>     > > based
>     > > > > > > > gateway
>     > > > > > > > > > will
>     > > > > > > > > > > require reindexing, though,
``````

potentially,

that

``````>     script can be
>     > > > > adjusted
>     > > > > > > > to
>     > > > > > > > > > > support it.
``````
``````>     > > > > > > > > > >    Let me explain some of the changes.
``````

The

``````first is
>     > > throttling
>     > > > > > > > support.
>     > > > > > > > > > In
>     > > > > > > > > > > 0.9, recoveries are being throttled on
``````

a

``````>     specific node in
>     > > order
>     > > > > to
>     > > > > > > > reduce
>     > > > > > > > > > > the load a that node. The throttling
``````

was

``````done on
>     the node
>     > > > > level,
>     > > > > > > > after a
>     > > > > > > > > > > shard has been allocated to it.
``````

Maintain

the

``````>     count of
>     > > current
>     > > > > > > > recoveries
>     > > > > > > > > > is
>     > > > > > > > > > > quite tricky because of the complexity
``````

of

the

``````>     recovery
>     > > process.
>     > > > > This
>     > > > > > > > has
>     > > > > > > > > > now
>     > > > > > > > > > > been refactored into a better place,
``````

which

is

``````>     the actual
>     > > > > > > > > > > allocation algorithm that runs and
``````

shuffles

``````>     shards around.
``````
``````>     > > > > > > > > > >    The more interesting change is the
``````

gateway.

``````>     There were
>     > > > > several
>     > > > > > > > > > problems
>     > > > > > > > > > > with how the gateway works today that
``````

were

``````>     exposed by a
>     > > user of
>     > > > > > > > > > > elasticsearch that stores 4TB data
``````

(several

``````>     indices, each
>     > > with
>     > > > > 10
>     > > > > > > > shards
>     > > > > > > > > > and
>     > > > > > > > > > > 2 replicas, which sums it up to 12TB).
This has
>     uncovered
>     > > some
>     > > > > > > > problems
>     > > > > > > > > > with
>     > > > > > > > > > > the current design, specifically how
``````

md5

are

``````>     computed (and
>     > > the
>     > > > > time
>     > > > > > > > it
>     > > > > > > > > > takes
>     > > > > > > > > > > to compute them on the local storage on
ec2), as
>     well as
>     > > other
>     > > > > > > > > > possibilities
>     > > > > > > > > > > for gateway corruptions using this
``````

``````Of course,
>     > > > > elasticsearch aim
>     > > > > > > > is
>     > > > > > > > > > to
``````

...

(talsalmona) #20

I see. Looking forward for it

Thanks,
Tal

On Aug 29, 3:54 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

There isn't an API for that. Thats what I meant that there is a whole
"orchestration" API that needs to be built on top of the new capabilities of
the gateway storage model. The good news is that the new storage model
allows for that, but there is still work left to actually expose it to the
user.

-shay.banon

On Sun, Aug 29, 2010 at 3:33 PM, Tal talsalm...@gmail.com wrote:

Hi Shay,
Can you please explain or point me to the docs or example of how to
work with the commit point API and how the commit / rollback workflow
works?

Thanks,
Tal

On Aug 26, 4:27 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Yea, sure.

On Thu, Aug 26, 2010 at 3:30 PM, Kenneth Loafman
kenneth.loaf...@gmail.comwrote:

Is it possible to copy a gateway from S3 to local, run the conversion
locally, then copy it back to S3? The S3 rename issue could be solved
this way, with less total downtime.

...Ken

Shay Banon wrote:

You can check what I do in the script, its pretty simple. The problem
is

that S3 does not have a rename method..., though maybe a copy command
can be used, but not sure how long it takes for large files. You can
replace the file based operations with S3 ones.

-shay.banon

On Thu, Aug 26, 2010 at 2:22 AM, Kenneth Loafman
<kenneth.loaf...@gmail.com mailto:kenneth.loaf...@gmail.com>
wrote:

``````Any hints for how to upgrade an S3 gateway?
``````
``````...Ken
``````
``````Shay Banon wrote:
> Yes, here it is:http://gist.github.com/546494.
``````
``````> -shay.banon
``````
``````> On Wed, Aug 25, 2010 at 8:46 PM, Paul <ppea...@gmail.com
<mailto:ppea...@gmail.com>
> <mailto:ppea...@gmail.com <mailto:ppea...@gmail.com>>> wrote:
``````
``````>     Hi Shay,
>      Just curious, is the conversion script for file based
``````

gateway

is

``````>     available? I haven't searched through GIT, so maybe it is
``````

there?

``````>     Spent the weekend building up 20mil docs and would prefer
``````

not

to

``````>     repeat this.
``````
``````>     Thanks,
>     Paul
``````
``````>     On Aug 24, 4:34 pm, Shay Banon <
``````

shay.ba...@elasticsearch.com

``````<mailto:shay.ba...@elasticsearch.com>
>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > Indeed. All the __xxx fils are binary files (either index
files or
>     > transaction log parts). The commit-N is a json file that
provides meta
>     > information on the commit point.
``````
``````>     > On Wed, Aug 25, 2010 at 1:32 AM, Grant Rodgers
<gra...@gmail.com <mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
``````

wrote:

``````>     > > I was very curious to see what the file format was!
``````
``````>     > > On Aug 24, 3:05 pm, Shay Banon
<shay.ba...@elasticsearch.com <mailto:
``````
``````>     <mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>> wrote:
>     > > > Great, thanks for validating that. Did not expect
``````

someone

to

``````>     open vi on
>     > > the
>     > > > gateway ;)
``````
``````>     > > > On Wed, Aug 25, 2010 at 12:18 AM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > wrote:
>     > > > > I think it's fixed in head; I didn't see this error
``````

again

``````>     when trying
>     > > > > the test below.
``````
``````>     > > > > On Aug 24, 1:12 pm, Grant Rodgers <
``````

gra...@gmail.com

``````<mailto:gra...@gmail.com>
>     <mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
``````

wrote:

``````>     > > > > > Oh you know it was probably a vi swap file. I was
taking a
>     look at
>     > > one
>     > > > > > of the commit logs, and it might have tried to
``````

snapshot

``````>     while it was
>     > > > > > open.
``````
``````>     > > > > > I think you have committed another change since
``````

80c7135

``````>     that ignores
>     > > > > > files elasticsearch didn't create. I'll build the
``````

latest

``````>     head and try
>     > > > > > viewing a commit snapshot again.
``````
``````>     > > > > > On Aug 24, 1:00 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     wrote:
``````
``````>     > > > > > > Strange, not sure how this file ended up in the
gateway,
>     I can't
>     > > see
>     > > > > where
>     > > > > > > elasticsearch would write it. It only writes
``````

__xxx

``````files
>     (no .
>     > > > > something)
>     > > > > > > and commit- files. I will fix it to ignore
``````

files

that

``````>     don't conform
>     > > to
>     > > > > the
>     > > > > > > format, but we should try and understand where
``````

its

``````>     coming from...
``````
``````>     > > > > > > -shay.banon
``````
``````>     > > > > > > On Tue, Aug 24, 2010 at 10:25 PM, Grant Rodgers
>     <gra...@gmail.com <mailto:gra...@gmail.com>
<mailto:gra...@gmail.com <mailto:gra...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > Oh btw that error was with 80c7135
``````
``````>     > > > > > > > On Aug 23, 12:22 pm, Shay Banon
>     <shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > Yes, though there is a whole "workflow"
``````

level

APIs

``````>     and support
>     > > for.
>     > > > > But,
>     > > > > > > > the
>     > > > > > > > > basics are there in the gateway.
``````
``````>     > > > > > > > > -shay.banon
``````
``````>     > > > > > > > > On Mon, Aug 23, 2010 at 10:21 PM, Tal
>     <talsalm...@gmail.com <mailto:talsalm...@gmail.com>
<mailto:talsalm...@gmail.com <mailto:talsalm...@gmail.com>>>
>     > > > > wrote:
>     > > > > > > > > > Very nice feature indeed.
>     > > > > > > > > > Will this also allow for index level
``````

commit

``````points?
``````
``````>     > > > > > > > > > Tal
``````
``````>     > > > > > > > > > On Aug 23, 10:11 pm, Shay Banon <
>     > > shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>
<mailto:shay.ba...@elasticsearch.com
<mailto:shay.ba...@elasticsearch.com>>>
>     > > > > wrote:
>     > > > > > > > > > > Hi,
``````
``````>     > > > > > > > > > >     I am going to push (pretty soon) a
``````

major

``````>     rewrite of the
>     > > > > gateway
>     > > > > > > > > > module
>     > > > > > > > > > > and improved throttling support. The
``````

gateway

``````>     change is a
>     > > > > breaking
>     > > > > > > > change,
>     > > > > > > > > > > meaning that new version will not be
``````

able

to

``````>     recover from a
>     > > 0.9
>     > > > > > > > gateway.
>     > > > > > > > > > I
>     > > > > > > > > > > will provide an upgrade script for file
``````

based

``````>     gateway, s3
>     > > based
>     > > > > > > > gateway
>     > > > > > > > > > will
>     > > > > > > > > > > require reindexing, though,
``````

potentially,

that

``````>     script can be
>     > > > > adjusted
>     > > > > > > > to
>     > > > > > > > > > > support it.
``````
``````>     > > > > > > > > > >    Let me explain some of the changes.
``````

The

``````first is
>     > > throttling
>     > > > > > > > support.
>     > > > > > > > > > In
>     > > > > > > > > > > 0.9, recoveries are being throttled on
``````

a

``````>     specific node in
>     > > order
>     > > > > to
>     > > > > > > > reduce
>     > > > > > > > > > > the load a that node. The throttling
``````

was

``````done on
>     the node
>     > > > > level,
>     > > > > > > > after a
>     > > > > > > > > > > shard has been allocated to it.
``````

Maintain

the

``````>     count of
>     > > current
>     > > > > > > > recoveries
>     > > > > > > > > > is
>     > > > > > > > > > > quite tricky because of the complexity
``````

of

the

``````>     recovery
>     > > process.
>     > > > > This
>     > > > > > > > has
>     > > > > > > > > > now
>     > > > > > > > > > > been refactored into a better place,
``````

which

is

``````>     the actual
>     > > > > > > > > > > allocation algorithm that runs and
``````

shuffles

``````>     shards around.
``````
``````>     > > > > > > > > > >    The more interesting change is the
``````

gateway.

``````>     There were
``````

...