Doc ID generation before communicating with ES


(vineeth mohan) #1

Hi ,

I have a system which picks feed , does XYZ data transformations and then
drops to ES.
For each feed there is a feed ID associated. And when the feed reaches ES ,
we will receive a related docID.
The feed ID and the doc ID are mapped somewhere to trace the path feed was
picked , transformed and pushed to storage.

I was wondering if there is some mechanism where a doc ID is generated at a
point long before it reaches ES and i just have to maintain a single ID for
the entire feed.

So basically i should be able to generate a doc ID much before i push the
feed to elastic Search and i should be able to guarantee that when the feed
reaches the ES , that ID is free (as in no document is present with that
ID).

Is this possible with ES ?

Thanks
Vineeth


(vineeth mohan) #2

its fine , if ES can give me a set of ID (say 1000 at a time) and guarantee
that these ID's never get auto Generated.

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hi ,

I have a system which picks feed , does XYZ data transformations and then
drops to ES.
For each feed there is a feed ID associated. And when the feed reaches ES
, we will receive a related docID.
The feed ID and the doc ID are mapped somewhere to trace the path feed was
picked , transformed and pushed to storage.

I was wondering if there is some mechanism where a doc ID is generated at
a point long before it reaches ES and i just have to maintain a single ID
for the entire feed.

So basically i should be able to generate a doc ID much before i push
the feed to elastic Search and i should be able to guarantee that when the
feed reaches the ES , that ID is free (as in no document is present with
that ID).

Is this possible with ES ?

Thanks
Vineeth


(Shay Banon) #3

I am not sure I follow what you are trying to do... .

On Tue, Nov 1, 2011 at 3:05 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

its fine , if ES can give me a set of ID (say 1000 at a time) and
guarantee that these ID's never get auto Generated.

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hi ,

I have a system which picks feed , does XYZ data transformations and
then drops to ES.
For each feed there is a feed ID associated. And when the feed reaches ES
, we will receive a related docID.
The feed ID and the doc ID are mapped somewhere to trace the path feed
was picked , transformed and pushed to storage.

I was wondering if there is some mechanism where a doc ID is generated at
a point long before it reaches ES and i just have to maintain a single ID
for the entire feed.

So basically i should be able to generate a doc ID much before i push
the feed to elastic Search and i should be able to guarantee that when the
feed reaches the ES , that ID is free (as in no document is present with
that ID).

Is this possible with ES ?

Thanks
Vineeth


(Clinton Gormley) #4

On Tue, 2011-11-01 at 18:35 +0530, Vineeth Mohan wrote:

its fine , if ES can give me a set of ID (say 1000 at a time) and
guarantee that these ID's never get auto Generated.

Vineeth - I released a Perl module to do just this:

https://metacpan.org/module/ElasticSearchX::Sequence

I explain how it works here:

http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html

It's simple to do, so you could translate the above into whatever
language you are using

clint

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
Hi ,

    I have a system which picks feed , does XYZ data
    transformations  and then drops to ES.
    For each feed there is a feed ID associated. And when the feed
    reaches ES , we will receive a related docID.
    The feed ID and the doc ID are mapped somewhere to trace the
    path feed was picked , transformed and pushed to storage. 
    
    I was wondering if there is some mechanism where a doc ID is
    generated at a point long before it reaches ES and i just have
    to maintain a single ID for the entire feed.
    
    So  basically i  should be able to generate a doc ID much
    before i push the feed to elastic Search and i should be able
    to guarantee that when the feed reaches the ES , that ID is
    free (as in no document is present with that ID).
    
    Is this possible with ES ?
    
    Thanks
               Vineeth

(vineeth mohan) #5

Thanks Clinton , that was exactly what i was looking for.

Thanks
Vineeth

On Wed, Nov 2, 2011 at 2:24 PM, Clinton Gormley clint@traveljury.comwrote:

On Tue, 2011-11-01 at 18:35 +0530, Vineeth Mohan wrote:

its fine , if ES can give me a set of ID (say 1000 at a time) and
guarantee that these ID's never get auto Generated.

Vineeth - I released a Perl module to do just this:

https://metacpan.org/module/ElasticSearchX::Sequence

I explain how it works here:

http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html

It's simple to do, so you could translate the above into whatever
language you are using

clint

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
Hi ,

    I have a system which picks feed , does XYZ data
    transformations  and then drops to ES.
    For each feed there is a feed ID associated. And when the feed
    reaches ES , we will receive a related docID.
    The feed ID and the doc ID are mapped somewhere to trace the
    path feed was picked , transformed and pushed to storage.

    I was wondering if there is some mechanism where a doc ID is
    generated at a point long before it reaches ES and i just have
    to maintain a single ID for the entire feed.

    So  basically i  should be able to generate a doc ID much
    before i push the feed to elastic Search and i should be able
    to guarantee that when the feed reaches the ES , that ID is
    free (as in no document is present with that ID).

    Is this possible with ES ?

    Thanks
               Vineeth

(system) #6