Doc ID generation before communicating with ES

vineeth_mohan · November 1, 2011, 12:57pm

Hi ,

I have a system which picks feed , does XYZ data transformations and then
drops to ES.
For each feed there is a feed ID associated. And when the feed reaches ES ,
we will receive a related docID.
The feed ID and the doc ID are mapped somewhere to trace the path feed was
picked , transformed and pushed to storage.

I was wondering if there is some mechanism where a doc ID is generated at a
point long before it reaches ES and i just have to maintain a single ID for
the entire feed.

So basically i should be able to generate a doc ID much before i push the
feed to elastic Search and i should be able to guarantee that when the feed
reaches the ES , that ID is free (as in no document is present with that
ID).

Is this possible with ES ?

Thanks
Vineeth

vineeth_mohan · November 1, 2011, 1:05pm

its fine , if ES can give me a set of ID (say 1000 at a time) and guarantee
that these ID's never get auto Generated.

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hi ,

I have a system which picks feed , does XYZ data transformations and then
drops to ES.
For each feed there is a feed ID associated. And when the feed reaches ES
, we will receive a related docID.
The feed ID and the doc ID are mapped somewhere to trace the path feed was
picked , transformed and pushed to storage.

I was wondering if there is some mechanism where a doc ID is generated at
a point long before it reaches ES and i just have to maintain a single ID
for the entire feed.

So basically i should be able to generate a doc ID much before i push
the feed to elastic Search and i should be able to guarantee that when the
feed reaches the ES , that ID is free (as in no document is present with
that ID).

Is this possible with ES ?

Thanks
Vineeth

kimchy · November 1, 2011, 6:12pm

I am not sure I follow what you are trying to do... .

On Tue, Nov 1, 2011 at 3:05 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

its fine , if ES can give me a set of ID (say 1000 at a time) and
guarantee that these ID's never get auto Generated.

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan vineethmohan@algotree.comwrote:

Hi ,

I have a system which picks feed , does XYZ data transformations and
then drops to ES.
For each feed there is a feed ID associated. And when the feed reaches ES
, we will receive a related docID.
The feed ID and the doc ID are mapped somewhere to trace the path feed
was picked , transformed and pushed to storage.

I was wondering if there is some mechanism where a doc ID is generated at
a point long before it reaches ES and i just have to maintain a single ID
for the entire feed.

So basically i should be able to generate a doc ID much before i push
the feed to elastic Search and i should be able to guarantee that when the
feed reaches the ES , that ID is free (as in no document is present with
that ID).

Is this possible with ES ?

Thanks
Vineeth

Clinton_Gormley · November 2, 2011, 8:54am

On Tue, 2011-11-01 at 18:35 +0530, Vineeth Mohan wrote:

its fine , if ES can give me a set of ID (say 1000 at a time) and
guarantee that these ID's never get auto Generated.

Vineeth - I released a Perl module to do just this:

https://metacpan.org/module/ElasticSearchX::Sequence

I explain how it works here:

http://blogs.perl.org/users/clinton_gormley/2011/10/elasticsearchsequence---a-blazing-fast-ticket-server.html

It's simple to do, so you could translate the above into whatever
language you are using

clint

Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
Hi ,

    I have a system which picks feed , does XYZ data
    transformations  and then drops to ES.
    For each feed there is a feed ID associated. And when the feed
    reaches ES , we will receive a related docID.
    The feed ID and the doc ID are mapped somewhere to trace the
    path feed was picked , transformed and pushed to storage. 
    
    I was wondering if there is some mechanism where a doc ID is
    generated at a point long before it reaches ES and i just have
    to maintain a single ID for the entire feed.
    
    So  basically i  should be able to generate a doc ID much
    before i push the feed to elastic Search and i should be able
    to guarantee that when the feed reaches the ES , that ID is
    free (as in no document is present with that ID).
    
    Is this possible with ES ?
    
    Thanks
               Vineeth

vineeth_mohan · November 3, 2011, 2:31am

Thanks Clinton , that was exactly what i was looking for.

Thanks
Vineeth

On Wed, Nov 2, 2011 at 2:24 PM, Clinton Gormley clint@traveljury.comwrote:

On Tue, 2011-11-01 at 18:35 +0530, Vineeth Mohan wrote:

its fine , if ES can give me a set of ID (say 1000 at a time) and
guarantee that these ID's never get auto Generated.

Vineeth - I released a Perl module to do just this:

https://metacpan.org/module/ElasticSearchX::Sequence

I explain how it works here:

ElasticSearch::Sequence - a blazing fast ticket server | Clinton Gormley [blogs.perl.org]

It's simple to do, so you could translate the above into whatever
language you are using

clint
Thanks
Vineeth

On Tue, Nov 1, 2011 at 6:27 PM, Vineeth Mohan
vineethmohan@algotree.com wrote:
Hi ,
    I have a system which picks feed , does XYZ data
    transformations  and then drops to ES.
    For each feed there is a feed ID associated. And when the feed
    reaches ES , we will receive a related docID.
    The feed ID and the doc ID are mapped somewhere to trace the
    path feed was picked , transformed and pushed to storage.

    I was wondering if there is some mechanism where a doc ID is
    generated at a point long before it reaches ES and i just have
    to maintain a single ID for the entire feed.

    So  basically i  should be able to generate a doc ID much
    before i push the feed to elastic Search and i should be able
    to guarantee that when the feed reaches the ES , that ID is
    free (as in no document is present with that ID).

    Is this possible with ES ?

    Thanks
               Vineeth

Topic		Replies	Views
Best practice in generating document ID Elasticsearch	2	10119	July 6, 2017
ElasticSearch DocumentId Elasticsearch	3	926	July 6, 2017
Automatic document id generation in elasticSearch Kibana	3	1152	December 5, 2016
Manager document ID in multi process and Elasticsearch Elasticsearch	2	506	February 6, 2017
Auto generated id Elasticsearch	3	6837	July 6, 2017

Doc ID generation before communicating with ES

Related topics