COMB uuid or string equivilents to speed up inserts

Dennis · May 15, 2012, 2:00am

Anyone ever looked at the indexing (i.e. INSERTing in database parlance)
speed using the random generated uuids or COMB uuids in ElasticSearch?

There are also string equivalent, and base 64 equivalents. I may try a
simple bash script and see what happens.

Dennis · September 26, 2013, 9:45pm

OK, I finally got to this point in design and production.

I generate a COMB_GUID where the upper 32 bits are based on the bits 33
through 1 of Unix time in milliseconds. So, there are 93 bits of randomness
every 2 milliseconds and the rollover on the upper bits happnes every 106
years.

When inserting in postgres the ratio of speed between a fully random UUID
and a COMB _GUID holds as beneficial for the COMB_GUID.
The COMB_GUID is 2X faster.

In Elasticsearch, there is NO discernible difference between the two for
indexing. I'm still going to use COMB_GUIDS in case content goes to BTREE
indexes anywhere in the chain as if the content is fed time related, or can
be presorted on the id field so that it IS timer related and partially
sequential, it will speed up.

Pretty interesting.

On Monday, May 14, 2012 7:00:42 PM UTC-7, Dennis wrote:

Anyone ever looked at the indexing (i.e. INSERTing in database parlance)
speed using the random generated uuids or COMB uuids in Elasticsearch?

database - What are the performance improvement of Sequential Guid over standard Guid? - Stack Overflow

There are also string equivalent, and base 64 equivalents. I may try a
simple bash script and see what happens.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Dennis · September 27, 2013, 1:54am

Just to be sure that something wasn't 'endian' between the indexing of
Elastic search and Postgres, I flipped the 32 bits end for end and put them
on the other end of the string. Exact same result. The lucene indexing for
the '_id' field is apparently quite different than the indexing in Postgres.

Someday in the next 6 months, I'll do this same test on CouchDB and maybe
MongoDB.

On Thursday, September 26, 2013 2:45:21 PM UTC-7, Dennis wrote:

OK, I finally got to this point in design and production.

I generate a COMB_GUID where the upper 32 bits are based on the bits 33
through 1 of Unix time in milliseconds. So, there are 93 bits of randomness
every 2 milliseconds and the rollover on the upper bits happnes every 106
years.

When inserting in postgres the ratio of speed between a fully random UUID
and a COMB _GUID holds as beneficial for the COMB_GUID.
The COMB_GUID is 2X faster.

In Elasticsearch, there is NO discernible difference between the two for
indexing. I'm still going to use COMB_GUIDS in case content goes to BTREE
indexes anywhere in the chain as if the content is fed time related, or can
be presorted on the id field so that it IS timer related and partially
sequential, it will speed up.

Pretty interesting.

On Monday, May 14, 2012 7:00:42 PM UTC-7, Dennis wrote:

Anyone ever looked at the indexing (i.e. INSERTing in database parlance)
speed using the random generated uuids or COMB uuids in Elasticsearch?

database - What are the performance improvement of Sequential Guid over standard Guid? - Stack Overflow

There are also string equivalent, and base 64 equivalents. I may try a
simple bash script and see what happens.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Words vs Guids, what is faster? Elasticsearch	3	354	June 22, 2021
Performance concerns on using UUIDv4 generated ID Elasticsearch	6	2971	August 14, 2018
Performance considerations on uid generation Elasticsearch	2	401	July 5, 2019
Performance implications of using mongo id as elastic _id Elasticsearch	4	960	June 27, 2018
What algorithm is ElasticSearch create Document _Id based on?Could somebody answer me，plz Elasticsearch	3	6693	February 28, 2019

COMB uuid or string equivilents to speed up inserts

Related topics