How to perform bulk unique inserts into an elasticsearch table using the .Net API?

bdb · November 4, 2013, 2:56pm

We are currently researching elasticsearch as a replacement for our current
system that uses TVP'shttp://msdn.microsoft.com/en-us/library/bb675163(v=vs.110).aspx on
SQL Server 2008R2.

The system ensures that the records about to be inserted are unique in the
table (based on a composite key).

The two data fields to be checked are UserID and ContentTitle. Instead of
checking if both exist, a hash can be created and stored in a single field.
When a record is to be added, the hashed valued of the incoming records can
be checked against the existing table's hash values.

How would this be accomplished in elasticsearch using the .Net API?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

spinscale · November 14, 2013, 8:56am

Hey,

two possible solutions (among a couple of others)

If you want to create a hash, use that hash as ID, when indexing your
data. The tuple of id and type are unique in an index. So if you reindex a
document with that hash, the data gets simple overwritten. Just make sure,
you have a good hash function in order to not overwrite data. You could
also configure the mapping of the index, to use the content of the hash
field to be used as ID. See
Elasticsearch Platform — Find real-time answers at scale | Elastic
Instead of creating a hash, it might be sufficient to simply concatenate
userId and contentTitle into a single ID (123_456 for example), and use
this ID when indexing. This would save a couple of CPU cycles as you dont
need to hash, but it might not be unique, depending on the format of these
ids.

Sorry, cannot tell you anything about the .NET API, but this should give
you a first hint I hope.

--Alex

On Mon, Nov 4, 2013 at 3:56 PM, bdb baden0x1@gmail.com wrote:

We are currently researching elasticsearch as a replacement for our
current system that uses TVP'shttp://msdn.microsoft.com/en-us/library/bb675163(v=vs.110).aspx on
SQL Server 2008R2.

The system ensures that the records about to be inserted are unique in the
table (based on a composite key).

The two data fields to be checked are UserID and ContentTitle. Instead of
checking if both exist, a hash can be created and stored in a single field.
When a record is to be added, the hashed valued of the incoming records can
be checked against the existing table's hash values.

How would this be accomplished in elasticsearch using the .Net API?

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
What is the recommended process for bulk inserts? Elasticsearch	2	565	February 18, 2017
Elasticsearch BULK API with .NET for upsert Elasticsearch	16	109	November 6, 2024
Dealing with duplicate documents Elasticsearch	4	1420	July 5, 2017
Update existing record in elasticsearch Elasticsearch	3	4354	June 16, 2017
Handling unique field (Other than the ID) Elasticsearch	3	6234	September 6, 2017

How to perform bulk unique inserts into an elasticsearch table using the .Net API?

Related topics