Indexing and "_id" question


(dmcclure) #1

If you have an "_id" field in your source json and try to index it
without passing in an id of your own, as in prepareIndex("indexName",
"type"), you get an error complaining that the "_id" generated by
elastic does not match "_id" in source. Same goes if I pass in my own
id to prepareSearch() to use for the index id and source contains an
"_id" that does not match up.

I know this is expected behavior but was curious if there was a way
around it, possibly be able to map "_id" to another field in source,
ignore existing "_id" in source, or use "_id" already in source if it
exists and an index id is not passed to prepareIndex()?

Background:
We are using mongodb, converting mongodb objects into json and storing
them into elastic. Mongo also uses _id and under the covers this is a
org.bson.types.ObjectId, which basically creates an _id field like:

   "_id":{"time":

1285864090000,"new":true,"machine":-561909349,"inc":1450029131}

when converted to json. If this field is used at top-level object for
source then elastic will error out for non-matching ids. Can't have
elastic use this id as is for index id, would need to convert to GUID,
which is not a problem, just curious if there was a way around it.

Thanks!


(Shay Banon) #2

Hey,

No, there isn't currently a way around this. It really simplifies things
when the _id, _type, and _source can't "change" their names. I actually
started (way back) with allowing that, but it got things too complicated to
be worth the effort. It also makes maintaing things much simpler. I think
you will have to handle it on your end..., sorry.

-shay.banon

On Thu, Sep 30, 2010 at 8:07 PM, dmcclure duane.mcclure@gmail.com wrote:

If you have an "_id" field in your source json and try to index it
without passing in an id of your own, as in prepareIndex("indexName",
"type"), you get an error complaining that the "_id" generated by
elastic does not match "_id" in source. Same goes if I pass in my own
id to prepareSearch() to use for the index id and source contains an
"_id" that does not match up.

I know this is expected behavior but was curious if there was a way
around it, possibly be able to map "_id" to another field in source,
ignore existing "_id" in source, or use "_id" already in source if it
exists and an index id is not passed to prepareIndex()?

Background:
We are using mongodb, converting mongodb objects into json and storing
them into elastic. Mongo also uses _id and under the covers this is a
org.bson.types.ObjectId, which basically creates an _id field like:

  "_id":{"time":

1285864090000,"new":true,"machine":-561909349,"inc":1450029131}

when converted to json. If this field is used at top-level object for
source then elastic will error out for non-matching ids. Can't have
elastic use this id as is for index id, would need to convert to GUID,
which is not a problem, just curious if there was a way around it.

Thanks!


(dmcclure) #3

No worries at all, figured that was case but thought I'd ask just to
be sure. Thanks again!

On Sep 30, 2:13 pm, Shay Banon shay.ba...@elasticsearch.com wrote:

Hey,

No, there isn't currently a way around this. It really simplifies things
when the _id, _type, and _source can't "change" their names. I actually
started (way back) with allowing that, but it got things too complicated to
be worth the effort. It also makes maintaing things much simpler. I think
you will have to handle it on your end..., sorry.

-shay.banon

On Thu, Sep 30, 2010 at 8:07 PM, dmcclure duane.mccl...@gmail.com wrote:

If you have an "_id" field in your source json and try to index it
without passing in an id of your own, as in prepareIndex("indexName",
"type"), you get an error complaining that the "_id" generated by
elastic does not match "_id" in source. Same goes if I pass in my own
id to prepareSearch() to use for the index id and source contains an
"_id" that does not match up.

I know this is expected behavior but was curious if there was a way
around it, possibly be able to map "_id" to another field in source,
ignore existing "_id" in source, or use "_id" already in source if it
exists and an index id is not passed to prepareIndex()?

Background:
We are using mongodb, converting mongodb objects into json and storing
them into elastic. Mongo also uses _id and under the covers this is a
org.bson.types.ObjectId, which basically creates an _id field like:

  "_id":{"time":

1285864090000,"new":true,"machine":-561909349,"inc":1450029131}

when converted to json. If this field is used at top-level object for
source then elastic will error out for non-matching ids. Can't have
elastic use this id as is for index id, would need to convert to GUID,
which is not a problem, just curious if there was a way around it.

Thanks!


(Mahendra M) #4

Hi Shay,

One quick question. How is this being handled in the CouchDB river plan ?
couchdb also has an '_id' field by default.

Regards,
Mahendra

On Thu, Sep 30, 2010 at 11:43 PM, Shay Banon
shay.banon@elasticsearch.comwrote:

Hey,

No, there isn't currently a way around this. It really simplifies things
when the _id, _type, and _source can't "change" their names. I actually
started (way back) with allowing that, but it got things too complicated to
be worth the effort. It also makes maintaing things much simpler. I think
you will have to handle it on your end..., sorry.

-shay.banon

On Thu, Sep 30, 2010 at 8:07 PM, dmcclure duane.mcclure@gmail.com wrote:

If you have an "_id" field in your source json and try to index it
without passing in an id of your own, as in prepareIndex("indexName",
"type"), you get an error complaining that the "_id" generated by
elastic does not match "_id" in source. Same goes if I pass in my own
id to prepareSearch() to use for the index id and source contains an
"_id" that does not match up.

I know this is expected behavior but was curious if there was a way
around it, possibly be able to map "_id" to another field in source,
ignore existing "_id" in source, or use "_id" already in source if it
exists and an index id is not passed to prepareIndex()?

Background:
We are using mongodb, converting mongodb objects into json and storing
them into elastic. Mongo also uses _id and under the covers this is a
org.bson.types.ObjectId, which basically creates an _id field like:

  "_id":{"time":

1285864090000,"new":true,"machine":-561909349,"inc":1450029131}

when converted to json. If this field is used at top-level object for
source then elastic will error out for non-matching ids. Can't have
elastic use this id as is for index id, would need to convert to GUID,
which is not a problem, just curious if there was a way around it.

Thanks!

--
Mahendra

http://twitter.com/mahendra


(Shay Banon) #5

It maps nicely no? The same one is used for elasticsearch and couchdb.

On Fri, Oct 1, 2010 at 6:37 AM, Mahendra M mahendra.m@gmail.com wrote:

Hi Shay,

One quick question. How is this being handled in the CouchDB river plan ?
couchdb also has an '_id' field by default.

Regards,
Mahendra

On Thu, Sep 30, 2010 at 11:43 PM, Shay Banon <shay.banon@elasticsearch.com

wrote:

Hey,

No, there isn't currently a way around this. It really simplifies things
when the _id, _type, and _source can't "change" their names. I actually
started (way back) with allowing that, but it got things too complicated to
be worth the effort. It also makes maintaing things much simpler. I think
you will have to handle it on your end..., sorry.

-shay.banon

On Thu, Sep 30, 2010 at 8:07 PM, dmcclure duane.mcclure@gmail.comwrote:

If you have an "_id" field in your source json and try to index it
without passing in an id of your own, as in prepareIndex("indexName",
"type"), you get an error complaining that the "_id" generated by
elastic does not match "_id" in source. Same goes if I pass in my own
id to prepareSearch() to use for the index id and source contains an
"_id" that does not match up.

I know this is expected behavior but was curious if there was a way
around it, possibly be able to map "_id" to another field in source,
ignore existing "_id" in source, or use "_id" already in source if it
exists and an index id is not passed to prepareIndex()?

Background:
We are using mongodb, converting mongodb objects into json and storing
them into elastic. Mongo also uses _id and under the covers this is a
org.bson.types.ObjectId, which basically creates an _id field like:

  "_id":{"time":

1285864090000,"new":true,"machine":-561909349,"inc":1450029131}

when converted to json. If this field is used at top-level object for
source then elastic will error out for non-matching ids. Can't have
elastic use this id as is for index id, would need to convert to GUID,
which is not a problem, just curious if there was a way around it.

Thanks!

--
Mahendra

http://twitter.com/mahendra


(Mahendra M) #6

Oh yes! /me is duh! :slight_smile:

On Fri, Oct 1, 2010 at 3:40 PM, Shay Banon shay.banon@elasticsearch.comwrote:

It maps nicely no? The same one is used for elasticsearch and couchdb.

On Fri, Oct 1, 2010 at 6:37 AM, Mahendra M mahendra.m@gmail.com wrote:

Hi Shay,

One quick question. How is this being handled in the CouchDB river plan ?
couchdb also has an '_id' field by default.

Regards,
Mahendra

On Thu, Sep 30, 2010 at 11:43 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:

Hey,

No, there isn't currently a way around this. It really simplifies
things when the _id, _type, and _source can't "change" their names. I
actually started (way back) with allowing that, but it got things too
complicated to be worth the effort. It also makes maintaing things much
simpler. I think you will have to handle it on your end..., sorry.

-shay.banon

On Thu, Sep 30, 2010 at 8:07 PM, dmcclure duane.mcclure@gmail.comwrote:

If you have an "_id" field in your source json and try to index it
without passing in an id of your own, as in prepareIndex("indexName",
"type"), you get an error complaining that the "_id" generated by
elastic does not match "_id" in source. Same goes if I pass in my own
id to prepareSearch() to use for the index id and source contains an
"_id" that does not match up.

I know this is expected behavior but was curious if there was a way
around it, possibly be able to map "_id" to another field in source,
ignore existing "_id" in source, or use "_id" already in source if it
exists and an index id is not passed to prepareIndex()?

Background:
We are using mongodb, converting mongodb objects into json and storing
them into elastic. Mongo also uses _id and under the covers this is a
org.bson.types.ObjectId, which basically creates an _id field like:

  "_id":{"time":

1285864090000,"new":true,"machine":-561909349,"inc":1450029131}

when converted to json. If this field is used at top-level object for
source then elastic will error out for non-matching ids. Can't have
elastic use this id as is for index id, would need to convert to GUID,
which is not a problem, just curious if there was a way around it.

Thanks!

--
Mahendra

http://twitter.com/mahendra

--
Mahendra

http://twitter.com/mahendra


(system) #7