[Hadoop] Setting Document ID in Map Reduce Mapper


(Daniel Tardón) #1

Hi all,

I'm newbie with ES and i'm trying to set manually each document ID. I've
seen in the documentation the es.mapping.id propperty and I'm trying to
set it in the conf part of the driver class the same way i set the index
and type of documents:

conf.set("es.resource", "logs/{event}");

conf.set("es.mapping.id", "id");

In the Mapper class I put in the MapWritable object a new key value pair
for each map:

MapWritable doc = new MapWritable();

String id = node+"|"+timestamp; //node and timestamp are two String values
that I have.
doc.put(new Text("id"), new Text(id));

And as a result I can't write in ES and get exceptions with this message:
JsonParseException[Unexpected character ('"' (code 34))

If I comment the es.mapping.id line and allow ES to set the documents ID
everything works fine.

What could I do?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ae11fa62-582e-4c67-8819-cd8616243e8e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Juan Carlos Fernández) #2

I had the same issue and it was solved using es-hadoop 2.0.1 instead 2.0.0.
Looks like a solved bug but I couldn't find anyone claiming it like an open
bug neither closed.
Regards

El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió:

Hi all,

I'm newbie with ES and i'm trying to set manually each document ID. I've
seen in the documentation the es.mapping.id http://es.mapping.id
propperty and I'm trying to set it in the conf part of the driver class the
same way i set the index and type of documents:

conf.set("es.resource", "logs/{event}");

conf.set("es.mapping.id", "id");

In the Mapper class I put in the MapWritable object a new key value pair
for each map:

MapWritable doc = new MapWritable();

String id = node+"|"+timestamp; //node and timestamp are two String
values that I have.
doc.put(new Text("id"), new Text(id));

And as a result I can't write in ES and get exceptions with this message:
JsonParseException[Unexpected character ('"' (code 34))

If I comment the es.mapping.id line and allow ES to set the documents ID
everything works fine.

What could I do?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9c31846f-b141-4d7d-971b-d8a2b2c43843%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #3

Glad to hear the issue has been fixed - not sure how I've missed this email before.
It was probably addressed in the way the parameters are handled [1]

In the future, when encountering an unexpected behaviour/bug please file an issue on github as well to make sure it's
not getting lost.

Thanks!

[1] https://github.com/elasticsearch/elasticsearch-hadoop/issues/223

On 9/1/14 4:15 PM, Juan Carlos Fernández wrote:

I had the same issue and it was solved using es-hadoop 2.0.1 instead 2.0.0. Looks like a solved bug but I couldn't find
anyone claiming it like an open bug neither closed.
Regards

El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió:

Hi all,

I'm newbie with ES and i'm trying to set manually each document ID. I've seen in the documentation the
/es.mapping.id <http://es.mapping.id>/ propperty and I'm trying to set it in the conf part of the driver class the
same way i set the index and type of documents:

        conf.set("es.resource", "logs/{event}");
        conf.set("es.mapping.id <http://es.mapping.id>", "id");


In the Mapper class I put in the MapWritable object a new key value pair for each map:

        MapWritable doc = new MapWritable();
        String id = node+"|"+timestamp; //node and timestamp are two String values that I have.
        doc.put(new Text("id"), new Text(id));


And as a result I can't write in ES and get exceptions with this message: JsonParseException[Unexpected character
('"' (code 34))

If I comment the es.mapping.id <http://es.mapping.id> line and allow ES to set the documents ID everything works fine.

What could I do?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/9c31846f-b141-4d7d-971b-d8a2b2c43843%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/9c31846f-b141-4d7d-971b-d8a2b2c43843%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54049929.9040103%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Daniel Tardón) #4

Indeed updating to 2.0.1 solves the problem.

I looked for issues in github related with specifying a document id in
MapReduce and the one [1] I found was closed without explanation and a link
to the online documentation.

So I wrote here because I thought it wasn't a bug. I thought it was a
misconfiguration in my code.

Thanks Juan Carlos for the solution.

[1] https://github.com/elasticsearch/elasticsearch-hadoop/issues/163

El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió:

Hi all,

I'm newbie with ES and i'm trying to set manually each document ID. I've
seen in the documentation the es.mapping.id http://es.mapping.id
propperty and I'm trying to set it in the conf part of the driver class the
same way i set the index and type of documents:

conf.set("es.resource", "logs/{event}");

conf.set("es.mapping.id", "id");

In the Mapper class I put in the MapWritable object a new key value pair
for each map:

MapWritable doc = new MapWritable();

String id = node+"|"+timestamp; //node and timestamp are two String
values that I have.
doc.put(new Text("id"), new Text(id));

And as a result I can't write in ES and get exceptions with this message:
JsonParseException[Unexpected character ('"' (code 34))

If I comment the es.mapping.id line and allow ES to set the documents ID
everything works fine.

What could I do?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/fefaa6b7-5b20-4ba1-8d47-833268de1fd6%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Costin Leau) #5

Actually the issue you refer to was not regarding a bug but rather a feature that the user was not aware of - hence the
closing comment pointing to the documentation.

In general, when in doubt, feel free to open an issue, even if it's a duplicate.

Glad to hear the problem has been sorted out.

Cheers!

On 9/2/14 12:06 PM, Daniel Tardón wrote:

Indeed updating to 2.0.1 solves the problem.

I looked for issues in github related with specifying a document id in MapReduce and the one [1] I found was closed
without explanation and a link to the online documentation.

So I wrote here because I thought it wasn't a bug. I thought it was a misconfiguration in my code.

Thanks Juan Carlos for the solution.

[1] https://github.com/elasticsearch/elasticsearch-hadoop/issues/163

El martes, 3 de junio de 2014 15:52:21 UTC+2, Daniel Tardón escribió:

Hi all,

I'm newbie with ES and i'm trying to set manually each document ID. I've seen in the documentation the
/es.mapping.id <http://es.mapping.id>/ propperty and I'm trying to set it in the conf part of the driver class the
same way i set the index and type of documents:

        conf.set("es.resource", "logs/{event}");
        conf.set("es.mapping.id <http://es.mapping.id>", "id");


In the Mapper class I put in the MapWritable object a new key value pair for each map:

        MapWritable doc = new MapWritable();
        String id = node+"|"+timestamp; //node and timestamp are two String values that I have.
        doc.put(new Text("id"), new Text(id));


And as a result I can't write in ES and get exceptions with this message: JsonParseException[Unexpected character
('"' (code 34))

If I comment the es.mapping.id <http://es.mapping.id> line and allow ES to set the documents ID everything works fine.

What could I do?

Thanks in advance

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to
elasticsearch+unsubscribe@googlegroups.com mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/fefaa6b7-5b20-4ba1-8d47-833268de1fd6%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/fefaa6b7-5b20-4ba1-8d47-833268de1fd6%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.

--
Costin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5405A719.5070403%40gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6