Elasticsearch with JSONML


(posejavier) #1

Hi,
I have a couchdb database with JSON documents built by transforming
XML documents into JSON using JSONML.
When I use elasticsearch in this database it gives me an error.

Did anyone have the same problems?

Many thanks in advance,


(David Pilato) #2

No. But you could gist your Json files, your es config and log.

David :wink:
@dadoonet

Le 16 mars 2012 à 10:07, jp50045 posejavier@hotmail.com a écrit :

Hi,
I have a couchdb database with JSON documents built by transforming
XML documents into JSON using JSONML.
When I use elasticsearch in this database it gives me an error.

Did anyone have the same problems?

Many thanks in advance,


(posejavier) #3

Hi,

I am developing a project based on CouchDB and elasticsearch.

I have transformed around 500.000 XML documents in JSON using JSONML to store them in the couchdb database.
When I use elasticsearch for these documents, it gives me the error:

org.elasticsearch.index.mapper.MapperParsingException: object mapping
[streams] trying to serialize a value with no field associated with
it, current value [4ecb8c99a2144a03dc000081]
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:
573)
at
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:
443)
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:
577)
at
org.elasticsearch.index.mapper.object.ObjectMapper.serializeArray(ObjectMapper.java:
565)
at
org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:
435)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:
465)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:
414)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:
285)

Having a look to the code and some of your posts, I realized that this error happens when some field changes from value to object.
For example, in my case I have the following JSON document:

{
"_id": "26700cfb3089832b2d99d3c3ff00d368",
"_rev": "5-46f71a61512d7edca18fe4fc6cccff8f",
"tagName": "change-docut",
"childNodes": [
{
"tagName": "bibliographic-data",
"childNodes": [
{
"data-format": "z76",
"tagName": "publication-ref",
"childNodes": [
{
"tagName": "document-id",
"childNodes": [
{
"tagName": "doc-number",
"childNodes": [
"AR048470"
]
}
],
"lang": "es"
}
]
},
{
"tagName": "classification",
"childNodes": [
{
"tagName": "edition",
"childNodes": [
7
]
}
]
}
]
}
]
}

...the error, as I understand, appears because the object of the second field"childNodes" does not have the field "lang": "es", so elesticsearch gives the error when trying to serialize it because finds that it is null.

The question is...

Q1. is there any way in elasticsearch that I can avoid this error ?
Q2. should I use another XML to JSON converter in order to avoid the error?
Q3. Could it be possible to modify the code in "private void serializeValue" and "private void serializeObject" in order to avoid that elasticsearch gives an error?

MANY THANKS in advance for your help!!!


(system) #4