I trying to index some tweets and learn more about ES. It seems to be
a very nice product! But I struggle with a problem. I created an index
and defined a mapping to index only 3 fields (screen_name, geo and
created_at). Also, the index stores the source.
Tweets have an identifier (id) that is stored using 64 bits. But when
the document is inserted in ES, this field is stored with a rounded
value in _source field. For example, the value 174204927055368192
becomes 174204927055368200.
I tried to find a setting in documentation to prevent this, but I
couldn't find it.
I am using pyes HTTP and the latest version of ES. Also tried Thrift
plugin (1.0.0) without success.
long values are signed 64bit integers, provide it as a string (which is what twitter uses for ids, they moved from numeric value a long time ago). Btw, it gets rounded on your end when you construct the json, the _source stored is the bytes of the json you provided, so when you ask for it, you get it as it was provided.
On Wednesday, February 29, 2012 at 4:27 PM, Walter dos Santos Filho wrote:
Hi,
I trying to index some tweets and learn more about ES. It seems to be
a very nice product! But I struggle with a problem. I created an index
and defined a mapping to index only 3 fields (screen_name, geo and
created_at). Also, the index stores the source.
Tweets have an identifier (id) that is stored using 64 bits. But when
the document is inserted in ES, this field is stored with a rounded
value in _source field. For example, the value 174204927055368192
becomes 174204927055368200.
I tried to find a setting in documentation to prevent this, but I
couldn't find it.
I am using pyes HTTP and the latest version of ES. Also tried Thrift
plugin (1.0.0) without success.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.