Wrong Results for _source field with Elastic Search


I have and index with basic analyzer. The data that I store is schemaless.
The mapping - i say - store the document.

I add documents to ES through MR job. Data in LinkedHashMapWritable is
loaded into ES.
The number of key/values varies from 15 - 40 per document. Also the
key/values are not the same for all the docuements. But i thought ES
schemaless document handling manges the same.
But the _source data returned with get or search contains wrong data (lots
of key/values that are not a part of the document and few key/values of the
document missing).

Number of documents -> about 100k only.

Below is the schema.

curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/'

curl -XPOST

curl -XPUT
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_settings' -d '
"index": {
"analysis": {
"analyzer": {
"default": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["trim", "lowercase"]}

curl -XPOST 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_open'

curl -XDELETE 'qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/'

curl -XPUT
-d '
"dedup_out" : {
"_ttl" : { "enabled" : true, "default" : "7d" },
"_source" : {"enabled" : true},
"_all" : {"enabled" : false},
"norms": {"enabled" : false},

Let me know if you need any more details.


You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff09cef0-f9d0-4592-a700-efdfadf43dd1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.