Wrong Results for _source field with Elastic Search


(Shobana Neelakantan) #1

Hi

I have and index with basic analyzer. The data that I store is schemaless.
The mapping - i say - store the document.

I add documents to ES through MR job. Data in LinkedHashMapWritable is
loaded into ES.
The number of key/values varies from 15 - 40 per document. Also the
key/values are not the same for all the docuements. But i thought ES
schemaless document handling manges the same.
But the _source data returned with get or search contains wrong data (lots
of key/values that are not a part of the document and few key/values of the
document missing).

Number of documents -> about 100k only.

Below is the schema.

curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/'

curl -XPOST
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_close'

curl -XPUT
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_settings' -d '
{
"index": {
"analysis": {
"analyzer": {
"default": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["trim", "lowercase"]}
}
}
}
}'

curl -XPOST 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_open'

curl -XDELETE 'qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/'

curl -XPUT
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/_mapping'
-d '
{
"dedup_out" : {
"_ttl" : { "enabled" : true, "default" : "7d" },
"_source" : {"enabled" : true},
"_all" : {"enabled" : false},
"norms": {"enabled" : false},
"ignore_above":128
}
}'

Let me know if you need any more details.

Thanks
Shobana

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff09cef0-f9d0-4592-a700-efdfadf43dd1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #2