Hi
I have and index with basic analyzer. The data that I store is schemaless.
The mapping - i say - store the document.
I add documents to ES through MR job. Data in LinkedHashMapWritable is
loaded into ES.
The number of key/values varies from 15 - 40 per document. Also the
key/values are not the same for all the docuements. But i thought ES
schemaless document handling manges the same.
But the _source data returned with get or search contains wrong data (lots
of key/values that are not a part of the document and few key/values of the
document missing).
Number of documents -> about 100k only.
Below is the schema.
curl -XPUT 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/'
curl -XPOST
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_close'
curl -XPUT
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_settings' -d '
{
"index": {
"analysis": {
"analyzer": {
"default": {
"type": "custom",
"tokenizer": "keyword",
"filter": ["trim", "lowercase"]}
}
}
}
}'
curl -XPOST 'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/_open'
curl -XDELETE 'qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/'
curl -XPUT
'http://qalt-es01-1-346019.slc01.dev.ebayc3.com:9200/fpti/dedup_out/_mapping'
-d '
{
"dedup_out" : {
"_ttl" : { "enabled" : true, "default" : "7d" },
"_source" : {"enabled" : true},
"_all" : {"enabled" : false},
"norms": {"enabled" : false},
"ignore_above":128
}
}'
Let me know if you need any more details.
Thanks
Shobana
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ff09cef0-f9d0-4592-a700-efdfadf43dd1%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.