On Dec 9, 2013, at 11:53 AM, joergprante@gmail.com wrote:
I'm not sure I understand. Elasticsearch is returning usage errors to the API so the client can react appropriately. Only failures are in the log, they are often decoupled from the client action and not yet possible to expose to the caller. Some usage errors or misconceptions are not detectable and Elasticsearch happily continues.
I am not saying that the approach is wrong -- simply that it can be easier. In case it isn't clear -- I think that elasticsearch will reduce the cost to do complicated things with data allowing one to focus on the problem instead of the mechanics. To me elasticsearch is up there with events like java in the 90s, node most recently and now elasticsearch (hadoop, Mahout, BogTable,.. are too much work to keep running) I hope it keeps going.
What do you mean by debugging a mapping?
Suppose I have a document that has many different data conversion issues with time ( As time goes on there will be other types that have similar issues)
It would be a time save if the post could return the error results.
What happens when someone sends 1 billion documents and wants to see what failed, -- special regex to find it in the log. I used to process 500,000,000 log entires a day with Oracle and it had issues but I could easily find everyone that failed -- using the same tools I used to add them. Log files are out of band. Just a comment.
It is a broader issue about general approach to use of a technology. Often products developed by developers are easy for developers to use. I was trying to suggest an improvement over the status quo. Can I make this work the way it is yes -- but there is a much better way.
Maybe you can post an example of your error or failure in the mapping so we can get a clearer picture about what you have encountered? I know that first steps to create a custom mapping is quite hard.
Thanks for that offer. I can't seem to figure it out.
I am using a river that uses follow from couchdb to automatically index. There are so many exceptions with my date time processing that the indexing actually slows down due to exception stack processing I guess. I have foure different time values in a nested JSON Object
consumerPrintToken: aaaaa
bbbb: [
{
ccc: 2013-12-09T07:30:10.488-0500
......
}
{
ccc: 2013-12-09T07:30:10.488-0500
}
]
unix_timestamp_created: 1386592206 (Seconds since 1970)
date_time_recorded: 2013-12-09T12:30:06.301Z
token_time_stamp: 1386592142134 (Milliseconds since 1970)
So I need to create a mapping for the time values above and then start the couchdb river. I am cautions since I spent the last few days undoing a couchbase and couchdb interaction where couchbase changed a template for all documents added and I didn't lknow enough to undo it so I had to start over)
I see that couchbase has a filter option so I could grab the value but it is not clear how you get all the nested time values.
{
"type" : "couchdb",
"couchdb" : {
"script" : "ctx.doc.field1 = 'value1'"
}
}
I am trying something like this in the mapping file below but I don't think I can do division without using the filter, don't recall if Java DateFormat has support for milliseconds or microseconds, sometimes usecs are useful for high data rates so the time can still be the key.
curl $CURL_OPTIONS -XPOST localhost:9200/test -d '{
"settings" : {
"number_of_shards" : 1
},
"mappings" : {
"b" : {
"_source" : {
"enabled" : true
},
"default" : {
"date_formats" : ["yyyy-MM-dd", "dd-MM-yyyy", "date_optional_time"]
},
"properties" : {
"cookie" : {
"type" : "string", "index" : "not_analyzed"
}
}
}
}
}' | ./jq .
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/F54855F7-E0D2-4B80-BB28-FAD782EC38EC%40gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.