Null_values and unmapped fields ... do they affect the index and sorting?

A couple questions:

Primary problem: sorting on some_subdoc.status gets a NumberFormatException.

We have a logs use case, with a number of types which use different mappings, and a lot of different sub-objects/sub-documents have a "status" field. When one of the users of the system does a search that tries to sort on e.g. response.status, it gets the N.F.E. "(is this really an INT?) ."

I have added "null_value": 0 to all the various status fields, re-pushed the mapping, and when I search for 0 in that field, the record retrieved shows the original null in the xxxx.status field.

Question 1: does the sort work off the index, or the value in _source ? ( and if the latter, do I have to change the null value before it gets indexed at all (like in Logstash)? )

Also we get things like "x": {"y": {"status":null}} in some logs; the "x" object is set to non-dynamic.

Question 2: Is there any chance that x.y.status could be messing up the sort?

Thanks for whatever you good folks can tell me.

It works on fielddata, which is either what is stored in doc values if you enabled them, or in the index otherwise.

I suspect your NumberFormatException is because you have an field in your index which is mapped as a string on one type and as an integer on another type. This is unfortunately something that Elasticsearch doesn't support and that we will enforce as of 2.0. The only option would be to reindex either each type in a different index or to assign field names in such a way that they will have the same mapping on all types.

Yes, I suspect that too.
Would those nulls in the _source cause this, or must there be a string somewhere?

I'm tearing my hair out trying to find the value in a G of docs per day. Guess I'll write a script to scan/scroll a day's index.

I don't think null could be an issue. However note that if you ever had a document with a string value, then the issue might still exist, even if that document has been removed since then.