Return ip field from inner_hit in dot-decimal format


(Benjamin Gathmann) #1

I have been posting an issue on Github:


where I describe the problem of retrieving a field of "ip" from an inner_hit in the correct format.

@jpountz wrote that this is not a bug, but I have found no way of solving this problem.
I have now tried to use filtering on the inner_hit instead:
"inner_hits" : {"_source":"network.http.packet.dstip"}
but this won't return anything if I set
_source=False
in the query itself, I get nothing back.
Alternatively, this:
_source=".network.http.packet.*"
also returns nothing, and this:
_source=".network.http.packet.dstip""
returns all entries in this branch (so not just the inner_hits).
Which is frustrating.

Is there a method I have overlooked?


(Benjamin Gathmann) #2

Is my use case too special or is there simply no solution? :wink:


(Benjamin Gathmann) #3

So 22 people have read this question, but no replies :frowning:
Should I open a new issue on Github then?
Btw, I also don't understand why in my Github example (see above), I get e.g. 386205091 as a match for 184.28.188.99 (which has a binary value of 10111000000111001011110001100011 and decimal equivalent of 3088890979).


(Benjamin Gathmann) #4

I have done further tests in the meantime and improved my syntax skills ;-), and now at least I only get back correct matches. Right now the only solution I can see for myself is to convert each ip address to dot-decimal format using some custom function which is an awkward workaround (see the wonderful example for "PHP example to convert decimal to IP" here: http://stackoverflow.com/questions/12130464/ip-address-conversion-to-decimal-and-vice-versa).
This is really bad and should be fixed in ES.


(Benjamin Gathmann) #5

I have opened an issue on Github:


but the ES team closed it like the one before (see first post), so I suppose they think there is some way to solve it with the existing DSL.
Who can give advice?


(Benjamin Gathmann) #6

This issue actually takes me a step further to this question:
How is source filtering supposed to work for inner_hits?
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-inner-hits.html says that it does, but I have not seen any example, and I could not get it to work so far (see my explanation in the very first entry I posted here above).

I have also noticed that the problem with the formatting does not only affect IP addresses, but also e.g. analyzed strings. If I request these using fielddata_fields, I receive a list of all the tokens that my original string was split up into, but not the original string.


(David Pilato) #7

What this "_source": "network.http.packet.dstip" gives?


(Benjamin Gathmann) #8

The _source element for my inner_hits remains empty whatever I try.
Funnily, I can return the item through the document _source, but then it returns all entries.
So basically _source filtering seems to fail for inner_hits.


(David Pilato) #9

Can you reproduce it with a full SENSE example? As a gist, it would be fantastic!

David Pilato- Developer | Evangelist
@dadoonet|@elasticfr


(Benjamin Gathmann) #10

Hi David and anyone interested in this issue: I have found the SOLUTION :smile:
In the _source part of the inner_hits definition, you must list the fields using the RELATIVE path to the nested node.
I.e. if the nested path is:
network.http.packet
and the field you are looking for is:
network.http.packet.dstip
then you need to write
"inner_hits" : {"_source":"dstip"}
while for fielddata_fields, it must be:
"inner_hits" : {"fielddata_fields":"network.http.packet.dstip"}
What a Ɯberraschung! :wink:


(Benjamin Gathmann) #11

I added this information as a PR to the ES reference, see https://github.com/elastic/elasticsearch/pull/16571


(system) #12