Here is one solution of how to parse domain name from url

For example, if the url is "https://pixabay.com/images/search/planet/", what I need is the domain "pixabay.com".
We can try scripted filed to solve this problem with the help of this documentation.

Attention: Never forget using defor intor String to define a new variable.

Firstly, we use the method indexOf('/') to prase out "pixabay.com/images/search/planet/". This method would return the position of the first '/' of each string, which is 6 in this url.
After getting the position of the first '/', we need the substring of this url, starting from 8th of the string 'p' to the end. So we use path.substring(fSlashIndex+2).
Secondly, to parse out substring "pixabay.com" before the first '/' of "pixabay.com/images/search/planet/", we reuse the method indexOf('/'). But what we need to use path.substring(0,lSlashIndex). The starting position is 0 and the end position is before "lSlashIndex".

def path = doc['url.keyword'].value; if (path != null) { int fSlashIndex = path.indexOf('/'); if (fSlashIndex > 0) { def path_new = path.substring(fSlashIndex+2); if (path_new != null) { int lSlashIndex = path_new.indexOf('/'); if (lSlashIndex > 0) { return path_new.substring(0,lSlashIndex); } else { return path_new; } } } } return "";

Thanks for sharing this!

Scripted fields can be expensive if you have a lot of them, we would always recommend extracting this during the ingestion process.

Yeah, as I have to wait for a "long" time to see the result of each field.

Here is another simpler script with the help of this doc on how to use indexOf() method.
def path = doc['url.keyword'].value; if (path != null) { int n = path.indexOf('/'); int m = path.indexOf('/', n+2); if (m > 0) { return path.substring(n+2,m); } } return "";

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.