I am trying to extract bitcoin address from eml body that are indexed into elasticsearch.
I am trying to use a scripted field for it with the following script:
if (!doc["emlBody.keyword"].empty) {
def m = /(: )([1,3]{1}[a-zA-Z0-9]{25,34})/.matcher(doc["emlBody.keyword"].value);
if (m.matches()) { return m.group(2) }
else { return "no match" }
}
else { return "NULL"}
However, after checking my data it seems that I rarely enter the first if section, and just get "NULL" results, when i should have at least a "no match" or the correct bitcoin address.
I am a bit lost, just not certain why this doesnt work.
you can try and switch to this check for empty email body: if(doc["emlBody.keyword"].value !=null) I've found it works best.
Yes, I tried, but no luck.
if i use the "emlBody" field after setting fielddata=true i get a more expected results, I get "no match" results instead of "NULL" so it does mean i am entering the if statement, but my regex doesnt seem to pickup.
Hi camay123
,
Could you share the format of the emlBody field so we can check your regex expression?
Cheers
The Field as a Type and Format of String.
I believe this regex would be more appropriate for bitcoin address:
(: )([13][a-km-zA-HJ-NP-Z1-9]{25,34})
However, both should work, but are not. Here is the part of a sample email:
Hello,
<br><br>
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Praesent sit amet imperdiet elit. Donec sapien orci, rutrum id odio quis, dictum facilisis velit.
<br><br>
Aliquam auctor pulvinar sapien. Ut ut mi fermentum, tempor ex sed, ultricies tellus: 1K8TqsB2C1iY8qdGqhnHfgen3uE8GBU7c8
<br><br>
Aliquam nunc purus, porta non rutrum id, luctus consequat tortor. Maecenas sollicitudin mi vel nisi ultricies, eget blandit mauris feugiat. Fusce et porttitor sem.
I forgot to ask: did you activate regex in your elasticsearch.yml file?
script.painless.regex.enabled: true
yes I Did add this line in my yml file and then restarted ES
is their a way to confirm this is active ?