How can i parse the following in logstash

    1st candidate: 9892143415 is the ph. no.
    2nd candidate: my phone no. is 8976635405
    3rd candidate: hii all 8976547545 my phone no.

Likewise i have 100 candidates and all have wriiten their phone no. in different positions . How can i extract only the phone no.s without using 100 parsers .

You could match a group of digits surrounded by white space or end of line...

grok { match => { "message" => "(^|\s)(?<phone>\d{10})($|\s)" } }

@Vishnu_mk you can try this grok
%{GREEDYDATA:candidate_name}:%{GREEDYDATA:data1}%{NUMBER:phone_number}%{GREEDYDATA:data2}

Thanks Badger

Hii,
I am creating a pipeline . because the candidate details are in (.docx) format and i have used fscrawler to send it to elasticsearch. It has being successfully indexed to elasticsearch . But now i am trying to parse the details using pipiline

PUT _ingest/pipeline/demo3
{
  "description" : "fscrawler demo3",
  "processors" : [
    {
      "grok": { 
        "field": "content", 
          "patterns":  ["(^|\s)(?<phone>\d{10})($|\s)"] 
        }
 
    }
  ]
}

When i am writing this pipeline in devtool it is giving me Bad string fin the pattern part

Its giving me Unrecognized character escape 's'

It is running now by changing:

(^|\s)(?<phone>\d{10})($|\s)

to
(^|\\s)(?<phone>\\d{10})($|\\s)

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.