I have many fields (~800) in my CSV file. Most are integers while few are strings.
I do have a mapping file where I list the type. But it is not feasible to list every field and its type since it keeps getting updated. If I miss any field then it gets indexed as string. Is there any way to say treat every field as integer by default unless the ones I specify as string?
That way I only need to specify the few string fields in my mapping file.
Hi, any response to this seemingly simple query?
Why not sending numbers instead of strings in the first place?
If I'm not mistaken, that indeed becomes the question.
This would mean that in logstash filter I convert each and every field to integer. I have ~800 of those out of which 8-10 are strings and rest all integers. I don't want to define those 790 odd fields as integers.
Hope that clears it up. Thanks.
Whatever script you want apply to detect which fields are numeric or not, I'd recommend doing that in logstash.
You can also use an ingest pipeline to convert all the fields as number and then if it fails catch the error and do nothing to leave the field as a string.
That might work.
Thanks for your responses.
Currently I am using a CSV filter to read the data in logstash.
Are you suggesting to use some ingest pipeline AFTER this step?
CSV filter gives me an option to convert the type of fields but I don't want to do that for every field.
Then may be a ruby script as a filter in logstash?
I'm moving your question to #logstash as experts there might have better ideas...
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.