How do I search for non-ASCII characters?

TimWard · December 21, 2018, 1:37pm

I've got a load of log data in Elasticsearch. Most of it is in ASCII, but very occasionally there's a non-ASCII character ... which breaks a downstream naive Python application.

What query can I use to find the documents that contain non-ASCII characters? - yes I'll need to fix the downstream code anyway, but it would be interesting to know what is generating non-ASCII log documents. Example: all I know about a document is that it contains a U+FFFD character - how do I search for that?

system · January 18, 2019, 1:37pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Javascript client munging up non-ascii characters Elasticsearch	1	442	December 13, 2017
Why do non ASCII characters cause problems in elastic search? Elasticsearch	1	563	January 10, 2022
Unable to store data containing non-ASCII chars into elasticsearch Elasticsearch	3	2139	March 31, 2017
Non Alphanumeric character searching Elasticsearch	4	2416	July 6, 2017
Chinese characters as field names using logstash Logstash	2	783	January 6, 2017

How do I search for non-ASCII characters?

Related topics