Problem with accents

Hello everyone,

I am trying to load data from a csv file to elasticsearch, but I can't get the accents "é",
I get \xE0 for "à", there is some solution to solve this ?

Here is my config, and the accents exist in the column ["Libellé évènement"

###############################################################################################################

input
{
file
{
path => "C:/Users/BEKRISO/KIBANA7.0.1/INPUT/9r_piste_audit.csv"
start_position => "beginning"
sincedb_path => "C:/Users/BEKRISO/KIBANA7.0.1/sincedb"
}
}

############################################################################################################################

filter
{
csv
{
separator => ","

	columns => ["Date et heure","Utilisateur","Code","Libelle evenement","Code retour","Application","Code site","Objet Start","Usage cache","Valeur avant modif","Valeur apres modif"]
}

	
mutate{

	convert => { 
		
		"Date et heure" => "string"
		"Utilisateur" => "string" 
		"Code" => "integer" 
		"Libellé évènement" => "string" 
		"Code retour" => "integer" 
		"Application" => "string" 
		"Code site" => "integer" 
		"Objet Start" => "string" 
		"Usage cache" => "string" 						
		"Valeur avant modif" => "string" 
		"Valeur après modif" => "string"	
	
	}
	
	#Gestion des accents
	rename => { "Libelle evenement" => "Libellé évènement"  
				"Valeur apres modif" => "Valeur après modif" }
					

}

date {  match => [ "Date et heure", "dd/MM/YY HH:mm" ] }

}

##############################################################################################################################

output
{
elasticsearch
{
hosts => "cas0000658713:9200"
index => "monbeaunode_1"
}

stdout { codec => rubydebug }

}
##############################################"

You need to make sure that your content is encoded with UTF8 and not another encoding.

thank you for your response, I tried the following config but still have the same problem :

###############################################################################################################

input
{
file
{
path => "C:/Users/BEKRISO/KIBANA7.0.1/INPUT/9r_piste_audit.csv"
start_position => "beginning"
sincedb_path => "C:/Users/BEKRISO/KIBANA7.0.1/sincedb"
}
stdin
{
codec => plain { charset=>"UTF-8" }
}
}

############################################################################################################################

filter
{
csv
{
separator => ","

	columns => ["Date et heure","Utilisateur","Code","Libelle evenement","Code retour","Application","Code site","Objet Start","Usage cache","Valeur avant modif","Valeur apres modif"]
}

	
mutate{

	convert => { 
		
		"Date et heure" => "string"
		"Utilisateur" => "string" 
		"Code" => "integer" 
		"Libellé évènement" => "string" 
		"Code retour" => "integer" 
		"Application" => "string" 
		"Code site" => "integer" 
		"Objet Start" => "string" 
		"Usage cache" => "string" 						
		"Valeur avant modif" => "string" 
		"Valeur après modif" => "string"	
	
	}
	
	#Gestion des accents
	rename => { "Libelle evenement" => "Libellé évènement"  
				"Valeur apres modif" => "Valeur après modif" }
					

}

date {  match => [ "Date et heure", "dd/MM/YY HH:mm" ] }

}

##############################################################################################################################

output
{
elasticsearch
{
hosts => "cas0000658713:9200"
index => "monbeaunode_1"
}

stdout { codec => rubydebug }

}

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

I meant that the CSV file should be in UTF8. If not, change the codec to the encoding you're using.

My csv file is in UTF-8 and I also made the UTF-8 codec, but still have the same error,
Here is my config code :

input
{
	file
	{
		path => "C:/Users/BEKRISO/KIBANA7.0.1/INPUT/9r_piste_audit.csv"
		start_position => "beginning"
		sincedb_path => "C:/Users/BEKRISO/KIBANA7.0.1/sincedb"
		codec => plain { charset => "UTF-8" }
	}
}


filter
{
	csv
	{
		separator => ","
		
		columns => ["Date et heure","Utilisateur","Code","Libelle evenement","Code retour","Application","Code site","Objet Start","Usage cache","Valeur avant modif","Valeur apres modif"]
	}

		
	mutate{
	
		convert => { 
			
			"Date et heure" => "string"
			"Utilisateur" => "string" 
			"Code" => "integer" 
			"Libellé évènement" => "string" 
			"Code retour" => "integer" 
			"Application" => "string" 
			"Code site" => "integer" 
			"Objet Start" => "string" 
			"Usage cache" => "string" 						
			"Valeur avant modif" => "string" 
			"Valeur après modif" => "string"	
		
		}
		
		#Gestion des accents
		rename => { "Libelle evenement" => "Libellé évènement"  
					"Valeur apres modif" => "Valeur après modif" }
						
	
	}
	
	date {  match => [ "Date et heure", "dd/MM/YY HH:mm" ] }

				
}

output
{
	elasticsearch
	{
		hosts => "cas0000658713:9200"
		index => "monbeaunode_1"
	}
	
stdout { codec => rubydebug }

}

Could you share your CSV file somewhere?

Here is some lines of my csv file :

   Date et heure,Utilisateur,Code,Libellé évènement,Code retour,Application,Code site,Objet Start,Usage cache,Valeur avant modif ,Valeur après modif
    17/06/2019 07:37,1,Appel à la passerelle par une application cliente,00,9R,0990,TA-NM252-0,NON,,
    17/06/2019 06:00,1,Appel à la passerelle par une application cliente,00,TM,0990,TA-KVNIVITG-0,OUI,,
    17/06/2019 03:00,1,Appel à la passerelle par une application cliente,00,KV_WKW,0990,TA-KVNIVITG-0,NON,,
    16/06/2019 06:00,1,Appel à la passerelle par une application cliente,00,TM,0990,TA-KVNIVITG-0,NON,,

Sure but can you share the source file (C:/Users/BEKRISO/KIBANA7.0.1/INPUT/9r_piste_audit.csv) that you are using? Not a copy and paste.

I can't share it, the only extensions possible to share are jpeg, jpg, png, gif

Try here: https://filebin.ca/

https://filebin.ca/4l1iz64QbIS7/9r_piste_audit.csv

Here is the content when you read it with UTF-8 encoding:

Date et heure,Utilisateur,Code,Libell? ?v?nement,Code retour,Application,Code site,Objet Start,Usage cache,Valeur avant modif ,Valeur apr?s modif
17/06/2019 07:37,1,Appel ? la passerelle par une application cliente,00,9R,0990,TA-NM252-0,NON,,
17/06/2019 06:00,1,Appel ? la passerelle par une application cliente,00,TM,0990,TA-KVNIVITG-0,OUI,,

So you file is not encoded in UTF-8.
I read it as ISO-8859-1 and saved it as UTF8. It's now this file: https://filebin.ca/4l1w3cBnDMd3/9r_piste_audit.csv

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.