anveshdd
(anvesh)
October 24, 2018, 2:24pm
1
Hi,
I am working on ES 6.3.X and I am using a external web crawler to crawl the data and when I check the index it has a lot of new line characters. Is there any suggestions to avoid the new line characters.
\tAbout\n\tSchools\n\tProspective Students\n\tGalleries\n\tNews\n\tEvents\n\tFaculty & Staff\n\tContact\n\n\n \n\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n\n \n\n \n \n \n\n \n\n \n\n \n\n \n\n\n \n\n \n\n \n\n \n\n\n \n \n \n About\n \n \n Schools\n \n \n Prospective Students\n \n \n Galleries\n \n \n News\n \n \n Events\n
dadoonet
(David Pilato)
October 24, 2018, 3:15pm
2
Why not solving that in your crawler directly?
anveshdd
(anvesh)
October 24, 2018, 3:40pm
3
I tried in my crawler but I have a limited resources on that. That's the reason I posted here whether any solution for that in elastic search side.
xavierfacq
(Xavier Facq)
October 24, 2018, 4:17pm
4
Hi,
You can look a filters to add to your mapping, in my project I have a filter like this one for special chars:
Doc: https://www.elastic.co/guide/en/elasticsearch/reference/6.4/analysis-custom-analyzer.html#_configuration_8
"char_filter" : {
"custom-quotes" : {
"mappings" : [
"\\u0091=>\\u0027",
"\\u0092=>\\u0027",
"\\u2018=>\\u0027",
"\\u2019=>\\u0027",
"\\u201B=>\\u0027",
"\\uFF07=>\\u0027"
],
"type" : "mapping"
}
}
bye,
Xavier
system
(system)
Closed
November 21, 2018, 4:17pm
5
This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.