TikaException[XML parse error]; nested: SAXParseException while parsing html

Hi,

I am getting the following exception when I try the

PUT on kibana with ElasticSearch 6.6.0 with ingest-attachment-6.6.0

{
"error": {
"root_cause": [
{
"type": "exception",
"reason": "java.lang.IllegalArgumentException: ElasticsearchParseException[Error parsing document in field [resumeB64]]; nested: TikaException[XML parse error]; nested: SAXParseException[The markup in the document following the root element must be well-formed.];",
"header":

{ "processor_type": "attachment" }

}
],
"type": "exception",
"reason": "java.lang.IllegalArgumentException: ElasticsearchParseException[Error parsing document in field [resumeB64]]; nested: TikaException[XML parse error]; nested: SAXParseException[The markup in the document following the root element must be well-formed.];",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "ElasticsearchParseException[Error parsing document in field [resumeB64]]; nested: TikaException[XML parse error]; nested: SAXParseException[The markup in the document following the root element must be well-formed.];",
"caused_by": {
"type": "parse_exception",
"reason": "Error parsing document in field [resumeB64]",
"caused_by": {
"type": "tika_exception",
"reason": "XML parse error",
"caused_by":

{ "type": "s_a_x_parse_exception", "reason": "The markup in the document following the root element must be well-formed." }

}
}
},
"header":

{ "processor_type": "attachment" }

},
"status": 500
}

Following is the exception from Rest client

"extendedStackTrace": "org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=exception, reason=java.lang.IllegalArgumentException: ElasticsearchParseException[Error parsing document in field [resumeB64]]; nested: TikaException[XML parse error]; nested: SAXParseException[The markup in the document following the root element must be well-formed.]; ]\n\tat org.elasticsearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:177) ~[elasticsearch-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:1793) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:1769) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1606) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1563) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1546) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1512) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:858) ~[elasticsearch-rest-high-level-client-6.5.0.jar!/:6.5.0]\n\tat com.zeyo.ats.data.es.repo.ApplicantRepository.save(ApplicantRepository.java:69) ~[classes!/:2018.4.1.0]\n\tat com.zeyo.ats.data.es.service.ApplicantServiceImpl.save(ApplicantServiceImpl.java:119) ~[classes!/:2018.4.1.0]\n\tat com.zeyo.ats.data.es.service.ExternalCandidateSave.saveExternalCandidate(ExternalCandidateSave.java:118) ~[classes!/:2018.4.1.0]\n\tat com.zeyo.ats.data.es.service.ExternalCandidateSave.saveExternalCandidateData(ExternalCandidateSave.java:48) ~[classes!/:2018.4.1.0]\n\tat com.zeyo.ats.data.es.service.ApplicantServiceImpl.save(ApplicantServiceImpl.java:102) ~[classes!/:2018.4.1.0]\n\tat com.zeyo.ats.controller.ApplicantController.createExternalApplicant(ApplicantController.java:345) ~[classes!/:2018.4.1.0]\n\tat com.zeyo.ats.controller.ApplicantController.externalApplicationFromMonster(ApplicantController.java:455) [classes!/:2018.4.1.0]\n\tat com.zeyo.ats.controller.ApplicantController$$FastClassBySpringCGLIB$$3023edfa.invoke(<generated>) [classes!/:2018.4.1.0]\n\tat org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) [spring-core-5.0.7.RELEASE.jar!/:5.0.7.RELEASE]\n\tat org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:746) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:163) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat com.zeyo.ats.springboot.TpPerformanceMonitorInterceptor.invokeUnderTrace(TpPerformanceMonitorInterceptor.java:25) [classes!/:2018.4.1.0]\n\tat org.springframework.aop.interceptor.AbstractTraceInterceptor.invoke(AbstractTraceInterceptor.java:130) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:185) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:688) [spring-aop-5.0.8.RELEASE.jar!/:5.0.8.RELEASE]\n\tat com.zeyo.ats.controller.ApplicantController$$EnhancerBySpringCGLIB$$eb811588.externalApplicationFromMonster(<generated>) [classes!/:2018.4.1.0]\n\tat sun.reflect.GeneratedMethodAccessor1315.invoke(Unknown Source) ~[?:?]\n\tat sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_151]\n\tat java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_151]\n\tat org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:209)
..........

Can anyone please help me to resolve this issue. With the limitation in content length I am unable to share the base64 string. I can provide that in mail if it helps in debugging the issue.

Please don't post the same topic more than once, it makes it harder for people to assist you.

Let's continue the discussion here RuntimeException while parsing doc file