[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Malformed content, found extra data after parsing: FIELD_NAME]];
16:10:16,190 ^[[36mDEBUG^[[m [f.p.e.c.f.c.v.ElasticsearchClientV7] Error caught for [ips-internal-doc-index]/[_doc]/[8f531bfbb22847e4c87c31a17a6284]: ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Malformed content, found extra data after parsing: FIELD_NAME]];
16:10:16,191 ^[[33mWARN ^[[m [f.p.e.c.f.c.v.ElasticsearchClientV7] Got [3] failures of [4] requests
Elasticsearch log says this.
"Caused by: java.lang.IllegalArgumentException: Malformed content, found extra data after parsing: FIELD_NAME",
"at org.elasticsearch.index.mapper.DocumentParser.validateEnd(DocumentParser.java:146) ~[elasticsearch-7.3.0.jar:7.3.0]",
"at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:72) ~[elasticsearch-7.3.0.jar:7.3.0]",
"... 34 more"] }
The same configuration worked fine for me with the same documents in FSCrawler 2.6. I am using all default configuration of Elasticsearch.
See if you can access the sample document. In fact, almost all documents are failing with the same error. Though FSCrawler can parse the document, extract metadata and create the JSON.
@dadoonet It worked for me after I deleted the existing index and recreated it as per the discussion in https://github.com/dadoonet/fscrawler/issues/755. Will come back if I get any more issues. For now, the issue seems resolved.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.