The Elasticsearch File System Crawler team is pleased to announce the fscrawler-2.4 release!
FS Crawler offers a simple way to index local files into elasticsearch.
Changes in this version include:
New features:
- Add support for more metadata OOTB Issue: 423. Thanks to dadoonet.
- Add support for language setting for tesseract Issue: 414. Thanks to dadoonet.
- Documentation about user mappings is wrong Issue: 408. Thanks to ploncker.
- No _settings_doc.json Issue: 405. Thanks to madergaser.
- Make logger externally configurable Issue: 394. Thanks to dadoonet.
Fixed Bugs:
- Fix warning messages when extracting rating field Issue: 425. Thanks to dadoonet.
- filename is Garbled when index a file which name contains Chinese characters Issue: 420. Thanks to dadoonet.
- Set Tika RESOURCE_NAME_KEY before extraction Issue: 413. Thanks to dadoonet.
- pdf_ocr should disable OCR entirely Issue: 410. Thanks to dadoonet.
- Can FSCrawler support Text file encoding non UTF-8 (Shift-JIS)? Issue: 400. Thanks to 710255930500.
Changes:
- Increase the limit of fields to 2000 instead of 1000 Issue: 421. Thanks to dadoonet.
- Update to elasticsearch High Level REST Client 6.0.0-beta1 Issue: 417. Thanks to dadoonet.
- Update to elasticsearch 5.6.0 Issue: 416. Thanks to dadoonet.
- Explosion in number of fields Issue: 415. Thanks to Ramon-zaro.
- Update to Tika 1.16 Issue: 412. Thanks to dadoonet.
- Update to elasticsearch High Level REST Client 5.6.0 Issue: 402. Thanks to dadoonet.
For a manual installation, you can download the fscrawler-2.4 here:
https://repo1.maven.org/maven2/fr/pilato/elasticsearch/crawler/fscrawler/2.4/
Have fun!
-Elasticsearch File System Crawler team