[ANNOUNCEMENT] - Elasticsearch File System Crawler 2.4 released


(David Pilato) #1

The Elasticsearch File System Crawler team is pleased to announce the fscrawler-2.4 release!

FS Crawler offers a simple way to index local files into elasticsearch.

Changes in this version include:

New features:

  • Add support for more metadata OOTB Issue: 423. Thanks to dadoonet.
  • Add support for language setting for tesseract Issue: 414. Thanks to dadoonet.
  • Documentation about user mappings is wrong Issue: 408. Thanks to ploncker.
  • No _settings_doc.json Issue: 405. Thanks to madergaser.
  • Make logger externally configurable Issue: 394. Thanks to dadoonet.

Fixed Bugs:

  • Fix warning messages when extracting rating field Issue: 425. Thanks to dadoonet.
  • filename is Garbled when index a file which name contains Chinese characters Issue: 420. Thanks to dadoonet.
  • Set Tika RESOURCE_NAME_KEY before extraction Issue: 413. Thanks to dadoonet.
  • pdf_ocr should disable OCR entirely Issue: 410. Thanks to dadoonet.
  • Can FSCrawler support Text file encoding non UTF-8 (Shift-JIS)? Issue: 400. Thanks to 710255930500.

Changes:

  • Increase the limit of fields to 2000 instead of 1000 Issue: 421. Thanks to dadoonet.
  • Update to elasticsearch High Level REST Client 6.0.0-beta1 Issue: 417. Thanks to dadoonet.
  • Update to elasticsearch 5.6.0 Issue: 416. Thanks to dadoonet.
  • Explosion in number of fields Issue: 415. Thanks to Ramon-zaro.
  • Update to Tika 1.16 Issue: 412. Thanks to dadoonet.
  • Update to elasticsearch High Level REST Client 5.6.0 Issue: 402. Thanks to dadoonet.

For a manual installation, you can download the fscrawler-2.4 here:
https://repo1.maven.org/maven2/fr/pilato/elasticsearch/crawler/fscrawler/2.4/

Have fun!
-Elasticsearch File System Crawler team