[ANNOUNCEMENT] - Elasticsearch File System Crawler 2.2 released


(David Pilato) #1

The Elasticsearch File System Crawler team is pleased to announce the fscrawler-2.2 release!

FS Crawler offers a simple way to index local files into elasticsearch.

Changes in this version include:

New features:
o Missing documentation for some local FS settings Issue: 287. Thanks to shadiakiki1986.
o Reorganize the documentation Issue: 281. Thanks to dadoonet.
o add link to repo with dockerfile usage of fscrawler Issue: 278. Thanks to shadiakiki1986.
o documentation for loop moved to under --loop instead of under --rest Issue: 277. Thanks to shadiakiki1986.
o Add fielddata for path fields Issue: 273. Thanks to dadoonet.
o Add new option to restart FS crawler Issue: 267. Thanks to dadoonet.
o Add a REST Layer Issue: 261. Thanks to dadoonet.
o Add tests for issue 221 Issue: 252. Thanks to dadoonet.
o Add new add_as_inner_object option to add metadata to json/xml documents Issue: 241. Thanks to babadofar.
o Add full support for elastic cloud (HTTPS) Issue: 236. Thanks to dadoonet.
o Test password protected files Issue: 229. Thanks to dadoonet.
o Add support for run only once Issue: 227. Thanks to dadoonet.
o Documentation request: update_rate Issue: 225. Thanks to Muffinman.
o Add OCR integration documentation Issue: 224. Thanks to Jdecaudin.
o Automatically deploy SNAPSHOT Issue: 214. Thanks to dadoonet.
o Add support for update mapping Issue: 205. Thanks to dadoonet.
o Add a test when mixing JSON and other files when json support is on Issue: 202. Thanks to dadoonet.
o Add field file.extension to the output Issue: 201. Thanks to bigtoerag.
o Multi-language support for indexed documents Issue: 162. Thanks to sgoeschl.
o New option: do not index folders Issue: 155. Thanks to nasreekar.
o Support for Basic Authentication - Shield/X-Pack Issue: 144. Thanks to Rilton.
o Add documentation about indexing on HDFS Issue: 63. Thanks to unixengineer.

Fixed Bugs:
o Files must be removed on delete of folder Issue: 298. Thanks to dadoonet.
o Don't swallow IOException when starting FSCrawler Issue: 295. Thanks to dadoonet.
o Not ingesting files with tilde ~ character in their name Issue: 291. Thanks to trorbyte.
o Fix default values Issue: 286. Thanks to dadoonet.
o NPE when using --rest option without Rest settings Issue: 279. Thanks to dadoonet.
o content is null when creating the most simple job Issue: 276. Thanks to shadiakiki1986.
o Do not call System.exit() when shutting down FsCrawler Issue: 275. Thanks to dadoonet.
o Remove trailing / character in virtual path Issue: 274. Thanks to dadoonet.
o Fix NPE when ignoring dirs Issue: 270. Thanks to dadoonet.
o Fix Invalid UTF-8 error (wrong encoding) Issue: 269. Thanks to rgrativol.
o fscrawler doesn't exit after last run when --loop is specified Issue: 266. Thanks to jberkenbilt.
o Bulk Response must contain the original error message Issue: 258. Thanks to dadoonet.
o SSH does not detect modified files Issue: 257. Thanks to babadofar.
o Fix testProtectedDocument221 test Issue: 256. Thanks to dadoonet.
o fscrawler does not recover when it lost communication with elasticsearch Issue: 255. Thanks to twindragons1987.
o Fix metadata date extraction Issue: 253. Thanks to dadoonet.
o Timezone on _status.json file is UTC based instead of local Timezone Issue: 245. Thanks to christopherjm.
o Elasticsearch Client must use search size if set Issue: 240. Thanks to babadofar.
o Detect deletions correctly when directories have more than 10 items in it Issue: 239. Thanks to babadofar.
o Fscrawler does not delete documents from index on json support Issue: 237. Thanks to FrodeRennemo.
o Prevent customised mappings from being overwritten Issue: 231. Thanks to edjeavons.
o Too many open files Issue: 228. Thanks to ernestoarbitrio.
o fix recursive scan when using include/exclude filters Issue: 222. Thanks to vakopian.
o fields should be replaced by stored_fields from elasticsearch 5.0.0 Issue: 209. Thanks to dadoonet.
o FSCrawler can't create a new job from scratch Issue: 208. Thanks to dadoonet.
o Index certain file types in folders recursively Issue: 206. Thanks to DatanoiseTV.
o Out of memory errors: potential memory leak Issue: 196. Thanks to Muffinman.
o Stabilize tests Issue: 191. Thanks to dadoonet.
o Starting fscrawler after removing an index and mapping results in unexpected behaviour Issue: 137. Thanks to danyill.
o Resolving Issue around File Indexing when Moving files/folders Issue: 136. Thanks to danyill.

Changes:
o Update jcommander to 1.60 Issue: 309. Thanks to dadoonet.
o Update maven-changes-plugin to 2.12.1 Issue: 308. Thanks to dadoonet.
o Update maven-assembly-plugin to 3.0.0 Issue: 307. Thanks to dadoonet.
o Update maven-dependency-plugin to 3.0.0 Issue: 305. Thanks to dadoonet.
o Update maven-compiler-plugin to 3.6.1 Issue: 304. Thanks to dadoonet.
o Update to Jackson 2.8.6 Issue: 302. Thanks to dadoonet.
o Update to Jersey 2.25.1 Issue: 301. Thanks to dadoonet.
o Update to Log4J 2.8 Issue: 300. Thanks to dadoonet.
o Update to elasticsearch 5.2.0 Issue: 296. Thanks to dadoonet.
o Support filename_as_id for any type of file Issue: 293. Thanks to dadoonet.
o Define a default exclusion list with files starting with ~ Issue: 292. Thanks to dadoonet.
o Update tests to use elasticsearch 2.4.4 and 1.7.6 Issue: 288. Thanks to dadoonet.
o Update to elasticsearch 5.1.2 Issue: 285. Thanks to dadoonet.
o Add better traces in Elasticsearch client Issue: 280. Thanks to dadoonet.
o Use path analyzer for directory fields Issue: 272. Thanks to dadoonet.
o Update to elasticsearch 5.1.1 / Lucene 6.3.0 / Log4J 2.7 Issue: 249. Thanks to dadoonet.
o Update to Tika 1.14 Issue: 248. Thanks to dadoonet.
o Update to elasticsearch 5.0.0 Issue: 243. Thanks to dadoonet.
o Add filename in logs when an error occurs Issue: 221. Thanks to lm-edi.
o Update to Jsch 0.1.53 Issue: 218. Thanks to dadoonet.
o Update to Log4J 2.6.2 Issue: 217. Thanks to dadoonet.
o Update to Jackson 2.8.1 Issue: 216. Thanks to dadoonet.
o Update maven plugins Issue: 215. Thanks to dadoonet.
o Update to elasticsearch 5.0.0-alpha5 Issue: 213. Thanks to dadoonet.
o Move settings validation to its own class Issue: 212. Thanks to dadoonet.
o Do not try to determine "group" on Windows platform Issue: 211. Thanks to dadoonet.
o Don't store password in setting files Issue: 207. Thanks to dadoonet.
o Replace internal elasticsearch REST client with official REST client Issue: 203. Thanks to dadoonet.
o Replace Tika Deprecated properties Issue: 177. Thanks to dadoonet.
o Replace internal elasticsearch REST client with official REST client Issue: 172. Thanks to dadoonet.

For a manual installation, you can download the fscrawler-2.2 here:
https://repo1.maven.org/maven2/fr/pilato/elasticsearch/crawler/fscrawler/2.2/

Have fun!
-Elasticsearch File System Crawler team


(system) #2