Hi everyone!
I'm struggling with Elasticsearch in combination with Apache Nutch.
With Apache Nutch I want to crawl my websites and index their content to Elasticsearch.
Unfortunately the index isn't created, even if Nutch says that everything is done. (Step 3 of the tutorial)
I'm using Apache Nutch 1.17 in combination with Elasticsearch and Kibana 7.10 on CentOS7.
This is the tutorial I'm following (except installation and konfiguration of ES and Kibana:
I'm wondering, why the following console output shows me "localhost", even if I've configured the IP address.
Hopefully somebody may help me out.
Thanks!
Here is the final console output.
ElasticIndexWriter:
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│host │Comma-separated list of hostnames │localhost│
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│port │The port to connect to elastic server. │9200 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│index │Default index to send documents to. │nutch │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│username │Username for auth credentials │elastic │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│password │Password for auth credentials │ │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│max.bulk.docs │Maximum size of the bulk in number of documents. │250 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│max.bulk.size │Maximum size of the bulk in bytes. │2500500 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│exponential.backoff.millis │Initial delay for the BulkProcessor exponential backoff policy. │100 │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│exponential.backoff.retries│Number of times the BulkProcessor exponential backoff policy should retry bulk│10 │
│ │operations. │ │
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
│bulk.close.timeout │Number of seconds allowed for the BulkProcessor to complete its last operation. │600 │
└───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Indexer: number of documents indexed, deleted, or skipped:
Indexer: 9 indexed (add/update)
Indexer: finished at 2020-11-12 11:16:34, elapsed: 00:00:02