Can a Beat read from the ES database and is message queuing built into the Beats architecture?

From what I have read so far, Beats executes a cron job based on file-based configuration parameters.

  1. Is there a way for a Beat to read from the ElasticSearch database to determine what to do - i.e. what URL to crawl, for example

  2. Are the jobs that are performed by a Beat queued in some sort of message queue or spooler (as I have seen in the architecture diagram for FileBeats), or is this something that has to be introduced separately?

beats are no external driven (cron-job like) job-processors.

Every beat is a specialised applications(shippers) used to collect "events". Some beats might internally decide to schedule re-occuring tasks, but this is an internal implementation detail. E.g. filebeat is tailing your files reading new lines the moment new lines become available, but uses threads with timeouts to scan the filesystem for new files. packetbeat analysis life network traffic. metricbeat executes (internal) periodic tasks, ...

In reference to FileBeat, if we needed to read in data from say an SFTP server ever so often, would this be good candidate?

Then, if I understand this correctly, we could send it to LogStash where we could do any ETL and then to ES?

filebeat does not support ftp. Trying to fail files via ftp doesn't sound like a fun task to me. Normally users install filebeat on the edge machines itself. You can send to Logstash or directly to Elasticsearch (e.g. using Ingest Node).

This topic was automatically closed after 21 days. New replies are no longer allowed.