We are looking into ways to get some data from Salesforce into Elasticsearch. Mainly to help correlate customer support and sales events with our various log data.
However I am having trouble determining how often the Input goes and runs a SOQL query.
The documentation says:
This input plugin will stop after all the results of the query are processed and will
need to be re-run to fetch new results. It does not utilize the streaming API.
So how exactly do you re-run the input to fetch new results? Restarting Logstash? We were thinking of having it run frequently (every 1-5 minutes) so that we could reference the data in close to real time. The Logstash machines currently ingest a few thousand events per second from other sources so restarting isn't really an option.
The Logstash machines currently ingest a few thousand events per second from other sources so restarting isn't really an option.
While it seems reasonable to have cron functionality built into the salesforce input, that doesn't appear to be available right now. What you could do is run a separate Logstash instance via cron whose only input is the salesforce one. Its filters and outputs could either mirror what's in your current long-running instance or you could set things up so that the Salesforce instance passes events to the long-running one via a lumberjack or tcp output/input pair.
If a solution is still needed please have a look at skyformation.com
We have implemented a connector that monitors and send the Salesforce events into elastic.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.