We are in the process of migrating our self hosted ELK stack to the hosted Elastic Services in Azure.
Currently we have our servers running filebeat configured to push their logs to Kafka/zookeeper which then pushes to logstash nd finally ES.
Along the way there are parsers that modify the logs for whatever format the group using them wants.
I have not seen an option for Kafka or Logstash on the hosted platform, the only thing I can see that's close is 'ingest node' but there doesn't seem to be much information related to that or if it would replace logstash.
Server filebeat.yml simply contains the output.kafka entry with the host/topic entries.
I know I have to replace that with the output.elasticsearch, but I'm trying to figure out how to get the parses in place as well as the correct index patterns, etc.
I'm sure it's something pretty simple I just don't see or get, hopefully somebody can point me in the right direction.
I don't think it is simple. Nothing in filebeat provides the large-scale buffering that kafka can (unless not yet reading from the disk counts). Some functionality from logstash has been pushed back into filebeat, some is pushed forward into ingest pipelines.
But that leaves some gaps where the filebeat to elasticsearch model cannot do everything that logstash can do. Although for some use-cases that doesn't matter.
So.. what would the solution be? I have yet to see an option for Kafka or Logstash from ES cloud, unless I just don't realize it's there or it's hidden or something.
It only gives the option of Elastic Search nodes, Kibana, APM, Enterprise Search, Master instance, Machine Learning Instance and coordinating/ingest instance.
The last one is the only thing I can think of that would be similar or close to logstash and/or kafka but can't find anything referencing them specifically.
On top of that, I have yet to figure out how to get a list of users/groups/permissions from our on-prem installation to copy over, but I'm much less worried about that right now, need to figure out how to properly get the data in place as our on-prem data center is shutting down in a few days.
Well... not even sure how that would work. Going from our servers filebeat to our logstash/kafka and then using them to push to the hosted ES?
Seems kind of a waste of money/effort to have it partially hosted at this point.
What is your source of data, what are you sending to your on-premises Elasticsearch? What kind of data is your filebeats collecting?
The only change from your current setup would be your Logstash output, instead of sending data to your on-premises Elasticsearch cluster it would send data to the hosted Elasticsearch cluster on Elastic Cloud, this is basically what Elastic Cloud is for a manages Elasticsearch Service, how the data will be ingested depends entirely on the client.
It really depends on each case, for a lot of people the cost in time and people to install, configure and manage an Elasticsearch cluster would be too expensive and the Elastic Cloud is an option, you need to take it in consideration before moving from an on-premises self-managed cluster to the Elastic Cloud.
In my case moving to Elastic Cloud would be way more expensive than to keep and manage our on-premises licensed cluster.