Hi ,
I am having issues setting up flume with elasticsearch and followed a few examples available online on various blogs.
I am able to save the data using Flume into HDFS by using HDFS sink. Now the same data I want to send to elasticsearch by choosing an elasticsearch sink.
But unfortunately, it does not seem to work, as no documents are created in elasticsearch index.
The data I want to send looks like this:
{‘id’: ‘26’, ‘value’: ‘8’}
{‘id’: ‘27’, ‘value’: ‘16’}
{‘id’: ‘28’, ‘value’: ‘21’}
{‘id’: ‘29’, ‘value’: ‘10’}
I have created an elasticsearch index with this mapping:
curl -XPUT ‘localhost:9200/riz_index?pretty’ -H ‘Content-Type: application/json’ -d’
{
“mappings” : {
“default” : {
“properties” : {
“id” : {“type”: “integer” },
“value” : {“type”: “integer” }
}
}
}
}
‘
The flume conf file:
Name the components on this agent
a1.sources = r1
a1.sinks = k1
a1.channels = c1
Describe/configure the source
a1.sources.r1.type = netcat
a1.sources.r1.bind = 127.0.0.5
a1.sources.r1.port = 5005
Describe the sink ES
a1.sinks = k1
a1.sinks.k1.type = elasticsearch
a1.sinks.k1.hostNames = localhost:9200,localhost:9300
a1.sinks.k1.indexName = riz_index
a1.sinks.k1.indexType = item
a1.sinks.k1.clusterName = elasticsearch
a1.sinks.k1.batchSize = 500
a1.sinks.k1.ttl = 5d
a1.sinks.k1.serializer=org.apache.flume.sink.elasticsearch.v12.ElasticSearchLogStashEventSerializer
a1.sinks.k1.channel = c1
Use a channel which buffers events in memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100
Bind the source and sink to the channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1
Is there anything that I am missing?
Thanks