Losgstah configuration issue


#1

Hello,
I begin with logstash and ElasticSearch and I would like to index .pdf or .doc file type in ElasticSearch via logstash.
I configured logstash using the codec multiline to get my file in a single message in ElasticSearch . Below is my configuration file:

input {
file {
path => "D:/BaseCV/*"
codec => multiline {
# Grok pattern names are valid! :slight_smile:
pattern => ""
what => "previous"
}

     }

}

output {
stdout {
codec => "rubydebug"
}
elasticsearch {
hosts => "localhost"
index => "cvindex"
document_type => "file"
}
}

At the start of logstash the first file I add , I recovered in ElasticSearch in one message , but the following are spread over several messages. I wish I had the correspondence : 1 file = 1 message.
Is this possible ? What should I change my setup to solve the problem ?
Thank you for your feedback.


(Christian Dahlqvist) #2

Log stash does not support indexing of PDF and Word documents. For this instead look at the mapper attachment plugin for Elasticsearch.


#3

Thanks for your answer.
The plugin map attachment is already installed on my version 1.7.2 ElasticSearch, and I managed to index documents in ElasticSearch, but not always in a single message.


(system) #4