How to import data to elasticsearch?

laoyang360 · August 16, 2016, 6:39am

Hi, All
I want to know how to import data to elasticsearch? Now I have the follow sence:
1, the data size is very small, eg: the import date size is 10MB/s;
2, the data size is very big, eg: the import data size is 100TB or 1PB per day;

For my opinion, I think the data source is very import.
When the data is now store into database such as the mysql/oracle or mongo, we should use the sync method,
such as: logstash-input-jdbc, mongo-connector sync data.

When the data is not store into database, it just crawle from website through web Crawler.
If the data format is JSON, I think we can use logstash import data into elasticsearch.
But when the data is not JSON, how to import data into elasticsearch? I don't konw.

The above is my opinion, maybe it's not correct.
If you have the detail demo, please tell me the website link address.
I want to know the correct method, any help could be sincerely appreciate.
Thank you very much!

dadoonet · August 16, 2016, 6:56am

IMO it depends on the data you have.
If you have JSon files or XML files on your disk (structured data), you can use FSCrawler project.
If you have non structured files on disk (PDF, oOo...) you can use FSCrawler as well.

You can also write a shell script (find) which cat the content of any file to Logstash, then define in Logstash the pipeline you want to use.

What kind of files do you have?

laoyang360 · August 17, 2016, 12:59am

Hi, @dadoonet:
Thanks for your reply.
My file type is like *.log recently, and I refer to the follow document configure
https://www.elastic.co/guide/en/logstash/current/advanced-pipeline.html

But when I to running, The follow error apper:
[root@laoyang bin]# ./logstash -f ./logstash_conf/first-pipeline.conf
Settings: Default pipeline workers: 16
Connection refused {:class=>"Manticore::SocketException", :level=>:error}
Pipeline main started

So, I want to know the correct config of input and output?

now, my configure is:
[root@laoyang bin]# cat ./logstash_conf/
first-pipeline.conf logstash-tutorial-dataset.log shakespare.json
[root@laoyang bin]# cat ./logstash_conf/first-pipeline.conf
input {
file {
path => "/opt/logstash/bin/logstash_conf/logstash-tutorial-dataset.log"
start_position => beginning
}
}
filter {
grok {
match => { "message" => "%{COMBINEDAPACHELOG}"}
}
geoip {
source => "clientip"
}
}
output {
elasticsearch {}
stdout {}
}

dadoonet · August 17, 2016, 6:55am

I moved your question to #logstash

magnusbaeck · August 17, 2016, 7:21am

Connection refused {:class=>"Manticore::SocketException", :level=>:error}

Is Elasticsearch running on localhost?

laoyang360 · August 18, 2016, 9:37am

Hi, @magnusbaeck:
Thanks for your reply, My Elasticsearch is now running on localhost, and my hostname is "laoyang"

[root@laoyang lib]# hostname
laoyang

In elasticsearch.yml:
cluster.name: my-application

laoyang360 · August 18, 2016, 9:37am

ok, tks.

magnusbaeck · August 18, 2016, 9:40am

My Elasticsearch is now running on localhost, and my hostname is "laoyang"

And you're still getting "connection refused" in Logstash?

Topic		Replies	Views
Not able to import complete data using Logstash to Elasticsearch Logstash	5	1485	March 22, 2017
How to import data from elasticsearch into mongo? Elasticsearch	3	2139	April 22, 2017
Data Import Question Logstash	2	277	November 21, 2019
[simple question] import JSON into elasticsearch Logstash	14	11744	April 17, 2018
Bulk Import to Elasticsearch Elasticsearch	6	2034	December 5, 2017

How to import data to elasticsearch?

Related topics