I have an folder containing data (20 GB) and this folder contains 26 subfolders that are sorted city-wise. Each of these subfolder contain many more subfolders comprising of csv and json files (The data that is stored in the files have different as well as somewhat similar fields) that I want to upload in bulk on elasticsearch. Apart from this, I also want to be able to specify the index name and mapping during the upload. For the same I have the following queries/requests :
Is this possible
Can anyone please help out with a detailed explanation as to how this can be done as I'm a beginner in this field.
You can tell Filebeat to look for particular files in it's input and then have those in specific input sections, eg one for *.csv and one for*.json. On the input you can also tag an event.
Then when you send them to Elasticsearch you can tell an output to filter only specific events, so the csv or json ones will go to the index you specifiy in set in each output section, you can also set the mapping there.
Can you please provide an example of exactly how this can be done along with a step-wise set of instructions for the same? I have referred to multiple documentations related to this but I find the instruction given to be a bit confusing.
Sorry for this but I'm new to elasticsearch and took this up as my first project.
How about you share what you have tried and what's not working and we can offer suggestions? That way we can help point out any mistakes and make it easier to learn.
Alright that makes sense. To start with, I'm facing issues with connecting filebeats with elasticsearch and kibana using cloud.auth (I'm unable to find it in the deployments overview-Security Section).
E:\elastic\filebeats\Elastic\Beats\filebeat>filebeat -e -E cloud.id="DETAILS -E cloud.auth="DETAILS"
Usage:
filebeat [flags]
filebeat [command]
Available Commands:
export Export current config or index template
generate Generate Filebeat modules, filesets and fields.yml
help Help about any command
keystore Manage secrets keystore
modules Manage configured modules
run Run filebeat
setup Setup index template, dashboards and ML jobs
test Test config
version Show current version info
Flags:
-E, --E setting=value Configuration overwrite
-M, --M setting=value Module configuration overwrite
-N, --N Disable actual publishing for testing
-c, --c string Configuration file, relative to path.config (default "filebeat.yml")
--cpuprofile string Write cpu profile to file
-d, --d string Enable certain debug selectors
-e, --e Log to stderr and disable syslog/file output
--environment environmentVar set environment being ran in (default default)
-h, --help help for filebeat
--httpprof string Start pprof http server
--memprofile string Write memory profile to this file
--modules string List of enabled modules (comma separated)
--once Run filebeat only once until all harvesters reach EOF
--path.config string Configuration path (default "")
--path.data string Data path (default "")
--path.home string Home path (default "")
--path.logs string Logs path (default "")
--strict.perms Strict permission checking on config files (default true)
-v, --v Log at INFO level
Use "filebeat [command] --help" for more information about a command.
This is the response i'm getting when I reset the password in the security section of the deployment that I created on elastic cloud.
I also made changes in the filebeats.yml file directly, hope that works.
Additionally, I'm unable to figure out exactly how to tell Filebeat to look for particular files in it's input.
Since this is your first elastic project I'd also recommend taking a look at uploading the file through kibana. By default it can handle csv, and json files up to 100MB in size. It will let you get the data in easily and get more familiar with mappings etc without too much overhead. Once you are more comfortable with that you it might help with configuring filebeat as well.
You can upload files through Kibana by going to Integrations and then searching for "Upload".
The only question that I have regarding this is that the data that I have with me is really large in number. So via Kibana, I'd have to individually upload every file and create indices for each of those files. In this case, is there any way I can upload multiple files under a common index during the importing process?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.