Upload CSV File to Kibana Dashboard

Debasis_Mallick · September 12, 2023, 11:01am

Hi Team,

I need help on below two points while uploading csv file through kibana dashboard.

How to upload a csv file size of more than 100MB through the kibana dashboard.
How to upload multiple csv files to same indices.

Thanks,
Debasis

ppisljar · September 12, 2023, 12:17pm

thats not possible thru kibana UI, but you can use various methods to get the data indexed to elasticsearch.

beats: you can use filebeat to monitor your folder for files and automatically index them, here is a sample filebeats config

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - /path/to/your/csv/files/*.csv

output.elasticsearch:
  hosts: ["http://your-elasticsearch-host:9200"]
  index: "your_index_name"

you can use es bulk api to index the documents, and maybe a simple python script to do the processing for you

from elasticsearch import Elasticsearch
from elasticsearch.helpers import bulk

es = Elasticsearch(['http://localhost:9200'])  # Replace with your Elasticsearch URL

# Read your CSV file and create a list of dictionaries for each row
# For example, you can use Python's CSV module for this.
data = [{"field1": value1, "field2": value2, ...}, {...}, ...]

# Index the data into Elasticsearch
bulk(es, data, index='your_index_name', doc_type='your_doc_type')

Debasis_Mallick · September 14, 2023, 7:49am

Thanks @ppisljar for your response to my first question. Could you please help me with the second requirement? For example, I am uploading a text.csv to the index "sample" and next want one more csv file to the same index "sample". Is there any way to achieve this?

Thanks,
Debasis

Debasis_Mallick · September 14, 2023, 10:57am

Hi @ppisljar,

Do you have any reference blog where we can get how filebeats can be configured such way to upload csv files to the particular index.

Thanks,
Debasis

ppisljar · September 14, 2023, 12:58pm

here is something i found on the web: Load CSV data to ElasticSearch using FileBeat

Debasis_Mallick · September 18, 2023, 11:06am

Thanks @ppisljar. I had created the ingest pipelines but where to check if it is successful or failed because I could not see the data in the corresponding index. Is there any way to check this?

Thanks,
Debasis

ppisljar · September 18, 2023, 11:40am

logs should be in /var/log/filebeat/filebeat

Debasis_Mallick · September 18, 2023, 12:03pm

Below is my unit file /usr/lib/systemd/system/filebeat.service and when I check under /var/log there is no filebeat folder as I mentioned above is there any way to check if ingest pipeline is working or in failed state.

Preformatted text> UMask=0027
> Environment="GODEBUG='madvdontneed=1'"
> Environment="BEAT_LOG_OPTS="
> Environment="BEAT_CONFIG_OPTS=-c /etc/filebeat/filebeat.yml"
> Environment="BEAT_PATH_OPTS=--path.home /usr/share/filebeat --path.config /etc/filebeat --path.data /var/lib/filebeat --path.logs /var/log/filebeat"
> ExecStart=/usr/share/filebeat/bin/filebeat --environment systemd $BEAT_LOG_OPTS $BEAT_CONFIG_OPTS $BEAT_PATH_OPTS
> Restart=always

Thanks,
Debasis

ppisljar · September 19, 2023, 11:27am

can you confirm filebeat is running ?

try to follow this tutorial to get it running: Filebeat quick start: installation and configuration | Filebeat Reference [8.10] | Elastic

Debasis_Mallick · September 20, 2023, 11:56am

Hi
Yes, my filebeat is up and running.

[root@cb-1 ~]# systemctl status filebeat
● filebeat.service - Filebeat sends log files to Logstash or directly to Elasticsearch.
   Loaded: loaded (/usr/lib/systemd/system/filebeat.service; disabled; vendor preset: disabled)
   Active: active (running) since Mon 2023-09-18 16:26:43 IST; 2 days ago
     Docs: https://www.elastic.co/beats/filebeat
 Main PID: 14193 (filebeat)
````Preformatted text`

Thanks,
Debasis

Debasis_Mallick · September 20, 2023, 12:42pm

Hi @ppisljar ,

I followed the same doc to install filebeat and it is up and running as per the below command. Is there any way to validate the ingest pipeline if it is working properly or not?

sytemctl status filebeat

Thanks,
Debasis

Debasis_Mallick · September 22, 2023, 11:06am

HI @ppisljar ,

Could you please help here?

Thanks,
Debasis

kcreddy · September 25, 2023, 7:13am

You could use GET /_nodes/stats?metric=ingest&filter_path=nodes.*.ingest.pipelines to get statistics about your ingest pipeline to see failed count. You can run it from Dev Tools in Kibana.

Is there any way to validate the ingest pipeline if it is working properly or not?

You can use the simulate pipeline API to test ingest pipeline.

Debasis_Mallick · September 26, 2023, 12:15pm

Hi @kcreddy Thanks for your response. Now I can see the ingest pipeline details from Dev tools as below. Which means data loaded to my indices but when I search the data under Discover tool nothing showing for sales indices so am I missing anything here.

>  "parse_sales_data": {
>             "count": 44462,
>             "time_in_millis": 269,
>             "current": 0,
>             "failed": 1,
>             "processors": [
>               {
>                 "csv": {
>                   "type": "csv",
>                   "stats": {
>                     "count": 44462,
>                     "time_in_millis": 180,
>                     "current": 0,
>                     "failed": 0
>                   }
>                 }
>               },

Thanks,
Debasis

Debasis_Mallick · September 26, 2023, 12:56pm

Hi @kcreddy ,

In addition to the above issue, just want to inform you that I followed the below link to create a pipeline of sales (testing the theoretical part since I am new to the elasticsearch world) before doing the actual data load which is in csv format.

Thanks,
Debasis

kcreddy · September 26, 2023, 2:40pm

Can you provide both the ingest pipeline and also filebeat configuration with couple csv rows?

Can you query your sales index from Dev Tools and check if you can find documents from there?

GET sales/_search
{
  "query":{
    "match_all" : {}
  }
}

If so, maybe the Data View created might be wrong which is not pointing to the index where data is ingested. In this case documents might have been ingested into sales index, but your Data View doesn't contain sales index.

Also, you seem to have 1 failure in the pipeline. You could have an on_failure clause inside your pipeline to add error.message field to understand why the failure occurred. More info here.

"on_failure": [
    {
      "set": {
        "description": "Record error information",
        "field": "error.message",
        "value": "Processor {{ _ingest.on_failure_processor_type }} with tag {{ _ingest.on_failure_processor_tag }} in pipeline {{ _ingest.on_failure_pipeline }} failed with message {{ _ingest.on_failure_message }}"
      }
    }
  ]

Debasis_Mallick · September 26, 2023, 4:11pm

Hi @kcreddy ,
Please find the ingest pipeline details as belwo.

PUT _ingest/pipeline/parse_sales_data
{
"processors": [
{
"csv": {
"description": "Parse sales data from scanner",
"field": "message",
"target_fields": ["sr","date","customer_id","transaction_id","sku_category","sku","quantity","sales_amount"],
"separator": ",",
"ignore_missing":true,
"trim":true
},
"remove": {
"field": ["sr"]
}
}
]
}

Below are some records from scanner-data.csv file

,Date,Customer_ID,Transaction_ID,SKU_Category,SKU,Quantity,Sales_Amount
1,02/01/2016,2547,1,X52,0EM7L,1,3.13
2,02/01/2016,822,2,2ML,68BRQ,1,5.46
3,02/01/2016,3686,3,0H2,CZUZX,1,6.35
4,02/01/2016,3719,4,0H2,549KK,1,5.59

I made the below changes in filebeat.yml

============================== Filebeat inputs ===============================

type: log
enabled: true
paths:

/cbdata/elasticsearch/scanner-data.csv # path to your CSV file
exclude_lines: [^""] # header line
index: sales
pipeline: parse_sales_data

---------------------------- Elasticsearch Output ----------------------------

output.elasticsearch:

Array of hosts to connect to.

hosts: ["https://xx.xx.xx.xx:9200","https://xx.xx.xx.xx:9200","https://xx.xx.xx.xx:9200"]

Protocol - either http (default) or https.

protocol: "https"

Authentication credentials - either API key or username/password.

#api_key: "id:api_key"
username: "elastic"
password: "elastic"
ssl:
enabled: true
certificate_authorities: ["/etc/filebeat/certs/cert.pem"]

Debasis_Mallick · September 26, 2023, 4:12pm

Hi @kcreddy ,

As I mentioned earlier I had followed the steps mentioned in below link.

Thanks,
Debasis

Debasis_Mallick · September 28, 2023, 5:29am

Hi @kcreddy ,

Did you find the time to look into the above issue.

Thanks,
Debasis

kcreddy · September 28, 2023, 7:16am

Hey, I was going through the tutorial and was able to ingest without any problem. The data you presented above is different from the ones in the tutorial. For example, the date format is different. There might be WARN or ERROR messages in your filebeat logs indicating failure to index the document due to parsing in wrong format.

If there are other errors, I would check inside the filebeat logs.

Topic		Replies	Views
Upload of multiple CSV files into same index Elasticsearch	5	893	March 23, 2023
Upload several csv files to kibana Kibana	2	1445	October 11, 2019
Import CVS to Kibana using filebeat Elasticsearch	2	1122	March 1, 2019
Importing multiple large csv and json files into a single index Elasticsearch	11	1843	April 4, 2023
Creating a report Dashboard from Multiple CSV Kibana	4	588	June 3, 2020

Upload CSV File to Kibana Dashboard

============================== Filebeat inputs ===============================

---------------------------- Elasticsearch Output ----------------------------

Array of hosts to connect to.

Protocol - either http (default) or https.

Authentication credentials - either API key or username/password.

Related topics

Protocol - either `http` (default) or `https`.