Elasticsearch index doesn't show all CSV data at once

Hi everyone,

Background info:
I am trying to read CSV files' data and transfer it to an Elasticsearch index.

Issue:
Elasticsearch index shows only one CSV path in that specific index. I called it "rosindex". Refreshing the page at https://***.found.io:9243/rosindex/_search?pretty will show different JSON content for different CSV paths, but I don't want it it this way. I want all CSV files to show up at once in the index.

Is there anyway i can show all CSV datas at once?

Logstash.conf file:

input {
    file {
        type => ".csv"
        start_position => "beginning"
        path => "/home/kourosh/Documents/metrics/files/*.csv"
    }
}

filter {
 ....
}

output{
    elasticsearch{
        hosts => ["https://***.found.io:9243"]
        user => "..."
        password => "..."
        index => "rosindex"
        workers => 1
    }
    stdout { 
        codec => rubydebug 
    }
}

Elasticsearch index:

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "wF2dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 7.40000009537,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444210630278341,7.40000009537",
          "@timestamp" : "2019-06-11T22:16:25.004Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444210630278341,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "wV2dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 6.5,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444211714413452,6.5",
          "@timestamp" : "2019-06-11T22:16:25.004Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444211714413452,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "wl2dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 8.30000019073,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444212803535300,8.30000019073",
          "@timestamp" : "2019-06-11T22:16:25.004Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444212803535300,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "w12dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 6.40000009537,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444213892452573,6.40000009537",
          "@timestamp" : "2019-06-11T22:16:25.004Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444213892452573,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "xF2dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 9.19999980927,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444214981876244,9.19999980927",
          "@timestamp" : "2019-06-11T22:16:25.005Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444214981876244,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "xV2dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 6.40000009537,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444216066947192,6.40000009537",
          "@timestamp" : "2019-06-11T22:16:25.005Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444216066947192,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "xl2dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 9.10000038147,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444217166283953,9.10000038147",
          "@timestamp" : "2019-06-11T22:16:25.005Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444217166283953,
          "@version" : "1"
        }
      },
      {
        "_index" : "rosindex",
        "_type" : "_doc",
        "_id" : "x12dSGsB9yYBmDbPHzh1",
        "_score" : 1.0,
        "_source" : {
          "data" : 6.5,
          "path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv",
          "message" : "1557444218247984032,6.5",
          "@timestamp" : "2019-06-11T22:16:25.005Z",
          "host" : "itecoation-Lenovo-ideapad-330S-15IKB",
          "type" : ".csv",
          "rosbagTimestamp" : 1557444218247984032,
          "@version" : "1"
        }
      }
      }
      }
    ]
  }
}

Not sure what this is:

type => ".csv"

Anyway it looks like the message field is correctly filled with the CSV content here.
What else do you expect?

@dadoonet, ".csv" is teling the input that it is CSV type. Also, ".../*.csv" will take all the files that end with a .csv inside the path directory given in the input. Here is a picture of the files inside that directory:

image

All those files are CSV. If you look back at Elasticsearch index I had before, you only see one CSV path, and not the rest of them.

"path" : "/home/kourosh/Documents/metrics/files/_slash_diagnostics_slash_cpu_monitor_slash_cpu.csv"

I was expecting the elasticsearch index show all data from different CSVs and not just one CSV.

Nope. It's just adding a new field to every document I think. At least it's not documented I think.

If you look back at Elasticsearch index I had before, you only see one CSV path, and not the rest of them.

It's probably because elasticsearch just returns back the first 10 hits and not everything. But the total number of hits shows that you indexed more than 10000 documents.

1 Like

@dadoonet, Yes that's right, but I want Elasticsearch to show all those documents from different CSV paths all at once. Is that possible?

You can extract all the resultset with the scroll API. But why would you do that? What is the use case?

I forgot about Kibana discover page. It allows to view all documents given a time.

I saw different CSV files, so everything is all good, thanks for helop @dadoonet! :slightly_smiling_face: