Trying to upload a csv to an index fails

I am trying to load a csv file to an index associated with ILM policies, the first time it was executed it was successful, then I have deleted and recreated the index and the load has not worked again.

This is the pipeline.

input {
  stdin {
  }
  file {
    path => "C:/Desarrollos/pruebas/file.csv"
    start_position => "beginning"
  }
}
filter {
  csv {
    columns => [ "ID","CREATION_DATE","END_POINT","TRANSACTION_DETAIL","TRANSACTION_ID","TRANSACTION_STAGE_DESCRIPTOR","USER_ID","STACKTRACE","PARAMETERS" ]
  }
}


output {
  elasticsearch {
    hosts => ["http://localhost:9200"]
    ilm_rollover_alias => "pac_demo"
    ilm_pattern => "000001"
    ilm_policy => "pac_demo_policy"
  }

}

and this is the lifecycle policy:

{
  "pac_demo_policy": {
    "version": 1,
    "modified_date": "2022-09-22T14:05:24.841Z",
    "policy": {
      "phases": {
        "hot": {
          "min_age": "0ms",
          "actions": {
            "rollover": {
              "max_age": "1d"
            }
          }
        },
        "delete": {
          "min_age": "7d",
          "actions": {
            "delete": {
              "delete_searchable_snapshot": true
            }
          }
        }
      }
    },
    "in_use_by": {
      "indices": [
        "pac_demo-000001"
      ],
      "data_streams": [],
      "composable_templates": [
        "pac_demo"
      ]
    }
  }
}

here's the logstash-plain.log

[2022-09-22T10:29:42,535][INFO ][logstash.runner          ] Log4j configuration path used is: C:\elk\logstash-8.3.2\config\log4j2.properties
[2022-09-22T10:29:42,540][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"8.3.2", "jruby.version"=>"jruby 9.2.20.1 (2.5.8) 2021-11-30 2a2962fbd1 OpenJDK 64-Bit Server VM 11.0.15+10 on 11.0.15+10 +indy +jit [mswin32-x86_64]"}
[2022-09-22T10:29:42,541][INFO ][logstash.runner          ] JVM bootstrap flags: [-Xms1g, -Xmx1g, -XX:+UseConcMarkSweepGC, -XX:CMSInitiatingOccupancyFraction=75, -XX:+UseCMSInitiatingOccupancyOnly, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -Djruby.jit.threshold=0, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[2022-09-22T10:29:42,619][WARN ][logstash.config.source.multilocal] Ignoring the 'pipelines.yml' file because modules or command line options are specified
[2022-09-22T10:29:45,686][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[2022-09-22T10:29:46,000][INFO ][org.reflections.Reflections] Reflections took 65 ms to scan 1 urls, producing 124 keys and 408 values 
[2022-09-22T10:29:47,580][INFO ][logstash.javapipeline    ] Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[2022-09-22T10:29:47,642][INFO ][logstash.outputs.elasticsearch][main] New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["//localhost:9200"]}
[2022-09-22T10:29:47,932][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[http://localhost:9200/]}}
[2022-09-22T10:29:48,108][WARN ][logstash.outputs.elasticsearch][main] Restored connection to ES instance {:url=>"http://localhost:9200/"}
[2022-09-22T10:29:48,119][INFO ][logstash.outputs.elasticsearch][main] Elasticsearch version determined (8.3.2) {:es_version=>8}
[2022-09-22T10:29:48,119][WARN ][logstash.outputs.elasticsearch][main] Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
[2022-09-22T10:29:48,164][INFO ][logstash.outputs.elasticsearch][main] Config is not compliant with data streams. `data_stream => auto` resolved to `false`
[2022-09-22T10:29:48,164][INFO ][logstash.outputs.elasticsearch][main] Config is not compliant with data streams. `data_stream => auto` resolved to `false`
[2022-09-22T10:29:48,167][WARN ][logstash.outputs.elasticsearch][main] Elasticsearch Output configured with `ecs_compatibility => v8`, which resolved to an UNRELEASED preview of version 8.0.0 of the Elastic Common Schema. Once ECS v8 and an updated release of this plugin are publicly available, you will need to update this plugin to resolve this warning.
[2022-09-22T10:29:48,178][INFO ][logstash.filters.csv     ][main] ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[2022-09-22T10:29:48,216][INFO ][logstash.outputs.elasticsearch][main] Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
[2022-09-22T10:29:48,266][INFO ][logstash.javapipeline    ][main] Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>8, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>1000, "pipeline.sources"=>["C:/elk/logstash-8.3.2/config/pac_logstash.conf"], :thread=>"#<Thread:0x68c9fc2c run>"}
[2022-09-22T10:29:48,819][INFO ][logstash.javapipeline    ][main] Pipeline Java execution initialization time {"seconds"=>0.55}
[2022-09-22T10:29:48,861][INFO ][logstash.inputs.file     ][main] No sincedb_path set, generating one based on the "path" setting {:sincedb_path=>"C:/elk/logstash-8.3.2/data/plugins/inputs/file/.sincedb_7f8e991b87c13cb19a994e7b7f986ea8", :path=>["C:/Desarrollos/pruebas/file.csv"]}
[2022-09-22T10:29:48,921][INFO ][logstash.javapipeline    ][main] Pipeline started {"pipeline.id"=>"main"}
[2022-09-22T10:29:48,970][INFO ][filewatch.observingtail  ][main][cb932d46483b05b369735f31a9b54ac424ca051a82b81ced08d60c7c1a08ddf4] START, creating Discoverer, Watch with file and sincedb collections
[2022-09-22T10:29:48,996][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
[2022-09-22T10:33:27,552][WARN ][logstash.runner          ] SIGINT received. Shutting down.
[2022-09-22T10:33:27,585][INFO ][filewatch.observingtail  ] QUIT - closing all files and shutting down.
[2022-09-22T10:33:27,969][INFO ][logstash.javapipeline    ][main] Pipeline terminated {"pipeline.id"=>"main"}
[2022-09-22T10:33:28,610][INFO ][logstash.pipelinesregistry] Removed pipeline from registry successfully {:pipeline_id=>:main}
[2022-09-22T10:33:28,644][INFO ][logstash.runner          ] Logstash shut down.

Hi @alter1

That is because logstash keeps track of what it has already loaded and that is stored in the sincedb

So if you want to reload a file you need to do 1 of 2 things.

1 - Clean up the sincedb by removing the file ... it is in your losgstash.log

delete this file you will need to do this after each run ...

C:/elk/logstash-8.3.2/data/plugins/inputs/file/.sincedb_7f8e991b87c13cb19a994e7b7f986ea8

2 - set the following in the input section which then will not keep track and the fille will be reloaded each time you start logstash. The is a good way during Dev / Test Cycle
sincedb_path => "NUL"

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.