Unable to send json to elastic

Hi,

I have some issues sending a json file to elastic. Logstash can connect to elastic and also in the logging are no errors. I may think it has to do with the json file or maybe the format.(or do i need to use the filter?)

    input {
      file {
        path => "/var/log/logstash/data/data*.json"
      }
    }

    output {
      elasticsearch {
        index => "%{[ecs-file]}"
        hosts => ["192.168.23.135"]
      }
    }

I would need some extra information to troubleshoot this.

I see you would like to create an index which has the content of the field ecs-file, but you never grab it from the input events.

I am also tempted to say you see no error because the data actually getting indexed, but into an index named exactly %{[ecs-file]}.

If it is the case (GET _cat/indices?v to list the indices), I think you should:

  1. Specify the codec of the input lines
  1. Delete the "wrongly" created index %{[ecs-file]} which can be deleted using DELETE %25%7B%5Becs-file%5D%7D

If it is not the case, then I would suggest setting Logstash in debug mode (log.level: debug in logstash.yml), search if there are any errors and show the version of Logstash & Elasticsearch.

Also, if Logstash already parsed the files, the sincedb (documentation) has already marked the input file(s) are read and you should remove it in order to parse again the file.

Hi Luca,

Thanks for the quick reply.

I created a index as follow:

PUT _template/ecs-file
{
    "order" : 0,
    "index_patterns" : [
      "ecs-file-*"
    ],
    "settings" : {
      "index" : {
        "lifecycle" : {
          "name" : "delete_after_7days_3primaries",
          "rollover_alias" : "ecs-file-*"
        }
      }
    },
    "mappings": { },
    "aliases" : { }
}
PUT ecs-file-000001
{
  "aliases": {
    "ecs-file": {
      "is_write_index": true
    }
  }
} 

cat indices

GET _cat/indices?

Output

green open ecs-file-000001

I have restarted the logstash in this is the logging i get:

[2020-04-12T03:28:01,100][INFO ][logstash.javapipeline    ][testfile] Pipeline started {"pipeline.id"=>"testfile}
[2020-04-12T03:28:01,281][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:alipress], :non_running_pipelines=>[]}
[2020-04-12T03:28:01,523][INFO ][filewatch.observingtail  ][testfile] START, creating Discoverer, Watch with filend sincedb collections
[2020-04-12T03:28:02,318][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600
[2020-04-12T03:30:01,069][WARN ][logstash.runner          ] SIGTERM received. Shutting down.
[2020-04-12T03:30:01,250][INFO ][filewatch.observingtail  ] QUIT - closing all files and shutting down.
[2020-04-12T03:30:01,667][INFO ][logstash.javapipeline    ] Pipeline terminated {"pipeline.id"=>"testfile"}
[2020-04-12T03:30:02,602][INFO ][logstash.runner          ] Logstash shut down.
[2020-04-12T03:30:42,664][INFO ][logstash.runner          ] Starting Logstash {"logstash.version"=>"7.4.2"}
[2020-04-12T03:30:46,651][INFO ][org.reflections.Reflections] Reflections took 149 ms to scan 1 urls, producing 20 ys and 40 values
[2020-04-12T03:30:49,188][INFO ][logstash.outputs.elasticsearch][testfile] Elasticsearch pool URLs updated {:chans=>{:removed=>[], :added=>[http://192.168.23.135:9200/]}}
[2020-04-12T03:30:49,802][WARN ][logstash.outputs.elasticsearch][testfile] Restored connection to ES instance {:u=>"http://192.168.23.135:9200/"}
[2020-04-12T03:30:49,930][INFO ][logstash.outputs.elasticsearch][testfile] ES Output version determined {:es_versn=>7}
[2020-04-12T03:30:49,948][WARN ][logstash.outputs.elasticsearch][testfile] Detected a 6.x and above cluster: the ype` event field won't be used to determine the document _type {:es_version=>7}
[2020-04-12T03:30:50,017][INFO ][logstash.outputs.elasticsearch][testfile] New Elasticsearch output {:class=>"Logash::Outputs::ElasticSearch", :hosts=>["//192.168.23.135"]}
[2020-04-12T03:30:50,263][WARN ][org.logstash.instrument.metrics.gauge.LazyDelegatingGauge][testfile] A gauge metc of an unknown type (org.jruby.RubyArray) has been create for key: cluster_uuids. This may result in invalid seriazation.  It is recommended to log an issue to the responsible developer/development team.
[2020-04-12T03:30:50,319][INFO ][logstash.javapipeline    ][testfile] Starting pipeline {:pipeline_id=>"testfile, "pipeline.workers"=>1, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>125, :thrd=>"#<Thread:0x31c7a556 run>"}
[2020-04-12T03:30:50,324][INFO ][logstash.outputs.elasticsearch][testfile] Using default mapping template
[2020-04-12T03:30:50,538][INFO ][logstash.outputs.elasticsearch][testfile] Attempting to install template {:managtemplate=>{"index_patterns"=>"logstash-*", "version"=>60001, "settings"=>{"index.refresh_interval"=>"5s", "number_oshards"=>1}, "mappings"=>{"dynamic_templates"=>[{"message_field"=>{"path_match"=>"message", "match_mapping_type"=>"ring", "mapping"=>{"type"=>"text", "norms"=>false}}}, {"string_fields"=>{"match"=>"*", "match_mapping_type"=>"strin, "mapping"=>{"type"=>"text", "norms"=>false, "fields"=>{"keyword"=>{"type"=>"keyword", "ignore_above"=>256}}}}}], roperties"=>{"@timestamp"=>{"type"=>"date"}, "@version"=>{"type"=>"keyword"}, "geoip"=>{"dynamic"=>true, "propertie=>{"ip"=>{"type"=>"ip"}, "location"=>{"type"=>"geo_point"}, "latitude"=>{"type"=>"half_float"}, "longitude"=>{"type>"half_float"}}}}}}}
[2020-04-12T03:30:51,417][INFO ][logstash.inputs.file     ][testfile] No sincedb_path set, generating one based othe "path" setting {:sincedb_path=>"/var/lib/logstash/plugins/inputs/file/.sincedb_9ad389108014868079fae20d6a174d21 :path=>["/var/log/logstash/data/data*.json"]}
[2020-04-12T03:30:51,544][INFO ][logstash.javapipeline    ][testfile] Pipeline started {"pipeline.id"=>"testfile}
[2020-04-12T03:30:51,847][INFO ][filewatch.observingtail  ][testfile] START, creating Discoverer, Watch with filend sincedb collections
[2020-04-12T03:30:51,876][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:alipress], :non_running_pipelines=>[]}
[2020-04-12T03:30:52,954][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600

When i remove the index it automatically creating the logstash index. It look likes logstash is not sending the json file.

What i have done:

  • Checked if the paths are correct.
  • Tried json example data
  • changed .rb file to .conf
  • Checked if the permissions were set correct

Here the debug logging:

  [2020-04-12T07:14:08,583][DEBUG][logstash.javapipeline    ] Pipeline started successfully {:pipeline_id=>"file", :thread=>"#<Thread:0x5999d9de run>"}
  [2020-04-12T07:14:08,655][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:08,788][INFO ][filewatch.observingtail  ][file] START, creating Discoverer, Watch with file and sincedb collections
  [2020-04-12T07:14:08,848][INFO ][logstash.agent           ] Pipelines running {:count=>1, :running_pipelines=>[:file], :non_running_pipelines=>[]}
  [2020-04-12T07:14:09,135][DEBUG][logstash.agent           ] Starting puma
  [2020-04-12T07:14:09,158][DEBUG][logstash.agent           ] Trying to start WebServer {:port=>9600}
  [2020-04-12T07:14:09,307][DEBUG][logstash.api.service     ] [api-service] start
  [2020-04-12T07:14:07,548][DEBUG][org.logstash.config.ir.CompiledPipeline][file] Compiled output
   P[output-elasticsearch{"hosts"=>["192.168.23.135"]}|[str]pipeline:9:3:```
  elasticsearch {
  #    index => "%{[ecs-file]}"
      hosts => ["192.168.23.135"]
    }
  ```]
   into
   org.logstash.config.ir.compiler.ComputeStepSyntaxElement@c96a6eaa
  [2020-04-12T07:14:09,910][INFO ][logstash.agent           ] Successfully started Logstash API endpoint {:port=>9600}
  [2020-04-12T07:14:11,226][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:11,227][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:12,096][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:12,176][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:12,179][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:12,222][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:13,595][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:15,085][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:15,101][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:15,102][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:15,131][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:16,255][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:16,257][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:18,086][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:18,110][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:18,111][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:18,120][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:18,595][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:21,085][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:21,113][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:21,143][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:21,147][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:21,267][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:21,268][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:23,601][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:24,089][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:24,109][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:24,110][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:24,143][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:26,296][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:26,297][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:27,084][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:27,092][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:27,093][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:27,100][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:28,595][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:30,084][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:30,100][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:30,101][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:30,129][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:31,328][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:31,329][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:33,082][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:33,092][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:33,093][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:33,117][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:33,595][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:36,084][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:36,091][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:36,093][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:36,102][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:36,354][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:36,374][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:38,596][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:39,084][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:39,088][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:39,090][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:39,124][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:41,385][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ParNew"}
  [2020-04-12T07:14:41,386][DEBUG][logstash.instrument.periodicpoller.jvm] collector name {:name=>"ConcurrentMarkSweep"}
  [2020-04-12T07:14:42,084][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:42,095][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:42,097][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:42,102][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}
  [2020-04-12T07:14:43,595][DEBUG][org.logstash.execution.PeriodicFlush][file] Pushing flush onto pipeline.
  [2020-04-12T07:14:45,087][DEBUG][logstash.config.source.multilocal] Reading pipeline configurations from YAML {:location=>"/etc/logstash/pipelines.yml"}
  [2020-04-12T07:14:45,108][DEBUG][logstash.config.source.local.configpathloader] Skipping the following files while reading config since they don't match the specified glob pattern {:files=>[]}
  [2020-04-12T07:14:45,110][DEBUG][logstash.config.source.local.configpathloader] Reading config file {:config_file=>"/etc/logstash/conf.d/file.rb"}
  [2020-04-12T07:14:45,115][DEBUG][logstash.agent           ] Converging pipelines state {:actions_count=>0}

  
  

Hello @Magnuss

Ok, got it. You want to use ILM and there are few things to take care of.

Unfortunately it is not possible to create rollover_aliase as you've done.
You really need to define a rollover_alias explicit name, now wildcards allowed.

E.g. if you will have 5 expected values (e.g. ecs-file-one, ecs-file-two, ecs-file-three...), you have to define 5 different index templates (which can use the same ILM policy).

If you can "accept" to write all the data into a single index, you have 2 choices:

  1. Rely on Logstash ILM integration
  2. Handle ILM manually (as you've done)

Let's review the steps you've done for (2).

Let's restart from scratch and delete the ecs-file-00001 index (the alias will go away with it).

The rollover_alias must be ecs-file.

PUT _template/ecs-file
{
    "order" : 0,
    "index_patterns" : [
      "ecs-file-*"
    ],
    "settings" : {
      "index" : {
        "lifecycle" : {
          "name" : "delete_after_7days_3primaries",
          "rollover_alias" : "ecs-file"
        }
      }
    },
    "mappings": { },
    "aliases" : { }
}

Then we can bootstrap the index:

PUT ecs-file-000001
{
  "aliases": {
    "ecs-file": {
      "is_write_index": true
    }
  }
} 

Logstash must be configured with ilm_enabled set to false (doc).

output {
  elasticsearch {
    ilm_enabled => false
    index => "ecs-file"
    hosts => ["192.168.23.135"]
  }
}

Unfortunately, due to the design of the ILM rollover alias creation API, Logstash cannot create the rollover_alias on the fly.
If you decide to use the ILM integration with Logstash, the rollover_alias gets created at Logstash startup and cannot be dynamic (based on a variable).

If you want to use the approach (1), it is only feasible if you know in advance all the possible values of ecs-file variable and you have to manually bootstrap all the indices.


In the test you've done with debug enabled, index => "%{[ecs-file]}" is commented out.

To improve this in the future, we're introducing the concept of data streams:

@Luca_Belluccini,

Thanks for helping out. What the real problem is that logstash is not sending the json data from the file into Elastic.

The ilm information i provided was for troubleshooting purposes.

For now i'm looking for a solution to get the json data into elasticsearch using json files.

In my previous answers I provided some information on how to solve some errors done setting up the index template and bootstrapping the rollover index, including current limitations with ILM rollover aliases.

  • What is the current content of the pipeline?
  • Do you wish to send data to one or multiple indices? Do you want to use ILM rollover or not?
  • Can you share one or 2 lines of the json files you want to ingest?

@Luca_Belluccini

Json file -> Logstash -> Elastic
I would like it in one index. ILM rollover is not important.
See sample below:

[{
  "id": 4000828499889,
  "link": "https://www.aliexpress.com/item/4000828499889.html?algo_pvid=d6d52e6d-b14b-4967-ade3-9f13a4f660cc&algo_expid=d6d52e6d-b14b-4967-ade3-9f13a4f660cc-17&btsid=0ab6f83115866312995255394e208c&ws_ab_test=searchweb0_0,searchweb201602_,searchweb201603_",
  "title": "Disney Pixar Cars 2  38 Style Metallic Finish Silver Chrome Lightning McQueen 1:55 Diecast Metal Toy Car Kids Gift",
  "tradeAmount": "2 orders",
  "averageStar": "0.0",
  "descriptionURL": "https://aeproductsourcesite.alicdn.com/product/description/pc/v2/en_US/desc.htm?productId=4000828499889&key=H35f9b08890e7490b80dd4898f6cb5aa2W.zip&token=f28c2bf43fe46bd68ddeb6cda974149a",
  "store": {
    "followingNumber": 65907,
    "establishedAt": "Apr 8, 2019",
    "positiveNum": 8021,
    "positiveRate": "97.5%",
    "name": "patrol dogs Store",
    "id": 4988381,
    "url": "https://www.aliexpress.com/store/4988381",
    "topRatedSeller": true
  },
  "specs": [
    {
      "Brand Name": "Disney"
    },
    {
      "Material": "Metal"
    },
    {
      "Age Range": "> 3 years old"
    },
    {
      "Barcode": "No"
    },
    {
      "Ship/Naval Vessel": "Other"
    },
    {
      "Features": "Diecast"
    },
    {
      "Certification": "3C"
    },
    {
      "3C": "Certificate"
    },
    {
      "Certificate Number": "3587945615"
    },
    {
      "Model Number": "3587944565"
    },
    {
      "Scale": "1:55"
    },
    {
      "Warning": "Cars 3"
    },
    {
      "Type": "Car"
    },
    {
      "Sizz": "As shown"
    },
    {
      "colour": "As shown"
    },
    {
      "Style": "As shown"
    },
    {
      "Features": "Entertainment Mini Education"
    },
    {
      "disney": "disney pixar cars"
    },
    {
      "cars disney": "cars 2 disney"
    },
    {
      "cars disney pixar": "cars 3"
    },
    {
      "disney cars": "mcqueen"
    }
  ],
  "categories": [
    "All Categories",
    "Toys & Hobbies",
    "Diecasts & Toy Vehicles"
  ],
input {
  file {
    path => "/var/log/logstash/*.json"
    codec => json
  }
}

output {
  elasticsearch {
    hosts => ["192.168.23.135"]
  }
}

Hello @Magnuss

Given your snippet:

[{
  "id": 4000828499889,
  "link": "https://www.aliexpress.com/item/4000828499889.html?algo_pvid=d6d52e6d-b14b-4967-ade3-9f13a4f660cc&algo_expid=d6d52e6d-b14b-4967-ade3-9f13a4f660cc-17&btsid=0ab6f83115866312995255394e208c&ws_ab_test=searchweb0_0,searchweb201602_,searchweb201603_",
  "title": "Disney Pixar Cars 2  38 Style Metallic Finish Silver Chrome Lightning McQueen 1:55 Diecast Metal Toy Car Kids Gift",
  "tradeAmount": "2 orders",
  "averageStar": "0.0",

There are 2 problems:

  • the JSON file contains an ARRAY of JSON objects (it starts with a [)
  • Except if you've pretty printed the JSON content, the JSON objects span over multiple lines

If we can make the assumption that every file starts with [{ and ends with }] (with an ending newline), you can use the following pipeline.

input {
	file {
		path => "..."
		start_position => "beginning"
		codec => multiline {
		  pattern => "^}?\]"
		  negate => true
		  what => next
		}
	}
}
filter { 
    json {
		source => "message"
		target => "json"
		remove_field => "message"
	}
	split {
		field => "json"
	}
}
output {
	 stdout { codec => rubydebug }
}

In any case this is highly inefficient as Logstash will buffer all the log lines prior to sending them to the filter and split the events.
The best would be to have a file containing JSON lines, one per line, without leading [ and ].


Similar asked question:

1 Like

@Luca_Belluccini

I tried both configs but logstash is still not sending the data to Elasticsearch. Somehow it is not sending the data to elastic.

My pipeline logs to standard output, it is normal you do not see data in Elasticsearch.

Please read my comments on the input file format and why it is a problem.

Thanks got some output now.

[2020-04-12T18:04:19,067][WARN ][logstash.filters.split   ][aliexpress] Only String and Array types are splittable. field:json is of type = NilClass
[2020-04-12T18:04:19,068][WARN ][logstash.filters.split   ][aliexpress] Only String and Array types are splittable. field:json is of type = NilClass
[2020-04-12T18:04:19,071][WARN ][logstash.filters.split   ][aliexpress] Only String and Array types are splittable. field:json is of type = NilClass

[2020-04-12T18:04:19,065][WARN ][logstash.filters.json    ][aliexpress] Error parsing json {:source=>"message", :raw=>"  \"averageStar\": \"1.0\",\n  \"descriptionURL\": \"https://aeproductsourcesite.alicdn.com/product/description/pc/v2/en_US/desc.htm?productId=32990160874&key=Hf1288693353941869b6ac80ae82c7582C.zip&token=90df0df89292942612944aa9ea155dd1\",\n  \"store\": {\n    \"followingNumber\": 135,\n    \"establishedAt\": \"Oct 18, 2017\",\n    \"positiveNum\": 1049,\n    \"positiveRate\": \"94.7%\",\n    \"name\": \"Q Channel World Store\",\n    \"id\": 3224136,\n    \"url\": \"https://www.aliexpress.com/store/3224136\",\n    \"topRatedSeller\": false\n  },\n  \"specs\": [\n    {\n      \"Brand Name\": \"Lesion\"\n    },\n    {\n      \"Warning\": \"None\"\n    },\n    {\n      \"Age Range\": \"5-7 Years\"\n    },\n    {\n      \"Age Range\": \"Grownups\"\n    },\n    {\n      \"Age Range\": \"14 Years & up\"\n    },\n    {\n      \"Age Range\": \"8~13 Years\"\n    },\n    {\n      \"Theme\": \"Sports\"\n    },\n    {\n      \"Model Number\": \"Mini Portable Inflatable Tube Pump\"\n    }\n  ],\n  \"categories\": [\n    \"All Categories\",\n    \"Toys & Hobbies\",\n    \"Pools & Water Fun\",\n    \"Pool Rafts & Inflatable Ride-ons\"\n  ],\n  \"wishedCount\": 7,\n  \"quantity\": 973,\n  \"photos\": [\n    \"https://ae01.alicdn.com/kf/HTB1wMWtM9zqK1RjSZPxq6A4tVXaz/2-Styles-Mini-Portable-Inflatable-Tube-Inflatable-Accessorial-Tool-For-Summer-Water-Game-Toys-Balloon-Toys.jpg\",\n    \"https://ae01.alicdn.com/kf/HTB1BX1vMYPpK1RjSZFFq6y5PpXaZ/2-Styles-Mini-Portable-Inflatable-Tube-Inflatable-Accessorial-Tool-For-Summer-Water-Game-Toys-Balloon-Toys.jpg\",\n    \"https://ae01.alicdn.com/kf/HTB17xS2M4naK1RjSZFtq6zC2VXa8/2-Styles-Mini-Portable-Inflatable-Tube-Inflatable-Accessorial-Tool-For-Summer-Water-Game-Toys-Balloon-Toys.jpg\",\n    \"https://ae01.alicdn.com/kf/HTB1GyuAM3HqK1RjSZFPq6AwapXao/2-Styles-Mini-Portable-Inflatable-Tube-Inflatable-Accessorial-Tool-For-Summer-Water-Game-Toys-Balloon-Toys.jpg\",\n    \"https://ae01.alicdn.com/kf/HTB1jkuyM7voK1RjSZPfq6xPKFXad/2-Styles-Mini-Portable-Inflatable-Tube-Inflatable-Accessorial-Tool-For-Summer-Water-Game-Toys-Balloon-Toys.jpg\",\n    \"https://ae01.alicdn.com/kf/HTB1nkutM9zqK1RjSZPxq6A4tVXac/2-Styles-Mini-Portable-Inflatable-Tube-Inflatable-Accessorial-Tool-For-Summer-Water-Game-Toys-Balloon-Toys.jpg\"\n  ],\n  \"skuOptions\": [\n    {\n      \"name\": \"Color\",\n      \"values\": [\n        \"1 pcs random color\",\n        \"1 pcs random color\"\n      ]\n    }\n  ],\n  \"prices\": [\n    {\n      \"price\": \"US $3.79\",\n      \"attributes\": [\n        \"Light Green\"\n      ]\n    },\n    {\n      \"price\": \"US $1.26\",\n      \"attributes\": [\n        \"Light Grey\"\n      ]\n    }\n  ],\n  \"companyId\": 240741874,\n  \"memberId\": 231367345\n}]", :exception=>#<LogStash::Json::ParserError: Unexpected character (':' (code 58)): expected a valid value (number, String, array, object, 'true', 'false' or 'null')

Ok, the problem is related to the input file format.
As I said few messages ago, the JSON file you're trying to parse contains a list of JSON objects.

Logstash usually can work with files containing one JSON object per line.

E.g.

{ "fieldone": 1, "anotherfield": "http://..." ... }
{ "fieldone": 2, "anotherfield": "http://..." ... }
{ "fieldone": 3, "anotherfield": "http://..." ... }
{ "fieldone": 4, "anotherfield": "http://..." ... }

While your file seems to be as follows:

[{ "fieldone": 1,
  "anotherfield": "http://..." ... 
},
{
  "fieldone": 2,
  "anotherfield": "http://..." ... 
},
{
  "fieldone": 3,
  "anotherfield": "http://..." ... 
},
{ 
  "fieldone": 4,
  "anotherfield": "http://..." ... }]

The file can be converted to the correct format using Python for example:

import json
with open('infile.json') as input_file:
    data = json.load(input_file)
    with open('outfile.json', 'w') as out_file:
        for e in data:
           json.dump(e, out_file)
           out_file.write('\n')

Or otherwise, we need to find a way to detect the last ] which terminates the JSON list.

1 Like

Hi @Luca_Belluccini

I managed to get better JSON format. Now i'm wondering how the conf file should look like.

Below is a sample

{"web-scraper-order":"1586794563-1","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"Wie zit er in de boom puzzel","price":"","image-src":"","description":"","voorraad":""}
{"web-scraper-order":"1586794563-10","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"Eurographics puzzel Ford F-series evolution 1000 stukjes","price":"","image-src":"","description":"","voorraad":""}
{"web-scraper-order":"1586794563-100","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"","price":"","image-src":"","description":"","voorraad":"Op voorraad"}
{"web-scraper-order":"1586794563-101","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"","price":"","image-src":"","description":"","voorraad":"Op voorraad"}
{"web-scraper-order":"1586794563-102","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"","price":"","image-src":"","description":"","voorraad":"Op voorraad"}
{"web-scraper-order":"1586794563-103","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"","price":"","image-src":"","description":"","voorraad":"Op voorraad"}
{"web-scraper-order":"1586794563-11","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"Puzzel slak Jou\u00e9co","price":"","image-src":"","description":"","voorraad":""}
{"web-scraper-order":"1586794563-12","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"Beach Summer Cottage Sunsout 1000","price":"","image-src":"","description":"","voorraad":""}
{"web-scraper-order":"1586794563-13","web-scraper-start-url":"https:\/\/www.test.com\/nl\/l\/puzzels\/N\/10560\/?page=216","title":"Puzzel Maxi Memopuzzel - Dieren - 40 stukjes","price":"","image-src":"","description":"","voorraad":""}

This is what i currently trying:

input {
  file {
    path => "/var/log/logstash/*.json"
    start_position => "beginning"

    codec => multiline {
      pattern => '^}$'
      negate => true
      what => next
    }
  }
}

filter {
  json {
    source => "message"
    target => "json"
  }
}


output {
  stdout { codec => rubydebug }
}

input {
  file {
    path => "/var/log/logstash/*.json"
    start_position => "beginning"
    codec => json
  }
}

output {
  elasticsearch {
    hosts => ["192.168.23.135"]
    ilm_enabled => false
    index => "ecs-file"
  }
}

1 Like

It is working thanks for helping out! :smiley:

1 Like

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.