A logstash to bigquery transmission problem

Hello!
I encountered an error when I used logstash to transfer kafka data to google bigquery

[ERROR][logstash.outputs.googlebigquery][main][30729410306cb4d98635a59cbf171cea3b769fd734b29b17e925785d4edac5ba] Error uploading data. {:exception=>com.google.cloud.bigquery.BigQueryException: Request payload size exceeds the limit: 10485760 bytes.}

this is my config

output {     
 36       if[database] == "t8891" {
 37             google_bigquery {
 38                   project_id => "newcar8891"
 39                   dataset => "logstash"
 40                   csv_schema => "original:STRING,timestamp:TIMESTAMP,id:STRING,topic:STRING,consumer_group:STRING,partition:STRING,offset:STRING,key:STRING,database:STRING"
 41                   json_key_file => "/home/shurui/bin/newcar8891-maxwell.json"
 42                   error_directory => "/opt/module/bqerror"
 43                   table_prefix => "logstash_t8891"
 44                   date_pattern => "%Y-%m"
 45                   batch_size => 6000
 46                   flush_interval_secs => 10
 47                   batch_size_bytes => 6000000
 48             }                                                                                                                                                                                                                                               
 49       }  

It's not related to LS.

Check here

|HTTP request size limit|10 MB|Exceeding this value causes invalid errors.

Internally the request is translated from HTTP JSON into an internal data structure. The translated data structure has its own enforced size limit. It's hard to predict the size of the resulting internal data structure, but if you keep your HTTP requests to 10 MB or less, the chance of hitting the internal limit is low.

Hello!

batch_size => 6000
flush_interval_secs => 10
batch_size_bytes => 6000000

Can I adjust these three parameters to help this error?

Your batch_size_bytes is already under 10 MB, but probably the batch_size has preference and a batch of 6000 events may be over 10 MB.

Try to use the default batch_size of 128 and start increasing it until it stops working.

The documentation says that the recommendation is about 500.

2 Likes

Thanks!
My current practice is to transfer tables that may exceed the limit separately to ensure that other tables will not be affected

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.