Csv filter plugin mapping problem

Can somebody help me understand this error?

[2020-01-28T17:29:58,361][DEBUG][o.e.a.b.TransportShardBulkAction] [AMIENS] [llmlogs-2020.01-000001][0] failed to execute bulk item (index) index {[llmlogs-2020.01-000001][_doc][9kr87G8BJZYuLet29GXZ], source[n/a, actual length: [10.9kb], max length: 2kb]}
org.elasticsearch.index.mapper.MapperParsingException: failed to parse field [FName] of type [float] in document with id '9kr87G8BJZYuLet29GXZ'. Preview of field's value: 'FName'

I'm using automatic column header detection with the csv plugin. Most columns are being detected just fine, but every now and then the csv plugin seems to detect the first line of the csv as data, not as column headers.

Can you check the mapping of the field FName in the llmlogs-2020.01-000001 index? It might be a float, however a document uses FName for this as a value as well, like "FName":"FName"

hope this helps

Well that's what the error seems to indicate. But no document has FName as a value anywhere in my data. I've done many tests with deleting the index and reingesting my CSVs and looking at each in detail. What really seems to be the problem is that some of the CSV files have empty cells in some of the columns, so you get something that looks like this:

Timestamp,Map,Untagged,ProgramSize,GenericPlatformMallocCrash,FName,Stats,EnginePreInit,FileSystem,ThreadStack,UObject,CsvProfiler,Localization,ConfigSystem,InitUObject,OOMBackupPool,RHIMisc,Meshes,RenderTargets,Textures,Shaders,RenderingThread,UI,AssetRegistry,AsyncLoading,StaticMesh,PhysX,Materials,Audio,Animation,EngineInit,AudioMixer,AudioPrecache,EngineMisc,LoadMapMisc,TaskGraphMiscTasks,GC,Particles,AudioDecompress,AudioFullDecompress,AudioRealtimePrecache,PSO,FMallocUnused,TrackedTotal,Total,Untracked,WorkingSetSize,PagefileUsed,StreamingManager,Networking,                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
2020-01-29T15:01:22.121Z,Outpost_P,8.07,0.00,3.82,8.88,3.67,17.19,13.32,6.44,126.63,0.00,4.17,1.73,1186.43,32.00,4.18,69.26,225.37,236.06,28.25,0.92,1.92,21.28,11.09,0.83,2.30,0.73,31.10,31.98,0.12,12.73,0.05,0.03,2.18,0.58,0.00,0.02,0.00,159.73,7.03,0.00,97.23,2357.34,6882.60,4658.29,2068.58,2565.27,
2020-01-29T15:01:52.129Z,Outpost_P,9.48,0.00,3.82,8.88,4.92,16.95,14.38,6.45,135.14,0.00,4.17,1.73,1186.43,32.00,4.19,84.03,367.23,305.11,27.47,4.03,19.73,21.28,7.01,0.83,2.30,0.74,31.97,31.97,0.12,13.05,0.05,8.56,2.01,6.93,0.02,0.02,0.00,159.73,7.03,0.05,43.10,2573.00,7019.62,4490.92,2129.38,2879.29,0.07,
2020-01-29T15:02:22.132Z,Station_Lobby,10.29,0.00,3.82,9.13,5.03,16.93,12.26,6.45,120.96,0.00,4.17,1.73,1186.43,32.00,4.27,135.80,440.53,591.12,33.19,5.18,10.73,21.28,8.10,15.57,14.51,0.95,35.01,32.07,0.12,16.50,0.05,21.19,7.18,13.16,0.02,0.18,0.00,182.14,8.25,0.14,47.64,3055.59,7408.83,4401.08,2279.37,3448.17,0.49,1.01,
2020-01-29T15:02:52.139Z,Station_Lobby,10.10,0.00,3.82,9.13,5.03,16.93,12.26,6.45,120.83,0.00,4.17,1.73,1186.43,32.00,4.27,132.23,381.67,590.21,33.18,5.40,10.43,21.28,7.98,15.57,14.52,0.95,34.97,32.07,0.09,17.65,0.05,21.15,7.18,13.35,0.03,0.18,0.00,186.42,8.31,0.17,38.67,2988.37,7289.51,4339.95,2253.25,3358.62,0.49,0.99,

As you can see the last few columns don't have values in the first few rows. I believe the error is related to this situation. I've tried many combinations of these parameters and have tried them set to true and false in various combinations. None of this seem to fix this problem. I'm almost thinking the CSV plugin might have a bug.

   autogenerate_column_names => false
   autodetect_column_names => true
   skip_empty_columns => true
   skip_empty_rows => true
   skip_header => true

How do you get those CSVs into Elasticsearch documents?

Using the CSV filter plugin with Logstash and Filebeat. My Logstash config here:

input {
  beats {
    port => 5044
  }
}
filter {
  csv {
   id => "LLM"
   autogenerate_column_names => false
   autodetect_column_names => true
   skip_empty_columns => true
   skip_empty_rows => true
   skip_header => true
   }
}
output {
  stdout { codec => rubydebug }
  elasticsearch { hosts => ["localhost:9200"]
                  index => "llmlogs-2020.01-000001"
  }
}

is that issue also showing in stdout, that there is a document with such a value? If so, I'd rather move this thread over to the logstash forum, but I would like to be sure first.

Yes, it's showing in stdout too. Feel free to move it.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.