Hi there,
I have seen this issue before and it seems stills to apply.
I am using this version via pyspark:
ss = (
SparkSession.builder.appName("ES")
.config("spark.driver.memory", "8g")
.config(
"spark.jars.packages",
"org.elasticsearch:elasticsearch-spark-30_2.12:7.13.0"
)
.getOrCreate()
)
Print Schema works:
root
|-- @timestamp: timestamp (nullable = true)
|-- AuthenticationPackage: string (nullable = true)
|-- Destination: string (nullable = true)
|-- DomainName: string (nullable = true)
|-- EventID: long (nullable = true)
|-- FailureReason: string (nullable = true)
|-- LogHost: string (nullable = true)
|-- LogonID: string (nullable = true)
|-- LogonType: long (nullable = true)
|-- LogonTypeDescription: string (nullable = true)
|-- ParentProcessID: string (nullable = true)
|-- ParentProcessName: string (nullable = true)
|-- ProcessID: string (nullable = true)
|-- ProcessName: string (nullable = true)
|-- ServiceName: string (nullable = true)
|-- Source: string (nullable = true)
|-- Status: string (nullable = true)
|-- SubjectDomainName: string (nullable = true)
|-- SubjectLogonID: string (nullable = true)
|-- SubjectUserName: string (nullable = true)
|-- Time: long (nullable = true)
|-- UserName: string (nullable = true)
|-- event: struct (nullable = true)
| |-- category: string (nullable = true)
|-- host: struct (nullable = true)
| |-- name: string (nullable = true)
|-- message: string (nullable = true)
|-- user: struct (nullable = true)
| |-- name: string (nullable = true)
However any operation like show triggers the positition not found error.
The mappings are correct:
{'mappings': {'unified-host': {'properties': {'@timestamp': {'type': 'date'},
'AuthenticationPackage': {'type': 'keyword'},
'Destination': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'DomainName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'EventID': {'type': 'long'},
'FailureReason': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'LogHost': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'LogonID': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'LogonType': {'type': 'long'},
'LogonTypeDescription': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'ParentProcessID': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'ParentProcessName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'ProcessID': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'ProcessName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'ServiceName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'Source': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'Status': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'SubjectDomainName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'SubjectLogonID': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'SubjectUserName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'Time': {'type': 'long'},
'UserName': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}},
'event': {'properties': {'category': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}}}},
'host': {'properties': {'name': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}}}},
'message': {'type': 'keyword'},
'user': {'properties': {'name': {'type': 'text',
'fields': {'keyword': {'type': 'keyword', 'ignore_above': 256}}}}}}}}}}
Blockquote
Is this still related to the dot notation?
How can I workaround it? Is there a way to exclude specific fields?