Using custom analyzer / tokenizers to breakdown a string into subfields

Hi,

I am getting a CSV report of PC specifications which I'm using logstash to import into ES.

One of the fields in the report that I receive comes in the following format:

C: (Used '14.51'GB of '80.01'GB , '18.13'%), D: (Used '42.42'GB of '385.75'GB , '11'%)

The number of drives is dynamic, depending on number of drives in the user's PC.

Instead of just storing this string, I would like to be able to store it in the following format:

"disks": [
	{
		"drivename": "C",
		"used": "14.51",
		"size": "80.01",
		"remaining": "65.5" #needs to be calculated
		"percentage": "18.13"
	},
	{
		"drivename": "D",
		"used": "42.42",
		"size": "385.75",
		"remaining": "343.33" #needs to be calculated
		"percentage": "11"
	}
]

I think it can be done by defining the disks field to use a custom analyzer and using a pattern tokenizer to breakdown the analyzer but the examples I find in the documentation is too simple for me to make much headway. Is there a more complex example that splits a string into multiple fields where I can refer to?

If anyone thinks that there's a better way to do this, I'm all ears too.

Thank you!
Wong

As you are using Logstash to do the import, the simplest way is to do the conversion into the desired format there:

https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html