Store XML file on S3 and send specific XML TAGs to ElasticSearch to index

Hi folks

Can I use LogStash to sent only some TAGs from an XML file and store the physical XML file on an S3?

                                  |----> ElasticSearch
                                  |

XML --> LogStash -->|
|
|----> S3

From the XML file, I need sending to ElasticSearch index only some specific TAGS.

To the S3 storage, I need send the complete XML file. In that case, I can not transform the XML file to JSON due to a government rule. I need store this file in the original XML format without changes.

The XML file is in a directory on my disk local.

I need store this files on an S3 due to when requested, the user can search on ElasticSeacrh and download the physical file from S3 by a Webpage that I will create.

Anyone with a similar situation?


My output on LogStash Conf file:

output {

#testing with a local file

stdout
{
codec => rubydebug
}

elasticsearch

{

hosts ="MY ElasticSearch Host";

user ="elastictem";

password ="My Password";

index ="xmlTest";

}

s3{

access_key_id ="My Access Key";

secret_access_key ="My Secret Access Key";

region ="us-east-1";

bucket ="Testxml1";

time_file = 1

}

@costamarcelo
Maybe checkout the clone filter.

If you clone incoming docs and use add_tag to differentiate them (e.g. `add_tag => "original"), then you can have conditional logic in your pipeline to do different things with each: original goes to s3 and the other is stripped of unnecessary bits and output to Elasticsearch.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.