XML or Json into elasticasearch

Hi everybody,

I think we are doing something wrong here and I really need your help because we are stuck.
We have several systems where we pass documents(xml/json) through. We want pass these document on to ELK so we can search through the content and do analysis on each customer/system . Nobody in the group has ELK experience, so that we have done so fare is that we have used filebeat->logstash->elasticsearch->kibana. But we end up with all the information that we need for further processing in the message segment of the filebeat message and what we want is structured data. How do we get this? Can it be done via filebeat or are we able to pass the document directly into elasticsearch via an api?

Best regards
Kim

Hi @kim_frederiksen

Fitst what version of the ELK Stack are you on?

2nd Elasticsearch has an extensive REST (JSON) API endpoint so yes you can directly write JSON documents to elasticsearch if you like you are not required to use filebeat and / or logstash.
Here are the Document APIs to do that we can chat about that if that is really what you want to do.

3rd As you say you team has little experience with elasticsearch, there are some other really important concepts like mapping (think schema) and index templates etc . There is lots of free training etc on the site plus you are in the right place for help.

Now ... to be clear = XML data will need to be parsed to JSON to be more useful in elasticsearch but no worries there are several tools / methods to help with that,,, but XML in a JSON field is fine I suspect that is where you are ... but that is not super useful, probably not what you want.

It sounds like you sort of have your data ingesting... but I might suggest that unless there is a specific reason you don't need logstash

Filebeat -> Elasticsearch should just fine, while you are debugging I would take out logstash unless you absolutely need it.

So know tell us about your data that is how we are going to be able to help.
As always we say if you can show us some of the input data and what you are seeing in kibana (in text not screen shot) that is of most help

So show us source and results. ?

Can you show us a sample of what ends up in the message field

Is your XML single line / condensed or "Expanded / multi-line / Pretty"?

We typically ask to see your configurations files for filebeat as well.

If you are saying that end up with still xml in the message field perhaps take a look at this

Lets us know how we can help... the more your share the better we can help

Hi Stephen,

Thanks for the input. We use version 7.16.2. We have played a bit around and created a index and mapping for what we think could work.

The index looks as follows:

{
  "scanglobaldata" : {
    "aliases" : { },
    "mappings" : {
      "properties" : {
        "frompartner" : {
          "type" : "text"
        },
        "message" : {
          "type" : "text"
        },
        "processname" : {
          "type" : "text"
        },
        "processstate" : {
          "type" : "text"
        },
        "system" : {
          "type" : "text"
        },
        "timestamp" : {
          "type" : "date"
        },
        "todirectory" : {
          "type" : "text"
        },
        "topartner" : {
          "type" : "text"
        }
      }
    },
    "settings" : {
      "index" : {
        "routing" : {
          "allocation" : {
            "include" : {
              "_tier_preference" : "data_content"
            }
          }
        },
        "number_of_shards" : "1",
        "provided_name" : "scanglobaldata",
        "creation_date" : "1661945340471",
        "number_of_replicas" : "1",
        "uuid" : "P53uLjq3QOmILqde8gMH9Q",
        "version" : {
          "created" : "7160299"
        }
      }
    }
  }
}

What we want with the data is to track all data coming in from different integration platforms and be able to filter the data in (system,frompartner,topartner and so on). Together with this data we would like to keep a record on the data that has been transmitted to us. This can be anything from csv, xml(pretty/not pretty) or json. But we need to store this in the message field so we can go back and see what was sent to us, and how do we do that?

Best regards
Kim

just add a field like message text and store the original content in that.

It will be searchable that way as well.

        "message" : {
          "type" : "text"
        }

I guess beyond that I am not sure what you are asking just set that field to your original content that is a very common pattern.

BTW please format you code in the future using the</> button

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.