Error on my json


(ridwan syarifudin) #1

this is my code for simulate
</>
POST _ingest/pipeline/_simulate
{
"pipeline": {
"description": "parsing master produk indexing",
"processors": [
{
"grok": {
"field": "message",
"patterns": [
"""%{NUMBER:id_product};"%{DATA:code_product}";"%{DATA:name_product}";"%{DATA:satuan_product}";"%{DATA:merek_vehicle}";"%{DATA:jenis_vehicle}";"%{DATA:merek_product}";"%{DATA:part_number}";%{NUMBER:weight};"%{DATA:unit_weight}""""
]
}
},
{
"remove": {
"field": "message"
}
}
]
},

"docs": [
{
"_index": "master_product_info",
"_type": "message",
"_id": "AVvJZVQEBr2flFKzrrkr",
"_score": 1,
"_source": {
"message": """42305;"FX4PER000501I";"PER DPN F-50 DH-0005-01 48110-87624-01 MITS";"PCS";"DAIHATSU";"";"INDOSPRING";"";;"""""
}
}
]
}
</>

somehow i got an error like this
</>
{
"error": {
"root_cause": [
{
"type": "parse_exception",
"reason": "Failed to parse content to map"
}
],
"type": "parse_exception",
"reason": "Failed to parse content to map",
"caused_by": {
"type": "json_parse_exception",
"reason": "Unexpected character ('"' (code 34)): was expecting comma to separate Array entries\n at [Source: org.elasticsearch.transport.netty4.ByteBufStreamInput@421e7b4e; line: 9, column: 254]"
}
},
"status": 400
}
</>

how do i fix this?? please help


(Gabriel Tessier) #2

Hi, I tried with a 6.5 version I don't know which version you run but the code below is not returning error.

POST _ingest/pipeline/_simulate
{
  "pipeline": {
    "description": "parsing master produk indexing",
    "processors": [
      {
        "grok": {
          "field": "message",
          "patterns": [
            """%{NUMBER:id_product};"%{DATA:code_product}";"%{DATA:name_product}";"%{DATA:satuan_product}";"%{DATA:merek_vehicle}";"%{DATA:jenis_vehicle}";"%{DATA:merek_product}";"%{DATA:part_number}";%{NUMBER:weight};"%{DATA:unit_weight}"
            """
          ]
        }
      },
      {
        "remove": {
          "field": "message"
        }
      }
    ]
  },
  "docs": [
    {
      "_index": "master_product_info",
      "_type": "message",
      "_id": "AVvJZVQEBr2flFKzrrkr",
      "_score": 1,
      "_source": {
        "message": """   42305;"FX4PER000501I";"PER DPN F-50 DH-0005-01 48110-87624-01 MITS";"PCS";"DAIHATSU";"";"INDOSPRING";"";0;""
        """
      }
    }
  ]
}

I added a blank line at the end of the content with """ otherwise kibana dev tools return error. Look like you can't have more than 3 following on the same line without kibana raise a syntar error.
I also added 0 for the weight.

Can you check with your version?

EDIT>>>
Forget to put the returned result

{
  "docs" : [
    {
      "doc" : {
        "_index" : "master_product_info",
        "_type" : "message",
        "_id" : "AVvJZVQEBr2flFKzrrkr",
        "_source" : {
          "name_product" : "PER DPN F-50 DH-0005-01 48110-87624-01 MITS",
          "jenis_vehicle" : "",
          "satuan_product" : "PCS",
          "weight" : "0",
          "id_product" : "42305",
          "merek_vehicle" : "DAIHATSU",
          "code_product" : "FX4PER000501I",
          "merek_product" : "INDOSPRING",
          "part_number" : "",
          "unit_weight" : ""
        },
        "_ingest" : {
          "timestamp" : "2019-04-15T03:20:58.893Z"
        }
      }
    }
  ]
}

(ridwan syarifudin) #3

thank you so much sir for your help, let me try your answer


(ridwan syarifudin) #4

sir can i ask you one question?

PUT _template/product_template

{
"index_patterns": ["master_product*"],
"settings": {
"number_of_shards": 1
},
"mappings": {
"message": {
"properties": {
"id_product": {
"type": "keyword"
},
"code_product": {
"type": "keyword"
},
"satuan_product": {
"type": "text"
},
"merek_vehicle": {
"type": "text"
},
"jenis_vehicle": {
"type": "text"
},
"merek_product": {
"type": "text"
},
"part_number": {
"type": "text"
},
"weight": {
"type": "float"
},
"unit_weight": {
"type": "text"
}
}
}
}
}

do you think my indexing document is correct?


(Gabriel Tessier) #5

According to your example document "id_product": is integer so better to use int (or better depends on the size you expect to have) instead of keyword.

More about numeric datatype in doc:
https://www.elastic.co/guide/en/elasticsearch/reference/current/number.html

Also for the text fields I think it's better to use multi-fields

"merek_vehicle": {
"type": "text",
"fields":{
          "raw":{
              "type": "keyword",
               "ignore_above": 256
          },
}....

it will allow you to use the field in aggregations.

More about multi-fields in the doc: https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html

Hope it help.


(ridwan syarifudin) #6

thank you so much sir