Is it possible to have fields of different type while using multi_field?


(apanimesh061) #1

I am trying to index plain text as well as text from attachment.
Following is the mapping that i use:

PUT /my-index/doc_set/_mapping
{
  "properties": {
    "doc_id": {
      "type": "string"
    },
    "text": {
      "type": "multi_field",
      "fields": {
        "text": {
          "type": "string",
          "store": true,
          "index": "analyzed"
        },
        "pdf": {
          "type": "attachment",
          "fields": {
            "pdf": {
              "type": "string",
              "store": true,
              "index": "analyzed"
            }
          }
        }
      }
    }
  }
}

How can I index text and text.pdf? text should accept plain text and text.pdf should index Base64. I try following indexing:

POST /my-index/doc_set/15
{
  "doc_id": "15",
  "text": "simple text for doc"
}

POST /my-index/doc_set/23
{
  "doc_id": "23",
  "text.pdf": "simplest text for doc"
}

The first document should not give any error but I get this:

{
   "error": "MapperParsingException[failed to parse]; nested: JsonParseException[Illegal white space character (code 0x20) as character #3 of 4-char base64 unit: can only used between units\n at [Source: [B@66b0738e; line: 3, column: 32]]; ",
   "status": 400
}

and second one is indexed without errors but a new field text.pdf is created.

How should I go about it?


(Nik Everett) #2

In general it sure is. See the example here: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-core-types.html#token_count

I don't know about your specific error though. Looks like the string type is trying to parse the pdf contents as a string maybe?


(apanimesh061) #3

Actually it should not as attachment type accepts Base62 encoded version of a string. So, text.pdf should throe an error and text should be accepted without any error.

I added to the mapping dynamic: strict and noticed that this raised error:

{
   "error": "StrictDynamicMappingException[mapping set to strict, dynamic introduction of [text.pdf] within [doc_set] is not allowed]",
   "status": 400
}

Am I accessing pdf and text field the wrong way??


(system) #4