I am trying to index plain text as well as text from attachment.
Following is the mapping that i use:
PUT /my-index/doc_set/_mapping
{
"properties": {
"doc_id": {
"type": "string"
},
"text": {
"type": "multi_field",
"fields": {
"text": {
"type": "string",
"store": true,
"index": "analyzed"
},
"pdf": {
"type": "attachment",
"fields": {
"pdf": {
"type": "string",
"store": true,
"index": "analyzed"
}
}
}
}
}
}
}
How can I index text
and text.pdf
? text
should accept plain text and text.pdf
should index Base64. I try following indexing:
POST /my-index/doc_set/15
{
"doc_id": "15",
"text": "simple text for doc"
}
POST /my-index/doc_set/23
{
"doc_id": "23",
"text.pdf": "simplest text for doc"
}
The first document should not give any error but I get this:
{
"error": "MapperParsingException[failed to parse]; nested: JsonParseException[Illegal white space character (code 0x20) as character #3 of 4-char base64 unit: can only used between units\n at [Source: [B@66b0738e; line: 3, column: 32]]; ",
"status": 400
}
and second one is indexed without errors but a new field text.pdf
is created.
How should I go about it?