Attachment content was been truncated

Hi,

when I index an attachment which is MS word document and it contains double quotes.
after indexing, the content after the double quotes was truncated.

for example, this is the attachment:

Test attachment, "test document", after this, it has more words.

after indexing, the content does not have "after this, it has more words."

I'm using Mapper Attachment Plugin V2.3.2

Thank you in advance,

Carlos

There is a limit by default of extracted content.
IIRC it's 10000 characters.

Try to cut and paste the text out of MS Word and into a text editor like Notepad. Save the file and try your import again.

I never use MS Word except to do word processing.
Word may not be your issue however I have found nothing but trouble using text from MS Word Documents as Word uses RTF and injects a bunch of hidden stuff that causes issues when parsing data.

Just a thought.
MS

my testing document has only less 100 characters.

Can you provide a full script which reproduces your issue?

I use the example from the link https://gist.github.com/karmi/5594127.
the only change I made is the test.doc. I added double quotes in the test.doc

for example (before change):
Test
RTF document.

Lorem
ipsum dolor.

(after change):
Test
RTF document.
"test document"
Lorem
ipsum dolor.