I am just starting to use elasticsearch. We have some documents in EDI
format and XML format that we need to index. For eg EDI format is
something like:
0120TRANA 770034661 PREPARER'S
AGENTE20080522010080014302AV901005 TZ #
0120TRANB 7700346616220 GREENWICH DR SAN DIEGO CA
92122 8585258010 #
0120ACK 5618383330100800143020001000000000000C0004
200805220090100500838801 1 NJ#
0120ACKR 561838333 01FORM 1040 00001000000100100504
#
0120****RECAP
000000000001010080014302000000000000000001000000000000000000000001000000
#
and everyone know xml
Is there a best practice or some best way of how it should be indexed
in terms of "JSON format", index fields or settings? Or just throw in
the document say with field name as "document"?
I am just starting to use elasticsearch. We have some documents in EDI
format and XML format that we need to index. For eg EDI format is
something like:
0120TRANA 770034661 PREPARER'S
AGENTE20080522010080014302AV901005 TZ #
0120TRANB 7700346616220 GREENWICH DR SAN DIEGO CA
92122 8585258010 #
0120ACK 5618383330100800143020001000000000000C0004
200805220090100500838801 1 NJ#
0120ACKR 561838333 01FORM 1040 00001000000100100504
#
0120****RECAP
000000000001010080014302000000000000000001000000000000000000000001000000
#
and everyone know xml
Is there a best practice or some best way of how it should be indexed
in terms of "JSON format", index fields or settings? Or just throw in
the document say with field name as "document"?
Does this plugin index the text in the document with different terms
or it just stores as an attachment. for eg: If I wanted to say get me
all xml documents that has value "xyz", would that be possible?
On Tue, Jan 10, 2012 at 10:37 PM, David Pilato david@pilato.fr wrote:
For the same use case, I use the mapper-attachment plugin.
I send the XML file as an attachment.
I am just starting to use elasticsearch. We have some documents in EDI
format and XML format that we need to index. For eg EDI format is
something like:
0120TRANA 770034661 PREPARER'S
AGENTE20080522010080014302AV901005 TZ #
0120TRANB 7700346616220 GREENWICH DR SAN DIEGO CA
92122 8585258010 #
0120ACK 5618383330100800143020001000000000000C0004
200805220090100500838801 1 NJ#
0120ACKR 561838333 01FORM 1040 00001000000100100504
#
0120****RECAP
000000000001010080014302000000000000000001000000000000000000000001000000
#
and everyone know xml
Is there a best practice or some best way of how it should be indexed
in terms of "JSON format", index fields or settings? Or just throw in
the document say with field name as "document"?
Does this plugin index the text in the document with different terms
or it just stores as an attachment. for eg: If I wanted to say get me
all xml documents that has value "xyz", would that be possible?
On Tue, Jan 10, 2012 at 10:37 PM, David Pilato david@pilato.fr wrote:
For the same use case, I use the mapper-attachment plugin.
I send the XML file as an attachment.
I am just starting to use elasticsearch. We have some documents in EDI
format and XML format that we need to index. For eg EDI format is
something like:
0120TRANA 770034661 PREPARER'S
AGENTE20080522010080014302AV901005 TZ #
0120TRANB 7700346616220 GREENWICH DR SAN DIEGO CA
92122 8585258010 #
0120ACK 5618383330100800143020001000000000000C0004
200805220090100500838801 1 NJ#
0120ACKR 561838333 01FORM 1040 00001000000100100504
#
0120****RECAP
000000000001010080014302000000000000000001000000000000000000000001000000
#
and everyone know xml
Is there a best practice or some best way of how it should be indexed
in terms of "JSON format", index fields or settings? Or just throw in
the document say with field name as "document"?
Does this plugin index the text in the document with different terms
or it just stores as an attachment. for eg: If I wanted to say get me
all xml documents that has value "xyz", would that be possible?
On Tue, Jan 10, 2012 at 10:37 PM, David Pilato david@pilato.fr wrote:
For the same use case, I use the mapper-attachment plugin.
I send the XML file as an attachment.
I am just starting to use elasticsearch. We have some documents in EDI
format and XML format that we need to index. For eg EDI format is
something like:
0120TRANA 770034661 PREPARER'S
AGENTE20080522010080014302AV901005 TZ #
0120TRANB 7700346616220 GREENWICH DR SAN DIEGO CA
92122 8585258010 #
0120ACK 5618383330100800143020001000000000000C0004
200805220090100500838801 1 NJ#
0120ACKR 561838333 01FORM 1040 00001000000100100504
#
0120****RECAP
Is there a best practice or some best way of how it should be indexed
in terms of "JSON format", index fields or settings? Or just throw in
the document say with field name as "document"?
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.