Is it recommended approach to store documents on elastic search?
what if we extract keywords from document and creating index on those keywords( assumption is file path will stored index along with keywords physical file not existing on elastic search) instead of applying index on the document.
Yes. If you don't need to index the whole document but you have access to the keywords and that's the only text you want to be able to search for, then that's probably right.
Here i am providing some more details on my requirement, can you please suggest which is better option.
I am looking for index on whole document.
Actually my requirement is to store files may be in size of GB.
My allowed file extensions will be pdf,docx,excel,csv,ppt,pptx, images(png,jpeg,tiff,svg,etc..),audio files (search on audio content),video files,..etc.
can you please suggest me based on above requirement, which is the better option to go with.
I am planing to add ftp server to maintain those files, will save file repository location in elastic search along with crawler data(indexed keywords).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.