Hello,
I'm hoping someone can help a relative newbie on the best way to design a search index for a project I'm working on. Basically what I have is < 1000 PDF documents sorted into a bit of a parent child relationship. Specifically:
Organization -> Document -> Section of Document
Where 'Section of Document' is an individual PDF. So for any given organization they could have 10 "documents", each document consisting of 50 Sections, and each section mapping to a PDF that is indexed.
My search queries need to return full text searches (along with various filters to return only matches from specific organizations, etc) grouped by Organization and Document... i.e. if I was rendering the results it would be:
Organization #1
Document 1
Matching Section #1
Matching Section #2
Document 2
...
Organization #2
Document #1
Matching Section #1
...
So it's important that we display all the documents for a specific organization together in a single set, and then the same is true for a document within that organization that has a matching section.
I am getting myself a little lost reading the documentation between field collapsing, entity-centric indexing, and aggregations. I'd really appreciate if someone could help me out and explain the best approach here and maybe an example of the index structure / query structure I'd really appreciate it! This is for ES 5.
Regards,
John