The flow seems entirely reasonable. You may run into troubles with exact vs partial matches to the search request though.
E.g. if someone searches for "foo bar baz" in the description field, that will match any documents saying containing "foo", "bar", or "baz" since you are using a match query. You may have multiple matches to deal with.
Instead, you may want to use a term query, since that performs an exact match. But that also means that case sensitivity matters, as well as spacing and special characters. You could also try using phrase searches but it also has edge-cases.
Basically, search is a bit more nuanced than just a database table lookup, since you need to deal with partial matches.
Perhaps show the results to the user and ask if one of the matching docs was their input? If not they can add it to the index. That saves you a lot of trouble of trying to determine how well partial matches match the input.
Thank you so much for your answer! I am still at the beginning of the whole process, so I will take notice of your hints and work with them. The problem right now is, that the search part doesn't notice if a document already exists.
E.g: I used the word "test" on title and description and type1/thema1 and added the document to the index. I got the message that the document has been added and its ID. When I used exactly the same input and pressed add again, I got the same response.
I made a few changes to the code as I am trying to figure it out.
EDIT: I managed to make it work! I will try to improve the search query as you suggested. Any further suggestions are much appreciated
No particular advice, just play around with analyzers and get a feel for how they tokenize/transform the text. You may want to do a combination of exact match (term queries), phrase matching (match_phrase or phrase queries) and partial matching (match) to suit your needs.
You could also structure your document IDs so that they are deterministic, and use simple GETs instead of searches to fetch the documents by ID. May or may not be a possibility for your system.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.