I had a necessity to filter the document with dictionary, so that the document did not fall in the index, if it contains a word from this dictionary (for example prohibited documents, obscene language, etc.).
How can I do that?
If this is not possible, may be you will consider this idea?
I'd consider this logic part of the application, although you might be able to use percolator for that: Register queries that match your forbidden words, index the document and check with percolator if any query matches. It depends a bit on the number of forbidden words whether this solution is really feasible though... .
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.