I need help creating Regex rules for Crawler, because it seems that regex is not working as it should.
I don't want to crawl anything in /wp-content/uploads folders, (images, media, etc.).
I tried to set regex to: DISALLOW - REGEX - .*/wp-content/uploads.*
Also tried regex to ignore all Image extensions I have: /\.(gif|jpe?g|tiff?|png|webp|bmp)$/i
Without any luck, it's all being processed again. How can I achieve this?
Take a look over the crawl rules documentation. Specifically, these regexes must follow the Ruby Regex syntax. When I added your rules to https://rubular.com/, it immediately identified some syntax issues.
For your image regex specifically, also note this block from the docs:
The rule matches when the path pattern matches the beginning of the path (which always begins with / ).
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.