Yes, a gsub operation like that is the right approach. Finding regular expressions for email addresses and HTTP URLs shouldn't be more than a web search away.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.