text
The triangle is blue.
The square is red.
There are two red circles.
The triangle is green.
I want to partition them by mentions of color in the text field, meaning that I want to add a color column like so.
text color
The triangle is blue. blue
The square is red. red
There are two red circles. red
The triangle is green. green
The color is determined by running a regular expression. Assume for simplicity's sake that exactly one color is mentioned in every text field. I want to perform a transform with color as the pivot field.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.