Solutions will depend on various things including if you have any markup as part of the content e.g. <h1> that might help understand the structural importance of various sections of the text. Otherwise you might be looking at employing more statistical methods of the words used in the text. Maybe querying using "MoreLikeThisQuery" to match a document to itself and using a Highlighter to highlight the content would provide a form of summary.
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.