A Spark-Based Open Source Framework for Large-Scale Parallel Processing of Rich Text Documents | IEEE Conference Publication | IEEE Xplore