Manual, full-text literature curation underlies much of the annotation found in biological databases, but the process of finding relevant papers and specific supporting evidence for annotations can be very time-consuming. To ameliorate the cost and tedium of manual curation, we have re-implemented, from scratch, the Textpresso text mining system to allow for direct annotation of full text and integration of that annotation into diverse curation systems. The new system, Textpresso Central, preserves the key features of the original system by allowing for sophisticated queries using keywords and semantically related categories, but enhances the user experience by providing search results in the context of full text. Resulting full text annotations can readily be integrated into any user-defined curation platform and also used to develop training sets to further improve text mining performance. The current implementation of TextpressoCentral includes articles from the PMC Open Archive and uses the Unstructured Information Management Architecture (UIMA) framework as well as Lucene indexing. To maximize the utility of the system, Textpresso Central also allows for processing and displaying of results from third-party text mining and natural language processing algorithms.
© 2000-2018 Textpresso, California Institute of Technology. Site build date: May 24 2018, 03:28:02. Textpressocentral v1.0.0 (White Snowberry)