FP, false positives, i. e. number of gene mentions that are incorrectly identified, including cases of gene men tions with incorrect database link, and non gene mentions. FN, false negative, selleck inhibitor i. e. number of missed genes. Further information about the IAT task is available at tasks biocreative iii iat Systems description Team 65 ODIN URL, odin The ODIN system is being developed within the scope of the OntoGene project, as acollaboration between the OntoGene group at the University of Zurich and the NITAS TMS group of Novartis Pharma AG. The purpose of the system is to allow a human annotator curator to leverage the results of a text mining system in order to enhance the speed and effectiveness of the annotation process.
Methods, The OntoGene system takes as input a document in plain text or supported XML based formats and processes it with a custom NLP pipeline, which includes Named Entity recognition and relation extraction. Entities which are currently supported include proteins, genes, experi mental methods, cell lines, and species. Entities detected in the input document are disambiguated with respect to a reference database. Since ODIN was primarily intended as a document inspector for annotation purposes, there is only an experimentally added retrieval function without ranking of the results. Interface, The annotated documents are handed back to the ODIN interface, which allows multiple display modalities, plus various selection and modification options. The curator can view the whole document with in line annotations highlighted, or can browse the extracted entities and be pointed back to the mentions within the document.
All entity annota tions are editable. Different entity views are supported, with sorting capabilities according to different criteria Selective display of text units containing entities of interest is supported. Rapid disambiguation can be achieved through manual organism selection. Additionally, exten sive logging functionalities are provided, which may be integrated in the document itself for document revision purposes. More details on ODIN are available in addi tional file 1. Team 68 GeneView URL, GeneView is a tool for gene centric searching, ranking, and visualization of scientific full text articles. Methods, GeneView initially performs a series of pre processing steps on each corpus that should be indexed, Full text articles are parsed and indexed using Lucene.
Gene names are identified and normalized to Entrez Gene IDs using the BioCreative III version of GNAT. This version of GNAT has been improved to deal more efficiently with full texts and allows for a more general species specific disambiguation of gene names. In AV-951 addition, single nucleotide polymorphisms are identified using MutationFinder. All recognized entities are added to the Lucene index, together with the section type they were found in and their entity type.