How Can Publishers Contribute?
For all automated information-extraction methods, it is obvious that access to literature is crucial. Electronic access has, of course, already had a huge impact, but the structure and organisation of manuscripts could also be improved. For example, semantic tags could be integrated into the text. The markup would not appear on web pages or when the document is printed, but it would help software to deal with semantic aspects of the document. Inserting tags, for example, to mark protein names would allow retrieval software to find documents about proteins even if they look like common English words, such as “you” or “and”. Retrieval engines currently often ignore such terms. In addition, explicit tags would enable text mining methods, for example, when looking for protein–protein interactions, to use the correct semantic interpretation. Text mining systems already available today, such as Whatizit, can integrate semantic tags during submission, which have to be verified by the author.