July 27, 2005

UK Announces National Centre for Text Mining

29th April, 2004. Imagine a future in which databases are populated with accurate, valid, exhaustive, rapidly updated data where users find what they want all the time; where drug discovery costs and development time are slashed and animal experimentation is reduced through early identification of unpromising paths; where new insights are gained through integration and exploitation of experimental results, databases, and scientific knowledge; where product development archives and patents yield new directions for R&D; and where searching yields facts rather than documents to read. This is the potential of text mining.

The JISC, BBSRC AND EPSRC announced today funding of some £1m to establish a National Centre for Text Mining. The remit of the Centre, the first publicly funded centre in the world, is to contribute to the associated national and international research agenda, to establish a service for the wider academic community, and to make connections with industry.

Text mining attempts to discover new, previously unknown information by applying techniques from natural language processing, data mining, and information retrieval:

  • To identify and gather relevant textual sources
    * To analyse these to extract facts involving key entities and their properties
    * To combine the extracted facts to form new facts or to gain valuable insights.

Text mining finds applications in many diverse areas of wide interest such as drug discovery and predictive toxicology, protein interaction, competitive intelligence, protection of the citizen, identification of new product possibilities, detection of links between lifestyle and states of health, and many more.

Led by UMIST, the National Centre for Text Mining will be run by an internationally leading consortium. The consortium has four UK partner institutions: UMIST, the Victoria University of Manchester, the University of Liverpool, and the University of Salford. These core partners are extended by international partners: the University of California Berkeley, the University of Geneva, the San Diego Supercomputing Centre, and the University of Tokyo, with the European Bioinformatics Institute having presence on the Technical Directorate. It is anticipated that the Centre will engage as part of the related emerging networks of excellence.

The Centre will be initially focused on biological and biomedical science. This area of science has the largest user community and the fastest growing literature, and the area where most applications research in text mining is being undertaken. At the same time, the tools developed by the Centre will be of interest and relevant to the needs of the wider academic community. A major challenge for the Centre will be to handle efficiently and robustly very large volumes of text and the intermediate data produced while processing.

The Centre will be housed in the under-construction £34M Manchester Interdisciplinary Biocentre to facilitate interaction between text mining researchers and bio-domain users. As a measure of its commitment to the Centre, the consortium is itself investing some £800K, including the establishment of a new Chair in Text Mining and the full-time secondment of staff. Further, the North-West Development Agency, the National Centre for e-Social Science, the Consortium for Post-Genome Science, and e-Science Northwest have been most supportive of the initiative.

Professor John Garside, Principal and Vice-Chancellor of UMIST, said: “I’m delighted that UMIST and the new University of Manchester have the opportunity with this new centre to make a leading contribution to the critical task of deriving meaning from text. The consortium represents expertise in all the component areas of text mining, with an impressive array of international partners.”

Professor Julian Crampton, University of Liverpool Pro-Vice Chancellor, said: “Our work within the UK Centre for Text Mining builds on current work in developing systems which will let leading researchers in the biosector discover hitherto unknown information. The possibilities for such data mining from large text collections are virtually untapped and we are pleased to play a leading part in developing state of the art tools and techniques which will increase the rate and scope of discovery in biomedical science.”

Professor Ross King, of the University of Wales, and member of the JISC Committee for the Support of Research (JCSR), said: “The setting up of the UK Centre in Text Mining is a very exciting development. The amount of scientific literature is growing so fast that there is an urgent need for novel computer based tools to help scientists keep up. The success of Google has shown how useful text retrieval programs can be. The tools developed will be applicable to academics in all subject areas, including social science and arts and humanities, for example in analysing ancient texts found by archaeologists.”

Professor John Keane (Computation, UMIST; Proposal Coordinator, and Interim Co-Director): “The Centre will play a leading role both nationally and internationally in developing the research agenda in text mining, promulgating associated best practice, and developing service provision. All those involved look forward to the challenges and opportunities that lie ahead.”

Dr Sophia Ananiadou (Computing, Science and Engineering, Salford; Interim Co-Director): “The Centre will address the increasing needs of the bioscience community to gather and structure scientific knowledge from texts. The synergy of text mining and bioscience will be beneficial to scientists from both communities.’

For further information:

John Keane, (Co-Director) 0161 200 3347

Sophia Ananiadou (Co-Director): 0161 295-0480

John McNaught (Technical Directorate) as a contact point: 0161 200 3098

Posted by mgk at July 27, 2005 11:04 AM
Comments
Due to the proliferation of comment spam, I've had to close comments on this entry. If you would like to leave comment, please send email to me at mgk =at= umd =dot= edu. Thank you.