Text mining is a relatively new research area at the intersection of natural-language processing, machine learning, data mining, and information retrieval. By appropriately integrating techniques from each of these disciplines, useful new methods for discovering knowledge from large text corpora can be developed. In particular, the growing interaction between computational linguistics and machine learning, is critical to the development of effective text-mining systems. Traditional data mining assumes that the information to be “mined” is already in the form of a relational database. Unfortunately, for many applications, electronic information is only available in the form of free natural-language documents rather than structured databases. Information Extraction (IE) technologies are the starting point for the analysis of text. In order to identify relationships, match patterns, and extract structured information from unstructured text.

(Mooney and Nahm, 2005) Text Mining with Information Extraction. Multilingualism and Electronic Language Management.

