Text Classification Systems Essay
Statistical text clustering
Statistical text clustering developed greatly in the 1990s, when it emerged as a subtask of Information Retrieval (IR) applications (Joachims and Sebastiani, 2002). The hallmark of that a development has been the dramatic improvement of the effectiveness of text clustering systems. The last two decades have witnessed an unprecedented revolution in developing mechanized solutions for organizing the vast quantity of unstructured digital documents and providing powerful tools for turning this unstructured repository into a structured one (Sebastiani, 2006).
The world of knowledge has witnessed over recent years a rapid increase in the amount of sorted data in all fields of knowledge due to the continuous improvement of methods for digitally storing data. As a response to the growing overflow of information which has made it difficult for many search engines to fill people’s needs, various computer-based clustering and classification methods have been developed. Concerns have been raised by IR researchers and internet users about the poor matching of queries and the results generated by search engines. IR researchers have worked in consequence