The Cluster Engine is a data mining software module that can be integrated into a larger Business Intelligence (BI) or Knowledge Management (KM) system. The Engines function is to categorize and sort unstructured information by identifying concepts within such information, and then organizing related concepts into cluster groups. Once these groups are formed, they can be saved as a taxonomy, which can be used for ongoing analysis to help identify trends and developments in the information.
For example, consider a customer satisfaction survey which solicits patrons with an open-ended question that asks for feedback regarding their shopping experience. Such a survey can generate several hundreds or even thousands of responses from patrons, and these responses (by nature) are unstructured information. In this case, the Cluster Engine can be used to automatically categorize and sort these responses into cluster groups. Once these groups are formed, various metrics, such as the "hot" or "most popular" customer issues, can be quickly identified. The cluster groups can also be saved as a taxonomy, so that when the survey is conducted again in the future, the Engine output can be compared against previously saved runs. This comparison can help identify trends and developments in customer satisfaction.
Input to the Cluster Engine can come from a variety of sources, some of which include traditional databases, XML, web site content, web logs (blogs), Usenet posts, and RSS feeds. By default, the Engine produces XML output of its taxonomy results, but this output format can be configured as needed. In addition to the unstructured information, each input record can include meta data or keyword information. These additional inputs become attributes, which can be used by the Engine to further filter or segment the resultant cluster groups.