I’ll be giving a talk this Wednesday titled “Knowledge Discovery with Iterative Denoising”.
Abstract:
The tasks of proper data analysis and knowledge extraction in data are beset by multiple difficulties when the datasets are large and in high dimension. From a performance perspective, it can be prohibitively expensive to search in a high dimensional space. Also, complex datasets often have local relationships of interest, findings that might be missed with global searches. While some progress has been made with addressing Curse of Dimensionality issues, traditional data mining algorithms largely take a static approach to the data mining process—simply tabulating the outputs of a particular algorithm from a given input, leaving the user to start the process over again with new inputs if another run is desired. With this static approach, the user is prevented from interacting with the data mining algorithm as well as with the data.
Towards addressing these issues, we discuss our methodology, called Iterative Denoising, which is a statistical pattern recognition framework for analyzing complex datasets. An important realization of our methodology is that users may want to interact with visualized representations of their data. We not only provide to the users lower-dimensional-space representations to highlight (possibly) desired structures in the data, but we also allow the user to interact with the data through an explicit interaction step. For example, the user may wish to change the displayed geometry relationships between objects, say to reflect some metadata intelligence the user has received that is not reflected in the original data. We highlight these contributions with examples from the analysis of text data.
Logistics:
12PM, October 8, 2008, Computer Science Department, Virginia Commonwealth University