Sunday, 15 July 2018, 13:30 – 17:30
Central Washington University, USA
The goal of this tutorial is presenting new reversible visualization and knowledge discovery methods that allow making knowledge discovery and predictive models more effective and rigorous. Specifically we focus on learning tasks of classification and clustering of n-D data using lossless visual representation of n-D data as graphs.
Interactive Visual Data Mining and Knowledge Discovery is a way of enhancing both analytical and visualization methods for discovering hidden patterns in multidimensional data. The fundamental challenge for visual discovery in multidimensional data is that we cannot see n-D data with the naked eye and need visual analytics tools (“n-D glasses”). This challenge starts at 4-D. Often multidimensional data are visualized by non-reversible, lossy dimension reduction methods such as Principal Component Analysis (PCA). While these methods are very useful, they can eliminate important information critical for knowledge discovery in n-D data before starting discovering n-D patterns. Therefore, the expansion of the class of reversible lossless visualization methods is important. The hybrid methods that combine such reversible methods with non-reversible visualization methods open new wide opportunities for knowledge discovery in n-D data. These lossless displays are important because of the abilities:
(1) to restore all attributes of each n-D data point from these graphs,
(2) to leverage the unique power of human vision to compare in parallel the hundreds of their features, and
(3) to speed up the selection of an appropriate n-D data classification model.
The presenter will review and compare reversible and non-reversible visualization methods such as General Line Coordinates, PCA, and Multidimensional Scaling, manifolds and others. The presenter will use relevant material from his published books, a forthcoming Springer book on Visual Knowledge Discovery and his recent tutorial at the IEEE Joint Neural Networks Conference 2017.
The target audience of this tutorial includes HCI graduate students, scientists and practitioners. The benefits from this tutorial for graduate students and researchers are to become familiar with these new developments and new opportunities to enhance their own research inspired by these methods. For practitioners the benefits are in the opportunities to apply these methods to the real world tasks.
Dr. Boris Kovalerchuk is a professor of Computer Science at Central Washington University, USA. He is co-author of two books on Data Mining (Kluwer, 2000), Visual and Spatial Analysis (Springer 2005), forthcoming Springer book on Visual Knowledge Discovery, and over 170 other publications including a chapter in the Data Mining Handbook. His research interests are in visual analytics, data mining, machine learning, uncertainty modeling, relationships between probability theory and fuzzy logic, data fusion, image and signal processing. He has been a principal investigator of several research projects in these areas supported by the US Government agencies. Dr. Kovalerchuk served as a senior visiting scientist at the US Air Force Research Laboratory and as a member of several expert panels at the international conferences and panels organized by the US Government bodies.