Logo der Universität Wien

Synchronization-based scalable subspace clustering of high-dimensional data

Kurzbeschreibung

How to address the challenges of the ``curse of dimensionality'' and ``scalability'' in clustering simultaneously? In this paper, we propose arbitrarily oriented synchronized clusters (ORSC), a novel effective and efficient method for subspace clustering inspired by synchronization. Synchronization is a basic phenomenon prevalent in nature, capable of controlling even highly complex processes such as opinion formation in a group. Control of complex processes is achieved by simple operations based on interactions between objects. Relying on the weighted interaction model and iterative dynamic clustering, our approach ORSC (a) naturally detects correlation clusters in arbitrarily oriented subspaces, including arbitrarily shaped nonlinear correlation clusters. Our approach is (b) robust against noise and outliers. In contrast to previous methods, ORSC is (c) easy to parameterize, since there is no need to specify the subspace dimensionality or other difficult parameters. Instead, all interesting subspaces are detected in a fully automatic way. Finally, (d) ORSC outperforms most comparison methods in terms of runtime efficiency and is highly scalable to large and high-dimensional data sets. Extensive experiments have demonstrated the effectiveness and efficiency of our approach.

Grafik Top
Autoren
  • Junming Shao
  • Xinzuo Wang
  • Qinli Yang
  • Claudia Plant
  • Christian Böhm
Grafik Top
Referenz
Kategorie
Journal Paper
Institut
Data Mining
Journal or Publication Title
Knowledge and Information Systems
ISSN
0219-1377
Seitenbereich
pp. 1-29
Datum
2016
Offizielle URL
http://dx.doi.org/10.1007/s10115-016-1013-1
Export
Grafik Top
Kontakt
Fakultät für Informatik
Universität Wien

Währinger Straße 29
1090 Wien