# Computing Clusters of Correlation Connected objects

@inproceedings{Bhm2004ComputingCO, title={Computing Clusters of Correlation Connected objects}, author={C. B{\"o}hm and Karin Murthy and Peer Kr{\"o}ger and Arthur Zimek}, booktitle={SIGMOD '04}, year={2004} }

The detection of correlations between different features in a set of feature vectors is a very important data mining task because correlation indicates a dependency between the features or some association of cause and effect between them. This association can be arbitrarily complex, i.e. one or more features might be dependent from a combination of several other features. Well-known methods like the principal components analysis (PCA) can perfectly find correlations which are global, linear… Expand

#### Figures and Topics from this paper

#### 182 Citations

Mining Hierarchies of Correlation Clusters

- Computer Science
- 18th International Conference on Scientific and Statistical Database Management (SSDBM'06)
- 2006

The algorithm HiCO (hierarchical correlation ordering), the first hierarchical approach to correlation clustering, is proposed, which determines the cluster hierarchy, and visualizes it using correlation diagrams. Expand

On Exploring Complex Relationships of Correlation Clusters

- Computer Science
- 19th International Conference on Scientific and Statistical Database Management (SSDBM 2007)
- 2007

The algorithm ERiC is proposed, which finds more information than state-of-the-art correlation clustering methods and outperforms existing competitors in terms of efficiency and to visualize the result by means of a graph-based representation. Expand

Mining Subspace Correlations

- Computer Science
- 2007 IEEE Symposium on Computational Intelligence and Data Mining
- 2007

A formal stochastic line cluster model is presented and its connection to correlation is established and an algorithm, which uses feature selection to search for line clusters embedded in subspaces of the data is presented. Expand

LINEAR MANIFOLD CORRELATION CLUSTERING

- 2007

The detection of correlations is a data mining task of increa sing importance due to new areas of application such as DNA microarr ay nalysis, collaborative filtering, and text mining. In these case… Expand

Deriving quantitative models for correlation clusters

- Mathematics, Computer Science
- KDD '06
- 2006

This paper introduces a general method that can extract quantitative information on the linear dependencies within a correlation clustering and shows how these quantitative models can be used to predict the probability distribution that an object is created by these models. Expand

Subspace clustering for complex data

- Computer Science
- BTW
- 2012

This work introduces novel methods for effective subspace clustering on various types of complex data: vector data, imperfect data, and graph data and proposes models whose solutions contain only non-redundant and, thus, valuable clusters. Expand

CURLER: finding and visualizing nonlinear correlation clusters

- Computer Science
- SIGMOD '05
- 2005

An algorithm for finding and visualizing nonlinear correlation clusters in the subspace of high-dimensional databases using a novel concept called co-sharing level which captures both spatial proximity and cluster orientation when judging similarity between clusters. Expand

New Techniques for Clustering Complex Objects

- Mathematics
- 2004

The tremendous amount of data produced nowadays in various
application domains such as molecular biology or geography can only be fully exploited by efficient and effective data mining tools. One of… Expand

New techniques for clustering complex objects

- Computer Science
- 2004

This thesis presents original extensions and enhancements of the density-based clustering notion to cope with high-dimensional data and proposes an algorithm called SUBCLU (density-connected Subspace Clustering) that extends DBSCAN (Density-Based Spatial C lustering of Applications with N oise) to the problem of subspace clustering. Expand

INCONCO: interpretable clustering of numerical and categorical objects

- Computer Science
- KDD
- 2011

The proposed algorithm, INCONCO, successfully finds clusters in mixed type data sets, identifies the relevant attribute dependencies, and explains them using linear models and case-by-case analysis, and outperforms existing approaches in effectiveness, as the extensive experimental evaluation demonstrates. Expand

#### References

SHOWING 1-10 OF 27 REFERENCES

Clustering by pattern similarity in large data sets

- Computer Science
- SIGMOD '02
- 2002

This paper introduces an effective algorithm to detect clusters of genes that are essential in revealing significant connections in gene regulatory networks, and performs tests on several real and synthetic data sets to show its effectiveness. Expand

OP-cluster: clustering by tendency in high dimensional space

- Mathematics, Computer Science
- Third IEEE International Conference on Data Mining
- 2003

A flexible yet powerful clustering model, namely OP-cluster (Order Preserving Cluster), which is essential in revealing significant gene regulatory networks and its effectiveness and efficiency in detecting coregulated patterns is demonstrated. Expand

/spl delta/-clusters: capturing subspace correlation in a large data set

- Computer Science
- Proceedings 18th International Conference on Data Engineering
- 2002

The /spl delta/-cluster model takes the bicluster model as a special case, where the FLOC algorithm performs far superior to the bICluster algorithm, and is devised to efficiently produce a near-optimal clustering results. Expand

Entropy-based subspace clustering for mining numerical data

- Computer Science
- KDD '99
- 1999

This work considers a database with numerical attributes, in which each transaction is viewed as a multi-dimensional vector, and identifies new meaningful criteria of high density and correlation of dimensions for goodness of clustering in subspaces. Expand

OPTICS: ordering points to identify the clustering structure

- Computer Science
- SIGMOD '99
- 1999

A new algorithm is introduced for the purpose of cluster analysis which does not produce a clustering of a data set explicitly; but instead creates an augmented ordering of the database representing its density-based clustering structure. Expand

Fast algorithms for projected clustering

- Computer Science
- SIGMOD '99
- 1999

An algorithmic framework for solving the projected clustering problem, in which the subsets of dimensions selected are specific to the clusters themselves, is developed and tested. Expand

Using the fractal dimension to cluster datasets

- Computer Science
- KDD '00
- 2000

A new clustering algorithm, based in the fractal properties of the data sets, which is capable of recognizing clusters of arbitrary shape and which places points incrementally in the cluster for whic h the change in the Fractal dimension after adding the point is the least. Expand

Finding generalized projected clusters in high dimensional spaces

- SIGMOD '00
- 2000

High dimensional data has always been a challenge for clustering algorithms because of the inherent sparsity of the points. Recent research results indicate that in high dimensional data, even the… Expand

Finding Generalized Projected Clusters In High Dimensional Spaces

- Computer Science
- SIGMOD Conference
- 2000

Very general techniques for projected clustering are discussed which are able to construct clusters in arbitrarily aligned subspaces of lower dimensionality, which is substantially more general and realistic than currently available techniques. Expand

Automatic subspace clustering of high dimensional data for data mining applications

- Computer Science
- SIGMOD '98
- 1998

CLIQUE is presented, a clustering algorithm that satisfies each of these requirements of data mining applications including the ability to find clusters embedded in subspaces of high dimensional data, scalability, end-user comprehensibility of the results, non-presumption of any canonical data distribution, and insensitivity to the order of input records. Expand