A New Approach to Finding and Testing Clusters David W. Scott Rice University Finding clusters in multivariate data is a fundamental problem underlying many investigations. The most successful algorithm is hierarchical clustering, which forms clumps by finding points that are close together. A more sophisticated algorithm is k-means, which iteratively reassigns points to the closest cluster center. Perhaps the most sophisticated algorithm is mixture modeling, in which the entire dataset is fit by a complicated combination of multivariate Normal distributions. In practice, these methods can be fooled. The most difficult problem is determining the correct number of clusters. The second problem is that all of these methods work best when the clusters have the same shape (spherical, for example) and the same numerosity. We examine some new research that aims at handling the difficult yet practical case: multivariate data, different cluster numerosity, different cluster shapes, and an unknown number of clusters. Our solution relies upon novel fitting technology and interactive graphical visualization.