Bin Yu
UC Berkeley

Spectral clustering and the high-dimensional Stochastic Block Model

In recent years network analysis have become the focus of much research in many fields including biology, communication studies, economics, information science, organizational studies, and social psychology. Communities or clusters of highly connected actors form an essential feature in the structure of several empirical networks. Spectral clustering is a popular and computationally feasible method to discover these communities. The Stochastic Block Model is a social network model with well defined communities. This talk will give conditions for spectral clustering to correctly estimate the community membership of nearly all nodes. These asymptotic results are the first clustering results that allow the number of clusters in the model to grow with the number of nodes, hence the name high-dimensional. If time allows, I will present on-going work on directed spectral clustering for networks whose edges are directed, including the Enron data as an example.