Topics in Graph Clustering

Yali Wan

Advisor: Marina Meila

In this talk, we present model-free methods for community recovery and evaluation on stability of graph properties, in which we do not make any assumptions on data generation progress. The study of social networks has aroused lots of attention lately. One of the major interest is on finding the guarantees for the recovery of community structure and evaluating the stability of the robustness of graph properties. Current work has been focused on model-based community recovery. Particular interests are in Stochastic Block Model (SBM) and its extension, Degree Corrected Stochastic Block Model (DC-SBM). Despite its fast advances, the results are limited by the assumption that the observed data come from the model.

We firstly provide a bootstrap method for the evaluation of robustness of graph properties in social networks, including clustering, weighted cut, number of connected components, etc. We describe graph properties using influence functions and breakdown points inspired by literature in robust statistics. The bootstrapping of networks is done by varying the weights on the nodes, in which way the topology of the graph is preserved.

Second, we describe a model-free framework to provide theoretical guarantees for the results of model-based clustering algorithms. The framework is as follows: if $M(G, C)$ fits the data $G$ well, then we shall prove that any other clustering $C_0$ of $G$ that also fits $G$ well will be a small perturbation of $C$. If this holds, then $C$ with model parameters $M(G, C)$ can be said to capture the data structure in a meaningful way. We instantiate this framework by obtaining model-free guarantees for SBM models.

Start Time


Building Map