Abstract. How can we succinctly describe a millionnode graph with a few simple sentences? How can we measure the `importance' of a set of discovered subgraphs in a large graph? These are exactly the problems we focus on. Our main ideas are to construct a `vocabulary' of subgraphtypes that often occur in real graphs (e.g., stars, cliques, chains), and from a set of subgraphs, find the most succinct description of a graph in terms of this vocabulary. We measure success in a wellfounded way by means of the Minimum Description Length (MDL) principle: a subgraph is included in the summary if it decreases the total description length of the graph.
Our contributions are threefold: (a) formulation: we provide a principled encoding scheme to choose vocabulary subgraphs; (b) algorithm: we develop VoG, an efficient method to minimize the description cost, and (c) applicability: we report experimental results on multimillionedge real graphs, including Flickr and the Notre Dame web graph.
Summarizing and Understanding Large Graphs. Statistical Analysis and Data Mining vol.8(3), pp 183202, Wiley, 2015. 

VoG: Summarizing and Understanding Large Graphs. In: Proceedings of the SIAM International Conference on Data Mining (SDM), pp 9199, SIAM, 2014. 