Center for Applied Mathematical Sciences Visit USC
USC Donrsife Homepage
Monday
17
Sep 12
3:30 PM - 4:30 PM KAP 414
Learning Social Similarities from Friendship Patterns
David Kempe Computer Science, USC

What can a social network tell us about the underlying latent "social structure", the way in which individuals are similar or dissimilar?
Much of social network analysis is, implicitly or explicitly, predicated on the assumption that individuals tend to be more similar to their friends than to strangers. Having explicit access to similarity information instead of merely the noisy signal provided by the presence or absence of edges could improve analysis significantly.
We study the following natural question: Given a social network - reflecting the underlying social distances between its nodes - how accurately can we reconstruct the social structure?

It is tempting to model similarities and dissimilarities as distances, so that the social structure is a metric space. However, observed social networks are usually multiplex, in the sense that different edges originate from similarity in one or more among a number of different categories, such as geographical proximity, professional interactions, kinship, or hobbies.
Since close proximity in even one category makes the presence of edges much more likely, an observed social network is more accurately modeled as a union of separate networks. In general, it is a priori not known which network a given edge comes from.
While generative models based on a single metric space have been analyzed previously, a union of several networks individually generated from metrics is structurally very different from (and more complex than) networks generated from just one metric.

We begin to address this reconstruction problem formally. The latent "social structure" consists of several metric spaces. Each metric space gives rise to a "distance-based random graph," in which edges are created according to a distribution that depends on the underlying metric space and makes long-range edges less likely than short ones. For a concrete model, we consider Kleinberg's small-world model and some variations thereof. The observed social network is the union of these graphs. All edges are unlabeled, in the sense that the existence of an edge does not reveal which random graph it comes from.
Our main result is a near-linear time algorithm which reconstructs from this unlabeled union each of the individual metrics with provably low distortion.

(Joint work with Ittai Abraham Shiri Chechik, and Aleksandrs Slivkins.)

Previous colloquium: Wave in random media and applications Next colloquium: Self-similarity of random trees and applications