Speaker
Description
Random Projections have been widely used to generate embeddings for various large graph tasks due to their computational efficiency in estimating relevance between vertices. The majority of applications have been justified through the Johnson-Lindenstrauss Lemma. We take a step further and investigate how well dot product and cosine similarity are preserved by Random Projections. Our analysis provides new theoretical results, identifies pathological cases, and tests them with numerical experiments. We find that, for nodes of lower or higher degrees, the method produces especially unreliable embeddings for the dot product, regardless of whether the adjacency or the transition (normalized version) is used. With respect to the noise introduced by Random Projections, we show that cosine similarity produces remarkably more precise approximations. This work builds on many experiments the Graph Intelligence Sciences team at Microsoft did to compute relevance between entities (email, documents, people, events, ...) in Office 365. Joint work with Cassiano Becker and Jennifer Neville.