Google engEDU
27 min 23 sec - Aug 3, 2006
www.google.com
Google TechTalks August 3, 2006
Dan Frankowski is both computer science researcher and practitioner in software and algorithms ... development. He got his master's degree in computer science from the University of Minnesota in 1993, then spent a year in Budapest on a Fulbright grant studying mathematics. From 1997 to 2003 he was an algorithms guy at Net Perceptions. From 2003 to the present, he has been a research fellow with the GroupLens research group at the Unviersity of Minnesota, which is most well-known for recommenders, but now studies online community more broadly.
ABSTRACT In today's data-rich networked world, people express many aspects of their lives online. It is common to segregate different aspects in different places: you might write opinionated rants about movies in your blog under a pseudonym while participating in a forum or web site for scholarly discussion of medical ethics under your real name. However, it may be possible to link these separate identities, because the movies, journal articles, or authors you mention are from a sparse relation space whose properties (e.g., many items related to by only a few users) allow re- identification. This talk examines this general problem in a specific setting: re- identification of users from a public web movie forum in a private movie ratings dataset.