Latanya Sweeney


Latanya Arvette Sweeney is a Professor of the Practice of Government and Technology at Harvard University, the Director of the Data Privacy Lab in the Institute of Quantitative Social Science at Harvard, and the Faculty Dean in Currier House at Harvard. She formerly served as the Chief Technologist of the Federal Trade Commission, a position she held from January 2014 until December 2014. She has made several contributions to privacy technology. Her best known academic work is on the theory of k-anonymity and she is credited with the observation that "87% of the U.S. population is uniquely identified by date of birth, gender, postal code."
Sweeney develops technology to assess and solve societal problems and teaches others how to use the same technology. She has made several discoveries related to identifiability and privacy technologies. Her work has received awards from numerous organizations, including the American Psychiatric Association, the American Medical Informatics Association, and the Blue Cross Blue Shield Association. Her work was praised in the TAPAC Report that reviewed the Total Information Awareness Project of DARPA. She has testified before the Privacy and Integrity Advisory Committee of the Department of Homeland Security and the European Union Commission. Sweeney was a Distinguished Career Professor of Computer Science, Technology and Policy in the School of Computer Science at Carnegie Mellon University. In 2001, she received her PhD in computer science from the Massachusetts Institute of Technology where she became the first African American woman to earn a PhD in computer science from that school. Her undergraduate degree in computer science was completed at Harvard University.

Education

Sweeney went to Dana Hall Schools in Wellesley, MA, where she received her high school diploma in 1977. She delivered the valedictory at the graduation ceremony.
Sweeney did undergraduate studies at Massachusetts Institute of Technology, where she focused on Electrical Engineering and Computer Science.
She went to Harvard Extension School to study Computer Science, where she received an ALB degree in Computer Science, Cum Laude. Her undergraduate research thesis was called “A Coin Toss: the Dialectical Odds Aren't Always 50/50”. She received honors grades in all courses and completed graduate courses in computer science, mathematics, physics, educational psychology, and philosophy. She also delivered the graduation speech.
Sweeney returned to MIT to study for her master's degree, where she received an S.M. in Electrical Engineering and Computer Science in 1997. During her studying of master's degree, she received a GPA of 4.9/5.0. Sweeney wrote a Master's thesis called “Sprees, a Finite-State Orthographic Learning System that Recognizes and Generates Phonologically Similar Spellings”, where she was the finalist in MasterWorks. She continued to study at MIT for her Ph.D. degree, where she advanced in Computer Science. She received the degree in 2001 and finished her Ph.D. thesis “Computational Disclosure Control: Theory and Practice”.

Career and research

In 2001, Sweeney became director and founder of the Data Privacy Lab, at Carnegie Mellon University. She was a member of the Program Committee for Modeling Decisions for Artificial Intelligence in 2005. In 2004, she founded the Journal of Privacy Technology, later becoming the editor-in-chief in 2006.
In her PhD dissertation at MIT, Sweeney examines various computational methodologies for the secure dissemination of anonymous data without revealing any identifying, or potentially identifying, information. She proposes novel approaches for secure data disclosure, defining and describing null-map, k-map and wrong-map models of protection. Sweeney then critiques and compares four electronic data-based computational programs on their capacity to protect private information. The systems evaluated are her Scrub System, her Datafly II System, Statistics Netherlands’ u-Argus System, and her k-Similar algorithm – which she concludes as the most effective system in minimizing privacy risks. Prior to her dissertation, Sweeney has already been published numerous times, in topics pertaining to healthcare data security, and she has also completed a Masters Thesis at MIT and an ALB Thesis at Harvard. Currently, Sweeney is a prominent data security researcher and continues ongoing work to advance this field.

Most recent publication

By conducting research on 110 mobile apps over 9 categories, Sweeney found that many mobile apps transmits sensitive personal data to third-party domains, particularly name, location and email etc. According to her study, Android apps send potential sensitive personal data to 3.1 third-party domains. As for iOS apps, they connected with an average of 2.6 third party domains. While increasing bringing up awareness for current risks for privacy leakage, Sweeney also inspires us to think about possible future sharing permission system on mobile phones.
In 2016 L. Sweeny, M. Bar-Sinai, M. Crosas, introduced Data Tags, Data Handling Policy Spaces and the Tags Language in IEEE Security and Privacy Workshops 2016 in San Jose CA. The paper introduces the Tags programming language and toolset which through questionnaires suggests data handling policies appropriate to the level of security the dataset requires. The Tags Language and Tools simplifies the development of security policies by recommending policies that meet the legal requirements for that dataset, like HIPAA.

Early publication and challenges

In 1997, Sweeney conducted her first re-identification experiment wherein she successfully identified then Massachusetts governor, William Weld to his medical records using publicly accessible records. Her results had a significant impact on privacy centered policymaking including the health privacy legislation HIPAA, however publication of the experiment was rejected twenty times. The several re-identification experiments she conducted after this were met with serious publication challenges as well. In fact, a court ruling in Southern Illinoisian v. Department of Public Health barred her from publication and sharing of her methods for a successful re-identification experiment. Fear of publicly exposing a serious issue with no known solution fueled majority of the backlash against publication of her works and similar re-identification experiments for over a decade. Unless experiments concluded that no risk existed or that the issue could be resolved through reasonable technological advancement, publication was largely denied.
In her article “,” writer Latanya Sweeney discusses her research project in which she located and matched up identities and personal health records through a number of methods. Such methods, as she explains in depth later on, include looking at public health records from hospitals and newspaper stories. Towards the end of the article, Sweeney touches upon the different approaches of how she analyzed and matched the data, either through using computer programs or human effort. She then makes the conclusion that new and improved methods of data sharing are necessary.

Data Privacy Lab

Since 2011 Sweeney's Data Privacy Lab has been conducting research about data privacy at Harvard. It intends to provide a cross-disciplinary perspective about privacy in the process of disseminating data. The Data Privacy Lab is sponsored by government, corporate, and nonprofit organizations. It is also in partnership with Berkman Klein Center, the Institute for Quantitative Social Science, Center for Research on Computation and Society, and Program on Informational Sciences. One of her missions of the Data Privacy Lab includes creating a conversation about data in technology and policies on protecting personal data in technology. Sweeney's Data Privacy Lab is working on 102 different projects regarding data privacy. Some of which include: The Genomic Privacy Project, Discrimination in Online Ad Delivery Project, Privacy-Enhanced Linking Project, and the Identifiability Project. The Genomic Privacy Project attempts to question the privacy of our genetic code and the use of genetic code to identify individuals. The Discrimination in Online Delivery Project examines the possibility of discrimination in the type of ads that show up in a search on a particular individual. There is a possibility that some searches will yield ads that are discriminatory to racial minorities. The Privacy-Enhanced Linking Project attempts to create algorithms in computer coding that will automatically protect privacy in the process of linking—which is the chain of searches that can be traced. The Identifiability Project examines how individuals can be identified through the use public census data. She argues that individuals can be identified using population census data through the combination of zip code, gender, and date of birth.

Recognition

In 2017, Forbes named Sweeney one of the most influential women in technology and artificial intelligence because her research showed that online advertising discriminates against people whose names are typically associated with the black community.