Algebraic-datatype taint tracking, with applications to understanding Android identifier leaks

Document Type

Conference Proceeding

Publication Date

8-20-2021

Abstract

Current taint analyses track flow from sources to sinks, and report the results simply as source → sink pairs, or flows. This is imprecise and ineffective in many real-world scenarios; examples include taint sources that are mutually exclusive, or flows that combine sources (e.g., IMEI and MAC Address are concatenated, hashed, leaked vs. IMEI and MAC Address hashed separately and leaked separately). These shortcomings are particularly acute in the context of Android, where sensitive identifiers can be combined, processed, and then leaked, in complicated ways. To address these issues, we introduce a novel, algebraic-datatype taint analysis that generates rich yet concise taint signatures involving AND, XOR, hashing-akin to algebraic, product and sum, types. We implemented our approach as a static analysis for Android that derives app leak signatures-an algebraic representation of how, and where, hardware/software identifiers are manipulated before being exfiltrated to the network. We perform six empirical studies of algebraic-datatype taint tracking on 1,000 top apps from Google Play and their embedded libraries, including: discerning between "raw"and hashed flows which eliminates a source of imprecision in current analyses; finding apps and libraries that go against Google Play's guidelines by (ab)using hardware identifiers; showing that third-party code, rather than app code, is the predominant source of leaks; exposing potential de-anonymization practices; and quantifying how apps have become more privacy-friendly over the past two years.

Identifier

85116251941 (Scopus)

ISBN

[9781450385626]

Publication Title

Esec Fse 2021 Proceedings of the 29th ACM Joint Meeting European Software Engineering Conference and Symposium on the Foundations of Software Engineering

External Full Text Location

https://doi.org/10.1145/3468264.3468550

First Page

70

Last Page

82

Grant

CNS-1617584

Fund Ref

National Science Foundation

This document is currently not available here.

Share

COinS