Practical Static Analysis for Privacy Bugs

Posted on 03 August 2025.

Privacy bugs are deeply problematic for software users (“once it’s out there you can’t take it back”), legally significant (due to laws like the GDPR), and difficult for programmers to find and to keep out. Static program analysis would therefore appear to be very helpful here.

Unfortunately, making an effective tool runs into several problems:

  • Rules need to be expressed in a way that is auditable by legal and policy experts, which means they cannot be too close to the level of code.

  • Policies need to then be mapped to the actual code.

  • The analysis needs to be as precise as possible.

  • The analysis needs to also be as quick as possible—ideally, fast enough to integrate into interactive development.

Our new system, Paralegal, checks off all these boxes:

  • It supports policies expressed in a first-order logic, but written in a stylized English form. For instance, here is a policy stating that there is a way for all personal data to be deleted:
    Somewhere:
    1. For each "user data" type marked user_data:
    A. There is a "source" that produces "user data" where:
      a. There is a "deleter" marked deletes where:
        i) "source" goes to "deleter"
    
  • It introduces markers as a key abstraction for mapping the policy onto the program. Several of the terms above — e.g., user_data — are designated in a lightweight way in the source program:
    #[paralegal::marker(user_data)]
    
  • It deeply leverages Rust’s type system to obtain useful summaries of function behavior without having to traverse their bodies. In particular, this avoids the pain of writing mock versions of functions, which are time-consuming and error-prone, without sacrificing correctness or precision. It also has some additional optimizations, such as adaptive approximation.

As a result, Paralegal is able to efficiently and effectively analyze several third-party, real-world codebases.

For more details, see our paper!