Browser Fingerprinting: How to Protect Machine Learning Models and Data with Differential Privacy?

Katharina Dietz; Michael Mühlhauser; Michael Seufert; Nicholas Gray; Tobias Hoßfeld; Dominik Herrmann

doi:10.14279/tuj.eceasst.80.1179

Authors

Katharina Dietz University of Würzburg
Michael Mühlhauser University of Bamberg
Michael Seufert University of Würzburg
Nicholas Gray University of Würzburg
Tobias Hoßfeld University of Würzburg
Dominik Herrmann University of Bamberg

DOI:

https://doi.org/10.14279/tuj.eceasst.80.1179

Abstract

As modern communication networks grow more and more complex, manually maintaining an overview of deployed soft- and hardware is challenging. Mechanisms such as fingerprinting are utilized to automatically extract information from ongoing network traffic and map this to a specific device or application, e.g., a browser. Active approaches directly interfere with the traffic and impose security risks or are simply infeasible. Therefore, passive approaches are employed, which only monitor traffic but require a well-designed feature set since less information is available. However, even these passive approaches impose privacy risks. Browser identification from encrypted traffic may lead to data leakage, e.g., the browser history of users. We propose a passive browser fingerprinting method based on explainable features and evaluate two privacy protection mechanisms, namely differentially private classifiers and differentially private data generation. With a differentially private Random Decision Forest, we achieve an accuracy of 0.877. If we train a non-private Random Forest on differentially private synthetic data, we reach an accuracy up to 0.887, showing a reasonable trade-off between utility and privacy.

Browser Fingerprinting: How to Protect Machine Learning Models and Data with Differential Privacy?

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

License

Information

DB-logos

Usage Statistics Information