Talk: Ownership protection of data and machine learning models
Ownership protection of data and machine learning models – Watermarking and Fingerprinting
Outsourcing and shifting data storage and complex Machine Learning (ML) models to cloud services witnessed a great growth over the past years as the costs of producing, maintaining, and processing data can be decreased this way. However, the data is usually expensive to collect or create in terms of time, money or human experts and can in addition be of sensitive nature, for example data in medical domain. Furthermore, training ML models usually requires vast amount of data and computational resources.
Because of this, data and ML models are considered valuable assets and sharing them entails potential intellectual property theft. Watermarking and fingerprinting are approaches for protecting ownership of various types of digital property, including those relevant in ML process – various types of data and ML models. By embedding a mark into a digital object these methods enable the owners to share these objects in their full form while enabling ownership claim and/or tracing recipients. There are multiple methods proposed for watermarking and fingerprinting data and ML models.
One of the most important requirements for such techniques is robustness, i. e. the marks should not be easily altered and removed by malicious attacks or benign alteration of the digital object. Secondly the perceptibility of marks should be minimized to reduce the success of the attacks and to maximize the utility of these digital objects. In this talk you will learn about state-of-the-art methods for watermarking and fingerprinting data and ML models, their vulnerabilities and challenges in protecting ownership of digital content in ML process.