“Sensitivity” in Differential Privacy

Original article was published by shaistha fathima on Becoming Human: Artificial Intelligence Magazine

Query “Sensitivity” types and effects on Differential Privacy Mechanism

Summary: Sometimes knowing the basics is important! This the fourth blog post of “Differential Privacy Basics Series” covering Sensitivity of the Query and its effects on Differential Privacy. Thank you Jacob Merryman for the amazing graphic used in this blog post. For more posts like these on Differential Privacy follow Shaistha Fathima on twitter.

Differential Privacy Basics Series

We have seen that Differential Privacy(DP) adds statistical noise to the data for better privacy, but, How much noise should we add? This depends on the below 4 factors:

  • Sensitivity of the Query.
  • Desired Epsilon.
  • Desired delta.
  • Type of Noise to be added. (most commonly used — Gaussian Noise or Laplacian Noise, among which Laplacian noise is easy to use)

In this blog post we will focus on the SENSITIVITY. To know about the Epsilon and Delta, please look at the (Part-3) Differential Privacy Definition of this series.

Machine Learning Jobs

Sensitivity

Sensitivity parametrize the amount i.e., how much noise perturbation is required in the DP mechanism. To determine the sensitivity, the maximum of possible change in the result needs to be calculated.

Generally sensitivity refers to the impact a change in the underlying data set can have on the result of the query. Let xA, xB be any data set from all possible data set of X differing in at most one element. Then the sensitivity is calculated with the following equation:

where ‖.‖1 is theL1-norm distance between data sets differing at most one element.

The sensitivity of a query can be stated both in the global and local context:

  • Global sensitivity refers to the consideration of all possible data sets differing in at most one element.
  • Whereas, local sensitivity is the change in one data set with differing at most one element.

Global Sensitivity

Global sensitivity refers to the maximum difference of the output a query function can result in when one change is made to any data set.

Global sensitivity determines the magnitude of the noise needed in order to meet the ε-DP requirements.

Given a query function f that is operating on a dataset D and producing the maximum result difference for all data sets (D1,D2) with at most one different entry will be:

where ‖.‖1 is the L1-norm distance between datasets differing at most one element, max is the maximum result of f(D1)−f(D2) for all datasets D1,D2.

Note: Global sensitivity is the maximum differences in output with consideration of all possible datasets and is therefore only dependent on the query and not the dataset.

This has wide consequences for the utility of certain queries. This can be demonstrated by a simple query that takes the sum of any dataset where every entry can in theory contribute any arbitrary amount. The largest difference an output can result in from any query is infinite due to the fact that there is no upper bound on any single entry, thus the global sensitivity for the sum mechanism is infinite.

Trending AI Articles:

1. Fundamentals of AI, ML and Deep Learning for Product Managers

2. The Unfortunate Power of Deep Learning

3. Graph Neural Network for 3D Object Detection in a Point Cloud

4. Know the biggest Notable difference between AI vs. Machine Learning

However the global sensitivity does not exclude the possibility to generate accurate information for sum queries. Bounds can be introduced into the query mechanism for the dataset, that effectively limits the datasets ability to store values greater than a predetermined threshold.

The dataset must continuously be modified in order to guarantee no value exceeds the threshold, the global sensitivity would not be infinite and only depend on the query function and the threshold. Thus, it can be said that Global sensitivity is the minimum sensitivity needed for a query to cover all possible datasets.

In the process of releasing data while using queries such as count or sum that has low sensitivity work well with global sensitivity. We could take the case of count query as an example which has GS(f) = 1 that is smaller than the true answer. However, when it comes to queries like median, average the global sensitivity is much higher.

Another way of looking at it is,

Here, d(x, x’) represents the distance between two datasets x and x’ and we say that two datasets are neighbors if their distance is 1 or less. How this distance is defined has a huge effect on the definition of privacy we obtain.

  • The distance between two datasets should be equal to 1 (i.e. the datasets are neighbors) if they differ in the data of exactly one individual.
  • This idea is easy to formalize in some contexts (e.g. in the US Census, each individual submits a single response containing their data) but extremely challenging in others (e.g. location trajectories, social networks, and time-series data).

Formally, the distance is encoded as a symmetric difference between the two datasets:

This particular definition has several interesting and important implications, for example:

  • If x’ is constructed from x by adding one row, then d(x, x’) = 1
  • If x’ is constructed from x by removing one row, then d(x, x’) = 1
  • If x’ is constructed from x by modifying one row, then d(x, x’) = 2

In other words, adding or removing a row results in a neighboring dataset at distance 1, whereas modifying a row results in a dataset at a distance 2.

This particular definition of distance results in what is typically called unbounded differential privacy. Many other definitions are possible, including one called bounded differential privacy in which modifying a single row in a dataset does result in a neighboring dataset.

The definition of global sensitivity says that for any two neighboring datasets x and x’, the difference between f(x) and f(x’) is at most GS(f). This measure of sensitivity is called “global” because it is independent of the actual dataset being queried, it holds true for any choice of neighboring x and x’.

Other Important Concepts

L1 and L2 Norms

The L1 norm of a vector V of length k is defined as (i.e. it’s the sum of the vector’s elements).

Example: In 2-dimensional space, the L1 norm of the difference between two vectors yields the “Manhattan distance” between them.

The L2 norm of a vector V of length k is defined as (i.e. the square root of the sum of the squares).

Example: In 2-dimensional space, this is the “Euclidian distance,” and it’s always less than or equal to the L1 distance.

L1 VS L2 Sensitivities

The L1 sensitivity of a vector-valued function is equal to the sum of the element wise sensitivities. For example, if we define a vector-valued function f that returns a length-k vector of 1-sensitive results, then the L1 sensitivity of f is k.

The L2 sensitivity of a vector-valued function is a vector-valued function f returning a length-k vector of 1-sensitive results has L2 sensitivity of √k . For long vectors, the L2 sensitivity will obviously be much lower than the L1 sensitivity! For some applications, like machine learning algorithms (which sometimes return vectors with thousands of elements), L2 sensitivity is significantly lower than L1 sensitivity.

Local Sensitivity

Local sensitivity attempts to calculate the sensitivity for a local dataset, where the possible changes are bound by the local data set and not the universe of all data sets.

Given a query function f that is operating on a data set D1, the local sensitivity is then the maximum differences that one change in D1 can produce:

D1 is the known dataset and D2 is another dataset with at most one different element (relative to the data set D1).

For queries such as count or range, the local sensitivity is identical to the global sensitivity.

Note: The feasibility of replacing the global with the local sensitivity have been proven to be problematic and additional limitations are needed in order to satisfy the conditions of ε -Differential Privacy. It is seen that the magnitude of the noise could leak information due to the fact the amount of noise reveals information about the data set to adversaries as from the formula, every differentially private algorithm must add a noise at least as large as the local sensitivity. Local sensitivity is the minimum sensitivity needed for a query to cover one specific data set.

You may also check this — A Case Study on Differential Privacy — Sensitivity (pg-19)

Similarly, another way of looking at it is:

Global sensitivity considers any two neighboring datasets, but since we’re going to run our differentially private mechanisms on an actual dataset — shouldn’t we consider neighbors of that dataset?

The intuition behind local sensitivity is to fix one of the two datasets to be the actual dataset being queried, and consider all of its neighbors.Global sensitivity considers any two neighboring datasets, but since we’re going to run our differentially private mechanisms on an actual dataset — shouldn’t we consider neighbors of that dataset?

Local sensitivity is a function of both the query (f) and the actual dataset (x). Unlike in the case of global sensitivity, we can’t talk about the local sensitivity of a function without also considering the dataset at which that local sensitivity occurs.

Local sensitivity allows us to place finite bounds on the sensitivity of some functions whose global sensitivity is difficult to bound. This is because local sensitivity measure is defined in terms of the actual dataset’s size, which is not possible under global sensitivity.

But, the issue with local sensitivity is, because it itself depends on the dataset, if the analyst knows the local sensitivity of a query at a particular dataset, then the analyst may be able to infer some information about the dataset. It’s therefore not possible to use local sensitivity directly to achieve differential privacy.

Moreover, keeping the local sensitivity secret from the analyst doesn’t help either. It’s possible to determine the scale of the noise from just a few query answers, and the analyst can use this value to infer the local sensitivity. Differential privacy is designed to protect the output of f(x) — not of the sensitivity measure used in its definition.

To solve this, Propose-test-release and Smooth Sensitivity like approaches have been proposed for safely using local sensitivity, which is beyond the scope of this blog post, but if you are interested to know more about it — check these notes shared by Professor Joseph Near .

Now, you might be wondering why not just go with Global sensitivity? Why waste time and effort on local sensitivity and on creating the better safer approaches of it?

Global sensitivity considers any two neighboring datasets, i.e., Global sensitivity is the minimum sensitivity needed for a query to cover all possible datasets. But since we will run our differentially private mechanisms on an actual dataset we might want to consider neighbors of that dataset! Thus, Local sensitivity is the minimum sensitivity needed for a query to cover one specific data set.

That’s it folks! In the next blog we will look deeper into the Noise adding mechanisms for the Differential Privacy.

Till then you may also check out my other beginner friendly Series:

References:

Don’t forget to give us your 👏 !


“Sensitivity” in Differential Privacy was originally published in Becoming Human: Artificial Intelligence Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.