anonymity#

k-anonymity privacy metric.

Functions

find_not_k_anonymous_qids

Find equivalence classes that violate k-anonymity.

get_equivalence_classes

Get all equivalence classes.

get_k_anonymity

Get the level of k-anonymity of the given data.

is_k_anonymous

Check whether the data satisfies k-anonymity.

find_not_k_anonymous_qids(data: DataFrame | ndarray, k: int = 2, qids_idx: list = [])#

Find equivalence classes that violate k-anonymity.

Parameters:
  • data (DataFrame or ndarray) – The data to inspect.

  • k (int, default 2) – The privacy parameter k.

  • qids_idx (list, optional) – The column indices of the QID attributes. If not provided, consider all columns as QID attributes.

Returns:

list[{qid, count}] – A list of dictionaries {qid, count}.

get_equivalence_classes(data: DataFrame | ndarray, qids_idx: list = [])#

Get all equivalence classes.

Parameters:
  • data (DataFrame or ndarray) – The data to inspect.

  • qids_idx (list, optional) – The column indices of the QID attributes. If not provided, consider all columns as QID attributes.

Returns:

list[{qid, count}] – A list of dictionaries {qid, count}.

get_k_anonymity(data: DataFrame | ndarray, qids_idx: list = [])#

Get the level of k-anonymity of the given data.

Parameters:
  • data (DataFrame or ndarray) – The data to inspect.

  • qids_idx (list, optional) – The column indices of the QID attributes. If not provided, consider all columns as QID attributes.

Returns:

int – The privacy parameter k.

is_k_anonymous(data: DataFrame | ndarray, k: int = 2, qids_idx: list = [])#

Check whether the data satisfies k-anonymity.

Parameters:
  • data (DataFrame or ndarray) – The data to inspect.

  • k (int, default 2) – The privacy parameter k.

  • qids_idx (list, optional) – The column indices of the QID attributes. If not provided, consider all columns as QID attributes.

Returns:

bool – Whether or not the data satisfies k-anonymity.