CAVG#
- class CAVG#
Bases:
object(Normalized) Average Equivalence Class Size (\(C_{AVG}\)).
\(C_{AVG}\) estimates the trade-off between information loss and privacy protection based on the average size of the equivalence classes. A value closer to 1 indicates a more balanced trade-off. \(C_{AVG}\) believes that a minimal-loss k-anonymization algorithm is one that results in equivalence classes all of size k. In other words, an equivalence class of size > k express an over-anonymization that led to unnecessary information loss.
\[C_{AVG} = \frac{|D|}{|EQs| * k}\]where \(|D|\) is the size of data and \(|EQs|\) is the number of equivalence classes.
Methods
Calculate CAVG score from the data.
Calculate the best-effort CAVG.
Calculate CAVG from equivalence classes.
- static calculate(data: DataFrame | ndarray, qids_idx: list, k: int)#
Calculate CAVG score from the data.
- Parameters:
data (DataFrame or ndarray) – The data to inspect.
qids_idx (list) – The column indices of the QID attributes.
k (int) – The privacy parameter k.
- Returns:
float – The calculated CAVG.
- static calculate_best_effort(org_data: DataFrame, k: int = 1)#
Calculate the best-effort CAVG.
The best CAVG happens when data records are evenly distributed into \(int(\frac{|D|}{k})\) equivalence classes, i.e., when the data has exactly \(int(\frac{|D|}{k})\) equivalence classes.
\[C_{AVG}\_best = \frac{|D|}{int(\frac{|D|}{k}) * k}\]where \(|D|\) is the size of data.
- Parameters:
org_data (DataFrame) – The original data.
k (int, default 1) – The privacy parameter k.
- Returns:
float – The calculated best-effort CAVG.
- static calculate_from_equivalence_classes(equivalence_classes: list, k: int)#
Calculate CAVG from equivalence classes.
- Parameters:
equivalence_classes (list[{qid, count}]) – A list of dictionaries, where each dictionary contains a ‘count’ key representing the size of an equivalence class.
k (int) – The privacy parameter k.
- Returns:
float – The calculated CAVG.
See also
k_anonymization.evaluation.anonymity.get_equivalence_classesGet all equivalence classes.