Quickstart
Installation
Install via pip:
pip install syndat
Basic Usage
import pandas as pd
from syndat.metrics import (
jensen_shannon_distance,
normalized_correlation_difference,
discriminator_auc
)
real = pd.DataFrame({
'feature1': [1, 2, 3, 4, 5],
'feature2': ['A', 'B', 'A', 'B', 'C']
})
synthetic = pd.DataFrame({
'feature1': [1, 2, 2, 3, 3],
'feature2': ['A', 'B', 'A', 'C', 'C']
})
print(jensen_shannon_distance(real, synthetic))
>> {'feature1': 0.4990215421876156, 'feature2': 0.22141025172133794}
print(normalized_correlation_difference(real, synthetic))
>> 0.24571345029108108
print(discriminator_auc(real, synthetic))
>> 0.6
# JSD score is being aggregated over all features
distribution_similarity_score = syndat.scores.distribution(real, synthetic)
discrimination_score = syndat.scores.discrimination(real, synthetic)
correlation_score = syndat.scores.correlation(real, synthetic)
# plot *all* feature distribution and store image files
syndat.visualization.plot_distributions(real, synthetic, store_destination="results/plots")
syndat.visualization.plot_correlations(real, synthetic, store_destination="results/plots")
# plot and display specific feature distribution plot
syndat.visualization.plot_numerical_feature("feature_xy", real, synthetic)
syndat.visualization.plot_numerical_feature("feature_xy", real, synthetic)
# plot a shap plot of differentiating feature for real and synthetic data
syndat.visualization.plot_shap_discrimination(real, synthetic)
# postprocess synthetic data
synthetic_post = syndat.postprocessing.assert_minmax(real, synthetic)
synthetic_post = syndat.postprocessing.normalize_float_precision(real, synthetic)
Modules
syndat.metrics: Metrics for the evaluation of synthetic data fidelity.syndat.scores: Scoring functions that normalize metrics for easier comparison.syndat.visualization: Visualize feature distributions correlation and SHAP analysis.syndat.postprocessing: Optional cleaning and formatting helpers.
Examples
See the examples/ folder on GitHub for end-to-end demos: https://github.com/SCAI-BIO/syndat/tree/main/examples