syndat.visualization
Visualize feature distributions correlation and SHAP analysis.
- syndat.visualization.plot_categorical_feature(feature, real_data, synthetic_data, y_scale='auto')
Plots count plots for a categorical feature from both real and synthetic datasets.
- Parameters:
feature (
str) – The feature to be plottedreal_data (
DataFrame) – The real datasynthetic_data (
DataFrame) – The synthetic datay_scale (
Literal['auto','absolute','relative']) – Categorical y-axis scale mode: - “auto”: uses relative frequencies when real/synthetic sample sizes differ by at least 1%, else absolute. - “absolute”: always uses absolute counts. - “relative”: always uses relative frequencies (%).
- Return type:
None
- syndat.visualization.plot_correlations(real, synthetic, store_destination)
Plots correlation matrices for real and synthetic features in form of heatmaps.
- Parameters:
real (
DataFrame) – The real datasynthetic (
DataFrame) – The synthetic datastore_destination (
str) – Path to the folder where the results should be stored.
- Return type:
None
- syndat.visualization.plot_distributions(real, synthetic, store_destination, categorical_y_scale='auto')
Plots violin plots (numeric features) or bar charts (categorical features) together with their summary statistics.
- Parameters:
real (
DataFrame) – The real datasynthetic (
DataFrame) – The synthetic datastore_destination (
str) – Path to the folder where the results should be stored.categorical_y_scale (
Literal['auto','absolute','relative']) – Categorical y-axis scale mode: - “auto”: uses relative frequencies when real/synthetic sample sizes differ by at least 1%, else absolute. - “absolute”: always uses absolute counts. - “relative”: always uses relative frequencies (%).
- Return type:
None
- syndat.visualization.plot_numerical_feature(feature, real_data, synthetic_data)
Plots violin plots for a numerical feature from both real and synthetic datasets and displays their summary statistics.
- Parameters:
feature (
str) – The feature to be plottedreal_data (
DataFrame) – The real datasynthetic_data (
DataFrame) – The synthetic data
- Return type:
None
- syndat.visualization.plot_shap_discrimination(real, synthetic, save_path=None)
Generates a SHAP summary plot to illustrate the discrimination between real and synthetic datasets using a Random Forest classifier.
- Parameters:
real (
DataFrame) – The real datasynthetic (
DataFrame) – The synthetic datasave_path (
str) – Path to the file where the resulting plot should be saved. If None, the plot will not be saved.
- Return type:
None- Returns:
None