syndat.rct.visualization_clinical_trials
- syndat.rct.visualization_clinical_trials.assign_visit_absolute(dat_vpc, Visits, geometric=False)
Assigns each time point in dat_vpc to a visit bin defined by Visits, using linear or geometric spacing.
- Parameters:
dat_vpc (
Union[ndarray,Series]) – Array or Series of time points to bin.Visits (
Union[ndarray,Series]) – Array or Series of reference visit times.geometric (
bool) – If True, uses geometric (log-space) binning.
- Return type:
ndarray- Returns:
Array of assigned visit values.
- syndat.rct.visualization_clinical_trials.bar_categorical(plt_dt, var_name, type_, strat_vars=None)
Generates a bar chart for a categorical variable comparing observed vs reconstructed distributions.
- Parameters:
plt_dt (
DataFrame) – DataFrame with one categorical variable to plot.var_name (
str) – Name of the variable to use as title.type – “Percentage” or “Subjects” to define the bar heights.
strat_vars (
Optional[List[str]]) – Optional list of variables to use for facetting.x_label – Label for the x-axis (default: “Real Data”).
y_label – Label for the y-axis (default: “Synthetic Data”).
- Return type:
ggplot- Returns:
ggplot object.
- syndat.rct.visualization_clinical_trials.bar_categorical_list(rp0, dt, mode='Reconstructed', type_='Percentage', dt_cs=None, dt_cs_label=['Counterfactual'], strat_vars=None, static=False, real_label='Real Data', syn_label='Synthetic Data', save_path=None, width=8, height=6, dpi=300, as_png=False)
Generates and optionally saves bar plots for all categorical variables listed in rp0.
- Parameters:
rp0 (
dict) – Dictionary with a key ‘long_cat’ containing a list of categorical variable names.dt (
DataFrame) – DataFrame with the columns ‘REPI’, ‘TYPE’, ‘Variable’, ‘DV’, ‘SUBJID’, ‘TIME’ (if static = False) and optionally others.mode (
Optional[str]) – String, usually “Reconstructed”, used for filtering TYPE.type – “Percentage” or “Subjects” to define the bar heights.
dt_cs (
Optional[DataFrame]) – Optional list of counterfactual DataFrames.dt_cs_label (
Optional[List[str]]) – Optional list of labels for the counterfactual data.strat_vars (
Optional[List[str]]) – Optional list of variables to use for facetting.static (
Optional[bool]) – If True, metrics for static variables will be calculated.real_label (
Optional[str]) – Label for the real data (default: “Real Data”).syn_label (
Optional[str]) – Label for the synthetic data (default: “Synthetic Data”).save_path (
Optional[str]) – Optional path to folder where plots should be saved. If not provided, plots are shown.width (
Optional[int]) – Width of the saved plot in inches (used only if save_path is provided).height (
Optional[int]) – Height of the saved plot in inches (used only if save_path is provided).dpi (
Optional[int]) – Resolution (dots per inch) of the saved plot (used only if save_path is provided).as_png (
Optional[bool]) – set to True if you want the plot to be saved as png
- Return type:
Dict[str,ggplot]- Returns:
Dictionary of ggplot objects keyed by variable name.
- syndat.rct.visualization_clinical_trials.gof_binary_list(rp0, dt, mode='Reconstructed', strat_vars=None, static=False, x_label='Real Data', y_label='Synthetic Data', save_path=None, width=8, height=6, dpi=300, as_png=False)
Creates goodness-of-fit (calibration) plots for binary variables by comparing the proportion of observed vs. reconstructed outcomes over time (in %).
- Parameters:
rp0 (
dict) – Dictionary with a key ‘long_bin’ containing a list of binary variable names.dt (
DataFrame) – pd.DataFrame with at least columns ‘REPI’, ‘TYPE’, ‘Variable’, ‘DV’, ‘TIME’ (if static = False), and optionally stratification variables.strat_vars (
Optional[List[str]]) – Optional list of column names for stratified (faceted) plots.static (
Optional[bool]) – If True, metrics for static variables will be calculated.x_label (
Optional[str]) – Label for the x-axis (default: “Real Data”).y_label (
Optional[str]) – Label for the y-axis (default: “Synthetic Data”).save_path (
Optional[str]) – Optional path to a folder. If provided, saves each plot as a PNG. If not provided, plots will be shown interactively.width (
Optional[int]) – Width of the saved plot in inches (used only if save_path is provided).height (
Optional[int]) – Height of the saved plot in inches (used only if save_path is provided).dpi (
Optional[int]) – Resolution (dots per inch) of the saved plot (used only if save_path is provided).as_png (
Optional[bool]) – set to True if you want the plot to be saved as png
- Return type:
Dict[str,ggplot]- Returns:
Dictionary mapping each variable name to its ggplot object.
- syndat.rct.visualization_clinical_trials.gof_continuous(plt_dt, var_name, strat_vars=None, log_trans=False, x_label='Real Data', y_label='Synthetic Data')
Generates a goodness-of-fit (GOF) plot for continuous variables using observed vs. reconstructed values.
Produces scatter plots with a smoothing line and identity line. Optionally applies log-transformation and stratification by specified variables.
- Parameters:
plt_dt (
DataFrame) – A pandas DataFrame containing columns ‘Observed’ and ‘Reconstructed’, and optionally stratification variables.var_name (
str) – Name of the variable to display in the plot title.strat_vars (
Optional[List[str]]) – Optional list of column names to stratify the plot using facet wrap.log_trans (
bool) – Whether to apply a log10 transformation to the axes.x_label (
str) – Label for the x-axis (default: “Real Data”).y_label (
str) – Label for the y-axis (default: “Synthetic Data”).
- Return type:
ggplot- Returns:
A ggplot object representing the GOF plot.
- syndat.rct.visualization_clinical_trials.gof_continuous_list(rp0, dt, mode='Reconstructed', strat_vars=None, static=False, log_trans=False, x_label='Real Data', y_label='Synthetic Data', save_path=None, width=8, height=6, dpi=300, as_png=False)
Creates a dictionary of goodness-of-fit (GOF) plots for a list of continuous variables. Saves or displays each plot depending on whether a path is provided.
- Parameters:
rp0 (
dict) – Dictionary with a key ‘long_cont’ containing a list of continuous variable names.dt (
DataFrame) – pd.DataFrame with columns including ‘REPI’, ‘TYPE’, ‘Variable’, ‘DV’, ‘SUBJID’, ‘TIME’ (if static = False), and optionally stratification variables.strat_vars (
Optional[List[str]]) – Optional list of column names to stratify each plot (faceted visualization).static (
Optional[bool]) – If True, metrics for static variables will be calculated.log_trans (
Optional[bool]) – If True, applies log10 transformation to both axes in the plots.x_label (
Optional[str]) – Label for the x-axis (default: “Real Data”).y_label (
Optional[str]) – Label for the y-axis (default: “Synthetic Data”).save_path (
Optional[str]) – Optional path to a folder. If provided, saves each plot as a PNG file. If not provided, plots will be shown interactively.width (
Optional[int]) – Width of the saved plot in inches (used only if save_path is provided).height (
Optional[int]) – Height of the saved plot in inches (used only if save_path is provided).dpi (
Optional[int]) – Resolution (dots per inch) of the saved plot (used only if save_path is provided).as_png (
Optional[bool]) – set to True if you want the plot to be saved as png
- Return type:
Dict[str,ggplot]- Returns:
A dictionary where keys are variable names and values are ggplot GOF plots.
- syndat.rct.visualization_clinical_trials.percentage_cat_traj_time_list(rp0, dt, mode='Reconstructed', dt_cs=None, dt_cs_label=['Counterfactual'], strat_vars=None, real_label='Real Data', syn_label='Synthetic Data', time_unit='Months', save_path=None, width=8, height=6, dpi=300, as_png=False)
Creates trajectories plots of the percentage of subjects who achieved the outcome value 1 (e.g., responders).
- Parameters:
rp0 (
dict) – Dictionary with a key ‘long_bin’ containing a list of binary variable names.dt (
DataFrame) – pd.DataFrame with at least columns ‘REPI’, ‘TYPE’, ‘Variable’, ‘DV’, ‘TIME’, and optionally stratification variables.mode (
Optional[str]) – String, usually “Reconstructed”, used for filtering TYPE.dt_cs (
Optional[DataFrame]) – Optional list of counterfactual DataFrames.dt_cs_label (
Optional[List[str]]) – Optional list of labels for the counterfactual data.strat_vars (
Optional[List[str]]) – Optional list of column names for stratified (faceted) plots.time_unit (
Optional[str]) – A string representing the unit of time to display on the x-axis label (e.g., “Months”, “Days”, “Hours”).real_label (
Optional[str]) – Label for the real data (default: “Real Data”).syn_label (
Optional[str]) – Label for the synthetic data (default: “Synthetic Data”).save_path (
Optional[str]) – Optional path to a folder. If provided, saves each plot as a PNG. If not provided, plots will be shown interactively.width (
Optional[int]) – Width of the saved plot in inches (used only if save_path is provided).height (
Optional[int]) – Height of the saved plot in inches (used only if save_path is provided).dpi (
Optional[int]) – Resolution (dots per inch) of the saved plot (used only if save_path is provided).as_png (
Optional[bool]) – set to True if you want the plot to be saved as png
- Return type:
Dict[str,ggplot]- Returns:
Dictionary mapping each variable name to its ggplot object.
- syndat.rct.visualization_clinical_trials.raincloud_continuous_list(rp0, dt, mode='Reconstructed', static=False, strat_vars=None, dt_cs=None, dt_cs_label=['Counterfactual'], real_label='Real Data', syn_label='Synthetic Data', save_path=None, width=8, height=6, dpi=300, as_png=False)
Generates and optionally saves raincloud plots for continuous observed vs reconstructed variables
- Parameters:
rp0 (
dict) – Dictionary with a key ‘long_cont’, ‘static_cont’ containing a list of continuous variable names.dt (
DataFrame) – DataFrame with the columns ‘REPI’, ‘TYPE’, ‘Variable’, ‘DV’, ‘SUBJID’, ‘TIME’ and optionally others.mode (
Optional[str]) – String, usually “Reconstructed”, used for filtering TYPE.static (
Optional[bool]) – If True, plots for static variables will be obtained.strat_vars (
Optional[List[str]]) – Optional list of variables to use for facetting.dt_cs (
Optional[DataFrame]) – Optional list of counterfactual DataFrames.dt_cs_label (
Optional[List[str]]) – Optional list of labels for the counterfactual data.real_label (
Optional[str]) – Label for the real data (default: “Real Data”).syn_label (
Optional[str]) – Label for the synthetic data (default: “Synthetic Data”).save_path (
Optional[str]) – Optional path to folder where plots should be saved. If not provided, plots are shown.width (
Optional[int]) – Width of the saved plot in inches (used only if save_path is provided).height (
Optional[int]) – Height of the saved plot in inches (used only if save_path is provided).dpi (
Optional[int]) – Resolution (dots per inch) of the saved plot (used only if save_path is provided).as_png (
Optional[bool]) – set to True if you want the plot to be saved as png
- Return type:
Dict[str,ggplot]- Returns:
Dictionary of ggplot objects keyed by variable name.
- syndat.rct.visualization_clinical_trials.raincloud_plot(plt_dt, var_name, strat_vars=None, real_label='Real Data', syn_label='Synthetic Data', dt_cs_label=[])
Generates a raincloud plot (violin + boxplot + jitter) comparing Observed vs Reconstructed data.
- Parameters:
dt – DataFrame with columns ‘TYPE’, ‘DV’ and optional stratification vars.
var_name (
str) – Name of the variable to use as the plot title.strat_vars (
Optional[List[str]]) – Optional list of variables to use for facetting.real_label (
Optional[str]) – Label for the real data (default: “Real Data”).syn_label (
Optional[str]) – Label for the synthetic data (default: “Synthetic Data”).dt_cs_label (
Optional[List[str]]) – Label for the counterfactual data (default: []).
- Return type:
ggplot- Returns:
ggplot object.
- syndat.rct.visualization_clinical_trials.trajectory_plot(plt_dt, var_name, strat_vars=None, time_unit='Months', achievement_plot=False)
Creates a ribbon plot of the median and 5th-95th percentiles of a continuous variable over time.
- Parameters:
plt_dt (
DataFrame) – DataFrame with summary statistics (‘med’, ‘p5’, ‘p95’) by Visit and TYPE.var_name (
str) – Name of the variable to use as the plot title.strat_vars (
Optional[List[str]]) – Optional list of variables to use for facetting.time_unit (
Optional[str]) – A string representing the unit of time to display on the x-axis label (e.g., “Months”, “Days”, “Hours”).achievement_plot (
Optional[bool]) – If True, plot percentage of subjects achieving the outcome over time; if False, plot median with ribbons.
- Return type:
ggplot- Returns:
ggplot object.
- syndat.rct.visualization_clinical_trials.trajectory_plot_list(rp0, dt, mode='Reconstructed', bins=None, dt_cs=None, dt_cs_label=['Counterfactual'], strat_vars=None, real_label='Real Data', syn_label='Synthetic Data', time_unit='Months', save_path=None, width=8, height=6, dpi=300, as_png=False)
Generates and optionally saves ribbon plots for continuous variables across visits.
- Parameters:
rp0 (
dict) – Dictionary with key ‘long_cont’ containing a list of variable names to plot.dt (
DataFrame) – Main DataFrame containing data for “Observed” and mode.mode (
Optional[str]) – String, usually “Reconstructed”, used for filtering TYPE.bins (
Optional[ndarray]) – Optional array of visit cutoffs. If None, uses unique TIME values in dt.dt_cs (
Optional[DataFrame]) – Optional list of counterfactual DataFrames.dt_cs_label (
Optional[List[str]]) – Optional list of labels for the counterfactual data.strat_vars (
Optional[List[str]]) – Optional list of stratification variables for facetting.time_unit (
Optional[str]) – A string representing the unit of time to display on the x-axis label (e.g., “Months”, “Days”, “Hours”).real_label (
Optional[str]) – Label for the real data (default: “Real Data”).syn_label (
Optional[str]) – Label for the synthetic data (default: “Synthetic Data”).save_path (
Optional[str]) – Optional path to save plots. If None, plots are printed to console.width (
Optional[int]) – Width of the saved plot in inches (used only if save_path is provided).height (
Optional[int]) – Height of the saved plot in inches (used only if save_path is provided).dpi (
Optional[int]) – Resolution (dots per inch) of the saved plot (used only if save_path is provided).as_png (
Optional[bool]) – set to True if you want the plot to be saved as png
- Return type:
Dict[str,ggplot]- Returns:
Dictionary of ggplot objects keyed by variable name.