collection_univariate_summary
- collection_univariate_summary(data: DataFrame, column: str, fig_height: int = 6, fig_width: int = 12, fontsize: int = 15, color_palette: Optional[str] = None, top_entries: int = 10, sort_collections: bool = False, remove_duplicates: bool = False, interactive: bool = False) Tuple[DataFrame, Figure]
Creates a univariate EDA summary for a provided collections column in a pandas DataFrame.
The provided column should be an object type containing lists, tuples, or sets.
- Parameters
data – Dataset to perform EDA on
column – A string matching a column in the data
fig_height – Height of the plot in inches
fig_width – Width of the plot in inches
fontsize – Font size of axis and tick labels
color_palette – Seaborn color palette to use
top_entries – Max number of entries to show for countplots
sort_collections – Whether to sort collections and ignore original order
remove_duplicates – Whether to remove duplicate entries from collections
interactive – Whether to display figures and tables in jupyter notebook for interactive use
- Returns
Tuple containing matplotlib Figure drawn and summary stats DataFrame