collection_summary
- collection_summary(data: DataFrame, column: str, fig_height: int = 1000, fig_width: int = 1000, top_entries: int = 10, sort_collections: bool = False, remove_duplicates: bool = False, display_figure: bool = False) Figure
Creates a univariate EDA summary for a collections column in a pandas DataFrame.
The provided column should be an object type containing lists, tuples, or sets.
- Parameters
data – Dataset to perform EDA on
column – A string matching a column in the data
fig_height – Height of the plot in inches
fig_width – Width of the plot in inches
top_entries – Max number of entries to show for countplots
sort_collections – Whether to sort collections and ignore original order
remove_duplicates – Whether to remove duplicate entries from collections
display_figure – Whether to display the figure in addition to returning it