countplot

countplot(data: DataFrame, column: str, ax: Optional[Axes] = None, order: Union[str, List] = 'auto', max_levels: int = 30, flip_axis: Optional[bool] = None, label_rotation: Optional[int] = None, percent_axis: bool = True, label_counts: bool = True, label_fontsize: Optional[float] = None, include_missing: bool = False, percent_denominator: Optional[int] = None, add_other: bool = True, **kwargs) Axes

Plots a bar plot of counts/percentages across the levels of a discrete data column in a pandas DataFrame.

Wraps seaborn’s countplot adding annotations, twin axis for percents, and a few other nice argument controls useful for EDA.

Parameters
  • data – pandas DataFrame with data to be plotted

  • column – column in the dataframe to plot

  • ax – matplotlib axes to plot to. Defaults to current axis.

  • order

    Order in which to sort the levels of the variable for plotting:

    • ’auto’: sorts ordinal variables by provided ordering, nominal variables by descending frequency, and numeric variables in sorted order.

    • ’descending’: sorts in descending frequency.

    • ’ascending’: sorts in ascending frequency.

    • ’sorted’: sorts according to sorted order of the levels themselves.

    • ’random’: produces a random order. Useful if there are too many levels for one plot.

    Or you can pass a list of level names in directly for your own custom order.

  • max_levels – Maximum number of levels to attempt to plot on a single plot. If exceeded, only the max_level - 1 levels will be plotted and the remainder will be grouped into an ‘Other’ category.

  • percent_axis – Whether to add a twin y axis with percentages

  • label_counts – Whether to add exact counts and percentages as text annotations on each bar in the plot.

  • label_fontsize – Size of the annotations text. Default tries to infer a reasonable size based on the figure size and number of levels.

  • flip_axis – Whether to flip the plot so labels are on y axis. Useful for long level names or lots of levels. Default tries to infer based on number of levels and label_rotation value.

  • label_rotation – Amount to rotate level labels. Useful for long level names or lots of levels.

  • include_missing – Whether to include missing values as an additional level in the data to be plotted

  • kwargs – Additional keyword arguments passed through to [sns.barplot](https://seaborn.pydata.org/generated/seaborn.barplot.html)

Returns

The axes plot was drawn to

Examples

(Source code, png, hires.png, pdf)

../_images/intedact-univariate_plots-countplot-1.png