Let’s pretend we have a function that creates a bar plot of categorical variable counts using a given data frame and a name of its column.

import pandas as pd

return pd.DataFrame({'x': list('aababc')})

def plot_count_bars(data: pd.DataFrame, column: str):
cnt = data[column].value_counts().rename('counts').reset_index()
return cnt.plot.bar(x='index', y='counts')



Looks good! However, what if we later create another function that generates a similar plot but using dots connected with a line instead of bars?

def plot_count_line(data: pd.DataFrame, column: str):
cnt = data[column].value_counts().rename('counts').reset_index()
return cnt.plot.line(x='index', y='counts', marker='o')



It works as well but see that the counting code is duplicated twice. So it is better to keep the computational logic separate from plotting. For example, create a new function called counts that takes a data frame and generates counts. Then, modify the plotting functions, so they take the pre-computed values and perform the rendering.

def counts(data: pd.DataFrame, column: str):
return data[column].value_counts().rename('counts').reset_index()

def plot_count_bars(cnt: pd.DataFrame):
return cnt.plot.bar(x='index', y='counts')

def plot_count_line(cnt: pd.DataFrame):
return cnt.plot.line(x='index', y='counts', marker='o')