A while ago I blogged about project Jupyter and in the last days I have been working a lot with it and I am still fascinated by its power.

Today I faced and solved two challenges I like to share here:

. plotting a pivot table

. changing legend entries

Assume we have the following dataframe:

Creating a pivot is a piece of cake by using the pandas pivot_table method on that dataframe:

Code: |

pivot = pd.pivot_table(df,index=["Org"],values=["Male employees","Female employees"], aggfunc=[len,np.mean,np.min,np.max,np.sum]) |

This gets us

. the number of departments per org ( = len Female employees or len Male employees )

. the sum of male and female employees per org ( = sum Female employees and sum Male employees )

. as well as mean, min and max

How to plot ?

We can simply save the pivot tables as a new dataframe ‘pivot’ and call its plot method. Let’s say we want to plot sum of male and female employees per org. First we need to drop the other statistics from the pivot table we don’t need for the plot. Then we plot:

Code: |

pivot.drop(['len','mean','amin','amax'],axis=1).plot(kind="barh") plt.show() |

Only problem here is that the legend entries of this plot look a bit cryptic. Here is some code to fix this:

Code: |

ax = plt.gca() handles,labels = ax.get_legend_handles_labels() new_labels = [] for l in labels: new_labels.append(l.split(",")[-1][:-1]) ax.legend(handles, new_labels) plt.show() |

I have shared the entire notebook here.