With the release of Python inside Power BI, we, the Power BI team have come together to show you some of our favorite python packages. Python is a great addition to the Power BI family by providing you the ability to perform quick data transformation or plot cool data visualizations. You can even expand your Power BI reports further by bringing in sophisticated machine learning and AI.
5 Python Packages to try out;
Here is a list of some our favorite Python packages you can use today inside Power BI; A demo Power BI File showcasing all the visual examples along with their python scripts can be found here (the examples were build using anaconda);
Seaborn – built on top of the default matplotlib plotting library, seaborn offers a great extension to the matplotlib library allowing you to generate more complex plots quickly. Let’s take a look at an example from the bike rental demand dataset. If you wish to create a swarm plot of the bike demand categorized by season., it just takes a few simple lines of code.
Altair – a declarative library to generate plots. unlike other libraries which require you to create the legend, axis and labels. Altair aim is to alleviate some of these pain points and focus on the plot itself rather than specify every single element of the chart. By default, the axis/legend are generated based on the data you pass to the plotting function. Below is an example generated in PBI. Note: interactive visuals are not yet supported in PBI, if you would like interactive python visuals please do vote for this feature
Scikit-Learn– a python library to perform machine learning over your data. Using Power BI, you can now use the many python libraries to create your own machine learning models and easily operationalize them within your Power BI reports. One of the things you may want to do first is generate a matrix chart showing the correlations/histogram between each pairwise variables
FlashText – a performant library for searching and replacing words within a text column. Below is a diagram showing the relative performance of FlashText vs Regex. Finding keywords or replacing values can now be done in minutes rather than hours.
PyFlux & Pendulum – Make date time parsing a thing of the past with pendulum. This package coupled with PyFlux allows you to generate some awesome time series analysis
If you missed the link earlier, all the example visuals can be found here, follow me at @mohaali45. Happy scripting