If a static image can already be a good approach to understanding and explaining data, imagine an interactive chart! Instead of being limited to what the image previously generated, with an interactive graph the user can, usually, zoom in and/or zoom out, select a specific region, pan around, hide/show a specific line on the graph, and so on.
The Plotly project can do everything mentioned above. It is an open-source library for interactive charts and maps for Python, R, Julia, JavaScript, ggplot2, F#, MATLAB®, and Dash. The main differential of this library is that all generated plots are interactive by default, in addition to being easy to use and master.
To do so, Plotly generates these graphs as HTML files that can be opened on any browser or the output of a jupyter notebook. It is a great advantage, because you can save this file and share it with other people without relying on any other tool or interpreter.
As common in other libraries, Plotly has some modules available for the developers designed for a specific goal or approach. For the Python library, at least, the main ones are the express and graph objects.
plotly.express
is a high-level module built on top of the graph objects module. Using a comparison with other famous data visualization library, this module would be familiar with the Seaborn, because of its integration with Pandas Dataframes objects.
The next code shows an example of the structure of this module to create a simple chart.
plotly.graph_objects
, on the other hand, is a low-level module with a tree-like data structure, and it is like Matplotlib – discussed in the previous blog post of this series – since it is more robust and verbose.
The code below shows an example of the structure of this module to create a visualization.
Let’s check the main differences between the express and graph objects modules with a more complete example. In this example, we are going to create a box plot with the “tips” dataset available on the express module.
# Library import plotly.express as px # Data df = px.data.tips() # Plot fig = px.box( df, x="day", y="total_bill", color="smoker", template='plotly_dark' ) # Show fig.show()
First, we import the library and load the necessary data on the df
variable – which is going to be nearly identical to the graph objects version. In order to generate the figure, we use the px.box
method to create a box plot. The first argument is the dataframe and x
, y
and color
are the names of the columns that are used to change these properties. The template
argument can use a pre-defined theme for the visualization. Lastly, we use the fig.show()
method to show the result.
# Library import plotly.express as px # Data df = px.data.tips() # Plot fig = go.Figure( data=[ go.Box( name=i, x=df[df['smoker'] == i]['day'], y=df[df['smoker'] == i]['total_bill'] ) for i in df['smoker'].unique() ], layout=dict( xaxis_title='day', yaxis_title='total_bill', legend_title='smoker', boxmode='group', template='plotly_dark' ) ) # Show fig.show()
First, we import the necessary libraries. Since we are going to use the dataset available on the express module, we import it as well and load the dataframe. The definition of the figure object is where we see the main difference between the versions. We start by creating a go.Figure
object that has two main arguments. The data attitude receives the charts that are going to be present on the same figure and the layout attribute can change the design of this figure. At the data argument, we can pass a list with the visualizations that we are interested in. In order to repeat the same result of the plotly version, we are going to list through the available values in the “smoker” columns, use that as the name of the chart, filter the dataframe and select the appropriate column to the x
and y
arguments. The layout attribute receives a dictionary with the values that we would like to change. Plotly express configured this automatically, but here we need to change it manually. Pro tip: plotly allows to use of the “magic underscore” notation, e.g., axis_title='day'
is equal to axis=dict(title='day')
.
Both versions are going to produce the same result shown in the image below. We can see the options menu on the top right corner of the image that allows the user to save a PNG of the chart, zoom in, zoom out, pan, and reset the axis. Also, if we hover the cursor over the box plot, we can get more information about the chart.
In a nutshell, plotly.express
is great for simpler visualizations and explorations with datasets. Although the plotly.graph_objects
can look scary at first glance, for sure, it is powerful for more complex and customized visualizations. And belief or not, with time and experience, you get used to the graph_objects syntax, and probably it is going to be your way-to-go method.
The Plotly development team also developed a framework called Dash to create astonishing dashboards and applications. With a basic knowledge of web development, mainly the HTML syntax and structure, Python, and the Plolty library, anyone can create an amazing web page to show plots and interact with many users at the same time. But it is going to be discussed in a new post of this series.
I hope you enjoyed reading a bit more about data visualization and hope to see you in the next blog post.