{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "The goal is to be able to show something similar to [Our World In Data](https://ourworldindata.org/grapher/share-of-adults-defined-as-obese?time=1975..2016&country=SAU)'s interactive map. This work was inspired by [Shivangi Patel's guide](https://towardsdatascience.com/a-complete-guide-to-an-interactive-geographical-map-using-python-f4c5197e23e0)\n", "\n", "We will work with the same dataset: Prevalence of obesity (BMI ≥ 30) among adults, estimated by country, standardised by age\n", "\n", "Data was obtained from the [Global Health Observatory data repository](http://apps.who.int/gho/data/node.main.A900A?lang=en) (World Health Organization): under `Download complete data set as` click on **more...**, then under **CSV** download `list containing text, codes and values`, or [click here](https://apps.who.int/gho/athena/data/data-verbose.csv?target=GHO/NCD_BMI_30A&profile=verbose&filter=AGEGROUP:*;COUNTRY:*;SEX:*)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's cleanup and organise the data. We will use the 'pandas' library for this." ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:17.710454Z", "start_time": "2019-06-09T16:40:17.247155Z" } }, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:17.887712Z", "start_time": "2019-06-09T16:40:17.712411Z" }, "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "Index(['GHO (CODE)', 'GHO (DISPLAY)', 'GHO (URL)', 'PUBLISHSTATE (CODE)',\n", " 'PUBLISHSTATE (DISPLAY)', 'PUBLISHSTATE (URL)', 'YEAR (CODE)',\n", " 'YEAR (DISPLAY)', 'YEAR (URL)', 'REGION (CODE)', 'REGION (DISPLAY)',\n", " 'REGION (URL)', 'COUNTRY (CODE)', 'COUNTRY (DISPLAY)', 'COUNTRY (URL)',\n", " 'AGEGROUP (CODE)', 'AGEGROUP (DISPLAY)', 'AGEGROUP (URL)', 'SEX (CODE)',\n", " 'SEX (DISPLAY)', 'SEX (URL)', 'Display Value', 'Numeric', 'Low', 'High',\n", " 'StdErr', 'StdDev', 'Comments'],\n", " dtype='object')" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read file\n", "data = pd.read_csv('data-verbose.csv')\n", "data.columns" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:17.946393Z", "start_time": "2019-06-09T16:40:17.890088Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YearCodeCountryPrevalence
01978UZBUzbekistan5.0
12003BDIBurundi2.8
21999CHNChina2.2
31996GHAGhana4.4
41992HNDHonduras9.3
\n", "
" ], "text/plain": [ " Year Code Country Prevalence\n", "0 1978 UZB Uzbekistan 5.0\n", "1 2003 BDI Burundi 2.8\n", "2 1999 CHN China 2.2\n", "3 1996 GHA Ghana 4.4\n", "4 1992 HND Honduras 9.3" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# discard male only and female only data\n", "data = data.loc[data[\"SEX (DISPLAY)\"] == 'Both sexes']\n", "# only keep columns of interest\n", "data = data[['YEAR (CODE)','COUNTRY (CODE)','COUNTRY (DISPLAY)','Numeric']]\n", "data.reset_index(inplace=True, drop=True)\n", "data.rename(columns={\n", " 'YEAR (CODE)': 'Year',\n", " 'COUNTRY (CODE)': 'Code',\n", " 'COUNTRY (DISPLAY)': 'Country',\n", " 'Numeric': 'Prevalence'\n", "}, inplace=True)\n", "data.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since we'll be coloring each country according to the corresponding obesity prevalence, we need access to the shape of each country. This is done using the 'geopandas' package and data from [natural-earth-vector](https://github.com/nvkelso/natural-earth-vector/tree/master/110m_cultural). Download all the files named \"ne_110m_admin_0_countries.*\"" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.320879Z", "start_time": "2019-06-09T16:40:17.948984Z" } }, "outputs": [], "source": [ "import geopandas as gpd" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.447389Z", "start_time": "2019-06-09T16:40:18.327076Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodegeometry
0FijiFJI(POLYGON ((180 -16.06713266364245, 180 -16.555...
1United Republic of TanzaniaTZAPOLYGON ((33.90371119710453 -0.950000000000000...
2Western SaharaSAHPOLYGON ((-8.665589565454809 27.65642588959236...
3CanadaCAN(POLYGON ((-122.84 49.00000000000011, -122.974...
4United States of AmericaUSA(POLYGON ((-122.84 49.00000000000011, -120 49....
\n", "
" ], "text/plain": [ " Country Code \\\n", "0 Fiji FJI \n", "1 United Republic of Tanzania TZA \n", "2 Western Sahara SAH \n", "3 Canada CAN \n", "4 United States of America USA \n", "\n", " geometry \n", "0 (POLYGON ((180 -16.06713266364245, 180 -16.555... \n", "1 POLYGON ((33.90371119710453 -0.950000000000000... \n", "2 POLYGON ((-8.665589565454809 27.65642588959236... \n", "3 (POLYGON ((-122.84 49.00000000000011, -122.974... \n", "4 (POLYGON ((-122.84 49.00000000000011, -120 49.... " ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# read shapes\n", "geo = gpd.read_file(\"ne_110m_admin_0_countries.shp\")[['ADMIN', 'ADM0_A3', 'geometry']]\n", "geo.columns = ['Country', 'Code', 'geometry']\n", "geo.head()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we display the map now, we will see that Antarctica takes a lot of space. Since we don't have data on it, let's drop it." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.453802Z", "start_time": "2019-06-09T16:40:18.449110Z" } }, "outputs": [], "source": [ "geo = geo.loc[~(geo['Country'] == 'Antarctica')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If we look closely at the data, we are missing information on some countries." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.465472Z", "start_time": "2019-06-09T16:40:18.455600Z" } }, "outputs": [ { "data": { "text/plain": [ "array(['San Marino', 'Sudan', 'Monaco', 'South Sudan'], dtype=object)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[data[\"Prevalence\"].isna()][\"Country\"].unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the case of Sudan, it's more of a labelling problem because Sudan was split in 2 separate countries in 2011." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.481996Z", "start_time": "2019-06-09T16:40:18.467460Z" } }, "outputs": [ { "data": { "text/plain": [ "array(['Sudan (former)', 'Sudan', 'South Sudan'], dtype=object)" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "data[data[\"Country\"].str.contains(\"Sudan\")][\"Country\"].unique()" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.494472Z", "start_time": "2019-06-09T16:40:18.483931Z" } }, "outputs": [ { "data": { "text/plain": [ "array(['Sudan', 'South Sudan'], dtype=object)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "geo[geo[\"Country\"].str.contains(\"Sudan\")][\"Country\"].unique()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In the current version of the dataset, only \"Sudan (former)\" contains data, but on our version of the map we only have the 2 independant states, not the former. We will simply copy the data from \"Sudan (former)\" in both new countries and drop the former." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.723627Z", "start_time": "2019-06-09T16:40:18.496054Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
YearCodeCountryPrevalence
16322016SSDSouth Sudan8.6
48482016SDNSudan8.6
\n", "
" ], "text/plain": [ " Year Code Country Prevalence\n", "1632 2016 SSD South Sudan 8.6\n", "4848 2016 SDN Sudan 8.6" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "for year in data[\"Year\"].unique():\n", " data.loc[(data[\"Country\"].isin([\"Sudan\",\"South Sudan\"])) & (data[\"Year\"] == year), \n", " \"Prevalence\"] = data[(data[\"Country\"] == \"Sudan (former)\") & (data[\"Year\"] == year)][\"Prevalence\"].values[0]\n", "data = data.loc[~(data['Country'] == 'Sudan (former)')]\n", "data[(data[\"Country\"].str.contains(\"Sudan\")) & (data[\"Year\"] == 2016)]" ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:18.768492Z", "start_time": "2019-06-09T16:40:18.725228Z" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
CountryCodegeometry
176South SudanSSDPOLYGON ((30.83385242171543 3.509171604222463,...
\n", "
" ], "text/plain": [ " Country Code geometry\n", "176 South Sudan SSD POLYGON ((30.83385242171543 3.509171604222463,..." ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Also, the 3-letter code for \"South Sudan\" is \"SSD\" and not \"SDS\" in the geographic data\n", "geo.loc[geo[\"Code\"]==\"SDS\", \"Code\"] = \"SSD\"\n", "geo[geo[\"Code\"]==\"SSD\"]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Preparing the plot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's create the interactive plot. We will use the 'bokeh' and 'matplotlib' libraries for this." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.243145Z", "start_time": "2019-06-09T16:40:18.770241Z" } }, "outputs": [], "source": [ "from bokeh.io import save, show, output_file, output_notebook, reset_output, export_png\n", "from bokeh.plotting import figure\n", "from bokeh.models import (\n", " GeoJSONDataSource, ColumnDataSource, ColorBar, Slider, Spacer,\n", " HoverTool, TapTool, Panel, Tabs, Legend, Toggle, LegendItem,\n", ")\n", "from bokeh.palettes import brewer\n", "from bokeh.models.callbacks import CustomJS\n", "from bokeh.models.widgets import Div\n", "from bokeh.layouts import widgetbox, row, column\n", "from matplotlib import pyplot as plt\n", "from matplotlib.colors import rgb2hex" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first thing we need to do is to group our data in predefined bins. We will assign each bin to a color." ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.295580Z", "start_time": "2019-06-09T16:40:19.244622Z" }, "code_folding": [], "scrolled": true }, "outputs": [], "source": [ "# Create bins to color each country\n", "bins = [0,2,5,10,15,20,25,30,100]\n", "# create stylish labels\n", "bin_labels = [f'≤{bins[1]}%'] + [f'{bins[i]}-{bins[i+1]}%' for i in range(1,len(bins)-2)] + [f'>{bins[-2]}%']\n", "# assign each row to a bin\n", "data['bin'] = pd.cut(\n", " data['Prevalence'], bins=bins, right=True, include_lowest=True, precision=0, labels=bin_labels,\n", ").astype(str)" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.319190Z", "start_time": "2019-06-09T16:40:19.297462Z" } }, "outputs": [ { "data": { "text/plain": [ "array(['Western Sahara', 'Falkland Islands', 'Greenland',\n", " 'French Southern and Antarctic Lands', 'Puerto Rico', 'Palestine',\n", " 'New Caledonia', 'Taiwan', 'Northern Cyprus', 'Somaliland',\n", " 'Kosovo'], dtype=object)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Merge the geographic data with obesity data\n", "df = geo.merge(data, on='Code', how='left')\n", "df = df.drop(columns=\"Country_y\").rename(columns={\"Country_x\":\"Country\"})\n", "df[df[\"Prevalence\"].isna()][\"Country\"].unique()" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.335127Z", "start_time": "2019-06-09T16:40:19.321603Z" } }, "outputs": [], "source": [ "# Add a 'No data' bin for countries without data on their obesity\n", "df.loc[df['Prevalence'].isna(), 'bin'] = 'No data'\n", "df.fillna('No data', inplace = True)" ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.353401Z", "start_time": "2019-06-09T16:40:19.337840Z" } }, "outputs": [], "source": [ "# Define a yellow to red color palette\n", "palette = brewer['YlOrRd'][len(bins)-1]\n", "# Reverse color order so that dark red corresponds to highest obesity\n", "palette = palette[::-1]\n", "\n", "# Assign obesity prevalence to a color\n", "def val_to_color(value, nan_color='#d9d9d9'):\n", " if isinstance(value, str): return nan_color\n", " for i in range(1,len(bins)):\n", " if value <= bins[i]:\n", " return palette[i-1]\n", "df['color'] = df['Prevalence'].apply(val_to_color)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Since Bokeh doesn't have an interactive colorbar, we will create one by plotting rectangles on a figure. This is a bit cumbersome because we need to define x coordinates and a width for each bin in our data, but I find the interactive colorbar to be very useful." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.374823Z", "start_time": "2019-06-09T16:40:19.355901Z" } }, "outputs": [], "source": [ "# assign x coordinates\n", "def bin_to_cbar_x(value):\n", " if value == 'No data': return -2\n", " for i,b in enumerate(bin_labels):\n", " if value == b:\n", " return 5*(i+1)\n", "df['cbar_x'] = df['bin'].apply(bin_to_cbar_x)\n", "# assign width\n", "df['cbar_w'] = df['Prevalence'].apply(lambda x: 5 if x == 'No data' else 4.7)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We will also add a second figure which displays the evolution of each country's obesity rate. We need to define another colorpalette for this." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.395974Z", "start_time": "2019-06-09T16:40:19.376973Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "165 countries to plot\n" ] } ], "source": [ "# create color palette for the graph\n", "countries = sorted(df[df[\"bin\"] != \"No data\"][\"Country\"].unique())\n", "n_country = len(countries)\n", "print(\"%d countries to plot\" % n_country)\n", "cmap = plt.get_cmap('gist_ncar', n_country)\n", "country_palette = [rgb2hex(cmap(i)[:3]) for i in range(cmap.N)]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Plotting" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now all that is left to do is to create the different objects that bokeh will display. Let's start with the datasources. We will define which year to display on the map first, as well as which country. " ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:19.399879Z", "start_time": "2019-06-09T16:40:19.397319Z" } }, "outputs": [], "source": [ "# define the output file\n", "reset_output()\n", "output_file(\"obesity-trends.html\", title=\"Obesity trends\", mode=\"inline\")" ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:23.477821Z", "start_time": "2019-06-09T16:40:19.401138Z" } }, "outputs": [], "source": [ "# Input sources\n", "df.sort_values(by=[\"Country\",\"Year\"], inplace=True)\n", "# source that will contain all necessary data for the map\n", "geosource = GeoJSONDataSource(geojson=df.to_json())\n", "# source that contains the data that is actually shown on the map (for a given year)\n", "displayed_src = GeoJSONDataSource(geojson=df[df['Year'].isin(['No data', 1975])].to_json())\n", "# source that will be used for the graph (we don't need the countries shapes for this)\n", "country_source = ColumnDataSource(df[df['Country'] == \"France\"].drop(columns=[\"geometry\"]))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The tools displayed with our map and graph." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:23.483041Z", "start_time": "2019-06-09T16:40:23.478975Z" } }, "outputs": [], "source": [ "# Tools\n", "\n", "# slider to select the year\n", "slider = Slider(title='Year',start=1975, end=2016, step=1, value=1975)\n", "\n", "# hover tool for the map\n", "map_hover = HoverTool(tooltips=[ \n", " ('Country','@Country (@Code)'),\n", " ('Obesity rate (%)', '@Prevalence')\n", "])\n", "\n", "# hover tool for the graph\n", "graph_hover = HoverTool(tooltips=[ \n", " ('Country','@Country (@Code)'),\n", " ('Obesity rate (%)', '@Prevalence'),\n", " ('Year', '@Year')\n", "])\n", "\n", "# button for the animation\n", "anim_button = Toggle(label=\"▶ Play\", button_type=\"success\", width=50, active=False)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now let's create the plot !" ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:23.512692Z", "start_time": "2019-06-09T16:40:23.484294Z" } }, "outputs": [], "source": [ "# create map figure\n", "p = figure(\n", " title = 'Share of adults who are obese in 1975', \n", " plot_height=550 , plot_width=1100, \n", " toolbar_location=\"right\", tools=\"tap,pan,wheel_zoom,box_zoom,save,reset\", toolbar_sticky=False,\n", " active_scroll=\"wheel_zoom\",\n", ")\n", "p.title.text_font_size = '16pt'\n", "p.xgrid.grid_line_color = None\n", "p.ygrid.grid_line_color = None\n", "p.axis.visible = False\n", "\n", "# Add hover tool\n", "p.add_tools(map_hover)\n", "\n", "# Add patches (countries) to the figure\n", "patches = p.patches(\n", " 'xs','ys', source=displayed_src, \n", " fill_color='color',\n", " line_color='black', line_width=0.25, fill_alpha=1, \n", " hover_fill_color='color',\n", ")\n", "# outline when we hover over a country\n", "patches.hover_glyph.line_color = '#3bdd9d'\n", "patches.hover_glyph.line_width = 3\n", "patches.nonselection_glyph = None" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:23.604836Z", "start_time": "2019-06-09T16:40:23.515127Z" } }, "outputs": [], "source": [ "# create the interactive colorbar\n", "p_bar = figure(\n", " title=None, plot_height=80 , plot_width=600, \n", " tools=\"tap\", toolbar_location=None\n", ")\n", "p_bar.xgrid.grid_line_color = None\n", "p_bar.ygrid.grid_line_color = None\n", "p_bar.outline_line_color = None\n", "p_bar.yaxis.visible = False\n", "\n", "# set the title and ticks of the colorbar\n", "p_bar.xaxis.axis_label = \"% Obesity (BMI ≥ 30)\"\n", "p_bar.xaxis.ticker = sorted(df['cbar_x'].unique())\n", "p_bar.xaxis.major_label_overrides = dict([(i[0],i[1]) for i in df.groupby(['cbar_x','bin']).describe().index])\n", "p_bar.xaxis.axis_label_text_font_size = \"12pt\"\n", "p_bar.xaxis.major_label_text_font_size = \"10pt\"\n", "\n", "# activate the hover but hide tooltips\n", "hover_bar = HoverTool(tooltips=None)\n", "p_bar.add_tools(hover_bar)\n", "\n", "# plot the rectangles for the colorbar\n", "cbar = p_bar.rect(x='cbar_x', y=0, width='cbar_w', height=1, \n", " color='color', source=displayed_src,\n", " hover_line_color='#3bdd9d', hover_fill_color='color')\n", "\n", "# outline when we hover over the colorbar legend\n", "cbar.hover_glyph.line_width = 4\n", "cbar.nonselection_glyph = None" ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:24.447675Z", "start_time": "2019-06-09T16:40:23.606643Z" } }, "outputs": [], "source": [ "# create the graph figure\n", "p_country = figure(\n", " title=\"Evolution of obesity\", plot_height=700 , plot_width=1100, \n", " tools=\"pan,wheel_zoom,save\", active_scroll=\"wheel_zoom\", toolbar_location=\"right\",\n", ")\n", "p_country.title.text_font_size = '14pt'\n", "p_country.xaxis.axis_label = \"Year\"\n", "p_country.yaxis.axis_label = \"Obesity rate (%)\"\n", "p_country.axis.major_label_text_font_size = \"12pt\"\n", "p_country.axis.axis_label_text_font_size = \"14pt\"\n", "\n", "# plot data on the figure\n", "line_plots = {}\n", "legend_items = {}\n", "for i, country in enumerate(countries):\n", " # get subset of data corresponding to a country\n", " country_source = ColumnDataSource(df[df['Country'] == country].drop(columns=[\"geometry\"]))\n", " # plot\n", " line = p_country.line(\"Year\", \"Prevalence\", legend=False, source=country_source, \n", " color=country_palette[i], line_width=2)\n", " circle = p_country.circle(\"Year\", \"Prevalence\", legend=False, source=country_source, \n", " line_color=\"darkgrey\", fill_color=country_palette[i], size=8)\n", " # used later in the interactive callbacks\n", " line_plots[country] = [line, circle]\n", " legend_items[country] = LegendItem(label=country, renderers=[line, circle])\n", " # only display France at first\n", " if country != \"France\":\n", " line.visible = False\n", " circle.visible = False\n", "\n", "default_legend = [\n", " (\"France\", line_plots[\"France\"]),\n", "]\n", "legend = Legend(items=default_legend, location=\"top_center\")\n", "legend.click_policy = \"hide\"\n", "p_country.add_layout(legend, 'right')\n", "\n", "# Add hover tool\n", "p_country.add_tools(graph_hover)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The interactivity will be done with JavaScript callbacks since they give much more liberty and we won't need to run a Bokeh server to display the map." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:24.456683Z", "start_time": "2019-06-09T16:40:24.448931Z" }, "code_folding": [], "scrolled": false }, "outputs": [], "source": [ "# JS callbacks\n", "\n", "# Update the map on slider change\n", "slider_callback = CustomJS(args=dict(slider=slider, source=geosource, displayed_src=displayed_src), code=\"\"\"\n", " var year = slider.value;\n", " var show = [year, 'No data'];\n", " var data = {};\n", " columns = Object.keys(source.data);\n", " columns.forEach(function(key) {\n", " data[key] = [];\n", " });\n", " for (var i = 0; i < source.get_length(); i++){\n", " if (show.includes(source.data['Year'][i])){\n", " columns.forEach(function(key) {\n", " data[key].push(source.data[key][i])\n", " });\n", " }\n", " }\n", " displayed_src.data = data;\n", " displayed_src.change.emit();\n", "\"\"\")\n", "slider.js_on_change('value', slider_callback)\n", "\n", "# Update figure title from slider change\n", "callback_title = CustomJS(args=dict(slider=slider, figure=p), code=\"\"\"\n", " var year = slider.value;\n", " figure.title.text = 'Share of adults who are obese in ' + year;\n", "\"\"\")\n", "slider.js_on_change('value', callback_title)\n", "\n", "\n", "# Add callback on country click\n", "plot_callback = CustomJS(args=dict(\n", " csource=country_source, source=geosource, displayed_src=displayed_src, line_plots=line_plots, legend=legend, legend_items=legend_items), code=\"\"\"\n", " // only continue if a country was selected\n", " var ixs = displayed_src.selected.indices;\n", " if (ixs.length == 0) { return; }\n", " \n", " // init\n", " var data = {};\n", " var items = [];\n", " countries = [];\n", " columns = Object.keys(source.data);\n", " columns.forEach(function(key) {\n", " data[key] = [];\n", " });\n", " \n", " // hide all plots\n", " for (var country in line_plots) {\n", " var line = line_plots[country][0];\n", " var circle = line_plots[country][1];\n", " line.visible = false;\n", " circle.visible = false;\n", " }\n", " \n", " // loop over the selected countries\n", " ixs.forEach(function(ix) {\n", " // identify corresponding country\n", " country = displayed_src.data[\"Country\"][ix];\n", " countries.push(country);\n", " });\n", " // sort them in order\n", " countries.sort()\n", " // display the corresponding glyphs and legend\n", " countries.forEach(function(country) {\n", " line = line_plots[country][0];\n", " circle = line_plots[country][1];\n", " line.visible = true;\n", " circle.visible = true;\n", " items.push(legend_items[country]);\n", " \n", " for (var i = 0; i < source.get_length(); i++){\n", " if (source.data['Country'][i] == country) {\n", " columns.forEach(function(key) {\n", " data[key].push(source.data[key][i])\n", " });\n", " }\n", " }\n", " });\n", " legend.items = items;\n", " csource.data = data;\n", " csource.change.emit();\n", "\"\"\")\n", "displayed_src.selected.js_on_change('indices', plot_callback)\n", "\n", "# add animation\n", "update_interval = 500 # in ms\n", "anim_callback = CustomJS(args=dict(slider=slider, update_interval=update_interval), code=\"\"\"\n", " var button = cb_obj;\n", " if (button.active == true){\n", " button.label = \"◼ Stop\";\n", " button.button_type = \"danger\";\n", " mytimer = setInterval(update_year, update_interval); \n", " } else {\n", " button.label = \"▶ Play\";\n", " button.button_type = \"success\";\n", " clearInterval(mytimer);\n", " }\n", "\n", " function update_year() {\n", " year = slider.value;\n", " if (year < 2016) {\n", " slider.value += 1;\n", " } else {\n", " slider.value = 1975;\n", " }\n", " }\n", "\"\"\")\n", "anim_button.callback = anim_callback" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, we define the layout for all these elements. We will have 2 tabs, one for the map, and one for the chart." ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:24.475205Z", "start_time": "2019-06-09T16:40:24.458064Z" } }, "outputs": [], "source": [ "# arrange display with tabs\n", "tab_map = Panel(title=\"Map\",\n", " child=column(\n", " p, # map\n", " p_bar, # colorbar\n", " row(widgetbox(anim_button), Spacer(width=10), widgetbox(slider)) # animation button and slider\n", " ))\n", "tab_chart = Panel(title=\"Chart\", child=column(p_country))\n", "tabs = Tabs(tabs=[ tab_map, tab_chart ])" ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "ExecuteTime": { "end_time": "2019-06-09T16:40:26.784784Z", "start_time": "2019-06-09T16:40:24.476885Z" }, "scrolled": true }, "outputs": [], "source": [ "# save the document and display it !\n", "footer = Div(text=\"\"\"\n", "Data: World Health Organization - Global Health Observatory
\n", "Author: Cédric Bouysset\n", "\"\"\")\n", "layout = column(tabs, footer)\n", "show(layout)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.7" }, "latex_envs": { "LaTeX_envs_menu_present": true, "autocomplete": true, "bibliofile": "biblio.bib", "cite_by": "apalike", "current_citInitial": 1, "eqLabelWithNumbers": true, "eqNumInitial": 1, "hotkeys": { "equation": "Ctrl-E", "itemize": "Ctrl-I" }, "labels_anchors": false, "latex_user_defs": false, "report_style_numbering": false, "user_envs_cfg": false }, "toc": { "base_numbering": 1, "nav_menu": {}, "number_sections": true, "sideBar": true, "skip_h1_title": false, "title_cell": "Table of Contents", "title_sidebar": "Contents", "toc_cell": false, "toc_position": { "height": "calc(100% - 180px)", "left": "10px", "top": "150px", "width": "259.261px" }, "toc_section_display": "block", "toc_window_display": false } }, "nbformat": 4, "nbformat_minor": 2 }