Case study: Visualising uncertainty
What is the issue?
Data values are not 'certain', and there is (almost always) some degree of uncertainty. For example, survey values are usually provided with Confidence Intervals showing the likely range of the data values. Even where data is not obtained from surveys - such as data on pupil exam results - there may be fluctuation over time due to changing populations (each year, a different set of pupils takes the exam).
When deciding whether values are different, the likely range of data values should be assessed to decide whether differences over time or between areas are likely to be significant. The Royal Statistical Society recommends that performance reporting should always include measures of uncertainty (although in practice this does not always occur!) 1.
How can visualisation help?
Visualisations can show uncertainty alongside data values, for example presenting the Confidence Intervals to indicate the range within which a survey result might lie. This visualisation of uncertainty can help with both analysing the data, and communicating results, for example:
- helping identify whether differences between areas, or over time, are likely to be significant;
- signposting that there is some uncertainty on what the 'true' value should be, and that the data value is not exact;
- highlighting which datasets are more reliable than others for decision-making (for example, more weight might be given in decision-making to those datasets with smaller confidence intervals - such as those based on bigger survey samples).
One of the key recommendations from US National Visualization and Analytics Center study into how visualisation can support "advanced analytic insight" was to develop methods and principles for representing data quality, reliability, and certainty measures throughout the data transformation and analysis process2.
Visualising Confidence Intervals
The most commonly used method for visualising uncertainty is to add Confidence Intervals to the chart used, as in the figure below. The size of Confidence Intervals are shown by the size of the lines above and below the actual data values. Where these overlap, differences are unlikely to be significant.
Source: Joint Strategic Needs Assessment, Surrey County Council
The figure shows mortality from alcohol-attributable conditions across England, Surrey and the Boroughs in Surrey. The size and overlap of the confidence intervals indicate that at Borough level, differences in levels of mortality from alcohol-attributable conditions between the Boroughs, or between men and women in each Borough, are unlikely to be statistically significant. However, differences between the genders are likely to be significant at county and national level (data for these larger areas is based on greater numbers of people, so confidence intervals are correspondingly smaller).
Funnel plots are a powerful way of visualising uncertainty and confidence intervals, particularly where performance data is compared against targets3. Funnel plots are slightly harder to interpret than confidence intervals, so the primary audiences for funnel plots are more likely to be analysts and researchers, rather than senior decision-makers (although of course, some decision-makers may want to see this kind of analysis - see the guide on practical steps for good visualisation for examples of testing your visualisation with key audiences).
Source: David Spiegelhalter, Medical Research Council Biostatistics Unit4
The figure above shows re-admissions following stroke, for large acute or multi-service NHS Trusts in England. Key elements of the plot include:
- Data is plotted for number of admissions (x-axis) against percentage of patients readmitted (y-axis).
- Institutions are not ranked.
- The national target for readmission (7.5%) is shown by the horizontal black line. The confidence intervals around this target are shown by the dotted lines - the plot takes into account the increased variability of the smaller units (the Confidence Interval bands are much wider for cases with fewer admissions, ie those to the left of the plot).
- Those data points that lie inside the dotted lines are not significantly different from the national target. Those that lie outside (the 2 outliers at the top of the chart) show significantly worse performance than the national target.
David Spiegelhalter identifies key advantages in using the funnel plots for comparing performance on outcomes5:
- There is no spurious ranking of institutions
- The eye is naturally drawn to important points that lie outside the funnels
- There is allowance for increased variability of the smaller units
- The axes are easily interpretable, so additional data points can be added by hand
- Repeated observations over time can be plotted on the funnel, and joined-up to show progress
- Easy to produce with standard spreadsheet programs6
How to create confidence interval and funnel plot visualisations
- Adding confidence intervals to Excel charts: Excel can add confidence intervals to standard bar-charts from the menu using Chart Tools/ Format/ Analysis/ Error Bars. Alternatively you can use the inbuilt Excel Stocks chart type - see this guide from UCLA for more information http://www.ph.ucla.edu/EPI/rapidsurveys/RScourse/chartconfinterval.pdf.
- Creating funnel plots: The Eastern Regional Public Health Observatory (ERPHO) have setup an Excel template with instructions for how to create funnel plots at http://www.erpho.org.uk/viewResource.aspx?id=14838.
1. Royal Statistical Society Working Party on Performance Monitoring in the Public Services (2003). Performance Indicators: The Good, Bad and Ugly. Available from http://www.rss.org.uk/main.asp?page=1222
3. Eg, see Spiegelhalter, D. Funnel plots for comparing institutional performance. 2004. Cambridge, MRC Biostatistics Unit. http://www.mrc-bsu.cam.ac.uk/BSUsite/AboutUs/People/davids/funpap.pdf (PDF)
4. Shown in ERPHO (2003). Quantifying performance: using performance indicators. http://www.erpho.org.uk/Download/Public/6990/1/INPHO%204%20Quantifying%20performance.pdf (PDF)
5. Spiegelhalter, D. Funnel plots for comparing institutional performance. 2004. Cambridge, MRC Biostatistics Unit. http://www.mrc-bsu.cam.ac.uk/BSUsite/AboutUs/People/davids/funpap.pdf (PDF)
6. See http://www.erpho.org.uk/viewResource.aspx?id=14838 for instructions and template to create in Excel.