Now that you visualize data in your own studio, what does your job look like?
As a freelancer, I have the freedom to use my time in three important ways: research for personal projects, lectures, and project development for clients. The main difference is that this kind of work allows for a more flexible approach to projects, because it demands the ability to comprehend the specific area of work of each individual client, understand the systematization of their data and their questions, programs, and data culture. This means that each project is different and presents me with interesting challenges, topics, and methods.
How did you decide to focus on visualization full time?
I have been doing data visualization in one way or another for over ten years now. For most of my career, it has sort of been a work hobby that my colleagues noticed and appreciated. At BBC, I got the opportunity to transform that “work hobby” into an official work position. Last year, I decided to create the The Synthesis Bureau so that I could fully devote myself to visualization.
I can’t imagine a more interesting profession in a more interesting period. Visualization combines several disciplines which were previously not considered to be connected. Statistics and design, programming and the theory of art, journalism and mathematics. This means that I constantly have to learn something new, I have opportunities to explore different methods and try new concepts and ideas.
Which problems are you faced with in the process of selecting data and visualizations?
It may be a cliche, but it’s frequently true that 80% of effort in data visualization is devoted to data preparation. Data often comes in complicated formats and are not adapted for machine learning, which represents a technical challenge. I find conceptual challenges interesting as well. How was the data collected? Is it data we have already seen? Is there a new way to present this data? What is missing in the data and how to we show it? Is the data unveiling something unexpected? How much do I trust the data source? Is the data important for a topic that is currently relevant?
What are the issues that arise in the process of perceiving and reaching conclusions based on inadequate analyses and visualizations?
Data visualization seems like a simple process, however, it is extremely easy to make mistakes. One of the more classic examples is the statement of Edward Tufte on how NASA engineers could have prevented the explosion of the space shuttle Challenger with better data visualization (this topic has been widely discussed within the community). In everyday visualization, it is important to fully comprehend the data, find the best way to present the data and, this is especially important, focus on how the readers of the visualization will interpret the presented data. In all three steps, it’s possible to make mistakes. When comprehending the data, it’s possible that there are mistakes in data collection methods which should be understood and explained to the readers. In presenting data one could, for an example, use a graph with 3D effects which decrease the readability of the visualization. And finally, one should understand the users of the visualizations. For example, a specific use of colors can be unclear for users who are color blind. There are cases when readers don’t understand the terminology, acronyms, and presented measurements, or it could happen that we run into issues with user experience design - for example, users might not understand how to use the interactive components of the visualization.
It is important to say that all of us still have a lot to learn about data visualization. The Economist magazine, which has a long tradition of innovatively presenting data, published a list of examples where they think they made mistakes. In the last ten years, we’ve seen fantastic work in the academic sector which is trying to set visualization standards that are easy to implement. For example, it’s good to follow Multiple Views which publishes the principles researched by the academic community or Data Visualisation Society which started an excellent discussion on the standards in this profession.
How do we popularize open data use?
Have in mind that the data visualization community loves a good challenge. Makeover Monday and Storytelling With Data Challenge are excellent examples of initiatives which use public data to present interesting stories using data visualization. I personally find inspiration in projects that connect data experts with charity organizations that are faced with particular challenges or questions. This means that open data use gets a very pragmatic dimension. There are several organizations dealing precisely with this topic. DATA4CHANGE and DataKind are great examples. Finally, open data that are accessible and easy to process will always be interesting candidates for visualization. Having quality metadata, a good data structure, expert commentary, clear labels, and standardized tables is extremely important in developing projects.
Given that successful examples are the best motivator, can you share with us a few examples where data visualization made a difference?
A classic, and my favorite, example is the graph that Florence Nightingale made during the Crimean war. In order to convince policymakers to invest into better hospital conditions, she created a graph presenting the causes of death in the Crimean war in three categories: death in the battlefield, deaths which could have been prevented, and deaths from other causes. When looking at the graph, it becomes clear that deaths from causes that could have been prevented is the greatest category and something that the British policymakers could change. This was the moment that transformed the understanding of health and sanitary conditions.