Interpreting Data

Kaiser Fung is a statistician with more than a decade of experience in applying statistical methods to unlocking the relationship between advertising and customer behaviors. His blog, “Junk Charts,” pioneered the genre of critically examining data and graphics in the mass media. He is an adjunct professor at New York University where he teaches practical statistics to professionals and holds statistics, business, and engineering degrees from Cambridge, Harvard, and Princeton Universities. Fung is also a fellow of the Royal Statistics Society. One of his books, Numbers Rule Your World: The Hidden Influence of Probabilities and Statistics on Everything You Do was created in the popular tradition of eye-opening bestsellers like Freakonomics, The Tipping Point, and Super Crunchers.

Download PDF Transcript of Podcast

Note: This is a transcription of a podcast. It has not gone through a professional editing process and may contain grammatical errors or incorrect formatting.

Related Podcast: Interpretations of Data

Transcription of the Podcast

Joe Dager:  Welcome, everyone. This is Joe Dager, the host of the Business 901 Podcast. With me today is Kaiser Fung. Kaiser is a professional statistician who holds degrees from Cambridge, Harvard, and Princeton Universities. He is a statistician for Sirius XM radio and applies his method to advertising and customer behavior. Kaiser, I’d like to welcome you to the podcast, and you’ve just written a new book that is called “Numbers Rule Your World.” Could you tell us why you wrote the book and what were you hoping for someone to get out of it?

Kaiser Fung:  Yes. Thanks for having me here, Joe. Well, the book “Numbers Rule Your World” is really a book about statistical thinking applied to real world problems. It takes a storytelling approach very similar to what you see in Freakonomics or in any of the Malcolm Gladwell books, there’s a little mathematics in there. The point that I’m trying to make at a very high level is that oftentimes in practice it is the communication of the statistical information that is the most important to influencing any decisions. The math itself is in the background and obviously you want to do it right, but once that’s done there’s a lot of stuff that has to happen in order to make an impact on the world.

I think that’s the direction that I would like to take the conversation in regards to the whole business analytics area. Now that we have found the great insight, what are we going to do with it?

Joe:  We go back to most people don’t really understand numbers?

Kaiser:  I think we all have some intuitive understanding of numbers. However, there are a lot of traps in terms of interpreting numbers. Unfortunately, I think schools don’t do a really good job in conveying that piece of information. So that’s actually another motivation for writing this sort of book is that if you open up a statistics textbook or any sort of introduction to statistics, there’s a way in which the materials are organized, and it’s typically organized by technical methodology. You have a chapter on causative intervals. You have a chapter on hypothesis testing. You have a chapter on regression techniques and so on, so forth. Those are not concepts; those are techniques. What is inside all these techniques are very important concepts that are needed to think about numbers.

My book is organized around five concepts. Things like thinking about variability as opposed to averages, figuring out how to form comparison groups, how do you compare like with like, big concepts like that. Typically ? then, read what I’ve written and go back to a textbook, it will be very challenging to find which parts of the book actually is the concepts that I talk about and yet those concepts are really what people should be thinking about when they think about numbers.

Joe:  I realize, after reading your book, it kind of reinforces a few things. You start reading a little bit differently. You start picking things out, “Well, heck, that guy’s talking about an individual, he’s not really talking about, for lack of a better word, the average person.”

Kaiser:  I think what happens is that when you are reading some of these other textbooks and so on, so forth, what happens is that you’re kind of so immersed in the math and so immersed in the numbers that you can’t really see what you’re doing. Oftentimes it’s not the case where they don’t teach you anything about the concepts. It’s thought of sitting in the background. What I’m trying to do is to expose and make these things apparent and put them as the foreground as opposed to the background. So we’ve essentially flip-flopped what is traditionally the way that this is presented. So all of the methodologies and the math are kind of sitting in the background now, and I’m exposing the parts that are really important to get. Even if you don’t get the math, you could at least get how to look at the world.

Oftentimes you notice that these are things that you probably are doing instinctively. It should just reinforce, “Here’s the theoretical framework behind why you would look at things in a certain way.”

Joe:  I don’t know if you do a lot of this instinctively. I think numbers, statistics, and probabilities are somewhat counterintuitive for most of us. We want to accept the story about something and that story usually comes from an individual. I think you mentioned in one of the subheadings of your chapters, “When you’re ready to get off the flight the pilot comes on and says, “You’re just got done with the safest part of your journey, “” but that was, “Now, go drive home.” That really brings a lot of truth to what you were saying.

Kaiser:  Yeah. I think you’re right in the sense that if you just go by intuition, intuition leads us to a lot of wrong places. But on the other hand, oftentimes the intuitive part is not completely wrong if we only realize a couple of things then we realize that the typical way of looking at things or framing the problem is what causes us to come to a wrong conclusion. So the whole fear of flying is often based on somebody telling you about specific plane crashes or looking at certain tables of, “Here are the top ten worst air disasters ever.” What happens is that somebody’s framed the discussion and makes us focus only on the 10 things that went wrong as opposed to the tens of millions of flights that happen every day that actually went without issues.

In that case, if you were to take as your basis of analysis only the 10 things that went wrong, then you apply the same statistical thinking only to 10 things, then you would have come to the conclusion that flying is very dangerous. But if you actually apply the same type of thinking to the largest set of all flights that actually went out there, then you would have arrived with a different conclusion.

I guess it goes back to the point of the math is probably not the part where things go wrong. The formulas are the formulas; it’s how you apply it. Are you applying it to the right things and are you taking into account the specific context of application? A lot of those contextual factors are things that determine whether you end up with the correct conclusion or an incorrect conclusion.

Joe:  You spend a lot of time, I think, talking about variability. Or maybe not a lot of time, but you start out with that because that really is whether numbers work or not.

Kaiser:  Yes. I think somewhere in the book I have said that you could, in fact, define the field of statistics as the study of variability because what happens is that if there’s no variability, if everything is just like the average, there really is nothing to study because every data set would literally have the same number. The whole field of statistics is there because things don’t happen the same way. It is hard for people to accept. I oftentimes think we like certainty and it’s very difficult to accept, for instance, say in business you like your metrics to be stable, but then it gets difficult to accept the fact that if the whole company pursues the exact same strategy and you don’t change a thing from one month to the next month, the metrics are not going to look exactly the same. There’s some natural fluctuation based on things that you can’t control.

So oftentimes, in my work, you have to convince people that if you’re going to do something different that something different has to be much better than what normally might happen. How do you determine if the difference that you are observing after you changed your ways? Is that real difference or is that just a difference that might happen regardless of anything different that you are doing. So with certainty we are trying to get people to accept the fact that there is some background variability no matter what, it is a big challenge.

Joe:  In sales and marketing we talk about forecasts, most of us don’t understand what a forecast means because there is a lot of uncertainty in forecasting. Can you really manage uncertainty without a good grasp of numbers?

Kaiser:  I think that’s sort of the key. One of the key ideas behind statistics, which when I teach my students at NYU that is sort of one of those big points I want to convey. Statistics is not there to eliminate or to cure variability. Variability is something that’s out there that cannot be cured. What statistics does is to create a way for us to measure the degree of variability. Because the whole point is that if you don’t know, as you are unable to measure it, you don’t know how large or how small the variability is. You cannot find ways to minimize it. So it is certainly true that the whole point is to try to figure out how big it is.

Being able to measure it is the first step to controlling that. Often times all you are left with is to establish this is the range of expectations. But you cannot really get rid of the fact that things do change, whether you like it or not.

Joe:  I think statistics allow you really to embrace uncertainty.

Kaiser:  Yeah, I think that’s a very big change from, I started out in the engineering world, and you find two very different ways of thinking about the world. If you want to jump then to statistics, you really have to start from the point of view that pretty much anything that you are looking at is uncertain. For things that have small levels of uncertainty, it is not very harmful to look at averages. You won’t make the wrong decision just based on the averages. But then the other things where the uncertainty is much higher and you end up making wrong decisions, you don’t account for that.

Joe:  We have statistics for everything now. We are flooded with statistics; we are flooded with data. Marketing companies, the smart companies are really embracing them using them to their advantage. But where do you see an average company, do they really understand the statistics out there that are available to them and are they using them effectively?

Kaiser:  Yes, I think this whole field of what I would call business analytics is still kind of developing. I would say the average company is probably not using them in the best way possible, and I think as a field, we are still trying to find a way to become more influential in the boardroom. That’s one of the reasons why, if you look at some of the books that I write, and the two blogs that I publish, one of which is Junk Charts, which is about specializing data. And the other one is just about how to communicate statistics information in sort of a written format.

All are about how do you communicate your information the most effective way to people that are not necessarily technical thinkers or quantitative thinkers. Because ultimately, we can’t really change anything in the world and also in the business world without the ability to influence people who do not necessarily think in a quantitative way.

Any important decisions are going to involve people who either think intuitively or who are not trained in any quantitative skills. I think in the past there is this and probably in the present too in certain settings there are people who feel that well, if they don’t know anything about math, we first need to teach them the math and then they will understand what we are talking about.

I think that’s just a completely wrong way to approach it. It is kind of like saying if you want to go do business with a local company in China, you should first teach them English, and then you speak to them in English. I think one of the big challenges is going to be how we speak to non-technical people in a language that works. That’s one of the keys I think to become more effective.

Joe:  So what languages do we speak to them? How do we do that?

Kaiser:  I think that’s kind of what I wanted to show in Numbers Rule Your World. That’s where some of the reviewers actually noticed that despite that title, there’re not a whole lot of numbers. There is certainly no statistical form of anything in the book which is that we really should speak to people in English, and in the language business as opposed to in the language of math and language of statistics. We obviously need the language of math and language of statistics actually to analyze the data and to come to conclusions. But once we know what the data is telling us and once we know what the insights are, we should think of the communication aspect as a whole other task that is separate from the task of doing math.

I am also hoping that there is a shift in emphasis even in the books written about analytics from books that focuses on how do you get insights to kind of things like my book where it focuses on what is the context behind a decision making and how these decisions are actually made.

If you think about books like Freakonomics or Malcolm Gladwell, and they’ve done a great job for us in terms of expanding people’s vision and also getting the world excited about data-driven decision making.

But if you look at kind of how they present their stories, typically they end at the point where OK this is how they actually look at the data, and this is what they found. In practice, that’s just the beginning of the hardest part of the job is how do you now get other people to embrace these insights. How do you now get these insights to influence either a public policy or business decision?

Joe:  I think that’s what’s kind of scary to a lot of people and the average guy that is maybe sitting in the meeting is that someone that’s good with numbers, some statistician can make the number say whatever they want them to say. Can’t they?

Kaiser:  Well, that’s yes. I would say yes and no.

Joe:  Well, I can’t say that literally.

Kaiser:  It is certainly true that statistics is not a black and white field and that’s kind of also something that statisticians from sort of pure mathematician, although statistics is often considered part of mathematics, it is a very different field in the sense that we have set uncertainties one the prerequisites, one of this thing that just sit there, that cannot be changed about the field of statistics. The other thing that cannot be changed is there being complete information. We don’t need to do statistics if you have complete information about everything. Almost always the data that we are looking at is going to be incomplete in certain ways. We have to embrace that too. When you have incomplete information, there is just no way to know the definitive truth. So everything is some kind of a guess. There are better guesses, and there are worse guesses. There is definitely a difference between a good statistician and a bad statistician.

It is still the case that even if you are a good statistician, whatever you say is your best guess. It goes back to why I keep saying, what has asked make us that’s important because how do you tell if it’s a good statistician or a bad statistician? Judge whether the suspicions that were made based on the statistician’s recommendations would turn out to be effective or not effective.

But ultimately you can say whatever you want in coming over the insights. If your insights are not actually giving you results, then you know that you cannot trust whoever is doing the numbers.

Joe:  I come from a field that, and in Lean especially more so than even Six Sigma a little bit, but in Lean … an old saying goes, if you’re not visual you’re not Lean. To portray numbers, a lot of it is about creating charts or something that’s very visual to someone to understand the numbers. Because nobody wants to see a list of numbers, that’s hard to digest. They want to see the result of the numbers and that’s usually typically done in a chart. Is that what Junk Charts is all about? Do you discuss how to make them and the pros and cons of charting?

Kaiser:  When I first started out, one of the things that I definitely wanted to avoid is to create a blog that just criticizes and does nothing constructive. I mean, you’ll notice the same vein in Numbers Rule the World book. I don’t really want to write a book about Lies, Damm Lies, and Statistics because there are lots of books out there about it. I also feel that it doesn’t help anyone do number better if all you are saying is how they have done wrong. What the stories in my book tell you are about how people are using numbers actually to make a difference in our world, basic concepts that you can take with you and apply to the real world, as opposed to just criticizing.

Also this idea of Junk Chart, you can actually reconstitute something valuable out of something that may not be so useful. Most of the time I strive to not only talk about what’s good and what’s bad about a chart that I either I find in my reading or that readers send to me, but I try to come up with different ways of visualizing the information.

Now, it may not always be better. Some readers will criticize what I do as well. But the whole practice of visualization is that you just have to do a lot of trial and errors. You need to visualize the data in multiple ways. Eventually you will find a way that really brings out the message of the data. I find that without just creating multiple versions of a chart, it’s hard to actually know which one was most successfully.

Joe:  When you looked at the charts, and of course on the blog you look at a lot of charts. Do you feel there’s a lot of room for improvement in that area?

Kaiser:  Unfortunately it’s sort of bad habits are very difficult to correct. I think some of the things that I talk about have been talked about by people for decades. You still see the same problems showing up constantly. Types of charts that are used a lot of these bubble charts they use circular areas to encapsulate data. Unfortunately, it’s been proven time and time again in experiments that we just don’t have a good grasp of relative sizes of circles. Anybody who presents such a chart is going to end up conveying the wrong information about the relative comparisons that such charts are typically created to do.

You keep seeing those same charts out there, and it’s a pretty pervasive problem. I try to counter that by showing. If you have visualized it in a different format and not using a bubble chart, things would actually look a lot closer to reality.

Something that I talk about constantly is what I call this self-sufficiency test. Which is that you can pretty much very quickly tell a bad chart by figuring out that, well, the designer of the chart put all of the data onto the chart at the same time that maybe you put bars or bubbles or other concepts. What that tells you often times is that the reason why all the data is printed on the chart is that if the data is not printed on the chart the chart itself doesn’t work.

If you think about it, if you’re going to visualize information, the point of visualizing is to use graphical elements to convey the data. If you need to print all of your data onto the same chart that means that your chart just doesn’t work. I’ve encouraged people to look at charts in this way. Look at a chart, try to visualize it without having to use the data itself and see if it actually works for you. If it doesn’t then the person picked the wrong chart.

Joe:  There’s no really a method in being able to determine the right and the wrong chart to use at the time. It’s really in what you’re trying to express, let’s say to that particular group and to that particular audience is your selection of charts, right?

Kaiser:  Yes, I talk about that a lot. In most posts I put up there, I start with: What is the question, why are we even looking at this data? What is the question that we’re supposed to address with the data? That’s possibly the number one thing. If you don’t know what you want to say, then you cannot possibly create a good chart. I have something what I have called the Tri?factor checkup on my blog, and that’s the very first question. One corner of the triangle is to try to figure out what is that you want to answer with the data.

Then the second point, the second corner is: Do you have the right data to address that question? If you’re showing me data but the data is only tangentially related to the question that you’re trying to address, that also guarantee a bad chart. But it’s just not going to give me a convincing answer to the question that I would like addressed.

The third part is that you pick the right format of a chart to convey the information that you have. That is were typically everyone is focused on that particular corner. What I think people would have to think beyond that corner is, as you said: What is it that we’re trying to address here anyway? Then the other part is: Do we actually have the right data?

That’s actually really important because I think that there’s too much analysis out there of what I would call convenient data sets. Data sets that you just happen to come across. I would much rather that the world starts with: What are the questions we would like to answer and then go out to find the right data to address those questions as opposed to the other way around, which is here’s the data, let’s figure out what’s interesting to look at because we have this data sitting around.

Joe:  What I see so many times is people selling those analytical packages, and they show all the nice charts. They show all the different things I’m going to get out of it. That tool is how we start interpreting our marketing data, and it’s like, “Time out!” It’s not what that tool’s is telling us, it’s all of those things within that tool, the things we need to know driven by our…

Kaiser:  Exactly. One of my recent posts on Junk Charts is about this interactive chart that was put up by one of the… I forgot who it was, one of these charitable organizations. It was supposed to visualize some survey results about peoples, I think, expectations of the economy in various parts of Asia. They surveyed different people in different industries. There, they created some kind of interactive thing where you can walk through different countries, different industries, and then it would show you data in some kind of a donut chart. The problem I have with that is exactly what you said, is that when they just find that navigation, that interactivity they pretty much have restricted how you can view the data.

Which means that in order for them to design this chart right they need to first talk to the readers or talk to the people who are going to use the chart and understand what kind of questions they want to answer. My main question is to understand the variability between different industries within a country. Or somebody else might be much more interested in the variability of countries, and they may not care about the availability within the industry.

You have first to start with having a deep understanding of all the types of questions that people might want to answer, and you need to have a prioritization of all these questions. Most people would like to address these five questions. Then maybe there are these other three questions that are of secondary importance.

You need to define then a navigation and interaction with those things in mind. The main problem I have was the execution of that particular chart, is that it presumes that people only look at things in a uni?variant way but in reality, in business most of the time people look at things in a multi-dimensional way. So, you kind of have to define the chart from that angle, which is kind of what I talk about and in the book.

Joe:  Without a good statistical background it’s pretty tough isn’t it, to do that?

Kaiser:  Yes. That’s kind of the challenge of what I’ve been talking all along is the challenge of communicating the findings. I think that’s where the statistic community has not done enough yet on. We are very obsessed with coming up with methodologies and coming of new techniques that may be marginally better than the existing techniques. That stuff is great for academia and research. Not when you’re talking about practice and business analytic. It’s much more about how do we even get some of these very basic concepts through to a much wider audience which does not come from the same background as the statistician. I totally acknowledge your point, and it’s very, very challenging.

When I wrote the book, there are parts of it that is very challenging too. I have a whole chapter that essentially conveys the concept of what a hypothesis test is and I chose to develop it in a way that has none of the jargon and none of the mathematics that’s typically used. I hope that some of the readers will find it to be a successful way of conveying that concept. It’s not a simple concept to grasp. Where I think this has to go is we have to come up with creative ways of conveying that information.

Joe:  You did a nice job in the book because I sat down and read the book, and it wasn’t one I had to labor through. I could sit down and read it, and also it brought some smiles along the way of “That’s not how I remember it being portrayed to me on the news or anything.” Or when I was standing in line at Disney, you talked about how they handled the queue, a few things like that. I reflected back to them on the things they did to not make the wait so long or appear so long, how they effectively managed that. They started doing all that with numbers.

Kaiser:  I think going back to something that I said before, in that case as well as… Even in that same chapter I talk about the highway incident in Minnesota, is that in all those cases they clearly identified what the business problem is or what’s caused the issues and then they literally have to spend a lot of sweat and energy and time to go and collect the data. The data just doesn’t come sitting there in a basket waiting for you to analyze. That’s really the tough part of a lot of businesses. If you are doing good work, you pretty much have to go through your own data. You have to go find the right pieces of information. Often times chose different ways to communicate the value of the work. This stuff that has been published would be published in technical journals or published for a more technical audience.

They describe their achievements the same way. But often times there are different ways of describing the same thing would actually in my view appeal more to a non-technical audience and next to capture the essence of what they are doing.

Joe:  Is there something that you would like to summarize or add to this conversation that I haven’t asked?

Kaiser:  I think the only thing would be that I would encourage people to come to my blogs. I’ve started a new blog since the book came out “Numbers Rule the World”. There’s a link to it from Junk Charts. But that’s where I… If you enjoy the chapters in the book that’s where I continue to apply the big concepts I describe in the book to all kinds of current affairs and real world things that are happening out there. It’s really kind of an extension to everything that I was trying to do.

I hope that they would join in this conversation as we discussed before, there’s plenty more work that needs to get done in terms of figuring out how do we describe and communicate this sort of information effectively, have all the answers and I hope that your listeners will join this conversation.

Joe:  I think you did a great job Kaiser because you started out recognizing the fact that it’s not everybody else’s job to learn the math. It’s your job to be able to present the math in a learnable fashion.

Kaiser:  Exactly. I think that eventually the academic community will come to this realization as well.

Joe:  It is so important today to do that because we are just flooded with data and how we manipulate that data and how we use that data is really going to be the companies that flourishes, I think.

Kaiser:  Yes, absolutely. We have seen a lot of signs of that. It’s an area that is on the rise, I mean lots of companies now realize that they need people who can read data accurately and hopefully this will become more and more of the norm in the upcoming years.

Joe:  Well, I’d like to thank you very much, Kaiser. This podcast is available on the Business901 iTunes store and also available on the Business901 website.

Kaiser:  Thanks a lot, Joe.

CAP-Do (More Info): What makes CAP-Do so attractive is that it assumes we do not have the answers. It allows us to create a systematic way to address the problems (pain) or opportunities (gain) from the use of our products and services.

Lean Marketing eBooks (More Info): Excerpt from the Lean Marketing House