Hadooponomics: Why Data Science Isn’t IT and What’s Beyond Data Visualization (Podcast Transcript)

Hadooponomics Ep 12Listen to the original podcast.

James Haight: Welcome back, everyone, you’re listening to the Hadooponomics podcast, and I’m your host, James Haight, as always. Welcome back here today, we have a fun episode for you guys. It’s not every day I get a chance to interview a fellow podcaster. I interview podcast host of Analytics on Fire Ryan Goodman. And Ryan does a lot of things in addition to Analytics on Fire. He’s also the co-founder of the Analytics on Fire community, which a lot of you might be familiar with, and he also runs his own analytics consultancy. So what I love about him is he’s not only a longtime practitioner, and a top notch one at that, but he also has a chance to be this market observer and talk with some of the smartest people out there doing some really cool things with analytics and Big Data. And he gets a chance to mesh all of this together and he has a really awesome perspective that I really enjoy, and I think you guys will as well. In this episode we start with the evolution of going from business intelligence to Big Data analytics and what that means and what that doesn’t mean. And then we sort of talk about the inherent tension between IT and business that is so well documented. People always talk about there’s a disconnect there, and we go beyond that it just exists and we start to talk about why it exists, and what you can do to solve that, and what you can do to make it better. And I think there’s a lot of very nice, practical advice to take back into your own lives and your own day jobs.

And then finally we transition the episode, we go from that into talking about, what is the next step of Big Data analytics. And what we focus on is the evolution of data visualization and real time analytics as it’s applied to Big Data and what that means for companies. So a really cool show for you, a lot of great stuff. As always, we have the show notes up on the website, bluehillresearch.com/hadooponomics. We’re gonna link back to the Analytics on Fire podcast, Ryan’s social media channels, and a whole bunch of other resources. And then other than that, I just wanna make an announcement, this is going to be our last episode of season two of the Hadooponomics series. We’ll be back on for season three, it’ll start in about one and a half months. So those of you who have followed us all the way through, thanks for sticking with us, and we’ll be back shortly. So just keep an eye out for when we start airing these episodes again, and we really, really appreciate you guys taking time out of your day to listen.

So with that, without any further ado, let’s go dive straight into Ryan’s interview. All right, everyone, welcome back to the Hadooponomics podcast. Today we have Ryan Goodman on the show. Ryan’s an awesome guest, he’s the Founder and CEO of CMaps Analytics, and he’s also the co-founder of the Analytics on Fire community, which I suspect a lot of our audience is already aware of and probably an avid listener. So without any further ado, Ryan, welcome to the show.

Ryan Goodman: Thank you so much for having me, happy to be here.

James: So Ryan, as I mentioned, this is a real treat because we don’t always get to have fellow podcasters on the show. And you sit in a really interesting perch and you’re very active in that online analytics community. And so what I was hoping to do is, just tell us a little bit more about yourself and who you are, and why our audience might already know you, and then we’ll dive in from there.

Ryan: Yeah, absolutely. So my day-to-day job or activities is spent with CMaps Analytics where I basically get to help customers build location analytics. So combining business data and geography to create location intelligence. So that’s what I do day-to-day, which certainly is applicable to Big Data analytics and the things we’ll talk about today. But a big part of what I do in my career, my journey into location analytics, has been in business intelligence coming up through companies like BusinessObjects. And really, through that process, my sole focus has really been around what I call end-user BI, information, analysis, by presenting data visualization, interactive analysis, and really focusing on that end-user experience. That has been my sole area of focus, and really has helped transitioned me into the world of analytics that we know today.

A big part of that, with Analytics on Fire is, I consider myself a customer advocate because we deal with the last mile, if you will, and that being BI. I’ve been fortunate enough, with my cohost for Analytics on Fire, Mico Yuk, to have worked with some of the largest customers in the world, gotten to know some of the top thought leaders in BI and analytics in the world. And what we wanted to do was try to find a way to bring that voice out into the ecosystem and community. So to your point, I consider myself very active in the online community, consuming and now helping broadcast a lot of these thought leaders through our Analytics on Fire podcast.

James: Excellent, and part of what I’m really excited about, and why I think our audience will have a lot to learn is, on the one hand you’re this practitioner who came up through the ranks building BI as the industry grew. And then on the other hand, too, you also sort of sit in this purview of getting a chance to be surrounded by really smart people who were coming up with a whole bunch of ideas all over the place. And I think the fusion of those two, I’ve listened to a few of your Analytics on Fire podcasts, and anyone listening here should totally check it out. The fusion of that perspective fires for some really interesting insight. And so can you just sort of expand a little bit more on the BI Brainz aspect of it, a little bit more about what you’re hoping to broadcast with that?

Ryan: Yeah, absolutely, and a big part of that is also my co-host. So my co-host for Analytics on Fire is Mico Yuk, and unfortunately she couldn’t join us today, I was hoping that she could. But she is the author of Data Visualization for Dummies, she had also co-created another community in the SAP business objects ecosystem called Everything Xcelsius. She is a global speaker, keynote speaker, and very well known and very vocal, very active. So in our intro she’s the fiery Caribbean and I’m the diplomatic maps guy, right? There’s a great dynamic.

James: What I want to do is to dive just straight in and really get your perspective on this idea of, hey, look, you grew up in sort of this BI-centric world and there’s no doubt that Big Data analytics has come on really strong as the amounts of data and the things that we can analyze have grown exponentially. But I want to get your perspective of, you’ve seen this whole transformation happen and you’ve been right in the thick things as it’s gone on for over ten plus years, right? So I want to get, first off, your high level reaction of sort of what’s going on, tell us what you’re seeing happening in the world. And then I want to dig into some of the implications and then how it’s really effecting a lot of the companies that we talk to.

Ryan: Yeah, absolutely. And to your point, it is kind of, there’s a lot going on, there’s a lot of churn, there’s a lot of excitement, there’s a lot of innovation occurring in the Big Data space, the analytics space, and those two things obviously overlap quite a bit. But I think, from our perspective, and when I say us I bring Mico and I into this. Because ultimately, as Analytics on Fire, we’ve seen customers, system integrators, implementers, consumers, all struggle with the same exact problem. Whether you’re talking about Big Data or small data, whatever you want to call it, and that is that, ultimately, the challenge is getting the right information to the right person at the right time. And that’s such a simple way to summarize the challenges that every single one of your listeners and our customers and listeners have, but ultimately, I think we get hung up on the technology. We get hung up on putting definitions or labels to things. The way that we define Big Data in itself is confusing. And so I think that is where some of the disillusion occurs as far as where, as an industry, where us, as customers, are going to extract value from Big Data.

James: Yeah, and so I think playing off of that point, one of the transitions we’ve witnessed is when BI first came to the table, of course it’s in the purview of IT, right? And when Big Data starts making the rounds that’s also in the purview of IT. But there’s been this incredible transition and this movement, I guess you could call it, to decentralization. And certainly, and we’ll talk about it a little bit later, data visualization software played a great big part in this. As we’ve had this idea of, well, now we can decentralize analytics and have it directly plugged into the business, we can be more agile, we can solve business problems directly. There’s always been this sort of inherent tension between business and IT, and I want to get your opinion of how is it changing things? Are these tensions expanding? What’s the right place for each of the parties involved?

Ryan: Oh yeah, yeah, I mean, the tension that you describe is real. And it really is, the organization in itself, its culture. We just did a podcast just focused on cultural differences and challenges organizations experience. But yeah, that disconnect or tension between IT and business has been around for a very long time and it has a lot to do with the roles that have been attributed to each organization, right? Everyone understands that historically, business, consumers of information, analysts, have always been thirsty for information, right? Not necessarily data, but information. Now the information workers, the people who are actually building analytics, building dashboards, building reports, have always been hungry for data. And IT traditionally has been the gatekeeper for that data. And what I’ll call legacy BI systems, legacy analytics systems, were fairly complex. For a very long time IT was the gatekeeper and, potentially, the order taker for getting data into the hands of the people who need it. Once again, returning back to the right information at the right time for the right users. IT was always kind of the gatekeeper for that.

And so what we’ve seen, to your point, and we’ll probably dive more into, is that the rise of what we call modern BI tools which allow less technical users to create and distribute information as visual analytics, or reports, or whatever you want to call it, doesn’t really matter. Some type of consumable information package, whether it be a dashboard or report. The availability of those tools, and the ease for using those tools, and the power for consuming lots of information and presenting visual analytics has made those abilities available outside of IT. And it’s all about results. The reality is that we’ve seen more organizations bring IT folks into the business. So to your point, the decentralized analytics, or BI team, where you have domain experts with technology experts living in the line of business as opposed to living in IT, and we’ve seen that model succeed. Now many of your guests who may be on the IT line of business may say, well, that raises issues with governance and quality, and same version of the truth, and those are also real issues that do occur in a decentralized approach. But at the end of the day what we’ve seen amongst our customers and the folks who speak and we interview with Analytics on Fire, is that there’s no denying the need to move faster, to get information created or transitioned from large volumes of data into creating consumable information. That need is there and it’s only accelerating. So we’re at a kind of a crossroads as an industry and we’ve seen, as you’ve said, a monumental shift. Specifically with the last Gartner Report for BI where what they now call a modern BI platform potentially does allow for non-IT driven initiatives to get information out into the business faster.

James: Absolutely, and what I want to bring up is sort of a point/counterpoint to each side of the argument, and hopefully we can come up with some sort of recommendation that actually helps drive people forward. But one of the examples we always like to use when we talk about, well, are there dangers of just having a whole bunch of very decentralized analysts running in the business unit doing the analysis they need. And one of the things I always say is, well, when you have two or three analysts coming into a room together you want to make sure that they have a conversation and not an argument. What if one’s trying to do forecasting and they’re using recognized revenue opposed to total bookings and another analyst is using bookings or some other denomination of money that the company is bringing in? With that similar definition without that consistent business logic, you’re gonna do the same analysis and get a different answer, right? So to some extent IT needs to be able to have control, or someone needs to, right? And it’s traditionally been IT to at least make sure that people are getting consistent answers.

Ryan: Absolutely. I mean, to some extent a lot of organizations that we talk to are kind of throwing out, well, not throwing out but they’re redefining what a center of excellence looks like, right? And some organizations maintain a centralized analytics center of excellence managed and owned by IT. We’re also seeing some organizations create brand new lines of business. So I just sat in on a fascinating presentation where we see what I call the rise of the chief analytics officer. So they actually created, more or less, a shared service that operates almost like finance, where there’s a cross pollination of IT professionals with dedicated analysts, where their role or their job is to go throughout the organization and solve challenging analytical problems, right? So they’re going in, more or less, as consultants for various lines of business. So they go into finance, they’ll go into operations, they’ll go into marketing, and they serve, basically, as that center of excellence and also the conduit to IT. So they’re very closely aligned and joined at the hip with IT, and that’s part of the mandate through the creation of this new organization. Now many companies may not have that ability, depending on size, depending on industry, depending on culture, but it’s very interesting to see that underway in very large organizations. I don’t know that I have permission to share the organization, but let’s just say it’s one of the largest, oldest auto manufacturing companies in the world. Because a lot of times I hear, well, that doesn’t scale, but we’re talking over a thousand members of this analytics team. It does scale.

James: Yeah, that’s incredible, and I’m really glad you brought that up. We actually did some pretty extensive research on how to build an analytics center of excellence. And I talked to a whole bunch of really smart people and a lot of customers and people who are implementing this, and one of the questions that came up is, where does a data scientist live? And it’s a really interesting question because there’s one, there’s of course organizational and org chart questions to consider, like reporting to who for efficiencies, etc. But there’s also the very human element that I think a lot of people overlook. Because we hear people like us talk about what you should, and we say, okay, the obvious answer is let’s stick our data scientist directly into the line of business. But you have to take a step back and say, well, maybe not so fast, because if you have a data scientist, they’re a PhD who’s crunching numbers in the back room, who are they gonna talk to all day, right? Who are they gonna get feedback from, who are they gonna collaborate with on the latest tactics and techniques, things to use and the best practices for their software, if the people that they’re working for in the line of business don’t have an appreciation for what they’re doing, right? How are you gonna manage that person, how are you gonna give them a good performance review, and vice versa? So I think there’s also a very human element where you obviously want to be decentralized, and agile, and all those things, but you need some sort of cohesive function to allow people to sort of enjoy their work environment and actually blossom to the potential that they bring to the table. Not just confined by a whole bunch of co-workers who don’t understand or appreciate what they’re actually doing.

Ryan: Absolutely, and it is interesting, I’m not sure if, during that analysis, you spoke to folks like Wayne Eckerson, but he has a very interesting concept of a federated approach. Where you’re taking folks from IT and you’re essentially injecting them into the business, right, and they become more or less the eyes and ears back to the BI competency or an analytics competency center, or Big Data competency center, doesn’t really matter. But you have this notion of cross pollination where you’re taking members from one part of the organization and injecting them into to the other. It is a pretty interesting concept. But yeah, I think no matter what way you slice these problems it becomes a people problem, right? And it becomes a communication and organizational problem that many of the customers that we talk to are dealing with. So having the right team in place is absolutely critical, and more importantly, having the right leadership in place.

James: Yeah, and one of the things we found is, where do you situate your analytics center of excellence, your BI competency center, or whatever you wanna call it, whatever you’re trying to do? But we found the really effective ones tended to be in the office of the COO or CEO, or in some sort of strategic role, or chief analytics officer. But if it was only in the purview of IT they tended to be less successful, and it sounds like that’s consistent with what you also have been finding.

Ryan: Yeah, I mean, the ones that are successful, it’s an organizational decision to really invest in some type of competency center, right? And so in the BI world it’s all about your director of BI or director of analytics, and ultimately having the right groups, councils, committees in place. So you need structure. No matter what you do, there has to be the right structure. So as I mentioned, Wayne Eckerson, I mean, he was talking about having multiple committees, working group committee of champions so you’re engaging champions. You have a line of business manager, so you have somebody who you assign to every line of business who manages that relationship back at the BI CC or the analytics competency center. And so, I mean, you have to have the right people in place for this to succeed, especially in a larger organization. In a small organization, people wear multiple hats, but the kinds of customers, at least that we deal with, are much larger. And so the upside of doing this right is potentially millions, tens of millions of dollars to the bottom line. And more importantly, as we’ll talk about probably a little bit more, the cost of not getting it right could be substantially worse.

James: And it’s important to have an understanding of that, because it’s great to talk about a federated approach and say, we’re gonna have a liaison to this line of business and another liaison to that line of business, etc, etc, but when you start adding up the dollar amounts of highly intelligent, smart, hardworking people who are in high demand, and you need 15 of them to liaise between all of this and build an effective center of competency, that’s a lot of money. So that becomes worth it if you can prove the value, right? But it’s definitely not a small investment.

Ryan: Yeah, you nailed it right on the head. Except when you look at it right after a massive, failed, deployment, right? [laughs] One of my favorite quotes is, these large enterprises are dollar smart but penny stupid, right? They’ll spend, as you said in the beginning, millions of dollars on setting up a data lake initiative but not have any of the foundation, or not having the right competency center, or not having the right people in place, not having a plan after a massive failure. And a big part of why we created Analytics on Fire, because we were tired of seeing so many failures. Failed deployments, failed initiatives for BI where these things were missing or where this information was not reaching the right levels. And so like I said, the cost of not having these things in place, or at least taking steps toward them, can be substantial.

James: Mm-hm, and it’s interesting too, you bring it up with data lakes and that sort of thing. So Hadoop’s been around for ten years as of a few months ago, but it’s still pretty new, pretty uncharted territory for a whole lot of people. With anything that’s as technical and as cutting edge as Hadoop and the whole entire ecosystem, and that changes so fast, that inherently becomes an IT project, right? You do the pilot project, you do the whatever. And I think we’re in for another, and have been in for another realm of learning that we need a federated approach. Or learning that we need to mix both IT and business together. Because it’s sort of IT’s taken it under its wing and they try and build pilot projects, and they don’t mesh with your actual business goals, etc, and it’s really painful. And I wonder if you see that too. It strikes me as something as, here we go again, this is the next cycle, and a lot of people have already experienced pain with this to that end.

Ryan: Yeah no, I mean, you’re absolutely right, and it’s interesting because, to your point, Hadoop and all the technologies and abstraction layers on top of Hadoop to ultimately help businesses extract value and create the kinds of, whether they’re applications or create the analytical process on top of Hadoop, ultimately to get the information out, right? What we’ve seen over the last, I can’t believe it’s been ten years, but creation of Spark, and now we’re seeing entire platforms and tools even being created on top of Spark for building output dashboards, reports, those kind of things. It’s getting easier, so it’s an innovation cycle. There’s the hype cycle and then there’s a cycle of innovation that has to occur where, to your point, IT, right now, if you wanna set up a Hadoop cluster, I mean, IT has to own that, right? You’re talking infrastructure, you’re talking new technology, there’s specialized expertise that needs to be acquired. But I think as we move forward, and even what we’re starting to see now, is that those technology investments, and as customers figure out what they want to do with those technology investments, which I’ll come back to in a minute, which is backwards, right? It’s becoming easier to provide access or extract value at different parts of the organization because of the folds and abstraction layers being created on top of Hadoop, right? And I by no means am endorsing anything, but, I mean, I saw a demo of DataMirror the other day and was blown away how someone who really doesn’t necessarily have to know Hadoop or have formal training can be able to build visuals and reports on the data lake. So it’s pretty impressive to see how far we’ve come, but there’s still a long way to go. And to back up on a point that I just made, I think one of the challenges that we’ve seen with Hadoop or any Big Data technology is that because it was looked at as a technology investment, we’ve seen a lot of organizations make this large investment without a solid business case, right? I’ve heard multiple stories, and certainly folks who have come to our podcast have shared stories of executives who read something in Wall Street Journal or have their vendor in their ear, they go and they make a significant investment assuming that the business value or the ROI would naturally come because you’re able to accelerate the analysis of more data, more information. They kinda buy into the hype and that’s where we’ve seen a lot of organizations, not necessarily extract the kind of value that they were thinking, just because they can access or keep all of their data and supposedly process and extract insights from it.

James: Sure, and I think it comes back to when we talked about this inherent struggle between IT and business and making sure things are aligned. If you can bridge the gap in understanding, which visualization is so amazing at doing, it’s much easier for people to comprehend charts and pictures because their brain is so great at pattern recognition but not necessarily at picking up patterns we’re just staring at, columns, rows, in a spreadsheet. Now when we can do this to something that would otherwise be 100% confined to just ones and zeros in a spreadsheet that no one except the most technical can view, Big Data, right, when we starting talking about hundreds of millions of rows. When you’re able to visualize this and build that lightweight, interactive business application that you’re talking about on top of your Hadoop cluster, on top of whatever you’re storing that has more data than you could previously analyze, now you can actually close that gap. And I think that taking the logical extension of what Tableau and others did in sort of the BI and analytics realm, and then scaling that up to the Big Data world, I think is going to be the major catalyst for closing that understanding gap that has been the cause of tension between IT and business, and making sure that your initiatives actually lead to value.

Ryan: I do agree with that. I think one step beyond that, though, and personally I think where there are organizations beyond just a visual communication and distribution of information through visualization, I believe that the next gap for many organizations to cover is real time. I think a lot of organizations who do analytics and who do business intelligence are still looking behind. And I know for a fact that most organizations still struggle even doing that. And having witnessed even the latest and greatest Tableau success stories that I’ve seen, and I’ve had customers deploy Tableau, you’re still kind of in this mode where you’re distributing information but you’re leaving the consumers of that information with the task of digesting, consuming, and getting to some thought plateau where they take action. And I think that’s still a challenge.

And so one of the great things about these Big Data technologies is the potential to process information in real time. And I think that becomes a very interesting opportunity, especially if you’re in the mode of pushing information to business users. It goes back to the original statement, right information at the right time. I’ve seen a few customers kind of make that jump from delivering dashboards and reports to delivering answers to questions before. Rather than giving the user the information in a dashboard or report that they have to hunt and peck, you’re ultimately just pushing the information that they need for a given time. Very simple example of that is that we were working with a company where they have thousands of sales reps and they were pushing them dashboards, right? So every day I get my pipeline dashboard, it shows who I should be talking to. I mean, so that information is hidden in the dashboard, it’s kind of hidden in plain sight. But I get a pipeline report, I get who’s furthest along in the pipe, who I haven’t talked to in a few days. Who I sold last to, how my other people in my sales team are doing, so on and so forth. And so ultimately, the complaint was that they deployed these dashboards and it didn’t really have an impact. And then the feedback was, the sales folks weren’t using the dashboard, right? And IT comes back and they say, well, we gave you everything that you asked for. And the business came back and said, well, yeah, you gave us all this information in a nice pretty Tableau dashboard, but I just want to know what are my top five customers I need to call today based on where I physically am, based on a number of factors, right? And so ultimately, they went back, they used another technology, and ultimately what they were doing was bursting them on a daily basis just a list of the top five customers based on, it is black box analytics, yes, but ultimately that’s where they saw the impact. Where they just gave the information based on interacting and doing another round of development. But ultimately just pushing the right information to the right user had a significant impact at that point and beyond.

James: Yeah, it’s really interesting to think about that. Because what you need to do is, we mentioned before, you have to understand your customer, right? And the customer in this case being the sales team, giving them a self service platform, maybe there’s a couple really curious sales analysts who are using it, sales reps who are using it who want to find out this and understand all these patterns. But what they really just wanted was contextualized insights delivered right to them so they can immediately take action.

So Ryan, as we sort of wrap up here, one of the things I love to do is, you shared a lot of great stuff with, and as I mentioned, some of our audience probably is an avid fan of yours. But for those of us who want to go out, find out a little bit more about you, where are we going to dig in deeper and find out more?

Ryan: Yeah, the best place to start is analyticsonfire.com. So that’s probably the best place to start. And then certainly you can find me on Twitter, hopefully we can include my Twitter handle in the show notes.

James: Yes, absolutely, we’ll have the show notes and everything that you talk about here we’ll have a link back up to that. So we can absolutely do that.

Ryan: Cool. I’m still trying to master Twitter. If you message me or want to engage me I will reply very, very quickly, but I’m not a Twitter expert by any means. But I’m definitely active.

James: Excellent. Well Ryan, thank you so much for coming on the show. A lot of great stuff here. I really enjoyed the conversation and really appreciate you coming on.

Ryan: Yeah, appreciate it, thank you so much for having me.

Posted on by admin