Andy Vidan is the CEO of Cambridge, Massachusetts-based DataOps startup Composable Analytics. He founded the company two years ago with MIT colleague Lars Fiedler. They now lead Composable—self-funded and self-sustaining, by the way—and are establishing a beachhead in the nascent DataOps space. I recently spoke with him about the genesis of his company, what it’s like to (maybe) work with the U.S. DoD, and the challenge of evangelizing DataOps to line-of-business stakeholders.
TOPH WHITMORE: Tell me about Composable Analytics.
ANDY VIDAN: Composable Analytics grew out of a project at MIT’s Lincoln Laboratory. Lincoln Lab is an MIT R&D center that’s provides advanced technology solutions to the U.S. Department of Defense and intelligence community. There, we saw the clear need for a unifying platform that can ingest all types of data and feed it to an intelligence analyst. An intelligence analyst within the Department of Defense is similar to a business analyst within the private sector. They’re sophisticated. They know their subject matter well, better than software developers may ever know their business. But they’re not always technical, and when they have to deal with different data sets from different systems, with different formats and different structures, they must rely on software engineers and use a variety of disjoint tools that further complicate their workflows.
Our approach was different: We wanted to develop a single ecosystem to bring in data from all sorts of sources, and present it to the user for self-service data discovery and analytics. For us, Big Data always meant all data. Aside from the massive amounts of data —which the community already knows how to handle—or even the high Big Data velocity and throughput, we focused on the variability that comes with all data: There’s always tabular data, and tabular data, and more tabular data, but we also have to think about image files, text documents, PDFs, sound files, and so on. We also wanted to make data accessible to an end user who knows the subject matter but is not a technical person.
TW: You and Lars Fiedler developed Composable while working at Lincoln Lab. How did Composable evolve from an MIT idea into a commercial solution?
AV: Lincoln Laboratory is a well-kept secret.
TW: With the defense department involved, it probably has to be!
AV: Yes. MIT Lincoln Laboratory is really one of the premier research labs in the US, very much like the old Bell Labs, or the Jet Propulsion Lab that NASA runs with Cal Tech. Composable Analytics was initially funded directly by the DoD. The nice thing about Lincoln Lab is that you have that user interaction. You aren’t just writing research papers, you are prototyping, building systems, you are meeting with end users—in this case, intelligence analysts and operators—to be able to really get down to requirements and get a system that they would eventually use.
TW: Does Composable Analytics still serve the Department of Defense?
AV: Yeah. So I can’t really answer the question.
TW: Good enough!
AV: Our main focus is private sector.
TW: Tell me more about the Composable Analytics technology. What value propositions do you offer to an enterprise IT leader?
AV: Three things: orchestration, automation, and analytics. To me, that really embodies what’s behind DataOps. Our platform, our ecosystem provides those three things for an enterprise and for users of data within that enterprise.
Let me walk you through a real use case: One of our financial sector customers wants to build effective customer profiles. One touch point is their call center. You might call in to request a change of address after a recent real-estate purchase. This is normally a short call: the call center agent would change the address and hang up the phone and everybody’s happy. But this is a situation where an organization can learn more about the customer. An enterprise can use that little tidbit of information that you just revealed about yourself in order to understand what other products and services you might be interested in. The fact that you purchased a home might mean you’re willing to purchase life insurance. You might mention you are having a baby. That might incite you to open an educational savings account with the company. What does this require? Being able to integrate with a Voice-over-IP system and orchestrate a data flow that takes the call-center recording, in real time, pushes it into a speech-to-text engine, takes the resulting unstructured text and uses various analytics and natural language processing techniques in order to determine intent, sentiment, and trigger words that can then be directly inserted back into a CRM. The call center agent can see that on your profile and talk to you about it during that call, or next time you call. That embodies orchestration, automation, plus analytics. Those are the types of complex all-data flow use cases we’re addressing.
TW: It sounds like a platform play. Are you essentially offering and delivering and serving pretty much the whole data value chain from ingestion through consumption?
AV: Yes, we are, and that’s where DataOps comes into play. There’s always raw data out there. At the end of the day your business users are getting value from applications, Excel or Dynamics or Power BI or Salesforce or NetSuite, whatever it is. But there’s a whole process that happens in between the raw data getting to the high-level application, a process that encompasses orchestration, automation, and analytics. That’s our play. That’s where we live. That’s what we do well.
TW: I like to talk about the enterprise conflict between IT leadership and line-of-business stakeholders like my former marketer self. Toph-the-marketing-boy wants self-service everything—data immediacy without data-administration complexity. On the other side, IT leadership is tasked with ensuring auditability, lineage, governance, security. Which side of that customer equation do you target? IT side? Business influencer? Or both?
AV: Almost always the business side.
TW: Interesting. I confess that’s not what I expected!
AV: We typically find that the business side is willing to adopt new technologies so it can directly increase business value. Back to DataOps, we enable the business side to develop operational data science solutions, through reliable and robust continuous integration, while establishing, through the use of our tools, DataOps best practices. So, when the business side is ready to have IT leadership take ownership of its proven data implementations, we already have a layer of governance, security, and auditing around it, which makes the transition that much easier.
We talk about operationalizing data. In many cases, organizations have invested in PhD-level scientists to develop, implement, and validate data models. They do this by building what is normally a one-off analytic. It works beautifully, but at that point, the model has not provided any business value to the organization.
That one-off data model or data analytic must fit into a larger data workflow, one that the organization supports, and which works in conjunction with IT. It must integrate with production databases, query data, pull it into the analytic model, perform the computation, and push it back into other production databases, production CRMs, maybe into ERP systems. It’s that part—the data-workflow management—that is missing in today’s Big Data solutions. That’s where the Composable platform comes in. It allows you to connect the data sets, plug-and-play the analytics—that you either write or bring in from other open-source libraries—and be part of this broader operational process.
TW: You’re preaching to the converted! Enterprises need to hear the DataOps gospel. But I think most face a challenge on both the data consumption and data management sides of the house: They must overcome conflicting objectives to collaborate. Do you find that it’s difficult to evangelize collaboration to these enterprise groups?
AV: No. It’s actually easy once we’re in. When enterprises use our platform as a framework for building these operational data flows, we typically have good engagement with IT leaders because they see things are developed correctly.
TW: What’s deployment like?
AV: The platform is a distributed web application developed as a native cloud application. It can be deployed on the cloud, and scales well both horizontally and vertically. You can spin up an instance of Composable on AWS or Microsoft Azure, but the public cloud is not required. We can deploy Composable for an enterprise on-premises. Back to our Department of Defense legacy, one of our requirements was to be able to run not just on-premises, but on air-gapped networks, and we can do that. With some of our customers—within insurance and finance—the data is sensitive, and we run on a cluster behind the corporate firewall completely disconnected from the web.
TW: What’s Composable’s funding situation?
AV: We were lucky enough to leave MIT with a product and customers ready and waiting. From day one—the end of 2014—we’ve been completely client-funded.
TW: Will you look to subsidize growth with outside investment?
AV: Yes. I think 2017 is the year for us. We’re reaching a point where capital will help us scale out dramatically.
We’re a growing but small company, with the entire team being technical and focused on product development. As we grow, our focus will be to bring on forward-deployed engineers and customer success managers to help with deployment. This will help us approach a broader set of customers and work with them to develop a DataOps Strategy, based on a small-scale, short-term pilot, that may last one or two months at most. After that, and after they see the value, they buy into Composable as a licensed delivery platform.
TW: Where is your customer base?
AV: All regions, but predominantly domestic. We have, for example, one large customer that is a global energy conglomerate with operations in South America and other parts of the world.
TW: I understand you’re producing an upcoming conference?
AV: Yes—the DataOps Summit conference series. The next event is in June here in our hometown in Boston. We’re focused on getting all the data professionals into the same room. That’s both the business side of the house and technical audiences, like software developers, data scientists, data engineers, IT operations, quality assurance engineers, and so on. More details online at dataopssummit.com.
Many enterprises have invested in data science, and developed some cool data applications, and now must figure out how to put them in an operational workflow to actually generate value! That’s what we’re trying to illustrate with this DataOps Summit series. We’ll bring in executives from the business side—financial services, insurance, oil and gas, cybersecurity, other verticals as well—and talk about what DataOps tools, techniques, best practices they can put together around data operations. But we’ll listen, too: The technology vendors in the room—Composable and others—can work with them on a DataOps vision that we can all build towards.
TW: Where does Composable Analytics go from here?
AV: First, democratizing data science. Enterprise business users should be able to work more and more like data scientists. Our current end users are typically sophisticated business users, but not necessarily technical. Ultimately, they know the business better than anyone else. We’re creating a framework to help these users develop their own analytical workflows. Composable has a visual designer that lets you create complex dataflows regardless of your technical level. That means a complex data pipeline can be created visually, just as you would draw out a workflow on a whiteboard! We have a machine-learning computational framework behind this that will accelerate the process for an analyst to build these workflows. As that analyst selects different modules to build up the data flow, the machine will recommend the next such module to come in. So, machine learning is accelerating the development of new machine-learning data flows. That’s pretty cool.
Second, there’s a lot of noise out there, and we’ve seen many organizations delay data-management solution adoption. Composable started as a self-service analytics platform, but over time has become a DataOps platform with orchestration, automation, and analytics aimed at getting people out of the rat’s nest of spreadsheets, and to start thinking about modern data architectures. We see DataOps being this transformative notion of best practices that allow organizations to say “Okay, we can do this.” We know how to do software development. We know how to build production systems. Now, let’s bring that to the data world and start to think about production data platforms and operational data science.