Topics of Interest Archives: DataOps

Smart DataOps: How AI Can Enhance Enterprise Data-workflow Management

Artificial Intelligence (AI) in a data environment holds great promise. The vendor marketing message is seductive: “We use advanced technology to make things better for you.”

But it’s where the AI rubber meets the data process road that the value proposition can get muddled. In evaluating data software, integrated AI technology must be more than a marketing evaluation checkbox. Understanding how AI can impact DataOps value delivery must be a fundamental objective in the data-driven enterprise. Otherwise, the value-add part becomes meaningless.

In this report, we’ll assess AI’s potential to impact DataOps environments. (Here’s a hint: It’s about the process.) Enterprises that adopt a DataOps approach to collaborative data-workflow management establish a commitment to continuous improvement. AI can help enterprises establish efficient DataOps, but only if the value AI provides goes beyond predictive analytics (and simple marketing promises).

To read the rest of this report, please fill out the download form.


Posted in Research | Tagged | Leave a comment

In Praise of (Data) Transparency - Part #2

InPraiseOfDataTransparency2bIn my previous blog on data transparency, I posited my admittedly idealistic vision that—within reason—the more that an enterprise fosters the free flow of data through an enterprise, the better. In this follow-up, I’ll look at some of the organizational blockers to data workflows, and how to get around them.

I’ll start with the basic underlying ideal: More data is better. If I work in marketing, I need to be able to see marketing data. And sales data. And financial data. And product management data. And…I could go on, but you get the point.

The problem, the challenge, really, is that in far too many organizations, that glorious cross-functional data just doesn’t flow across the enterprise, or I should say, over or through its silos, be they functional, architectural, or process-based. Perhaps it’s naive of me to ask, but why on earth does this obstinate hindrance to progress still persist?

Data blockages—institutional or human-created—lead to data-hoarding. (Know any data hoarders in your enterprise? Am I the only one who thinks “Data Hoarders” would make for a great reality show?)Let’s look at some of the organizational contributors to data blockage. Any of these data-hoarding characteristics hit close to home?

  • Provincialism: “It’s my data. I own it. Only I get to derive value from it. Plus, I may be able to use it against those who anger me.”
  • Trust (or more specifically, the lack thereof): “This data is proprietary, and must remain confidential. I don’t know who you hire over there in [other department that’s not mine], therefore, I cannot trust you with this information.”
  • Change is a threat: “We’ve always done it this way. We’ve never shared before, and we’re not about to change for your benefit.”
  • Incompatibility: “You’re the one who chose that marketing automation solution. It’s not my fault it doesn’t easily integrate with my CRM.”
  • Misplaced or missing incentives: “What benefit will I see if I share data with you? It will cost me time/money to share, and could even be a risk…one I’m not willing to take.”

The inefficient flow of information in an enterprise so often boils down to organizational dysfunction. How willing are you and your colleagues to work together to share data? Would you share your team’s data with someone in your enterprise you don’t like? Does sharing your team’s data with another group deliver tangible benefits to your team?

Defeating the data-hoarders requires a corporate commitment to the free flow of data over, under, and through the enterprise. That’s an organizational behavior and leadership challenge that should be addressed at the C-suite level.

Moving towards data transparency requires more than just progressive leadership. Effective data integration is a prerequisite. Technology helps, on both the data management and data consumption sides of the equation. For example, Informatica frames its data-management capabilities around its Enterprise Information Catalog, or EIC for short. The EIC is Informatica’s data catalog solution, a technology that leverages machine learning to catalog, classify, and map relationships between enterprise data assets. The end user (typically a data scientist or even a business user) can get at her or his data assets via a search interface. That new process delivers benefits: Discovery is convenient, access is accelerated, and perhaps most importantly, the data is trustworthy.

The data-workflow approach championed by Informatica and other data-integration and data-cataloging vendors works, and delivers all the tangible benefits the vendors’ respective marketing materials trumpet. But no technology by itself can overcome myopic, office-politics-driven data-hoarding. To reap the benefits of true enterprise data transparency, you’re going to have to come to agreement with your peers—even the ones who drive you crazy—on five simple words: “We’re all in this together.”

Posted in Blog | Tagged , | Leave a comment

In Praise of (Data) Transparency - Part #1



In speaking with colleagues, enterprises, and data technology vendors, I often tell this cautionary tale: At a prior company, I led a marketing operations revamp. The effort included a comprehensive, redesign, re-architect, and rebuild of the corporate website, with particular focus on scaling to support the online sale of thousands of SKUs. Coupled with that ambitious agenda, my team and I worked to develop data-driven operations, using integrated marketing automation software to collect and analyze opportunities along a sales-funnel-mapped customer journey.

We succeeded in building our idealized technology solution (a month early and under budget, I might add rather egotistically), along with associated delineated business processes. But the value delivery—in insight, process automation, and strengthened customer relationships—stopped at the door to the marketing department. The sales team had little interest in and even less commitment to improving its own data discipline, so integration with an archaic CRM was out of the question. Worse, the finance team protected financial information to such an extent that seeing live revenue data—even when generated by the website we’d developed—required written (on paper, no less) permission.

I look back on this experience with some nostalgia but more than a little frustration. The result of our herculean development efforts? Effective data-driven marketing. Hampered by duplicated processes across divisions. And ultimately, the realization that corporate myopia limited us to no more than a functional silo of data-driven success. (Read what I really think of silo isolationism here.)

What would have made it more successful? Several things. C-suite-level buy-in in other departments. A shared commitment to creating a data-driven enterprise that extended beyond just the realm of marketing. And perhaps most essentially, a willingness to share data across the enterprise.

I shared this anecdote with a fellow analyst at a recent tradeshow. The analyst (one whom I respect and acknowledge is oh-so-much smarter than me) posited the argument that my call for open data transparency across the organization is unrealistic. The analyst’s practical view was that individual departments in an enterpirse will remain protective of their data, and that it’s not reasonable to expect the finance lead to share data with the marketing lead, or customer service lead, or R&D lead, etc. My not-so-implicit take: Silos aren’t going away, and given that fact, we should build/deploy data technology solutions in and around them.

Neither of us is right or wrong here. I know I suffer from pie-in-the-sky idealism when it comes to eradicating enterprise silo culture. (“Never gonna happen,” said a F1000 high-tech consulting-client VP to me once when I proposed collaborating with another VP in another functional silo to achieve shared efficiencies.) And the other analyst’s point is a good one—You can aim for the sky, but if you’re going to get anything done, you’d better start work down here on earth.

I still cling to that free-movement-of-data-is-a-good-thing idealism, something that I actually encounter every now and then in the real world (though, admittedly, typically in smaller, newer, often-SaaS-based companies). In an upcoming blog post I’ll discuss the types of enterprise blockers to data transparency. But in the meantime, here’s my ask of those of you who have read this far: Is my idealism out of touch? Or is the free flow of data in an enterprise something still for which we should strive? Email me your data transparency/opacity success/horror stories at, or DM me at @TophW47.

Posted in Blog | Tagged | Leave a comment

The New Way of Work: How DataOps Transforms Enterprise Roles

What if…

…line-of-business leaders spent less time getting data and more time acting upon it?

…analysts spent less time munging data and more time analyzing?

…managers managed data-workflow outcomes as diligently as they managed business-workflow outcomes?

Enterprise digital transformation offers the promise of data-driven value delivery. But without a DataOps approach, data-progressive organizations run the risk of burdening data-driven initiatives with archaic processes, risk-averse culture, and out-of-date functional roles.

A successful enterprise digital transformation and—more  importantly–an effective DataOps approach both require a new way of thinking about those traditional functional roles. Data consumption functions must evolve to accommodate more data and more insight. Data management roles must adapt to deliver and broker data, and manage comprehensive end-to-end data workflows through the organization. Finally, measurement must start with value, enabling data managers to work backwards through the enterprise to maximize that value delivery with the right data in the right place at the right time via a dynamic, continuously-improving process.

To read the rest of this report, please fill out the download form.


Posted in Research | Tagged | Leave a comment

Beyond Self-Service: How Machine Learning Drives Enterprise Data’s Third Wave

Enterprises undergoing digital transformations move through three phases of maturity: Commodity Storage, Self-service Everything, and Machine-learning Ubiquity. At each stage, enterprise data technology innovations have served end users seeking to get the most value out of their data.

Many enterprises have reached that second stage—using self-service data technologies to empower end users to access and consume data on their own. But the convenience of self-service data technology is self-limiting: As enterprise data grows, end users’ ability to find it, figure out what to do with it, and gain insight from it gets more difficult. And that’s a complex challenge only exacerbated by static, technology-reinforced, self-service processes.

An emerging third phase responds to that challenge, and helps enterprises move into a dynamic data operations environment characterized by smart workflows, self-optimizing data workflow orchestration, and an enterprise commitment to maximizing data-derived value. In this new world, enterprises leverage machine-learning technologies to craft DataOps models that learn with iteration, and scale with continuous improvement. Coupling that approach with embedded analytics can deliver insight at the point of its greatest potential impact: where data meets decision.

In this report, Blue Hill Research examines how digital transformations have evolved, and looks at how innovative enterprises are using machine-learning-enabled technology like GoodData to accelerate data flow, shorten communication spans, empower line-of-business stakeholders, and deliver greater bottom-line value (while overturning a few old-school business models in the process).

To read the rest of this report, please fill out the download form.

GoodData Beyond Self Service Image

Posted in Research | Tagged , | Leave a comment

No More Silos: How DataOps Technologies Overcome Enterprise Data Isolationism

Data—and the value derived from it—dictates success in the modern enterprise. Enterprises that exploit data to derive value recognize new revenue, see new efficiencies, and enjoy intangible benefits like strengthened customer relationships and greater marketing efficiency.

But organizational ennui, legacy system burdens, and change aversion conspire to bury enterprise data in metaphorical silos. The free flow of data is a mandate for success in the modern enterprise. When silos obstruct data-workflow efficiency, that modern enterprise cannot maximize data-derived value.

In this report, Blue Hill Research examines how enterprise leaders use DataOps approaches to break down silos, whether those silos are organizational, architectural, or process-driven. This report also introduces a migration framework for DataOps adoption.

To read the rest of this report, please fill out the download form.


Posted in Research | Tagged | Leave a comment

This Week in DataOps: Rain, the Real World, and Another Manifesto (the Good Kind)

TWIDO logoAs the saying goes, April showers bring May flowers, unless you live in British Columbia, where April showers bring May showers, and let’s face it, the joke doesn’t work as well with June flowers and pilgrims.

It’s been a big week in the DataOps world. First off, if you missed it (or even if you didn’t and want to listen to it again—thanks, Mom), check out the recording of the joint webinar I did last week with Information Builders’ marketing VP Jake Freivald, “DataOps in the Real World.” We talked collaborative data orchestration (long hashtag), DataOps in healthcare, and fast-talkers. Some fun things you’ll learn:

  • Information Builders’ latest Omni-Gen release includes a unique, tiered-functionality offering of three different toolsets, including Integration, Data Quality, and MDM editions.
  • The Information Builders engagement with customer St. Luke’s University Health Network (a relationship I profiled here in an earlier DataOps research piece) was so successful that the two parties have collaborated to package the solution as a healthcare-vertical-targeted BI and analytics solution.
  • They can’t hear you if you knock your headset microphone away from your face.
  • No matter its relevance, “COMAECAL” is not a particularly marketable DataOps acronym. (Sing it with me, Collaborate! Orchestrate! Measure! Accelerate!…)

dataops_landing_890x200_1Qubole founders (and former Facebook infrastructure engineers, and Apache Hive co-developers) Ashish Thusoo and Joydeep Sen Sarma have just authored “Creating a Data-Driven Enterprise with DataOps.” The book—published by O’Reilly—evangelizes both DataOps corporate culture and platform. It also features case examples from the likes of eBay, Twitter, and Uber. Expect some promotion (!), presentations, and available copies at the upcoming Qubole-sponsored Data Platforms 2017 conference next month. (Check out my “Questioning Authority” DataOps interview with Qubole CEO Thusoo here.)

Also, in case you missed it, the big news last week was Infor’s acquisition of cloud BI and analytics developer Birst. The move is an interesting one, in part because it raises the profile of BI in an enterprise context: Infor offers ERP solutions, and now Birst BI tools will snap into that portfolio.

It’s still a work in progress but if you’re committed to DataOps like the folks at DataKitchen, check out the draft DataOps Manifesto developed by a consortium of DataOps leaders. (I’m a big fan of DataOps manifestos.) It’s a call to action for the DataOps-faithful, and a series of (evolving) DataOps principles.

Finally, I’m looking forward to the upcoming Talend Connect and Informatica World events in California. Find me and let’s talk DataOps ‘til we’re blue in the face. (Just kidding. I’ll stop at flushed pink.)

Posted in Blog | Tagged , , , , | Leave a comment

This Week in DataOps: The Promotional Edition

TWIDO logoSpring has sprung (finally, though only briefly here in Canada), which means it’s webinar and publishing season! And that makes for a busy month in the DataOps world.

Join me and Information Builders VP of Marketing Jake Freivald Thursday, April 27 2017 for our webinar on “DataOps in the Real World: How Innovators are Reinventing Their Business Models with Smarter Data Management.” I’ll be providing an overview of DataOps—what it is, how it works, and why it matters—and presenting an interesting healthcare case example. (So far, only two slides include pictures of my head.) I’m looking forward to an enlightening discussion! Registration details and more information available here.


Silos kill! Well, they at least hinder progress. Keep an eye out for my upcoming DataOps report “No More Silos: How DataOps Technologies Overcome Enterprise Data Isolationism.” (Tentative publication date = Friday, April 28, 2017.) The research looks at how data innovators leverage technologies from vendors like Informatica, Domo, Switchboard Software, Microsoft, Yellowfin, and GoodData to break down organizational, architectural, and process-based enterprise silos.

Here’s what the first page might just look like:

p1 - No More Silos

Posted in Blog | Tagged , , , , , , | Leave a comment

This Week in DataOps: The Tradeshow Edition

TWIDO logoDataOps wasn’t the most deafening sound at Strata + Hadoop World San Jose this year, but as data-workflow orchestration models go, the DataOps music gets louder with each event. I’ve written before about Boston-based DataOps startup Composable Analytics. But several Strata startups are starting to get attention too.

Still-in-stealth-mode-but-let’s-get-a-Strata-booth-anyway San Francisco-based startup Nexla is pitching a combined DataOps + machine-learning message. The Nexla platform enables customers to connect, move, transform, secure, and (most significantly) monitor their data streams. Nexla’s mission is to get end users deriving value from data rather than spending time working to access it. (Check out Nexla’s new DataOps industry survey.)

DataKitchen is another DataOps four-year-overnight success. The startup out of Cambridge, Massachusetts also exhibited at Strata. DataKitchen users can create, manage, replicate, and share defined data workflows under the guise of “self-service data orchestration.” The DataKitchen guys—“Head Chef” Christopher Bergh and co-founder Gil Benghiat—wore chef’s outfits and handed out logo’ed wooden mixing spoons. (Because your data workflow is a “recipe.” Get it?)

DataOps at Strata - Nexla and DataKitchen booths

DataOps in the wild — The Nexla and DataKitchen exhibition booths at Strata + Hadoop World San Jose.

Another DataOps-y theme at Strata: “Continuous Analytics.” In most common parlance, the buzzphrase suggests “BI on BI,” enabling data-workflow monitoring/management to tweak and improve, with the implied notion of consumable, always-on, probably-streaming, real-time BI. Israeli startup Iguazio preaches the continuous analytics message (as well as plenty of performance benchmarking) as part of its “Unified Data Platform” offering.

I got the chance to talk DataOps with IBM honchos Madhu Kochar and Pandit Prasad of the IBM Almaden Research Center. Kochar and Prasad are tasked with the small challenge of reinventing how enterprises derive value from their data with analytics. IBM’s recently announced Watson AI partnership with Salesforce Einstein is only the latest salvo in IBM’s efforts to deliver, manage, and shape AI in the enterprise.

Meanwhile, over in the data-prep world, the data wranglers over at Trifacta are working to “fix the data supply chain” with self-service, democratized data access. CEO Adam Wilson preached a message of business value—Trifacta’s platform shift aims to resonate with line-of-business stakeholders, and is music to the ears of a DataOps wonk like me. (And it echoes CTO Joe Hellerstein’s LOB-focused technical story from last fall.)

Many vendors are supplementing evangelism efforts with training outreach programs. DataRobot, for example, has introduced its own DataRobot University. The education initiative is intended both for enterprise training, but also for grassroots marketing, with pilot academic programs already in place at a major American university you’ve heard of but shall remain nameless, as well as the National University of Singapore and several others.

Another common theme: The curse of well-intentioned technology. Informatica’s Murthy Mathiprakasam identifies two potential (and related) data transformation pitfalls: cheap solutions for data lakes that can turn them into high-maintenance, inaccessible data swamps, and self-service solutions that can reinforce data-access bad habits, foster data silos, and limit process repeatability. (In his words, “The fragmented approach is literally creating the data swamp problem.”) Informatica’s approach: unified metadata management and machine-learning capabilities powering an integrated data lake solution. (As with so many fundamentals of data governance, the first challenge is doing the metadata-unifying. The second will be evangelizing it.)

I got the opportunity to meet with Talend customer Beachbody. Beachbody may be best known for producing the “P90” and “Insanity” exercise programs, and continues to certify its broad network of exercise professionals. What’s cool from a DataOps perspective: Beachbody uses Talend to provide transparency, auditability, and control via a visible data workflow from partner to CEO. More importantly, data delivery—at every stage of the data supply chain—is now real time. To get to that, Beachbody moved its information stores to AWS and—working with Talend—built a data lake in the cloud offering self-service capabilities. After a speedy deployment, Beachbody now enjoys faster processing and better job execution using fewer resources.

More Strata quick hits:

  • Qubole is publishing a DataOps e-book with O’Reilly. The case-study focused piece includes use-case examples from the likes of Walmart.
  • Pentaho is committed to getting its machine-learning technology into common use in the data-driven enterprise. What’s cool (to me): the ML orchestration capabilities, Pentaho’s emphasis on a “test-and-tune” deployment model.
  • Attunity offers three products using two verbs and a noun. Its Replicate solution enables real-time data integration/migration, Compose delivers a data-warehouse automation layer, but it is Attunity’s Visibility product that tells the most interesting DataOps story: It provides “BI-on-BI” operations monitoring (focused on data lakes).
  • Check out Striim’s BI-on-BI approach to streaming analytics. It couples data integration with a DataOps-ish operations-monitoring perspective on data consumption. It’s a great way to scale consumption with data volume growth. (The two i’s stand for “Integration” and “Intelligence.” Ah.)
  • Along those same lines, anomaly-detection technology innovator Anodot has grown substantially in the last six months, and promises a new way to monitor line-of-business data. Look for new product, package, and service announcements from Anodot in the next few months.

Last week I attended Domo’s annual customer funfest Domopalooza in Salt Lake City. More on Domo’s announcements coming soon, but a quick summary:

  • Focus was noticeably humble (core product has improved dramatically from four years ago, when it wasn’t so great, admitted CEO Josh James in his first keynote) and business-value-focused. (James: “We don’t talk about optimizing queries. (Puke!) We talk about optimizing your business.”)
  • There was a definite scent of DataOps in the air. CSO Niall Browne presented on Domo data governance. The Domo data governance story emphasizes transparency with control, a message that will be welcomed in IT leadership circles.
  • Domo introduced a new OEMish model called “Domo Everywhere.” It allows partners to develop custom Domo solutions, with three tiers of licensing: white label, embed, and publish.
  • Some cool core enhancements include new alert capabilities, DataOps-oriented data-lineage tracking in Domo Analyzer, and Domo “Mr. Roboto” (yes, that’s what they’re calling it) AI functionality.
  • Domo also introduced its “Business-in-a-Box” package of pre-produced dashboard elements to accelerate enterprise deployment. (One cool dataviz UI element demoed at the show: Sample charts are pre-populated with applicable data, allowing end users to view data in the context of different chart designs.)

Finally, and not at all tradeshow-related, Australian BI leader Yellowfin has just announced its semi-annual upgrade to its namesake BI solution. Yellowfin version “7.3+” comes out in May. (The “+” might be Australian for “.1”.) The news is all about extensibility, with many, many new web connectors. But most interesting (to me at least) is its JSON connector capability that enables users to establish their own data workflows. (Next step, I hope: visual-mapping of that connectivity for top-down workflow orchestration.)

Posted in Blog | Tagged , , , , , , , , , , , , , , , | Leave a comment

On DataOps, the DoD, and Operationalizing Data Science: Questioning Authority with Composable Analytics’ Andy Vidan

AndyVidan2Andy Vidan is the CEO of Cambridge, Massachusetts-based DataOps startup Composable Analytics. He founded the company two years ago with MIT colleague Lars Fiedler. They now lead Composable—self-funded and self-sustaining, by the way—and are establishing a beachhead in the nascent DataOps space. I recently spoke with him about the genesis of his company, what it’s like to (maybe) work with the U.S. DoD, and the challenge of evangelizing DataOps to line-of-business stakeholders.

TOPH WHITMORE: Tell me about Composable Analytics.

ANDY VIDAN: Composable Analytics grew out of a project at MIT’s Lincoln Laboratory. Lincoln Lab is an MIT R&D center that’s provides advanced technology solutions to the U.S. Department of Defense and intelligence community. There, we saw the clear need for a unifying platform that can ingest all types of data and feed it to an intelligence analyst. An intelligence analyst within the Department of Defense is similar to a business analyst within the private sector. They’re sophisticated. They know their subject matter well, better than software developers may ever know their business. But they’re not always technical, and when they have to deal with different data sets from different systems, with different formats and different structures, they must rely on software engineers and use a variety of disjoint tools that further complicate their workflows.

Our approach was different: We wanted to develop a single ecosystem to bring in data from all sorts of sources, and present it to the user for self-service data discovery and analytics. For us, Big Data always meant all data. Aside from the massive amounts of data —which the community already knows how to handle—or even the high Big Data velocity and throughput, we focused on the variability that comes with all data: There’s always tabular data, and tabular data, and more tabular data, but we also have to think about image files, text documents, PDFs, sound files, and so on. We also wanted to make data accessible to an end user who knows the subject matter but is not a technical person.

TW: You and Lars Fiedler developed Composable while working at Lincoln Lab. How did Composable evolve from an MIT idea into a commercial solution?

AV: Lincoln Laboratory is a well-kept secret.

TW: With the defense department involved, it probably has to be!

AV: Yes. MIT Lincoln Laboratory is really one of the premier research labs in the US, very much like the old Bell Labs, or the Jet Propulsion Lab that NASA runs with Cal Tech. Composable Analytics was initially funded directly by the DoD. The nice thing about Lincoln Lab is that you have that user interaction. You aren’t just writing research papers, you are prototyping, building systems, you are meeting with end users—in this case, intelligence analysts and operators—to be able to really get down to requirements and get a system that they would eventually use.

TW: Does Composable Analytics still serve the Department of Defense?

AV: Yeah. So I can’t really answer the question.

TW: Good enough!

AV: Our main focus is private sector.

TW: Tell me more about the Composable Analytics technology. What value propositions do you offer to an enterprise IT leader?

AV: Three things: orchestration, automation, and analytics. To me, that really embodies what’s behind DataOps. Our platform, our ecosystem provides those three things for an enterprise and for users of data within that enterprise.

Let me walk you through a real use case: One of our financial sector customers wants to build effective customer profiles. One touch point is their call center. You might call in to request a change of address after a recent real-estate purchase. This is normally a short call: the call center agent would change the address and hang up the phone and everybody’s happy. But this is a situation where an organization can learn more about the customer. An enterprise can use that little tidbit of information that you just revealed about yourself in order to understand what other products and services you might be interested in. The fact that you purchased a home might mean you’re willing to purchase life insurance. You might mention you are having a baby. That might incite you to open an educational savings account with the company. What does this require? Being able to integrate with a Voice-over-IP system and orchestrate a data flow that takes the call-center recording, in real time, pushes it into a speech-to-text engine, takes the resulting unstructured text and uses various analytics and natural language processing techniques in order to determine intent, sentiment, and trigger words that can then be directly inserted back into a CRM. The call center agent can see that on your profile and talk to you about it during that call, or next time you call. That embodies orchestration, automation, plus analytics. Those are the types of complex all-data flow use cases we’re addressing.

TW: It sounds like a platform play. Are you essentially offering and delivering and serving pretty much the whole data value chain from ingestion through consumption?

AV: Yes, we are, and that’s where DataOps comes into play. There’s always raw data out there. At the end of the day your business users are getting value from applications, Excel or Dynamics or Power BI or Salesforce or NetSuite, whatever it is. But there’s a whole process that happens in between the raw data getting to the high-level application, a process that encompasses orchestration, automation, and analytics. That’s our play. That’s where we live. That’s what we do well.

TW: I like to talk about the enterprise conflict between IT leadership and line-of-business stakeholders like my former marketer self. Toph-the-marketing-boy wants self-service everything—data immediacy without data-administration complexity. On the other side, IT leadership is tasked with ensuring auditability, lineage, governance, security. Which side of that customer equation do you target? IT side? Business influencer? Or both?

AV: Almost always the business side.

TW: Interesting. I confess that’s not what I expected!

AV: We typically find that the business side is willing to adopt new technologies so it can directly increase business value. Back to DataOps, we enable the business side to develop operational data science solutions, through reliable and robust continuous integration, while establishing, through the use of our tools, DataOps best practices. So, when the business side is ready to have IT leadership take ownership of its proven data implementations, we already have a layer of governance, security, and auditing around it, which makes the transition that much easier.

We talk about operationalizing data. In many cases, organizations have invested in PhD-level scientists to develop, implement, and validate data models. They do this by building what is normally a one-off analytic. It works beautifully, but at that point, the model has not provided any business value to the organization.

That one-off data model or data analytic must fit into a larger data workflow, one that the organization supports, and which works in conjunction with IT. It must integrate with production databases, query data, pull it into the analytic model, perform the computation, and push it back into other production databases, production CRMs, maybe into ERP systems. It’s that part—the data-workflow management—that is missing in today’s Big Data solutions. That’s where the Composable platform comes in. It allows you to connect the data sets, plug-and-play the analytics—that you either write or bring in from other open-source libraries—and be part of this broader operational process.

TW: You’re preaching to the converted! Enterprises need to hear the DataOps gospel. But I think most face a challenge on both the data consumption and data management sides of the house: They must overcome conflicting objectives to collaborate. Do you find that it’s difficult to evangelize collaboration to these enterprise groups?

AV: No. It’s actually easy once we’re in. When enterprises use our platform as a framework for building these operational data flows, we typically have good engagement with IT leaders because they see things are developed correctly.

TW: What’s deployment like?

AV: The platform is a distributed web application developed as a native cloud application. It can be deployed on the cloud, and scales well both horizontally and vertically. You can spin up an instance of Composable on AWS or Microsoft Azure, but the public cloud is not required. We can deploy Composable for an enterprise on-premises. Back to our Department of Defense legacy, one of our requirements was to be able to run not just on-premises, but on air-gapped networks, and we can do that. With some of our customers—within insurance and finance—the data is sensitive, and we run on a cluster behind the corporate firewall completely disconnected from the web.

TW: What’s Composable’s funding situation?

AV: We were lucky enough to leave MIT with a product and customers ready and waiting. From day one—the end of 2014—we’ve been completely client-funded.

TW: Will you look to subsidize growth with outside investment?

AV: Yes. I think 2017 is the year for us. We’re reaching a point where capital will help us scale out dramatically.

We’re a growing but small company, with the entire team being technical and focused on product development. As we grow, our focus will be to bring on forward-deployed engineers and customer success managers to help with deployment. This will help us approach a broader set of customers and work with them to develop a DataOps Strategy, based on a small-scale, short-term pilot, that may last one or two months at most. After that, and after they see the value, they buy into Composable as a licensed delivery platform.

TW: Where is your customer base?

AV: All regions, but predominantly domestic. We have, for example, one large customer that is a global energy conglomerate with operations in South America and other parts of the world.

TW: I understand you’re producing an upcoming conference?

AV: Yes—the DataOps Summit conference series. The next event is in June here in our hometown in Boston. We’re focused on getting all the data professionals into the same room. That’s both the business side of the house and technical audiences, like software developers, data scientists, data engineers, IT operations, quality assurance engineers, and so on. More details online at

Many enterprises have invested in data science, and developed some cool data applications, and now must figure out how to put them in an operational workflow to actually generate value! That’s what we’re trying to illustrate with this DataOps Summit series. We’ll bring in executives from the business side—financial services, insurance, oil and gas, cybersecurity, other verticals as well—and talk about what DataOps tools, techniques, best practices they can put together around data operations. But we’ll listen, too: The technology vendors in the room—Composable and others—can work with them on a DataOps vision that we can all build towards.

TW: Where does Composable Analytics go from here?

AV: First, democratizing data science. Enterprise business users should be able to work more and more like data scientists. Our current end users are typically sophisticated business users, but not necessarily technical. Ultimately, they know the business better than anyone else. We’re creating a framework to help these users develop their own analytical workflows. Composable has a visual designer that lets you create complex dataflows regardless of your technical level. That means a complex data pipeline can be created visually, just as you would draw out a workflow on a whiteboard! We have a machine-learning computational framework behind this that will accelerate the process for an analyst to build these workflows. As that analyst selects different modules to build up the data flow, the machine will recommend the next such module to come in. So, machine learning is accelerating the development of new machine-learning data flows. That’s pretty cool.

Second, there’s a lot of noise out there, and we’ve seen many organizations delay data-management solution adoption. Composable started as a self-service analytics platform, but over time has become a DataOps platform with orchestration, automation, and analytics aimed at getting people out of the rat’s nest of spreadsheets, and to start thinking about modern data architectures. We see DataOps being this transformative notion of best practices that allow organizations to say “Okay, we can do this.” We know how to do software development. We know how to build production systems. Now, let’s bring that to the data world and start to think about production data platforms and operational data science.

Posted in Governance, Risk Management, and Compliance, Blog, Operations | Tagged | Leave a comment

Latest Blog

Q2 Research Agenda Announced Blue Cedar Puts Mobile Application Security Far Ahead of MDM Apple iPhone X Highlights Enterprise Corporate-Liable vs. BYOD Conundrum

Topics of Interest





Big Data



Emerging Tech

Social Media


Unified Communications



Supply Chain Finance



Corporate Payments


Risk Management

Legal Tech

Data Management


Log Data

Business Intelligence

Predictive Analytics

Cognitive Computing

Wearable Tech


Sales Enablement

User Experience

User Interface

Private Equity

Recurring Revenue


Advanced Analytics

Machine Learning


IBM Interconnect

video platform

enterprise video

design thinking

enterprise applications


Managed Mobility Services


Hadoop World


service desk





USER Applications




Questioning Authority

















fog computing

legacy IT



Switchboard Software


Data Wrangling

Data Preparation


Information Builders

Mobile Managed Services



Virtual Reality


Enterprise Mobility



Mobile devices

Mobile App Security

Augmented Reality

Mixed Reality

Artifical Intelligence


Managed Mobiity Services


Enterprise Performance Management

General Industry

Human Resources

Internet of Things



Telecom Expense Management