Topics of Interest Archives: DataOps

The New Way of Work: How DataOps Transforms Enterprise Roles

What if…

…line-of-business leaders spent less time getting data and more time acting upon it?

…analysts spent less time munging data and more time analyzing?

…managers managed data-workflow outcomes as diligently as they managed business-workflow outcomes?

Enterprise digital transformation offers the promise of data-driven value delivery. But without a DataOps approach, data-progressive organizations run the risk of burdening data-driven initiatives with archaic processes, risk-averse culture, and out-of-date functional roles.

A successful enterprise digital transformation and—more  importantly–an effective DataOps approach both require a new way of thinking about those traditional functional roles. Data consumption functions must evolve to accommodate more data and more insight. Data management roles must adapt to deliver and broker data, and manage comprehensive end-to-end data workflows through the organization. Finally, measurement must start with value, enabling data managers to work backwards through the enterprise to maximize that value delivery with the right data in the right place at the right time via a dynamic, continuously-improving process.

To read the rest of this report, please fill out the download form.


Posted in Research | Tagged | Leave a comment

Beyond Self-Service: How Machine Learning Drives Enterprise Data’s Third Wave

Enterprises undergoing digital transformations move through three phases of maturity: Commodity Storage, Self-service Everything, and Machine-learning Ubiquity. At each stage, enterprise data technology innovations have served end users seeking to get the most value out of their data.

Many enterprises have reached that second stage—using self-service data technologies to empower end users to access and consume data on their own. But the convenience of self-service data technology is self-limiting: As enterprise data grows, end users’ ability to find it, figure out what to do with it, and gain insight from it gets more difficult. And that’s a complex challenge only exacerbated by static, technology-reinforced, self-service processes.

An emerging third phase responds to that challenge, and helps enterprises move into a dynamic data operations environment characterized by smart workflows, self-optimizing data workflow orchestration, and an enterprise commitment to maximizing data-derived value. In this new world, enterprises leverage machine-learning technologies to craft DataOps models that learn with iteration, and scale with continuous improvement. Coupling that approach with embedded analytics can deliver insight at the point of its greatest potential impact: where data meets decision.

In this report, Blue Hill Research examines how digital transformations have evolved, and looks at how innovative enterprises are using machine-learning-enabled technology like GoodData to accelerate data flow, shorten communication spans, empower line-of-business stakeholders, and deliver greater bottom-line value (while overturning a few old-school business models in the process).

To read the rest of this report, please fill out the download form.

GoodData Beyond Self Service Image

Posted in Research | Tagged , | Leave a comment

No More Silos: How DataOps Technologies Overcome Enterprise Data Isolationism

Data—and the value derived from it—dictates success in the modern enterprise. Enterprises that exploit data to derive value recognize new revenue, see new efficiencies, and enjoy intangible benefits like strengthened customer relationships and greater marketing efficiency.

But organizational ennui, legacy system burdens, and change aversion conspire to bury enterprise data in metaphorical silos. The free flow of data is a mandate for success in the modern enterprise. When silos obstruct data-workflow efficiency, that modern enterprise cannot maximize data-derived value.

In this report, Blue Hill Research examines how enterprise leaders use DataOps approaches to break down silos, whether those silos are organizational, architectural, or process-driven. This report also introduces a migration framework for DataOps adoption.

To read the rest of this report, please fill out the download form.


Posted in Research | Tagged | Leave a comment

This Week in DataOps: Rain, the Real World, and Another Manifesto (the Good Kind)

TWIDO logoAs the saying goes, April showers bring May flowers, unless you live in British Columbia, where April showers bring May showers, and let’s face it, the joke doesn’t work as well with June flowers and pilgrims.

It’s been a big week in the DataOps world. First off, if you missed it (or even if you didn’t and want to listen to it again—thanks, Mom), check out the recording of the joint webinar I did last week with Information Builders’ marketing VP Jake Freivald, “DataOps in the Real World.” We talked collaborative data orchestration (long hashtag), DataOps in healthcare, and fast-talkers. Some fun things you’ll learn:

  • Information Builders’ latest Omni-Gen release includes a unique, tiered-functionality offering of three different toolsets, including Integration, Data Quality, and MDM editions.
  • The Information Builders engagement with customer St. Luke’s University Health Network (a relationship I profiled here in an earlier DataOps research piece) was so successful that the two parties have collaborated to package the solution as a healthcare-vertical-targeted BI and analytics solution.
  • They can’t hear you if you knock your headset microphone away from your face.
  • No matter its relevance, “COMAECAL” is not a particularly marketable DataOps acronym. (Sing it with me, Collaborate! Orchestrate! Measure! Accelerate!…)

dataops_landing_890x200_1Qubole founders (and former Facebook infrastructure engineers, and Apache Hive co-developers) Ashish Thusoo and Joydeep Sen Sarma have just authored “Creating a Data-Driven Enterprise with DataOps.” The book—published by O’Reilly—evangelizes both DataOps corporate culture and platform. It also features case examples from the likes of eBay, Twitter, and Uber. Expect some promotion (!), presentations, and available copies at the upcoming Qubole-sponsored Data Platforms 2017 conference next month. (Check out my “Questioning Authority” DataOps interview with Qubole CEO Thusoo here.)

Also, in case you missed it, the big news last week was Infor’s acquisition of cloud BI and analytics developer Birst. The move is an interesting one, in part because it raises the profile of BI in an enterprise context: Infor offers ERP solutions, and now Birst BI tools will snap into that portfolio.

It’s still a work in progress but if you’re committed to DataOps like the folks at DataKitchen, check out the draft DataOps Manifesto developed by a consortium of DataOps leaders. (I’m a big fan of DataOps manifestos.) It’s a call to action for the DataOps-faithful, and a series of (evolving) DataOps principles.

Finally, I’m looking forward to the upcoming Talend Connect and Informatica World events in California. Find me and let’s talk DataOps ‘til we’re blue in the face. (Just kidding. I’ll stop at flushed pink.)

Posted in Blog | Tagged , , , , | Leave a comment

This Week in DataOps: The Promotional Edition

TWIDO logoSpring has sprung (finally, though only briefly here in Canada), which means it’s webinar and publishing season! And that makes for a busy month in the DataOps world.

Join me and Information Builders VP of Marketing Jake Freivald Thursday, April 27 2017 for our webinar on “DataOps in the Real World: How Innovators are Reinventing Their Business Models with Smarter Data Management.” I’ll be providing an overview of DataOps—what it is, how it works, and why it matters—and presenting an interesting healthcare case example. (So far, only two slides include pictures of my head.) I’m looking forward to an enlightening discussion! Registration details and more information available here.


Silos kill! Well, they at least hinder progress. Keep an eye out for my upcoming DataOps report “No More Silos: How DataOps Technologies Overcome Enterprise Data Isolationism.” (Tentative publication date = Friday, April 28, 2017.) The research looks at how data innovators leverage technologies from vendors like Informatica, Domo, Switchboard Software, Microsoft, Yellowfin, and GoodData to break down organizational, architectural, and process-based enterprise silos.

Here’s what the first page might just look like:

p1 - No More Silos

Posted in Blog | Tagged , , , , , , | Leave a comment

This Week in DataOps: The Tradeshow Edition

TWIDO logoDataOps wasn’t the most deafening sound at Strata + Hadoop World San Jose this year, but as data-workflow orchestration models go, the DataOps music gets louder with each event. I’ve written before about Boston-based DataOps startup Composable Analytics. But several Strata startups are starting to get attention too.

Still-in-stealth-mode-but-let’s-get-a-Strata-booth-anyway San Francisco-based startup Nexla is pitching a combined DataOps + machine-learning message. The Nexla platform enables customers to connect, move, transform, secure, and (most significantly) monitor their data streams. Nexla’s mission is to get end users deriving value from data rather than spending time working to access it. (Check out Nexla’s new DataOps industry survey.)

DataKitchen is another DataOps four-year-overnight success. The startup out of Cambridge, Massachusetts also exhibited at Strata. DataKitchen users can create, manage, replicate, and share defined data workflows under the guise of “self-service data orchestration.” The DataKitchen guys—“Head Chef” Christopher Bergh and co-founder Gil Benghiat—wore chef’s outfits and handed out logo’ed wooden mixing spoons. (Because your data workflow is a “recipe.” Get it?)

DataOps at Strata - Nexla and DataKitchen booths

DataOps in the wild — The Nexla and DataKitchen exhibition booths at Strata + Hadoop World San Jose.

Another DataOps-y theme at Strata: “Continuous Analytics.” In most common parlance, the buzzphrase suggests “BI on BI,” enabling data-workflow monitoring/management to tweak and improve, with the implied notion of consumable, always-on, probably-streaming, real-time BI. Israeli startup Iguazio preaches the continuous analytics message (as well as plenty of performance benchmarking) as part of its “Unified Data Platform” offering.

I got the chance to talk DataOps with IBM honchos Madhu Kochar and Pandit Prasad of the IBM Almaden Research Center. Kochar and Prasad are tasked with the small challenge of reinventing how enterprises derive value from their data with analytics. IBM’s recently announced Watson AI partnership with Salesforce Einstein is only the latest salvo in IBM’s efforts to deliver, manage, and shape AI in the enterprise.

Meanwhile, over in the data-prep world, the data wranglers over at Trifacta are working to “fix the data supply chain” with self-service, democratized data access. CEO Adam Wilson preached a message of business value—Trifacta’s platform shift aims to resonate with line-of-business stakeholders, and is music to the ears of a DataOps wonk like me. (And it echoes CTO Joe Hellerstein’s LOB-focused technical story from last fall.)

Many vendors are supplementing evangelism efforts with training outreach programs. DataRobot, for example, has introduced its own DataRobot University. The education initiative is intended both for enterprise training, but also for grassroots marketing, with pilot academic programs already in place at a major American university you’ve heard of but shall remain nameless, as well as the National University of Singapore and several others.

Another common theme: The curse of well-intentioned technology. Informatica’s Murthy Mathiprakasam identifies two potential (and related) data transformation pitfalls: cheap solutions for data lakes that can turn them into high-maintenance, inaccessible data swamps, and self-service solutions that can reinforce data-access bad habits, foster data silos, and limit process repeatability. (In his words, “The fragmented approach is literally creating the data swamp problem.”) Informatica’s approach: unified metadata management and machine-learning capabilities powering an integrated data lake solution. (As with so many fundamentals of data governance, the first challenge is doing the metadata-unifying. The second will be evangelizing it.)

I got the opportunity to meet with Talend customer Beachbody. Beachbody may be best known for producing the “P90” and “Insanity” exercise programs, and continues to certify its broad network of exercise professionals. What’s cool from a DataOps perspective: Beachbody uses Talend to provide transparency, auditability, and control via a visible data workflow from partner to CEO. More importantly, data delivery—at every stage of the data supply chain—is now real time. To get to that, Beachbody moved its information stores to AWS and—working with Talend—built a data lake in the cloud offering self-service capabilities. After a speedy deployment, Beachbody now enjoys faster processing and better job execution using fewer resources.

More Strata quick hits:

  • Qubole is publishing a DataOps e-book with O’Reilly. The case-study focused piece includes use-case examples from the likes of Walmart.
  • Pentaho is committed to getting its machine-learning technology into common use in the data-driven enterprise. What’s cool (to me): the ML orchestration capabilities, Pentaho’s emphasis on a “test-and-tune” deployment model.
  • Attunity offers three products using two verbs and a noun. Its Replicate solution enables real-time data integration/migration, Compose delivers a data-warehouse automation layer, but it is Attunity’s Visibility product that tells the most interesting DataOps story: It provides “BI-on-BI” operations monitoring (focused on data lakes).
  • Check out Striim’s BI-on-BI approach to streaming analytics. It couples data integration with a DataOps-ish operations-monitoring perspective on data consumption. It’s a great way to scale consumption with data volume growth. (The two i’s stand for “Integration” and “Intelligence.” Ah.)
  • Along those same lines, anomaly-detection technology innovator Anodot has grown substantially in the last six months, and promises a new way to monitor line-of-business data. Look for new product, package, and service announcements from Anodot in the next few months.

Last week I attended Domo’s annual customer funfest Domopalooza in Salt Lake City. More on Domo’s announcements coming soon, but a quick summary:

  • Focus was noticeably humble (core product has improved dramatically from four years ago, when it wasn’t so great, admitted CEO Josh James in his first keynote) and business-value-focused. (James: “We don’t talk about optimizing queries. (Puke!) We talk about optimizing your business.”)
  • There was a definite scent of DataOps in the air. CSO Niall Browne presented on Domo data governance. The Domo data governance story emphasizes transparency with control, a message that will be welcomed in IT leadership circles.
  • Domo introduced a new OEMish model called “Domo Everywhere.” It allows partners to develop custom Domo solutions, with three tiers of licensing: white label, embed, and publish.
  • Some cool core enhancements include new alert capabilities, DataOps-oriented data-lineage tracking in Domo Analyzer, and Domo “Mr. Roboto” (yes, that’s what they’re calling it) AI functionality.
  • Domo also introduced its “Business-in-a-Box” package of pre-produced dashboard elements to accelerate enterprise deployment. (One cool dataviz UI element demoed at the show: Sample charts are pre-populated with applicable data, allowing end users to view data in the context of different chart designs.)

Finally, and not at all tradeshow-related, Australian BI leader Yellowfin has just announced its semi-annual upgrade to its namesake BI solution. Yellowfin version “7.3+” comes out in May. (The “+” might be Australian for “.1”.) The news is all about extensibility, with many, many new web connectors. But most interesting (to me at least) is its JSON connector capability that enables users to establish their own data workflows. (Next step, I hope: visual-mapping of that connectivity for top-down workflow orchestration.)

Posted in Blog | Tagged , , , , , , , , , , , , , , , | Leave a comment

On DataOps, the DoD, and Operationalizing Data Science: Questioning Authority with Composable Analytics’ Andy Vidan

AndyVidan2Andy Vidan is the CEO of Cambridge, Massachusetts-based DataOps startup Composable Analytics. He founded the company two years ago with MIT colleague Lars Fiedler. They now lead Composable—self-funded and self-sustaining, by the way—and are establishing a beachhead in the nascent DataOps space. I recently spoke with him about the genesis of his company, what it’s like to (maybe) work with the U.S. DoD, and the challenge of evangelizing DataOps to line-of-business stakeholders.

TOPH WHITMORE: Tell me about Composable Analytics.

ANDY VIDAN: Composable Analytics grew out of a project at MIT’s Lincoln Laboratory. Lincoln Lab is an MIT R&D center that’s provides advanced technology solutions to the U.S. Department of Defense and intelligence community. There, we saw the clear need for a unifying platform that can ingest all types of data and feed it to an intelligence analyst. An intelligence analyst within the Department of Defense is similar to a business analyst within the private sector. They’re sophisticated. They know their subject matter well, better than software developers may ever know their business. But they’re not always technical, and when they have to deal with different data sets from different systems, with different formats and different structures, they must rely on software engineers and use a variety of disjoint tools that further complicate their workflows.

Our approach was different: We wanted to develop a single ecosystem to bring in data from all sorts of sources, and present it to the user for self-service data discovery and analytics. For us, Big Data always meant all data. Aside from the massive amounts of data —which the community already knows how to handle—or even the high Big Data velocity and throughput, we focused on the variability that comes with all data: There’s always tabular data, and tabular data, and more tabular data, but we also have to think about image files, text documents, PDFs, sound files, and so on. We also wanted to make data accessible to an end user who knows the subject matter but is not a technical person.

TW: You and Lars Fiedler developed Composable while working at Lincoln Lab. How did Composable evolve from an MIT idea into a commercial solution?

AV: Lincoln Laboratory is a well-kept secret.

TW: With the defense department involved, it probably has to be!

AV: Yes. MIT Lincoln Laboratory is really one of the premier research labs in the US, very much like the old Bell Labs, or the Jet Propulsion Lab that NASA runs with Cal Tech. Composable Analytics was initially funded directly by the DoD. The nice thing about Lincoln Lab is that you have that user interaction. You aren’t just writing research papers, you are prototyping, building systems, you are meeting with end users—in this case, intelligence analysts and operators—to be able to really get down to requirements and get a system that they would eventually use.

TW: Does Composable Analytics still serve the Department of Defense?

AV: Yeah. So I can’t really answer the question.

TW: Good enough!

AV: Our main focus is private sector.

TW: Tell me more about the Composable Analytics technology. What value propositions do you offer to an enterprise IT leader?

AV: Three things: orchestration, automation, and analytics. To me, that really embodies what’s behind DataOps. Our platform, our ecosystem provides those three things for an enterprise and for users of data within that enterprise.

Let me walk you through a real use case: One of our financial sector customers wants to build effective customer profiles. One touch point is their call center. You might call in to request a change of address after a recent real-estate purchase. This is normally a short call: the call center agent would change the address and hang up the phone and everybody’s happy. But this is a situation where an organization can learn more about the customer. An enterprise can use that little tidbit of information that you just revealed about yourself in order to understand what other products and services you might be interested in. The fact that you purchased a home might mean you’re willing to purchase life insurance. You might mention you are having a baby. That might incite you to open an educational savings account with the company. What does this require? Being able to integrate with a Voice-over-IP system and orchestrate a data flow that takes the call-center recording, in real time, pushes it into a speech-to-text engine, takes the resulting unstructured text and uses various analytics and natural language processing techniques in order to determine intent, sentiment, and trigger words that can then be directly inserted back into a CRM. The call center agent can see that on your profile and talk to you about it during that call, or next time you call. That embodies orchestration, automation, plus analytics. Those are the types of complex all-data flow use cases we’re addressing.

TW: It sounds like a platform play. Are you essentially offering and delivering and serving pretty much the whole data value chain from ingestion through consumption?

AV: Yes, we are, and that’s where DataOps comes into play. There’s always raw data out there. At the end of the day your business users are getting value from applications, Excel or Dynamics or Power BI or Salesforce or NetSuite, whatever it is. But there’s a whole process that happens in between the raw data getting to the high-level application, a process that encompasses orchestration, automation, and analytics. That’s our play. That’s where we live. That’s what we do well.

TW: I like to talk about the enterprise conflict between IT leadership and line-of-business stakeholders like my former marketer self. Toph-the-marketing-boy wants self-service everything—data immediacy without data-administration complexity. On the other side, IT leadership is tasked with ensuring auditability, lineage, governance, security. Which side of that customer equation do you target? IT side? Business influencer? Or both?

AV: Almost always the business side.

TW: Interesting. I confess that’s not what I expected!

AV: We typically find that the business side is willing to adopt new technologies so it can directly increase business value. Back to DataOps, we enable the business side to develop operational data science solutions, through reliable and robust continuous integration, while establishing, through the use of our tools, DataOps best practices. So, when the business side is ready to have IT leadership take ownership of its proven data implementations, we already have a layer of governance, security, and auditing around it, which makes the transition that much easier.

We talk about operationalizing data. In many cases, organizations have invested in PhD-level scientists to develop, implement, and validate data models. They do this by building what is normally a one-off analytic. It works beautifully, but at that point, the model has not provided any business value to the organization.

That one-off data model or data analytic must fit into a larger data workflow, one that the organization supports, and which works in conjunction with IT. It must integrate with production databases, query data, pull it into the analytic model, perform the computation, and push it back into other production databases, production CRMs, maybe into ERP systems. It’s that part—the data-workflow management—that is missing in today’s Big Data solutions. That’s where the Composable platform comes in. It allows you to connect the data sets, plug-and-play the analytics—that you either write or bring in from other open-source libraries—and be part of this broader operational process.

TW: You’re preaching to the converted! Enterprises need to hear the DataOps gospel. But I think most face a challenge on both the data consumption and data management sides of the house: They must overcome conflicting objectives to collaborate. Do you find that it’s difficult to evangelize collaboration to these enterprise groups?

AV: No. It’s actually easy once we’re in. When enterprises use our platform as a framework for building these operational data flows, we typically have good engagement with IT leaders because they see things are developed correctly.

TW: What’s deployment like?

AV: The platform is a distributed web application developed as a native cloud application. It can be deployed on the cloud, and scales well both horizontally and vertically. You can spin up an instance of Composable on AWS or Microsoft Azure, but the public cloud is not required. We can deploy Composable for an enterprise on-premises. Back to our Department of Defense legacy, one of our requirements was to be able to run not just on-premises, but on air-gapped networks, and we can do that. With some of our customers—within insurance and finance—the data is sensitive, and we run on a cluster behind the corporate firewall completely disconnected from the web.

TW: What’s Composable’s funding situation?

AV: We were lucky enough to leave MIT with a product and customers ready and waiting. From day one—the end of 2014—we’ve been completely client-funded.

TW: Will you look to subsidize growth with outside investment?

AV: Yes. I think 2017 is the year for us. We’re reaching a point where capital will help us scale out dramatically.

We’re a growing but small company, with the entire team being technical and focused on product development. As we grow, our focus will be to bring on forward-deployed engineers and customer success managers to help with deployment. This will help us approach a broader set of customers and work with them to develop a DataOps Strategy, based on a small-scale, short-term pilot, that may last one or two months at most. After that, and after they see the value, they buy into Composable as a licensed delivery platform.

TW: Where is your customer base?

AV: All regions, but predominantly domestic. We have, for example, one large customer that is a global energy conglomerate with operations in South America and other parts of the world.

TW: I understand you’re producing an upcoming conference?

AV: Yes—the DataOps Summit conference series. The next event is in June here in our hometown in Boston. We’re focused on getting all the data professionals into the same room. That’s both the business side of the house and technical audiences, like software developers, data scientists, data engineers, IT operations, quality assurance engineers, and so on. More details online at

Many enterprises have invested in data science, and developed some cool data applications, and now must figure out how to put them in an operational workflow to actually generate value! That’s what we’re trying to illustrate with this DataOps Summit series. We’ll bring in executives from the business side—financial services, insurance, oil and gas, cybersecurity, other verticals as well—and talk about what DataOps tools, techniques, best practices they can put together around data operations. But we’ll listen, too: The technology vendors in the room—Composable and others—can work with them on a DataOps vision that we can all build towards.

TW: Where does Composable Analytics go from here?

AV: First, democratizing data science. Enterprise business users should be able to work more and more like data scientists. Our current end users are typically sophisticated business users, but not necessarily technical. Ultimately, they know the business better than anyone else. We’re creating a framework to help these users develop their own analytical workflows. Composable has a visual designer that lets you create complex dataflows regardless of your technical level. That means a complex data pipeline can be created visually, just as you would draw out a workflow on a whiteboard! We have a machine-learning computational framework behind this that will accelerate the process for an analyst to build these workflows. As that analyst selects different modules to build up the data flow, the machine will recommend the next such module to come in. So, machine learning is accelerating the development of new machine-learning data flows. That’s pretty cool.

Second, there’s a lot of noise out there, and we’ve seen many organizations delay data-management solution adoption. Composable started as a self-service analytics platform, but over time has become a DataOps platform with orchestration, automation, and analytics aimed at getting people out of the rat’s nest of spreadsheets, and to start thinking about modern data architectures. We see DataOps being this transformative notion of best practices that allow organizations to say “Okay, we can do this.” We know how to do software development. We know how to build production systems. Now, let’s bring that to the data world and start to think about production data platforms and operational data science.

Posted in Blog, Governance, Risk Management, and Compliance, Operations | Tagged | Leave a comment

This Week in DataOps: Manifestos, Shocking Steps, and the Rise of Data Governance

This Week in DataOps

Welcome to the first edition of This Week in DataOps! (And before you ask, no, it probably won’t come out every week.) For a reference point, think of “This Week in Baseball,” only the highlights are about data-derived value maximization. (Yes, that’s the hashtag: #dataderivedvaluemaximation. Lot of competition for that trademark, I bet.)

In this roundup: Two DataOps companies step into the light, two upcoming DataOps events take the stage, and a big DataOps buy signals a big DataOps player’s commitment to data governance transparency.

In news from BHR hq city Beantown, two new startups have taken up the mantra of DataOps. Composable Analytics, based across the Charles in Cambridge, grew out of a project at MIT’s Lincoln Laboratory. Cofounders Andy Vidan and Lars Fiedler started Composable back in 2014 with the aim of delivering orchestration, automation, and analytics, all within a DataOps context. Check out Andy’s lucid manifesto “Moving Forward with DataOps.” (I’m a big fan of DataOps manifestos, by the way.) Key takeaway: Real-time data flows, analytics delivered as a service, and composability are essential to DataOps success.

Another Boston-area firm is making news in the DataOps space. (New Cambridge, Massachusetts tourism slogan: Come for the craft beer. Stay for the data workflow management.) DataKitchen is the self-described “DataOps Company,” and delivers an algorithmic platform based on data “kitchens,” where enterprise data consumers create data “recipes” spanning data access, transformation, modeling, and visualization. And cofounders Christopher Berg and Gil Benghiat will be speaking on “Seven Steps to High-velocity Data Analytics with Dataops” at this month’s Strata + Hadoop World event in San Jose. (Apparently some of the steps are “shocking!” More details on that not-at-all-clickbaity preso here.)

Speaking of upcoming events, two feature a DataOps agenda. In June, head to…yep, Cambridge, Massachusetts for the DataOps Summit, a two-day show produced by the nice folks at Composable Analytics. Day one will focus on DataOps business use case and day two examines DataOps technical innovations. Speakers include Tamr CEO Andy Palmer, MIT Lincoln Lab researcher Vijay Gadepally, Unravel Data CTO Bala Venkatrao, IBM UrbanCode Deploy product manager Laurel Dickson-Bull, and chief technologist for PWC’s Global Data & Analytics practice Ritesh Ramesh. (Maybe don’t bring up the Oscars with Ritesh.)

And in late May, head to Phoenix for Data Platforms 2017. This year’s theme is “Engineering the Future with DataOps.” The show is sponsored by O’Reilly, Qubole, Amazon Web Services, and Oracle. Featured speakers include former Obama administration “Geek in Chief” R. David Edelman, Qubole CEO Ashish Thusoo, and Facebook engineering director Ravi Murthy.

And in case you missed it:

  • Informatica acquired UK-based data governance software developer Diaku. The Diaku data governance app snaps nicely into the broader Informatica portfolio. Plus Informatica gets more tech talent and at least some greater foothold in Europe. The purchase signals Informatica’s (and, arguably, the broader data-management software space at large) commitment to DataOps-y principles of orchestration, transparency, and workflow-based collaboration.
  • Tamr just patented its data unification model! As Tamr notes, the concept of data unification may not necessarily be particularly new, but Tamr’s “comprehensive approach for integrating a large number of data sources” coupled with its machine-learning algorithms is uniquely innovative enough to merit patent protection, at least in the judgment of the nice folks at the U.S. Patent and Trademark Office.

That’s it for now. See you next week in DataOps!

Posted in Analytics, Blog, Governance, Risk Management, and Compliance | Tagged | Leave a comment

DataOps, “Agile Growability,” and a Humble Dose of Humanity: 11 Things I Want to Hear at Strata + Hadoop World San Jose

See you at Strata + Hadoop World San Jose.Strata + Hadoop World San Jose is coming, and—trade show junkie that I am—I’m once again filled with anticipation. I look forward to new and exciting technologies on display, plenty of marketing hype, and of course, brightly-colored logo pens (especially the ones that double as flashlights or USB sticks). In addition to the sweet swag, here’s what I hope to see and hear in California…

Acknowledgement from data-technology vendors of the growing influence of business end users in purchase decisions. It’s no longer just about the IT leader! Selling technology for technology’s sake is not enough any more, and vendors who ignore business leadership audiences in their messaging do so at their own peril. I want to hear how cool new technologies will help not just IT leadership, but business users as well.

Context! I’m a Strata-holic. I want see all the new features of all the new functional solutions. But I want to see those solutions demo’ed in the context of broader business and DataOps workflows.

Business value! Imagine, if you will, a solution message that starts with business value and works its way backwards…like say, a technology positioned as the business case for a DataOps approach. The new data-technology sale is less about the how and more about the why: delivering tangible, measurable enterprise business value. Why aren’t we all getting that yet? (Hat-tip to the GoodData social-media folks for this much better way of putting it.)

Speaking of business value, I’m eager to hear a compelling “cloud + data = goodness” message from Microsoft. I like where Microsoft is going with its Cortana Intelligence Suite, Azure Data Factory, and Power BI. (Full disclosure: I used to work there.) But I want more. Excluding a certain online bookseller located on the opposite side of Lake Washington, Microsoft is the only major enterprise data management solution provider that owns the cloud, so to speak. In this instance, at least from Microsoft’s selling perspective, cloud is more than a commoditized, off-premise storage option—It’s a strategic advantage…I think. And I want to hear about how that’s a potential advantage for me, expressed (empathetically!) in data-analytics value terms.

And speaking of coherent cloud messages, I’m still waiting for a good solution to the data-consumption bottleneck. How can data consumers digest data (think streaming) as fast as the architecture can scale to store it? (The answer is not hiring more interns to monitor reporting dashboards.) Toph-the-marketing-boy should be able to avoid missing stuff, test new data applications easily, and work with exponentially greater datasets than he currently can. (Sisense paid darn good lip service to this challenge last fall, and I’m looking forward to an update.)

And speaking of that already-here-no-longer-looming data-consumption bottleneck as an example, I’m particularly interested in companies with data technologies that work “here” applied to what’s going on over “there.” For instance, Anodot takes its anomaly-detection technology beyond the ops world and uses it to attack the data-consumption-as-data-volume-grows-exponentially challenge. And Rocana performance-monitoring software doubles nicely as an accountability and visibility solution for senior (read: non-technical) management.

Orchestration across the silos! Point solutions are good. Functional solutions are good. But when they don’t support cross-function and cross-organizational-silo transparency, their success is limited. Platform-level data orchestration is the next big thing, and not everyone is addressing it yet. Teradata’s “Unified Data Architecture” messaging is a good start. (Teradata marketing folks, please save me a logo pen.) So is Domo’s anti-silo evangelism.

The next layer of trust in data: data solutions that are smart enough to provide on-the-fly extensibility. Call it “agile growability,” call it “smart integration,” but what it really is is a data-management model that grows dynamically as it learns from its own operation. (Continuous improvement? Oh yeah. V2.0.)  A good DataOps workflow provides the best data journey at that moment. A great DataOps workflow is smart enough to improve itself over time. A business user should be able to not just trust in the data now, but trust that the next dataset will be even better. Who’s headed this way?

Actian, I’ve had a change of heart. Please…bring back the dancers.

Democratization that’s meaningful. TIBCO, I’m looking your way—Tell me more about “self-service integration for all” (and why it’s better than the alternatives). And DataRobot—Your advanced analytics are stellar, but what’s the true business impact of my becoming a “citizen data scientist?”

Finally, a human request: Our industry has been built upon—and thrives because of—the contributions of immigrants. I speak as one (to Canada) when I ask: How can we support our tech workers impacted by possible U.S. immigration restrictions? Some initial options for Big Data companies: sign amicus briefs, petition for more H1-B visas, and hug your employees. And if it comes to it, consider opening satellite development offices in other countries. (Canadian technology firms may not wait for you.)


Posted in Blog, Research | Tagged , , | 1 Comment

Latest Blog

VMware's Industry Analyst Day Highlights Drive Toward "Consumer Simple, Enterprise Secure" Voice Continues to Dominate: Apple Releases HomePod, Broadens Siri's Reach IBM and Cisco Systems Team Up for Integrated Cybersecurity Solution, Services, and Threat Intelligence

Topics of Interest




Big Data



Emerging Tech

Social Media


Unified Communications



Supply Chain Finance



Corporate Payments


Risk Management

Legal Tech

Data Management


Log Data

Business Intelligence

Predictive Analytics

Cognitive Computing

Wearable Tech


Sales Enablement

User Experience

User Interface

Private Equity

Recurring Revenue


Advanced Analytics

Machine Learning


IBM Interconnect

video platform

enterprise video

design thinking

enterprise applications


Managed Mobility Services


Hadoop World


service desk





USER Applications




Questioning Authority

















fog computing

legacy IT



Switchboard Software


Data Wrangling

Data Preparation


Information Builders


Enterprise Performance Management

General Industry

Human Resources

Internet of Things



Telecom Expense Management