Topics of Interest Archives: DataKitchen

This Week in DataOps: Rain, the Real World, and Another Manifesto (the Good Kind)

TWIDO logoAs the saying goes, April showers bring May flowers, unless you live in British Columbia, where April showers bring May showers, and let’s face it, the joke doesn’t work as well with June flowers and pilgrims.

It’s been a big week in the DataOps world. First off, if you missed it (or even if you didn’t and want to listen to it again—thanks, Mom), check out the recording of the joint webinar I did last week with Information Builders’ marketing VP Jake Freivald, “DataOps in the Real World.” We talked collaborative data orchestration (long hashtag), DataOps in healthcare, and fast-talkers. Some fun things you’ll learn:

  • Information Builders’ latest Omni-Gen release includes a unique, tiered-functionality offering of three different toolsets, including Integration, Data Quality, and MDM editions.
  • The Information Builders engagement with customer St. Luke’s University Health Network (a relationship I profiled here in an earlier DataOps research piece) was so successful that the two parties have collaborated to package the solution as a healthcare-vertical-targeted BI and analytics solution.
  • They can’t hear you if you knock your headset microphone away from your face.
  • No matter its relevance, “COMAECAL” is not a particularly marketable DataOps acronym. (Sing it with me, Collaborate! Orchestrate! Measure! Accelerate!…)

dataops_landing_890x200_1Qubole founders (and former Facebook infrastructure engineers, and Apache Hive co-developers) Ashish Thusoo and Joydeep Sen Sarma have just authored “Creating a Data-Driven Enterprise with DataOps.” The book—published by O’Reilly—evangelizes both DataOps corporate culture and platform. It also features case examples from the likes of eBay, Twitter, and Uber. Expect some promotion (!), presentations, and available copies at the upcoming Qubole-sponsored Data Platforms 2017 conference next month. (Check out my “Questioning Authority” DataOps interview with Qubole CEO Thusoo here.)

Also, in case you missed it, the big news last week was Infor’s acquisition of cloud BI and analytics developer Birst. The move is an interesting one, in part because it raises the profile of BI in an enterprise context: Infor offers ERP solutions, and now Birst BI tools will snap into that portfolio.

It’s still a work in progress but if you’re committed to DataOps like the folks at DataKitchen, check out the draft DataOps Manifesto developed by a consortium of DataOps leaders. (I’m a big fan of DataOps manifestos.) It’s a call to action for the DataOps-faithful, and a series of (evolving) DataOps principles.

Finally, I’m looking forward to the upcoming Talend Connect and Informatica World events in California. Find me and let’s talk DataOps ‘til we’re blue in the face. (Just kidding. I’ll stop at flushed pink.)

Posted in Blog | Tagged , , , , | Leave a comment

This Week in DataOps: The Tradeshow Edition

TWIDO logoDataOps wasn’t the most deafening sound at Strata + Hadoop World San Jose this year, but as data-workflow orchestration models go, the DataOps music gets louder with each event. I’ve written before about Boston-based DataOps startup Composable Analytics. But several Strata startups are starting to get attention too.

Still-in-stealth-mode-but-let’s-get-a-Strata-booth-anyway San Francisco-based startup Nexla is pitching a combined DataOps + machine-learning message. The Nexla platform enables customers to connect, move, transform, secure, and (most significantly) monitor their data streams. Nexla’s mission is to get end users deriving value from data rather than spending time working to access it. (Check out Nexla’s new DataOps industry survey.)

DataKitchen is another DataOps four-year-overnight success. The startup out of Cambridge, Massachusetts also exhibited at Strata. DataKitchen users can create, manage, replicate, and share defined data workflows under the guise of “self-service data orchestration.” The DataKitchen guys—“Head Chef” Christopher Bergh and co-founder Gil Benghiat—wore chef’s outfits and handed out logo’ed wooden mixing spoons. (Because your data workflow is a “recipe.” Get it?)

DataOps at Strata - Nexla and DataKitchen booths

DataOps in the wild — The Nexla and DataKitchen exhibition booths at Strata + Hadoop World San Jose.

Another DataOps-y theme at Strata: “Continuous Analytics.” In most common parlance, the buzzphrase suggests “BI on BI,” enabling data-workflow monitoring/management to tweak and improve, with the implied notion of consumable, always-on, probably-streaming, real-time BI. Israeli startup Iguazio preaches the continuous analytics message (as well as plenty of performance benchmarking) as part of its “Unified Data Platform” offering.

I got the chance to talk DataOps with IBM honchos Madhu Kochar and Pandit Prasad of the IBM Almaden Research Center. Kochar and Prasad are tasked with the small challenge of reinventing how enterprises derive value from their data with analytics. IBM’s recently announced Watson AI partnership with Salesforce Einstein is only the latest salvo in IBM’s efforts to deliver, manage, and shape AI in the enterprise.

Meanwhile, over in the data-prep world, the data wranglers over at Trifacta are working to “fix the data supply chain” with self-service, democratized data access. CEO Adam Wilson preached a message of business value—Trifacta’s platform shift aims to resonate with line-of-business stakeholders, and is music to the ears of a DataOps wonk like me. (And it echoes CTO Joe Hellerstein’s LOB-focused technical story from last fall.)

Many vendors are supplementing evangelism efforts with training outreach programs. DataRobot, for example, has introduced its own DataRobot University. The education initiative is intended both for enterprise training, but also for grassroots marketing, with pilot academic programs already in place at a major American university you’ve heard of but shall remain nameless, as well as the National University of Singapore and several others.

Another common theme: The curse of well-intentioned technology. Informatica’s Murthy Mathiprakasam identifies two potential (and related) data transformation pitfalls: cheap solutions for data lakes that can turn them into high-maintenance, inaccessible data swamps, and self-service solutions that can reinforce data-access bad habits, foster data silos, and limit process repeatability. (In his words, “The fragmented approach is literally creating the data swamp problem.”) Informatica’s approach: unified metadata management and machine-learning capabilities powering an integrated data lake solution. (As with so many fundamentals of data governance, the first challenge is doing the metadata-unifying. The second will be evangelizing it.)

I got the opportunity to meet with Talend customer Beachbody. Beachbody may be best known for producing the “P90” and “Insanity” exercise programs, and continues to certify its broad network of exercise professionals. What’s cool from a DataOps perspective: Beachbody uses Talend to provide transparency, auditability, and control via a visible data workflow from partner to CEO. More importantly, data delivery—at every stage of the data supply chain—is now real time. To get to that, Beachbody moved its information stores to AWS and—working with Talend—built a data lake in the cloud offering self-service capabilities. After a speedy deployment, Beachbody now enjoys faster processing and better job execution using fewer resources.

More Strata quick hits:

  • Qubole is publishing a DataOps e-book with O’Reilly. The case-study focused piece includes use-case examples from the likes of Walmart.
  • Pentaho is committed to getting its machine-learning technology into common use in the data-driven enterprise. What’s cool (to me): the ML orchestration capabilities, Pentaho’s emphasis on a “test-and-tune” deployment model.
  • Attunity offers three products using two verbs and a noun. Its Replicate solution enables real-time data integration/migration, Compose delivers a data-warehouse automation layer, but it is Attunity’s Visibility product that tells the most interesting DataOps story: It provides “BI-on-BI” operations monitoring (focused on data lakes).
  • Check out Striim’s BI-on-BI approach to streaming analytics. It couples data integration with a DataOps-ish operations-monitoring perspective on data consumption. It’s a great way to scale consumption with data volume growth. (The two i’s stand for “Integration” and “Intelligence.” Ah.)
  • Along those same lines, anomaly-detection technology innovator Anodot has grown substantially in the last six months, and promises a new way to monitor line-of-business data. Look for new product, package, and service announcements from Anodot in the next few months.

Last week I attended Domo’s annual customer funfest Domopalooza in Salt Lake City. More on Domo’s announcements coming soon, but a quick summary:

  • Focus was noticeably humble (core product has improved dramatically from four years ago, when it wasn’t so great, admitted CEO Josh James in his first keynote) and business-value-focused. (James: “We don’t talk about optimizing queries. (Puke!) We talk about optimizing your business.”)
  • There was a definite scent of DataOps in the air. CSO Niall Browne presented on Domo data governance. The Domo data governance story emphasizes transparency with control, a message that will be welcomed in IT leadership circles.
  • Domo introduced a new OEMish model called “Domo Everywhere.” It allows partners to develop custom Domo solutions, with three tiers of licensing: white label, embed, and publish.
  • Some cool core enhancements include new alert capabilities, DataOps-oriented data-lineage tracking in Domo Analyzer, and Domo “Mr. Roboto” (yes, that’s what they’re calling it) AI functionality.
  • Domo also introduced its “Business-in-a-Box” package of pre-produced dashboard elements to accelerate enterprise deployment. (One cool dataviz UI element demoed at the show: Sample charts are pre-populated with applicable data, allowing end users to view data in the context of different chart designs.)

Finally, and not at all tradeshow-related, Australian BI leader Yellowfin has just announced its semi-annual upgrade to its namesake BI solution. Yellowfin version “7.3+” comes out in May. (The “+” might be Australian for “.1”.) The news is all about extensibility, with many, many new web connectors. But most interesting (to me at least) is its JSON connector capability that enables users to establish their own data workflows. (Next step, I hope: visual-mapping of that connectivity for top-down workflow orchestration.)

Posted in Blog | Tagged , , , , , , , , , , , , , , , | Leave a comment