Topics of Interest Archives: Enterprise Performance Management

The Pumpkin Spice School of Big Data

Source: Pumpkin Spice Trident Layers Gum by Mike MozartIn our particular pocket of New England, the leaves are turning golden, and football is replacing baseball on the TVs. This means one thing to coffee drinkers: the re-emergence of the Pumpkin Spice Latte at Starbucks. Over the past ten years, this drink has gone from an odd cult drink to a phenomenon so large that it has earned its own hashtag on Twitter: #PSL.

At the same time, one has to wonder, “What is Pumpkin Spice?” (Other than possibly the long-lost American cousin of the Spice Girls?) Pumpkin spice doesn’t actually have pumpkin in it. And it’s far from the spiciest flavor out there. However, the concept of “pumpkin spice” insinuates the idea of something that’s handmade, traditional, and uniquely American in a way that draws people into the concept of wanting to consume it. Despite its complete lack of pumpkin and relative lack of spice, the flavor created is almost secondary to the cultish conceit that has been constructed around “Pumpkin Spice.”

Unfortunately, the hype, conceptualization, and ubiquitous phenomenon of Pumpkin Spice is matched in the enterprise world through the most overhyped phrase in tech: Big Data.  Like Pumpkin Spice, everybody wants Big Data, everybody wants to invest in Big Data tools, and everybody thinks that we are currently in a season or era of Big Data. And in the past, we’ve explained why we reluctantly think the term “Big Data” is still necessary. But when you go behind the curtain and try to figure out what Big Data is, what do you actually find?

For one thing, “Big Data” often isn’t that big. Although we talk about petabytes of data, there are practitioners that talk about “Big Data” problems that are only hundreds of megabytes. These are still very big portions of data, but these problems are manageable through traditional analytics tools.

And even when Big Data is “big,” this is still a very relative term. For instance, even when Big Data collects terabytes of data, text, and binaries, the data collected is rarely analyzed on a daily basis. In fact, we still lack the sentiment analysis, video analysis, and audio analysis needed to quickly analyze large amounts of data. And we know that data is about to grow by at least one order of magnitude, if not two, as the Internet of Things and the accompanying billions of sensors start to embed themselves into our planet.

Even outside of the Internet of Things, the entirety of the biological ecosystem represents yet another large source of data that we are just starting to tap. We are nowhere close to understanding what happens in each of our organs, much less in each cell of our bodies. To get to this level of detail for any lifeform represents additional orders of magnitude for data.

And then there’s even a higher level of truly Big Data when we track matter, molecules, and atomic behavior on a broad-based level to truly understand the nature of chemical reactions and mechanical physics. Compared to all of this, we are just starting to collect data on Planet Earth. And yet we call it Big Data.

See Related Research

So, our “Big Data” isn’t big in comparison to the amount of data that actually exists on Earth. And the types of data that we collect are still very limited in nature, since they almost always come from electronic sources, and often lack the level of detail that could legitimately recreate the environment and context of the transaction in question. And yet we are already calling it Big Data and setting ourselves up to start talking about “Bigger Data,” “Enormous Data,” and “Insanely Large Data.”

To get past the hype, we should start thinking about Big Data in terms of the scope that is actually being collected and supported. There is nothing wrong with talking about the scale of “log management data” or “sensor data” or “video data” or “DNA genome data.” For those of us who live in each of these worlds and know that log management gets measured in terabytes per day or that the human genome has 3 billion base pairs and approximately 3 million SNP (single-nucleotide polymorphism) replacements, we start talking about meaningful measurements of data again, rather than simply defaulting to the overused Big Data term.

I will say that there is one big difference between Pumpkin Spice season and Big Data Season. Around the end of the year, I can count on the end of Pumpkin Spice season. However, the imprecise cult of Big Data seems far from over; the community of tech thought leaders continues to push more and more use cases into Big Data, rather than provide clarity on what actually is “Big,” what actually constitutes “Data,” and how to actually use these tools correctly in the Era of Big Data.

In this light, Blue Hill Research promises to keep the usage of the phrase “Big Data” to a minimum. We believe there are more valuable ways to talk about data, such as:

- Our primary research in log and machine data management
- Our scheduled research in self-service topics including data quality, business intelligence, predictive analytics, and enterprise performance management
- Tracking the $3 billion spent in analytics over the past five years.
- Cognitive and neuroinspired computing

By focusing on the actual data topics that provide financial, operational, and line-of-business value, Blue Hill will do its best to minimize the extension of Big Data season.

Posted in Analytics, Blog, General Function, General Industry, Research | Tagged , , , , , , | Leave a comment

Blue Hill's Q4 Self-Service Analytics Research

Data Science Venn DiagramThere is a fundamental issue in the world of enterprise analytics and data management that is vital to the future of business intelligence and analytics: are employees free and able to pursue the deep analytical insights needed to further advance their business goals? The concept of the analytical business has become more popular in recent years as statistics and algorithms have become sexy concepts. One need only look at the concept of “Moneyball” to see that statistics are no longer relegated to the nerd squad. When Brad Pitt becomes the face of gaining analytic advantages in the workplace, analytics has arrived as a mainstream business topic.

But the popularity of analytics does not mean that it has been fully realized into a set of tools that are ubiquitous and easy to use. Although we have seen phenomenal strides in the tools made available to support business intelligence over the past five years, we are still largely in a world of haves and have-nots when it comes to analytic access.

Why is this? Part of the problem is that we as an industry are defining analytic freedom in different ways. A simple way to think of this is to consider the enterprise-wide view, the department-wide view, and the individual view.

Some of us look at this from a company-wide view, where analytic freedom means having agile data warehousing, robust ETL, a portfolio of analytic applications custom-made for each department, an army of number crunchers to handle each predictive request, and a fully-realized BI Center of Excellence.

Yet others look at the department-wide view, where the key is to provide each employee within a department with relevant data. For a marketing department, this might mean a 360-degree view of all campaigns, products, and customers. For a manufacturing department, this might mean full access to operational efficiencies, production, and Six Sigma efforts. These needs are often met in department-specific applications such as CRM and marketing automation management. But outside of the department’s purview, everybody else’s data problems are irrelevant. As a result, these department-specific solutions merely create a silo, where data-driven enlightenment is limited only to a specific few individuals and solely for certain tasks within a single department.

And finally, there is the individual’s need for data. There is the 1% of data analysts who are able to independently work with the vast majority of data sources, statistically analyze them, and find key connections that have previously escaped detection. We call them data scientists, and the only thing we truly know about this rare and prized species is that there is an enormous shortage of these individuals. But for the rest of us, vendors still need to catch up and provide a variety of tools that will give the typical knowledge worker the same access to data and analytics that the data analysts and data scientists have. This is no small task, as it requires transformative products to be developed in multiple areas: data cleansing, data management, business intelligence, predictive analytics, and performance management.

To make good decisions, individuals first have to find the correct data sources, and then make sure that the data is clean and reliable. This means going through everything: formal business data repositories, third-party data, collected survey and sensor data, informal spreadsheets and tallies, and more. In doing so, employees are often tasked with cleaning up the manual mistakes associated with data collection and collation. The subsequent task of data cleansing is estimated to take up three-quarters of a data analyst’s time. To reallocate this time to more valuable tasks, such as direct data analysis or business alignment of results with specific initiatives, companies need to take advantage of self-service and automated data management tools that solve basic problems in data management. This may include issues as mundane as changing “Y” to “Yes” or providing a default value for any null values in a column. Or this may include the automatic joining of fields in unrelated data sources that have never been linked before. As Blue Hill looks at data management, we plan to look at vendors ranging from market leaders such as IBM and Informatica to emerging startups such as Tamr, Trifacta, and Paxata to determine how each solution supports Blue Hill’s key stakeholders in technology, finance, and the line of business.

There has been a recent evolution in focused on self-service management of business intelligence. The vendors that have caught Blue Hill’s attention to the greatest extent in this regard include Adaptive Insights, Birst, GoodData, IBM Watson Analytics, Microsoft PowerPivot, Qlik Sense, SAP Lumira, Tableau, and Yellowfin. One of the most interesting aspects about this evolution is that end users may initially assume that the startup vendors mentioned would be less scalable, whereas the established enterprise vendors would be more difficult to use. However, this assumption is a false dichotomy; all of the leading vendors in this space, regardless of size, must scale and be easy to use. The key differentiations between these vendors tend to be more associated with the roles that they play within the enterprise and the extent to which they play into the Blue Hill Corporate Hierarchy of Needs.

See Related Research

Predictive analytics has been a more difficult area to innovate from a usability perspective. The biggest challenge has traditionally been the basic hurdle of statistical knowledge. For instance, Microsoft Excel has long had a statistical package that was sufficient to handle basic requests, but the vast majority of Excel users don’t know how to access or use it. Likewise, the statistical software giants, IBM SPSS and SAS, are easy enough to find in the academic world where students cut their teeth on statistical analysis. But for knowledge workers who were not number crunchers in their college days, this availability is (appropriately enough) academic compared to their day-to-day requests for sales projections, production forecasts, and budget estimates. Because of this, the drag-and-drop workflows of Alteryx, the natural language inputs of Watson Analytics, and the modelling ease of Rapidminer and SAP InfiniteInsight are going to become increasingly important as companies seek to change their companies from reactive monitors of data to predictive and cognitive analyzers of data-driven patterns.

Finally, Enterprise Performance Management represents an important subset of business intelligence focused on financial and operational planning. This is a core capability for any business planning. Small companies typically use spreadsheets to handle this analysis. However, as companies start including multi-currency, multi-country, complex supply chains, diverse tax structures, and even treasury activities, companies increasingly need a dedicated EPM solution that can be shared amongst multiple finance officers. At the same time, EPM needs to remain easy to use, or companies risk trading off the assurance of compliance with a delay of days or even weeks in supporting financial closes and budgeting activities. In light of this core challenge, Blue Hill is looking both at the offerings of large software vendors (such as Oracle, IBM, SAP, and Infor) as well as newer upstarts (such as Adaptive Insights, Host Analytics, Tidemark, and Tagetik) to see how they have worked to simplify the Enterprise Performance Management space.

These are the key research efforts that Blue Hill is going to pursue this quarter as we seek to understand the advancement of self-service in analytics, business intelligence, and data management. We are seeking the true differentiators that buyers can hang their hat on in 2014 and going into 2015 as they affect the financial, technological, and line-of-business managers: the three key stakeholders.

Posted in Analytics, Blog, General Function, General Industry, Research | Tagged , , , | Leave a comment

Latest Blog

GRC Implementation Success, Part 3: Business Requirement Definition GRC Implementation Success, Part 2: GRC’s Place in the Business GRC Implementation Success, Part 1: Implementation Success is GRC Success

Topics of Interest

Blog

News

BI

Big Data

Cloud

Virtualization

Emerging Tech

Social Media

Microsoft

Unified Communications

GRC

Security

Supply Chain Finance

Procure-to-Pay

Order-to-Cash

Corporate Payments

Podcast

Risk Management

Legal Tech

Data Management

Visualization

Log Data

Business Intelligence

Predictive Analytics

Cognitive Computing

Wearable Tech

Salesforce

Sales Enablement

User Experience

User Interface

Private Equity

Recurring Revenue

ILTACON

Advanced Analytics

Machine Learning

IBM

IBM Interconnect

video platform

enterprise video

design thinking

enterprise applications

Tangoe

Managed Mobility Services

Strata

Hadoop World

DataOps

service desk

innovation

knowledge

design

usability

USER Applications

ROI

Time-to-Value

AI

Questioning Authority

Domo

Yellowfin

Nexla

DataKitchen

Iguazio

Trifacta

DataRobot

Informatica

Talend

Qubole

Pentaho

Attunity

Striim

Anodot

Tableau

IoT

fog computing

legacy IT

passwords

authentication

Switchboard Software

GoodData

Data Wrangling

Data Preparation

TWIDO

Information Builders

Analytics

Enterprise Performance Management

General Industry

Human Resources

Internet of Things

Legal

Mobility

Telecom Expense Management