There is lots of buzz about the opportunity to leverage Big Data to drive business outcomes. However, some companies are struggling with a comprehensive strategy which includes taking a use case, and putting into production with a sound infrastructure strategy. One of the many challenges for clients include, “How do I stand up an infrastructure environment to support putting a Big Data use case into production at the right TCO with low risk to my business” (considering we could be talking about 100’s of TBs up to PBs of data).
First, let’s take a look at what the Analysts say about Big Data and the current trends of data growth on our planet:
IDC recently predicted, in their recent paper “The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things” massive explosion of Data growth with 10x growth from 2013-2020. At the same time, they say only less than 5% of data today is actually analyzed to drive decisions and outcomes.
They also say data from embedded Systems (signals from components of the Internet of Things) will grow from 2% in 2013 to 10% in 2020 which is part of the overall data growth, and represents an increase of being able to drive insight across various devices and products used from data being collected around the world.
As an SAP customer looking at a Big Data strategy, we first look at the various sample use cases, which exist in certain industry verticals:
- Energy: there are opportunities to leverage Big Data in Oil & Gas around Oil Field Automation, Risk management and in the Utility industry around preventative maintenance of energy stations and end user smart reader devices.
- Financial Services: there are use cases around trading & risk aggregation.
- Retail: there are use cases around customer sentiment, customer marketing programs, and brand loyalty.
- Telecommunications: there are use cases around customer retention and customer acquisition.
Many of these use cases are derived from Structured & Unstructured Data sources:
- Structured Data that lives in SAP systems include data about company transactions, customer data and product data. There can also be structured data sources that live outside SAP systems since many SAP clients have other transactional systems around their SAP implementation. These data sources may not be high “volume” however they are of high “value” to organizations.
- Unstructured Data sources may live within an SAP customer’s environment or outside in the public domain, and can include sensor data, machine data, meter data, and social data. These data sources are usually high “volume” but may not be of high “value” until analyzed in the right manner.
One of the reasons for the large data growth taking place in the world is the billions of devices out there such as mobile devices and company products, which have the ability to send sensor data from them. This is where Cisco coined the term Internet of Things (IoT)representing how these “Smart Devices” operate and represent an opportunity.
The most powerful use cases can come from combining and analyzing SAP data sources from transactions, customers and products with the unstructured data sources from sensors, meters and in the social space.
A simple example would be a Telco provider pulling social data from the public data sources, and combining it with data about their existing clients in order to better understand when and how to provide better services. This can directly impact their ability to retain or gain clients. In order to achieve such a use case, often you have to collect 100s of TBs of Data.
The way SAP looks to help clients with a Big Data strategy is with their “SAP HANA Data Platform” which encompasses the ability to effectively tier data across these data platforms, putting “Hot Data” in SAP HANA, “Warm Data” in SAP IQ (formerly Sybase IQ) and “Cold Data” in Hadoop to manage a large data set using a data temperature approach. SAP then provides intelligence of being able to move data between the data sources in changing business environments. For example, the data sitting in SAP HANA for Real-Time Analysis today may be different in a few months especially if a company were to acquire a new division or bring a new Product to market. At EMC we believe this is a sound Big Data strategy for an SAP client, however there can be challenges in making this strategy a reality; especially as it relates to the complexities of standing up various infrastructures, managing data protection and security as well as having a good overall data management strategy. These all represents challenges, and a potential higher overall TCO in a Big Data project, and are risks, which clients have to weigh in order to justify an initiative.
To simplify the ability to execute an SAP Big Data strategy (assuming there is a justifiable Big Data use case) we have built an “EMC Big Data Platform for SAP” which we also refer to as the “Jupiter Platform” (the largest planet in our solar system…get it?! 🙂 ).
This EMC Big Data Platform for SAP is a purpose built solution for the SAP HANA Data Platform’s strategy with all the compute, network, and storage needed to stand up a Big Data project. All virtualized infrastructure to drive cost out of the infrastructure (including the ability to run HANA Virtually). Think of it as an “Appliance” or “packaged solution” for Hot, Warm and Cold Data, which can scale enormously at the right TCO, given the platform can manage data through a data temperature framework. The platform also includes items such as management & orchestration, security and data protection (HA, Backup/Recovery and Disaster Tolerance).
We believe this platform represents a much simpler way, which reduces risk for clients, taking a Big Data use case into production. With EMC’s 30+ years of experience in managing, storing, protecting and analyzing data, we have thought about all the infrastructure items clients need as they embark upon their Big Data Journey.
Gartner often defines Big Data through 4 Attributes: “The 4 V’s”: Volume, Velocity, Variety and Value. Let’s take a look at EMC’s Big Data Platform for SAP and how the solutions align with these attributes:
- Data Volume: The Big Data Platform for SAP can start in the 50 TB range, and scale to 10+ PBs with an optimized TCO in mind by moving most data to warm or cold sources at lower cost.
- Data Velocity: The Platform can ingest data at over 100 TBs per hour loading speeds so clients can make use of data captured before it possibly becomes obsolete. The Platform also enables fast analysis turnaround time for large data sets i.e “Fast Data”.
- Data Variety: The Platform can ingest any type of structured or unstructured data source such as sensor, files, email/SMS, social, multimedia etc.
- Data Value: Measured in a company’s ability to quickly find new business opportunities or ways to optimize such as a manufacturing process, logistics, or maintenance. The Platform reduces risks and gives predictable low $/TB by simplifying a Big Data deployment.
Since the Platform also uses common tools and products, the required skillsets are likely to be the same that currently exist. Hence, the overall TCO of the platform is optimized so clients can focus on taking their Big Data use cases from “Concept to Business Outcomes” for their organizations.
I look forward to talking more in detail about our SAP Big Data Platform at EMC World (May 5-8), SAP Sapphire (June 3-5), and at future SAP Week EBC events.
You can also engage in a conversation with me on Twitter: @henrikwagner73
…as well as check out a 5 mins video on EMC’s Big Data platform for SAP