The speed of technology transformation is enormous. And that probably is an understatement. It is mind blowing. Well, even that, probably doesn’t say it right. It is hard to believe how things changed at the technology front and what was predicted for the future, how data grew and continues to grow in lighting speed.
When we are just trying to grapple with the idea of ‘Big Data’, it becomes just not good enough. There is a buzz about ‘Extreme Data’ . This is the new kid in the block. What is so extreme about it? The Big Data primarily deals with the volume, but this is just one dimension of Extreme Data, there are two other dimensions to it – velocity and variety — and these dimensions are critical when it comes to winning the battle for control of enterprise data with conflicting requirements of business intelligence and reporting vs. data mining and statistical analysis.
Today, enterprises spend a lot of time managing the data volumes. But that no longer matters anymore, what with increased affordability of disk space making handling the volume relatively easy. The situation however is complicated by variety. Now, enterprises barely get tabular data from their online transaction processing systems into and out of their data warehouses in a timely manner, even, when all they need to handle are static reports and business intelligence. It becomes even harder to comprehend what happens, when they try to add data from weblogs, applications logs, business control systems, RFID, and on and on, with new types seemingly appearing every day? The need of the hour for the enterprises is to understand the patterns in their business, To achieve this, we need to address variety as well as volume.
The challenge here won’t be about storing the data — a bigger bucket would address that, but about effectively getting data into and out of the store. Throwing old technology at these new data problems won’t give results. The need of the hour is performance, and that, leads us to the other dimension – velocity. Velocity is how rapidly an enterprise can move its data volume from source to user community. Not all users have the same velocity requirements. Users doing advanced analytics — data mining to understand market basket behavior, pattern-seeking for cross-selling opportunities, online processing for financial analytics — need far more velocity than users doing simple BI and reporting. They need extreme velocity. In a world of location and context-aware devices with persistent connectivity — iPhone to iPad, Blackberry to Galaxy Tab, — extreme velocity is a crucial prerequisite for loading and merging real-time feeds, and getting the required performance for real-time, advanced analytics. One of the way to achieve this is through – Embedded analytics — pushing specialized functions into the database itself — is a way to address this and it is gaining more and more popularity.
SAP’s answer for this extreme data demand is HANA – the in-memory database which can be OLTP and OLAP database. HANA is an appliance (yes.. that’s how SAP calls it) which is combination of hardware software bundle. It provides powerful features like: significant processing speed, the ability to handle very big data, unstructured data, predictive capabilities and text mining capabilities.
How ‘extreme’ is HANA
Hana has four key qualities to tackle the extreme data – first, it comes in very large amounts; second, it’s sophisticated, can deal with both structured, unstructured data; third, it’s deals with data in real time – which means that all the following activities – data gathering, analyzing, acting upon, and distributing and deploying the data can be done real time; and fourth, it enables the application of the Extreme Data – knowledge and intelligence that already resides in your own company and lets you make better decisions. This takes us far beyond BI. SAP HANA, is built specifically to be the platform that can not only handle but also wring maximum value out of Extreme Data, It is a platform for planning, forecasting, analyzing, storing, and eventually for OLTP at that extreme level.
Opportunities with Extreme Data
A recent Gartner report says that there is nothing like ‘information overload’. By 2015, the companies which have built a modern information management system, will outperform their peers financially by 20 percent. What we call today as ‘unstructured data’ makes up 67 to 85 percent of the information in an organization. The total volume of data is growing at 59 percent every year, while the number of files grows at 88 percent per year. The report also suggests that there are three big opportunities that can come from extreme information management in marketing, planning and operations: The addition of sentiment and network analysis from social software to existing marketing processes; The combination of operational technology monitoring and metrics with logistics and supply planning; and Adherence to a high-fidelity data strategy that builds datasets, creating competitive advantages in operations.
The new technology advancement comes with its own challenges. It costs three times more to store data than to get it, and although 70 to 85 percent is unstructured, most companies spend the majority of their analytics only on structured data. Basically, they are ignoring most of the information while spending a lot to save it. The variety of information assets is already crushing and the diversity of information is growing. With small applications and millions of meters and sensors, the amount of data being created from operational technology increases. These challenges will need to be addressed. Governance poses another huge problems of how to deal with the data moving out of the application or the database, and into services and metadata. It would be a wait and watch approach on how such things would be tackled, going forward. But, upfront the industry already did something which is very easy to do. They have coined new acronyms – Enterprise Information Architecture EIA and EIM – Enterprise Information Management.
It would be interesting to see, for how long, extremeness of the data is going to be meaningful. May be, in another few years( or months possibly), we can expect to be swept away by another new wave called ‘Super Extreme Data’ .
Let us brace ourselves!