I was in a meeting with a customer the other day. We were discussing the features and functionality of various “Big Data” technologies, specifically Hadoop. The customer acknowledged that he had run tests with Hadoop, but his organization was not ready to go any further because they didn’t have “Big Data”. My comment to this was that many companies with less than a terabyte of data are implementing supposed “Big Data” solutions.
The market should consider whether the technologies labeled as “Big Data” can provide them with access to data analysis that could give them new perspectives, allow their businesses to operate more efficiently and perhaps discover better ways to serve their customers (of course there are many other valuable applications of these solutions). Technologies such as Hadoop can harvest data from sources ( e.g., social media, machine sensors, database logs, etc.) that were historically inaccessible because of the architecture of legacy data management environments and “lack of structure” in in the data. (Although, our Chief Data Officer would argue that data always has had structure).
Insights derived from the aforementioned data sources, as well as others, can be valuable to organizations of any size. Using Hadoop as an example, with cloud based options and pricing as low as 5-7K per server, this solution can provide tremendous value to companies at a very affordable price.
We are also seeing the price of data warehouse appliances come down significantly (e.g., IBM Puredata for Analytics, Formerly Netezza). Many vendors are releasing entry level appliances starting around $50K (all hardware, software and analytics functions included in a single unit) designed to simplify and create powerful analytics platforms for small to mid-sized organizations. When compared to traditional data warehouses, these appliances provide very fast time to value (Can be operational within hours and producing data within days in some cases*), powerful analytics, very fast response times and very low operational overhead. They can also ingest data that would be almost impossible for “traditional” data warehousing solutions.
So, as we all know the term “Big Data” has been way over-marketed. Perhaps the term we should use to describe today’s need for management of data is “Relative Data”. The data should be labeled as relative because every organization has different data sizes and requirements. Not everyone will have “Big Data” management issues. However, organizations should not exclude these technologies from consideration because they feel that they are below the “Big Data” clip level…whatever that is. Companies today should investigate technologies in the “Big Data” category to explore whether they can provide them with new insights through access of data that would have been almost impossible with previous generations of technology.
*This timeframe may vary depending upon the existing data foundation and level of complexity of the data environment.