This website uses cookies. By continuing to browse the site, you are agreeing to our use of cookies
Data & AI Solutions
August 26, 2013
As we all know that a Supermarket is something big where we get grocery, food and house hold items and these items are organized in aisles. Every supermarket has its own style. Similarly private operators who give Big data distributions (grocery Stores) offering a wide range of enterprises big data services with hadoop ecosystem (food and house hold products) to the typical Business Intelligent system. In this blog, we are going to move a trolley across various big data supermarkets.
Cloudera Hadoop Distribution (CDH): A Leading Big data Platform for Apache foundation offering 100% enterprise-ready distribution of Hadoop and related project.
IBM Info sphere BigInsights : Bring the power of hadoop to the enterprise contains enhanced functionality and improved consumability, making it easier to use Hadoop leveraging existing skills, build big data applications and uncover insights in your data.
MapR: Delivers on the promise of Hadoop, making managing and analyzing Big Data a reality for more business users. MapR Distribution brings unprecedented dependability, speed and ease-of-use to Hadoop.
Horton works Data Platform (HDP) : Most stable and reliable Apache Hadoop distribution. Built and packaged by the core architects, builders and operators of Hadoop.
Karmasphere: Provides everything data professionals need to put Big Data to work. From predicting customer needs to tracking user behavior to optimizing development processes, Karmasphere was built to help you turn data into a strategic asset.
Where does the Hadoop ecosystem (products) fit into the typical BI system?
Hadoop distributed file System (HDFS) | HDFS is highly fault-tolerant and is designed to be deployed on cheaper hardware. Data is distributed across all nodes |
---|---|
Hive | Data warehouse system built on top of the hadoop for analyzing large dataset. |
HBase | Column-oriented database system for random, real-time read/write access |
Mapreduce | Programming model for processing a large Cluster of commodity machines. Mapreduce has Parallel processing power of distributed file system with large data set. |
Flume | Distributed services that can collect data from different sources. |
Sqoop | Imports data from an RDBMS to hadoop and vice versa |
Hiho | Moving Data between any database and hadoop |
Chukwa | The Powerful tool for displaying, monitoring and analyzing results of the large collection of logs. |
Pig | The Dataflow scripting language of high-level platform and capable of running mapreduce Engine. |
HiveQL | Querying Language to access the hive. |
Jaql | An executable program and a built-in annotator library provide the text analytics for Hadoop. |
Mahout | The Core algorithms for implementing clustering, classification and filtering of large data sets. |
Hue | User interface framework and software development kit (SDK) for visual Hadoop applications |
Beeswax | User Interface framework for analyzing hive. |
Zookeeper | Coordination service for distributed applications. |
Oozie | The Workflow scheduler system for managing hadoop jobs. |
Whirr | The Cloud-neutral way to run services |
Features | MapR | IBM Biginsight | Cloudera | Hortonworks | Karmasphere |
---|---|---|---|---|---|
Name Node High availability | Available | Available | Available | Available | Integrated with MapR |
Connector to Social media | Flume | Social Data Accelerator | Flume | Flume | No |
Administering and Monitoring | Not Available | Available | Available | Available | Available |
On Cloud | Google Cloud Platform and Amazon EMR | IBM SmartCloud Enterprise | No | No | Karmasphere Analytics for EMR |
Machine Learning | Mahout | Machine data Accelerator | Mahout | No | No |
Web User Interface | Available | Available | Available | Available | Available |
Cluster Set up | Easy | Easy | Easy | Easy | Medium |
Product | MapR Edition | Infosphere biginsight | CDH | Hortonworks Data Platform | Karmasphere Studio |
For windows | No | No | No | yes | Yes |
Text analysis | INFA Hparser | Annotation querying language | No | No | Ability to use SAS, SPSS and R Analytic Models |
Every outcome starts with a conversation