Hadoop Analytics with EMC Greenplum
Realizing the Promise of Hadoop
Organizations across all industries are looking to leverage their ever-growing big data assets to identify key trends and new opportunities to help accelerate their business. Hadoop has emerged as an innovative big data analytics engine to support data-intensive distributed applications and reduce the time required for data analysis especially of unstructured data.
EMC provides the industry’s first end-to-end Hadoop solution that lets you quickly extract new insight from your data, thereby accelerating business productivity. This single vendor solution combines EMC Isilon scale-out NAS with EMC Greenplum HD along with EMC consulting, installation, training and support offerings on Hadoop to deliver powerful data analytics capabilities on a highly flexible and efficient data storage platform.
EMC Isilon scale-out NAS is the first and only enterprise NAS solution that natively integrates with the Hadoop Distributed File System (HDFS) layer. By treating HDFS as an over the wire protocol, you can quickly deploy a comprehensive big data analytics solution that combines Greenplum HD with Isilon NAS storage systems to provide a powerful, highly efficient and flexible data storage and analytics ecosystem..
Simple to Implement and Manage
By combining the simplicity of EMC Isilon scale-out NAS storage with the leading-edge analytics tools of EMC Greenplum HD, we can provide you with a highly integrated, one-stop Hadoop solution that lets your users quickly extract new insight from your data, thereby accelerating business productivity. This approach also eliminates the risk associated with the complex hardware and software configuration process. With Isilon Scale-out NAS, you can avoid the complexities of managing large pools of direct attached storage. You also gain the ability to access HDFS and further simplify your entire workflow by loading data over standard protocols. You can also use EMC Isilon InsightIQ to monitor performance of your Hadoop analytics storage environment.
Increase Efficiency to Reduce Costs
Our solution helps you dramatically increase efficiency, reduce your IT footprint and drive down the total cost of ownership of your Hadoop solution. EMC Isilon scale-out NAS can provide up to 80% utilization or more from a single pool of shared storage. This compares to Hadoop storage deployments using direct attached storage (DAS) that typically require three times more capacity. Our high density storage and Reed Solomon parity striping approach also helps you gain added efficiencies. With our unique integration with HDFS, you can also eliminate the resource-intensive import and export of data into and out of Hadoop.
Flexibility to Scale Easily
With EMC Isilon Scale-out NAS for EMC Greenplum HD, you gain the flexibility to expand your Hadoop data capacity or compute resources independently. You can also leverage multiple standard protocols to distribute insight to other components of the big data analytics workflow. Our scale-out architecture also enables you to expand capacity quickly and economically. You can scale performance and capacity in less than 60 seconds, while the entire Hadoop ecosystem remains online.
Highly Reliable
EMC Isilon’s multi-node architecture removes the risk of single point of failure of the metadata namenode or the job tracker associated with traditional Hadoop storage deployments. You also benefit from the automatic load balancing and failover capabilities of EMC Isilon storage. With EMC Isilon FlexProtect, you can provide robust and efficient distributed data protection at a file or folder level. To further protect your Hadoop data efficiently, you can use the powerful snapshot capabiIities of SnapShotIQ, data replication with SyncIQ for disaster recovery protection, and SmartLock for WORM data protection from EMC Isilon.
View our Hadoop solution overview for additional details.