Combining Operational and Analytical Big Data Using Couchbase and Spark: A Market Basket Analysis Example

couchbasesparkmba

Couchbase is emerging as a platform of choice in the Enterprise NoSQL market. Couchbase is engineered for handling the operational aspects of big data. However, the platform is continually being enhanced to support integration with related technologies that can address the analytical aspects of big data, and that integration offers disruptive solutions capability to organizations. […]

From Data to Wisdom

owl

Much is made today of the possible wisdom to be gleaned from a wealth of available data. I am one of those who, through experience, believe this to be true. However, there are a couple steps necessary to achieve wisdom from data. Knowledge from information leads to action and when actions are repeated enough, organizations […]

Getting Started with Giraph

Screen Shot 2015-01-29 at 11.20.07 AM

Apache Hadoop’s core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected […]

Hadoop Ecosystem Cheat Sheet

HDP 2.2 Components

For someone evaluating Hadoop, the considerably large list of components in the Hadoop ecosystem can be overwhelming.  Below you’ll find a reference table with keywords you may have heard in discussions concerning Hadoop as well as a brief description. Image courtesy of Hortonworks. Name Description HDFS Hadoop’s underlying distributed file system YARN Provides resource management […]