From Data to Wisdom

owl

Much is made today of the possible wisdom to be gleaned from a wealth of available data. I am one of those who, through experience, believe this to be true. However, there are a couple steps necessary to achieve wisdom from data. Knowledge from information leads to action and when actions are repeated enough, organizations […]

Getting Started with Giraph

Screen Shot 2015-01-29 at 11.20.07 AM

Apache Hadoop’s core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected […]

Hadoop Ecosystem Cheat Sheet

HDP 2.2 Components

For someone evaluating Hadoop, the considerably large list of components in the Hadoop ecosystem can be overwhelming.  Below you’ll find a reference table with keywords you may have heard in discussions concerning Hadoop as well as a brief description. Image courtesy of Hortonworks. Name Description HDFS Hadoop’s underlying distributed file system YARN Provides resource management […]