Featured Post

Screen Shot 2015-01-29 at 11.20.07 AM

Getting Started with Giraph

Apache Hadoop's core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected by flight paths, and computers connected via a network (an obvious one to those of us in IT!) Graph processing is usually an iterative process with heavy reliance on vertex communication.  In addition to this, any type of exploratory analysis on a graph data set will require multiple iterations and subsequent calculations. The implication to those of us developing applications in Hadoop:  MapReduce isn't the right tool for the job when working with graph data sets.  Enter Apache Giraph - a distributed, fault-tolerant graph processing framework built on top of Hadoop. Giraph is based on Google's Pregel, … [Read More...]

Other Recent Posts


MarkLogic Gets JavaScript

Six months ago, I was down at the MarkLogic World San Francisco conference just south of SFO, listening to the opening speeches. I was there primarily to hear more about what would end up being in the second phase of their Semantics offering, when a mention … [Read More...]

Flexible Metadata

DAM and Flexible Data Models Using Document Databases

In my last post, I demonstrated a "flexible" way to store digital asset metadata inside a relational database. The results were not perfect. Relational databases require a strict data schema so that all information fits into pre-defined two-dimensional tables. … [Read More...]

HDP 2.2 Components

Hadoop Ecosystem Cheat Sheet

For someone evaluating Hadoop, the considerably large list of components in the Hadoop ecosystem can be overwhelming.  Below you’ll find a reference table with keywords you may have heard in discussions concerning Hadoop as well as a brief description. Image … [Read More...]

Hope amid life challenge.

Big Data takes a back seat to Big Dreams

It’s easy to find oneself lost in the mad rush to compete and conquer in business. That is certainly the case in the fast-paced and competitive world of IT.  Particularly in the hyper-fast world of Enterprise IT consulting. Against that backdrop we brought … [Read More...]