My Journey to Apache Lucene/Solr Committer

Introduction I joined Avalon Consulting, LLC a little over three years ago. I’ve wanted to write a reflection of my time here and becoming a Apache Lucene/Solr committer was a great catalyst for that. My time at Avalon has been a whirlwind of technologies and projects with a focus on search and search with Hadoop. […]

Getting Started with Giraph

Screen Shot 2015-01-29 at 11.20.07 AM

Apache Hadoop’s core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected […]

Hadoop Ecosystem Cheat Sheet

HDP 2.2 Components

For someone evaluating Hadoop, the considerably large list of components in the Hadoop ecosystem can be overwhelming.  Below you’ll find a reference table with keywords you may have heard in discussions concerning Hadoop as well as a brief description. Image courtesy of Hortonworks. Name Description HDFS Hadoop’s underlying distributed file system YARN Provides resource management […]

Getting Started with Hadoop

hortonworks-logo1

Avalon is successfully helping a number of our clients derive business benefit from Hadoop.  And in that process, we see a very common problem:  many of the great developers and architects we encounter just don’t know where to start in terms of getting that base level of technical knowledge in Hadoop.  And they’re too busy […]