Machine Learning Solutions: Recommender System Design

market-2242727_1920

Recommender systems are now ubiquitous in our daily lives. From Amazon indicating similar products, to Netflix suggesting TV shows, even down to which version of a given advertisement you get in the mail, every business seems to be using recommender systems in order to improve their service. While recommender systems may seem too complex to […]

My Journey to Apache Lucene/Solr Committer

Introduction I joined Avalon Consulting, LLC a little over three years ago. I’ve wanted to write a reflection of my time here and becoming a Apache Lucene/Solr committer was a great catalyst for that. My time at Avalon has been a whirlwind of technologies and projects with a focus on search and search with Hadoop. […]

Getting Started with Giraph

Screen Shot 2015-01-29 at 11.20.07 AM

Apache Hadoop’s core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected […]

Hadoop Ecosystem Cheat Sheet

HDP 2.2 Components

For someone evaluating Hadoop, the considerably large list of components in the Hadoop ecosystem can be overwhelming.  Below you’ll find a reference table with keywords you may have heard in discussions concerning Hadoop as well as a brief description. Image courtesy of Hortonworks. Name Description HDFS Hadoop’s underlying distributed file system YARN Provides resource management […]