data data everywhere and all the ships did sink

architecture

This is my second blog focusing on data technology. In my first installment “Data Data Everywhere” I explored NoSQL technologies, specifically Couchbase. Recently, I had the opportunity to attend the SAP Data Hub training course. In that course, we focused on developing pipelines and moving data from one storage technology to another. We also looked […]

Machine Learning Solutions: Recommender System Design

market-2242727_1920

Recommender systems are now ubiquitous in our daily lives. From Amazon indicating similar products, to Netflix suggesting TV shows, even down to which version of a given advertisement you get in the mail, every business seems to be using recommender systems in order to improve their service. While recommender systems may seem too complex to […]

My Journey to Apache Lucene/Solr Committer

Introduction I joined Avalon Consulting, LLC a little over three years ago. I’ve wanted to write a reflection of my time here and becoming a Apache Lucene/Solr committer was a great catalyst for that. My time at Avalon has been a whirlwind of technologies and projects with a focus on search and search with Hadoop. […]

Getting Started with Giraph

Screen Shot 2015-01-29 at 11.20.07 AM

Apache Hadoop’s core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected […]