Apache Solr JDBC Introduction

Apache Zeppelin (incubating) Solr JDBC

In my first post, I detailed the history that led up to my working on the Apache Solr JDBC driver and becoming an Apache Lucene/Solr committer. This post will describe the Solr JDBC driver and its usage. The next few posts in the series will be detailed guides on how to use the Solr JDBC driver with […]

My Journey to Apache Lucene/Solr Committer

Introduction I joined Avalon Consulting, LLC a little over three years ago. I’ve wanted to write a reflection of my time here and becoming a Apache Lucene/Solr committer was a great catalyst for that. My time at Avalon has been a whirlwind of technologies and projects with a focus on search and search with Hadoop. […]

Getting Started with Giraph

Screen Shot 2015-01-29 at 11.20.07 AM

Apache Hadoop’s core analytical tools (e.g. MapReduce, Hive, Pig) are great for performing batch analytics over large, unstructured data sets.  However, a myriad of data sets are comprised of a more graph-like structure. Examples of such data sets include: a map with cities connected by roads, a social network with people connected by relationships, airports connected […]

Hadoop Ecosystem Cheat Sheet

HDP 2.2 Components

For someone evaluating Hadoop, the considerably large list of components in the Hadoop ecosystem can be overwhelming.  Below you’ll find a reference table with keywords you may have heard in discussions concerning Hadoop as well as a brief description. Image courtesy of Hortonworks. Name Description HDFS Hadoop’s underlying distributed file system YARN Provides resource management […]