MarkLogic Gets JavaScript

javascriptBGSix months ago, I was down at the MarkLogic World San Francisco conference just south of SFO, listening to the opening speeches. I was there primarily to hear more about what would end up being in the second phase of their Semantics offering, when a mention was made about V8 integration. Now V8 in this case was not a reference to the tomato juice drink (or its many, many variants), nor to a high performance gas guzzling car that seems to be going the way of the dodo.

No, in the world of NoSQL databases and web application servers, V8 could only mean one thing – the Google JavaScript engine that effectively kick-started the modern web application world, and had since gone onto power Google Chrome and the Node.js server. MarkLogic was putting V8 into their MarkLogic 8 server. Okay, let me state that again: MarkLogic was PUTTING THE FASTEST FRIGGIN’ JAVASCRIPT ENGINE ON THE PLANET INTO MARKLOGIC SERVER!!!

So why the ALL-CAPS? I’ve been using MarkLogic server since about 2005.When Chris Lindblad, the founder of the company, put together the first version of the server, he was interested primarily in making database indexes more efficient, and at the time attacked a problem that was pretty major – building a workable XML database. XML at the time was the hot technology, and while other XML indexing schemes were around, most relied upon having pre-defined schemas.

Lindblad was one of a very few people (Wolfgang Meier of eXist-db was the other) who realized that there were ways of using an optimized indexing scheme to more productively store all of the elements and attributes and text fields that make up XML, and so this became the focus of his new product. The choice proved prophetic in the short term, as XML was really just beginning to catch on in the publishing world at the time, and the tiny company soon morphed into a considerably bigger company … and then began to run into a shift that no one could have seen coming.

During that same time, the Mozilla project was coming back to life, and one of the things that made that possible was a much more robust JavaScript layer, along with a part of the stack borrowed from Microsoft – a browser component with the unwieldy name of the XML HTTP Request Object, which had first been formulated primarily for moving SOAP information around. Writer Jesse James Garrett dubbed this new stack the Asynchronous JavaScript and XML object (aka, AJAX) while Douglas Crockford started making the case for sending a special serial form of JavaScript objects over this to better communicate with the server.

JavaScript web developers,who were suddenly finding that with only a bit of an upgrade they were suddenly once again in demand, easily adapted this new JavaScript Object Notation (JSON). into become a de facto lightweight data standard that seemed superior to XML (especially the very verbose SOAP structures) for the kinds of things that web developers liked to write – terse messages, simple objects and arrays. Just as XML-based systems worked better with an XML database in the back end that could be searched, so too was their a demand for a back JSON database – which soon included dedicated systems such as CouchBase and MongoDB, Redis, Memcached, Cassandra, and several others, all now lumped into a category that was called NoSQL databases.

For MarkLogic, even though they started offering a way to convert JSON into and out of XML, their reputation was now solidly entrenched in the XML world, and after a period of explosive growth, XML had reached a degree of maturity that lent itself well to publishing but not to web applications. One big culprit for this was XQuery, a language which, while having a certain innate elegance, nonetheless was used by a comparatively small number of developers. If you wanted to take advantage of the rich API within MarkLogic, you had to develop XQuery expertise. This had the potential to end badly.

I’ve had a chance to evaluate that JavaScript layer within the upcoming MarkLogic 8 version, and MarkLogic has pulled out a transformation that will ensure that the company will be doing very well in the future. The server has always been unique in that it has placed what amounts to an entire mid-tier layer intelligently above the database, and indeed, I think that even many users of the product have tended to perceive MarkLogic as a database without necessarily understanding the amount of power that’s contained in its application layer.

With the introduction of V8, developers will be able to use JavaScript, rather than XQuery, for all layers of their application within the MarkLogic server. The extensive XQuery function API (several thousand functions, overall) is now available as JavaScript functions that allow control over every aspect of the server from querying databases to creating whole new applications to creating web service pipes and running analytics functions. Yet beyond that, it is now possible to both create new function library modules in JavaScript and to take advantage of many of the same modules that currently work for node.js, one of the fastest growing library of community modules in the IT sphere.

It’s worth taking a moment to explore the significance of this. MarkLogic is storing and retrieving JSON from its databases, is able to search that JSON to the same (extensive) degree that it can search XML, and can be programmed to work using JavaScript, can dynamically evaluate JavaScript, all with nary an XML angle bracket in sight anywhere.

In a recent RedMonk survey, Javascript made up approximately 13 percent of all new github projects, more than any other language, surpassingRuby in 2011.

It also is one of the most heavily used languages in web development, both because of presence on all web browsers and because of node.js, which has steadily been gaining strength as both a stand-alone server and as an easily spawned auxillary web services node. By migrating into the realm of database servers (as it has been doing in Couch and Mongo and now MarkLogic) it is also becoming a viable alternative to the rather clunky realm of stored procedures.

While getting a good number of how many skilled practitioners of any given language is hard to do, it is likely that there are a hundred Javascript programmers for every XQuery developer. There are few certification programs for XQuery (MarkLogic just announced its own certification program this month, and that’s focused on the server rather than the language) while JavaScript is now taught in nearly every university, community college and increasingly high school level computer science program in the country.

Does that mean that everyone who knows JavaScript will immediately jump to MarkLogic? No, of course not. What it does mean is that any organization that already has a significant investment in people with Javascript skills can now leverage those people for working with the server, means that new hires with strong Javascript skills can be more readily trained up into MarkLogic quickly and relatively painlessly and means that it becomes increasingly possible to build a completely application stack built on Javascript alone, meaning more shared libraries that work for both client and server, as well as better interoperability with other systems such as node.js and mobile clients that use JavaScript through PhoneGap or jQueryMobile.

The strategy and massive investment by MarkLogic in establishing JavaScript as an equal (or perhaps even slightly superior) partner in the growing suite of data technologies that is MarkLogic Server is another bold master stroke by MarkLogic CEO Gary Bloom, who also made adding semantics technologies into the server a key part of their strategy last year. By my own evaluation, the strategy has paid off technologically, and has a good chance of catapulting MarkLogic out of the realm of specialty data servers and into the top tier of data service technology providers such as Oracle, Microsoft and IBM.

In my next post I intend to do a deeper dive into the Server Side JavaScript features of MarkLogic, with a brief excursion into the node.js library, as part of a general series on JavaScript, semantics and NoSQL data systems.

As always, thoughts and comments are welcome.

Kurt Cagle is an information architect, ontologist, author and industry analyst specializing in the NoSQL and JavaScript space. He is available for consultation. His clients have included Fortune 500 companies and US and European Federal Agencies. He lives in Issaquah, Washington, where he’s working on his latest novel, Storm Crow.

Kurt Cagle About Kurt Cagle

Kurt Cagle is the Principal Evangelist for Semantic Technology with Avalon Consulting, LLC, and has designed information strategies for Fortune 500 companies, universities and Federal and State Agencies. He is currently completing a book on HTML5 Scalable Vector Graphics for O'Reilly Media.

Leave a Comment