Calling .NET code from XQuery

April 9th, 2012 by Demian Hess

As I mentioned in a blog post several weeks ago, I have been working on a way to call .NET code directly from MarkLogic.

My goal is to be able to access the functionality of existing .NET assemblies so that I don’t need to spend time re-implementing any logic in XQuery. Some typical use cases might include authenticating against a proprietary access control system written in .NET, or connecting to S3 buckets using Amazon’s .NET SDK.

I’m calling my code library “ML.NET” and am pleased to report that it now has just about all the features needed to provide real value to my projects, including:

  • Dynamically compiling .NET code for invocation
  • Caching compiled assemblies so that code does not need to be re-compiled
  • Support for data types such as strings, booleans, integers, floats, XML documents, binaries, and also arrays of all these types
  • Authentication to prevent unauthorized users from executing .NET code
  • Automatic system clean up to remove unused cached assemblies

To make this a bit more concrete, I’ll step through a simple example that uses ML.NET to integrate Amazon’s Simple Storage Service (S3) with data stored in MarkLogic.

A Simple Example Using ML.NET

In this example, I’ve stored image metadata in MarkLogic while the images themselves are kept in an S3 bucket. Although Amazon provides a RESTful API to retrieve objects from buckets, signing the requests and parsing the responses can be tedious in pure XQuery. To simplify the process, I’ll use ML.NET to access Amazon’s .NET SDK from within XQuery.

The first step is to embed C# code in my XQuery by storing the text in a variable named $class-code. The $class-code defines a static class that retrieves an object from an S3 bucket and converts it to a byte array. Note that the “real” work is done by the AmazonS3, GetObjectRequest, and GetObjectResponse classes, which are all part of the Amazon SDK.

let $class-code := '
using Amazon;
using Amazon.S3;
using Amazon.S3.Model;
using System.IO;
using System;

public static class S3Retriever{
   public static byte[] Retrieve(AmazonS3 client, string bucket, string objectKey){
      GetObjectRequest request = new GetObjectRequest(){
         BucketName = bucket, Key = objectKey
      };
      GetObjectResponse response = client.GetObject(request);
      using (BinaryReader reader = new BinaryReader(response.ResponseStream))
      { return reader.ReadBytes( (int)response.ContentLength ); }
   }
}'
   

Next, I create “executable” code (stored inside $execute-code) that calls the static class and provides the image identifier and an S3 client that holds our Amazon credentials.

let $execute-code := '
   AmazonS3 s3Client = AWSClientFactory.CreateAmazonS3Client(key, secret);
   bytes = S3Retriever.Retrieve(s3Client, bucket, assetId);
'

Finally, in pure XQuery, I compile and invoke the C# in order to retrieve 10 images from Amazon and save them to my local filesystem.

let $_ := (
   mlnet:start("https://localhost/mlnetservice/mlnet.ashx", $username, $password),
   mlnet:custom-assemblies("AWSSDK.dll"),
   mlnet:classes($class-code),
   mlnet:set-outparam("bytes", mlnet:bytearray())
)
let $execution-id := ""
let $_ :=
   for $img at $index
      in (//asset[mime-type eq 'image/jpeg' and byte-size lt 1024 * 500])[1 to 10]
   let $path := fn:concat("c:\images\", fn:string($img/filename))
   return (
      mlnet:set-param("key", mlnet:string($key)),
      mlnet:set-param("secret", mlnet:string($secret)),
      mlnet:set-param("bucket", mlnet:string($bucket)),
      mlnet:set-param("assetId", mlnet:string($img/asset-id)),
      (
         if ($index eq 1) then xdmp:set($execution-id, mlnet:execute($execute-code))
         else mlnet:re-execute($execution-id)
      ),
      xdmp:save($path, mlnet:get("bytes"))
   )

return mlnet:end()

Note that at the beginning of the XQuery, I call mlnet:start() with three parameters: the location of the ML.NET web service, a username, and a password. The web service is part of the ML.NET library and is responsible for compiling and invoking all the .NET code. In this case, I am accessing the service over SSL, which ensures that the username and password are encrypted. The service will check these user credentials against the MarkLogic Security database, thus ensuring that only authorized users are able to compile and execute code on the server.

After calling mlnet:start(), I reference the Amazon SDK assembly, set the class code so that it is available for use, and define an output parameter named “bytes” that I will use to retrieve the image as a byte array.

In the next block of code, I query MarkLogic for ten small JPEG files and then iterate over the sequence. In the first iteration, the code calls mlnet:execute(), which compiles and invokes the C# stored in $execute-code. Calling mlnet:execute() also returns an id that I can then pass to mlnet:re-execute in all subsequent iterations (in this example, I am using xdmp:set() to assign the id to $execution-id). mlnet:re-execute() re-uses previously compiled assemblies, thus avoiding the overhead of compiling .NET code in each iteration.

How long are compiled assemblies kept by the web service? In this case, the assemblies are deleted as soon as the XQuery calls mlnet:end(). Calling mlnet:start() creates a compilation and execution context. Calling mlnet:end() deletes all the assemblies that were created within that context.

If you forget to call mlnet:end(), the system will eventually delete old assemblies after an “expiration period”. You can specify the expiration period when you first set-up the ML.NET web service; I generally set the duration to 20 minutes.

Getting More Information

I plan to have a poster on ML.NET at the upcoming MarkLogic World conference in Washington, DC (May 1-3, 2012). Stop by Avalon Consulting at the Diamond Booth and I’ll be happy to demo the ML.NET library and discuss the technical implementation in greater detail.

What is next in Enterprise Search

March 15th, 2012 by Joe Hilger

Mashable just published an interesting article about Googles latest advancements in search.  The article, Google Knowledge Graph Could Change Search Forever, describes a “developing vision for search that takes it beyond mere words and into the world of entities, attributes and the relationship between those entities.”

Google is building a huge knowledge base describing entities, relationships and attributes of those entities.  Google’s vision is to make search smarter by identifying the people, places and things in the content that is being indexed.  Once Google understands what is in the content, the engine can recommend related information and present more relevant results.  To do this, Google is evolving from using statistical algorithms, thesauri and key word matching to capturing over 200 million entities and their associated relationships.

For example, if a user searches on Jeremy Lin Google will understand that you are looking for a basketball player in New York who played for Harvard.  Proper results will include New York Knicks box scores and recent basketball news from the NBA, New York Knicks and Harvard.  This is a lot more powerful than a list of links to articles containing the words “Jeremy Lin.”

Can we make Enterprise Search (search on your websites) just as smart?  We need to.  Even well-developed search, using facets and other techniques, is not satisfying the needs of your customers, stakeholders, and employees.

I believe entity-based search is a coming trend and I am not alone in this thought.  After posting the story about Google on Avalon’s internal collaboration site, my peers responded with positive and informative comments.

Brian:

When I first saw the description, I thought it sounded like a step toward RDBMS, but this is more like eduction on steroids. Not only are they extracting entities, but deciding how they relate with other entities. I thought that a leading search engine vendors entity extraction with eduction was impressive, but this goes far beyond that.

Kurt:

The mention of Freebase is significant, because Freebase is (was?) a Semantic Web database. In essence, you take 12 million entities and decompose all of the facts known about them into collections of assertions, some of them class relations:
person:BarackObama politics:holdsOffice office:PresidentOfTheUnitedStates.
person:BarackObama potus:predecessor person:GeorgeBush.
person:BarackObama politics:affiliation politics:USDemocraticParty.
person:MichelleObama relationship:spouseOf person:BarackObama.
person:MichelleObama educated:matriculatedFrom school:UniversityOfChicago.

some of them value relationships or labels:
person:BarackObama potus:firstTermStart 2008.
person:BarackObama potus:firstTermEnd 2012.
person:BarackObama foaf:FirstName “Barack”.

with several hundred million such assertions you can then ask generalized questions such as “give me all Democratic presidents who were preceded by Republican presidents in the twentieth-first century, and whose wives matriculated at the University of Chicago.

The resulting datasets in turn form graphs, which don’t necessarily correspond to the tree structures that you would expect from XML groves.

Bing has been going down this road for a couple of years – it started out as a Semantic database rather than simply a lexical search base. This is also important because this is more than just data enrichment, which is still largely lexical in nature – it also established relationships between entities within the datasets (and the significance of this is important because the same entity may have multiple names or designators, but if you have a system that can identify sameAs relationships, then your database becomes intelligent in that it can make inferences.

This is part of the reason why its important to not lose sight of this aspect of Big Data – Hadoop/MapReduce is significant because it provides a standardized way of performing parallel processing across commodity hardware. Semantic Web is important because it provides a way to make the resulting data internally aware and referential.

Mike:

Interesting. It seems like this has the potential to close (eliminate?) the gap between 2 majorly different types of search – Discovery vs Targeted. The difference today is a big reason why Google search works so well and why so many firms have terrible Enterprise search. As the article said, I can google “the 10 deepest lakes in the U.S” and get decent results because google indexes so much content and their “cross our fingers and hope someone on the web has written about these things or topics” works.

But in the Enterprise, people’s searches are often much more targeted. On our Client X call today, we went through some very specific use cases where users needed to locate a very specific regulatory filing document. Here is one: “I’ve received a question from the agency regarding the US NDO sBLA submission and need to find the original submission information to answer it”. That is a different problem to solve. Because our engine doesn’t have the AI this article talks about, our solution is to leverage the real Intelligence of the user’s mind by giving him the ability to manipulate and filter the results using a combination of facets, keywords and sorting. And it works pretty well – much better than their current solution.

The approach this article takes would theoretically make it possble to cut right to the chase based on the original query. Pretty cool if they can get it to work. Question is – when will something like that be available – and affordable – to businesses?

Mike is asking the right question.  Your search experience doesn’t need to invest in all possible searches, as Google has.  Instead, I believe most enterprises can enrich their search by adding as few as 200 relevant entities and relationships.  An intranet or public site search that has been made intelligent in this way can deliver the results that are specifically relevant to your users.

I’m excited about this new trend in search.  Let me know if your organization has started exploring this approach.  Avalon has helped dozens of organizations improve the usability and effectiveness of their search experience.  We are ready to help you take the next step.

 

Creating MLJAM for .NET

March 5th, 2012 by Demian Hess

XQuery is a powerful language. Add MarkLogic-specific extension functions and constructs and there is very little you can’t do all in XQuery.

But, of course, sometimes you do find things you can’t accomplish–like generating a GUID value. Or let’s say you have a lot of business logic already implemented in a programming language like Java or C#. Why recreate all that work in XQuery?

Most XQuery developers working in MarkLogic know about MLJAM, a code library that lets you instantiate and use Java classes from XQuery. It’s a great way to access functionality that XQuery either can’t support or would take too long to re-implement.

But what about those of us working in a Microsoft environment? In my old organization, I had many .NET assemblies already written to handle complex business processes (especially getting information out of and back into SQL Server databases and pre-MarkLogic Content Management Systems). It would be great if I could access all that functionality.

What I really need is a .NET version of MLJAM—an “MLNET”, as it were. The problem is that MLJAM is built on top of an open source project named “BeanShell,” which provides a scripting environment for evaluating Java. The .NET framework does not have a comparable scripting engine—it requires compiled code, which XQuery can’t provide.

After thinking about the problem, however, I realized that .NET has a set of classes that are perfectly capable of dynamically compiling source code. Indeed, after a little more research, I also learned that there is a .NET scripting language called “PowerShell,” as well as a set of classes that can provide  a run-time environment for PowerShell scripts.

This past weekend, I decided to see if I could build a simple version of MLNET that would take arbitrary C# code, compile it, and run commands. I was pleasantly surprised to find that it was not nearly as difficult as I thought.

My simple version of MLNET consists of a set of XQuery functions that send C# code to a web service. The service uses PowerShell to compile and execute the code, then sends back the results. As this was only a test, my version of MLNET is limited in scope. It doesn’t have any real security or error handling; and while it can evaluate any C# code, it can only return strings as results.

For those interested in the technical details, I’ve provided source code and set-up instructions with this post.

But what can MLNET do? Let’s take that hypothetical case of needing to generate a GUID. The .NET framework has a dedicated class that generates these identifiers, which I can now invoke like this:

let $guid := net:execute((),"return System.Guid.NewGuid().ToString();")

I can also call existing custom classes with MLNET. For example, let’s say that I have a Data Access Layer class already defined that saves a new username and password to a database.  The class is named “Acme.DAL.UserManager” and is compiled inside an assembly named AcmeDAL.dll. I can invoke it like this:

let $_ := net:assemblies("AcmeDAL.dll")
let $result := net:execute(("user", $username, "password", $password),
'
try{
   Acme.DAL.UserManager mgr = new Acme.DAL.UserManager();
   mgr.AddUser(user, password);
   return "OK";
}
catch (Exception e){
   return "Error";
}
')

Note that in this example, I pass the username and password values in to the net:execute() function. In other words, I don’t need to know any values ahead of time–the XQuery can supply the necessary information when the code is run.

And of course, I can also use classes that I declare directly in XQuery. For example, I can write and instantiate my own class to calculate an MD5 hash:

let $_ := net:assemblies("System.Security.dll")
let $_ := net:classes('
 using System.Security.Cryptography;
 using System.Text;
 public class Hasher{
   public string Hash(string str){
      using (MD5 md5Hash = MD5.Create()){
         byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(str));
         StringBuilder sBuilder = new StringBuilder();
         for (int i = 0; i < data.Length; i++)
            sBuilder.Append(data[i].ToString("x2"));
         return sBuilder.ToString();
      }
   }
 }
')
let $hash := net:execute( ("str", "string I want hashed"), '
Hasher h = new Hasher();
return h.Hash(str);
' )

MLNet fills an important gap that was preventing me from developing all my applications in pure XQuery. Or at least, it will once I get the code in better shape!

Unfortunately, the current version of MLNET is just a proof-of-concept. It can evaluate the code shown in these examples, but it is not efficient and doesn’t provide a rich set of features, like support for passing in and returning arrays of values, or handling any data type other than a string.

Nevertheless, I don’t think it would be very difficult to write a true, production-ready version of MLNET. Indeed, I encourage anyone who’s interested to download and improve the attached code. I know I was planning to keep working on the project.

Who knows? With a few more weekends of effort, .NET developers could soon be “MLJamming”, too.

Downloads


Download Example Code

(Note: I’m sharing the code through Google Docs, so the link will take you to a page displaying everything inside the zip. Select File > Download to get the zip file itself.)

Is this the start of e-Textbooks?

February 28th, 2012 by Demian Hess

“Is this the start of e-textbooks?” It was 2006 and I was in a production meeting with the head of our college textbooks division. She had just passed around the new Sony ebook reader and we were trying to decide if this was the device that would start the transition from print to electronic textbooks.

The answer then was clearly no: textbooks have large, colorful pages packed with images, boxes, time lines, and tables. None of these features translated to the small, black-and-white screen of the Sony ereader — nor the first-generation Kindle (which felt like reading on an Etch-A-Sketch).

Fast forward to 2010. My boss had just purchased a new iPad and we were passing it around the table. Everyone knew that this was what we’d been waiting for. Here was a device that could support a wider range of content than had ever been possible. The only problem was, we didn’t have the content for it.

If you haven’t been in publishing, it’s hard to describe how much of what we do is based on what has been done before. Publishers and authors create books based on a mental image of how the words will look on a page, how professors will teach their classes, and how students will study.

But none of us had an image of what a textbook should look like on an iPad. The screen was gorgeous, but still too small for the normal textbook trim size. And what would be the point of simply putting a print book on the device? Static images and tables seemed stale on the iPad. We needed a whole new paradigm, but didn’t know what it should be.

With the release of the iPad 3 only a few weeks away, I believe that textbook publishers are finally ready to take advantage of the new generation of tablet devices. This past January, Apple released the iBooks Author app, which lets publishers create an interactive textbook with 3D images, quizzes, glossaries, and iPad-optimized navigation and search. Most importantly, publishers can immediately preview their work, thus allowing a seamless learning cycle as content is written, previewed, and then revised to take better advantage of the device.

While the Author app is good, I am even more impressed with a San Fransisco-based start-up called Inkling, which has developed its own online platform for creating digital textbooks. I like Inkling because its books can be read not only on the iPad, but also the Kindle Fire, Samsung Galaxy Tab, and Barnes and Noble Nook. Additionally, where Apple only allows books created with Author to be sold through the iBookstore, Inkling lets publishers sell their e-textbooks on their own websites.

Printed textbooks are certainly not going to disappear overnight. After all, tablet computers are expensive and printed textbooks are familiar and ubiquitous. It reminds me of the early 1980s, when I started college and my grandparents bought me a brand new electronic typewriter. I lugged that typewriter around for years, but it wasn’t my first choice for writing papers. Apple had just released the Macintosh, and if the computer center wasn’t too full, I preferred to do all my work on that.