Creating MLJAM for .NET

XQuery is a powerful language. Add MarkLogic-specific extension functions and constructs and there is very little you can’t do all in XQuery.

But, of course, sometimes you do find things you can’t accomplish–like generating a GUID value. Or let’s say you have a lot of business logic already implemented in a programming language like Java or C#. Why recreate all that work in XQuery?

Most XQuery developers working in MarkLogic know about MLJAM, a code library that lets you instantiate and use Java classes from XQuery. It’s a great way to access functionality that XQuery either can’t support or would take too long to re-implement.

But what about those of us working in a Microsoft environment? In my old organization, I had many .NET assemblies already written to handle complex business processes (especially getting information out of and back into SQL Server databases and pre-MarkLogic Content Management Systems). It would be great if I could access all that functionality.

What I really need is a .NET version of MLJAM—an “MLNET”, as it were. The problem is that MLJAM is built on top of an open source project named “BeanShell,” which provides a scripting environment for evaluating Java. The .NET framework does not have a comparable scripting engine—it requires compiled code, which XQuery can’t provide.

After thinking about the problem, however, I realized that .NET has a set of classes that are perfectly capable of dynamically compiling source code. Indeed, after a little more research, I also learned that there is a .NET scripting language called “PowerShell,” as well as a set of classes that can provide  a run-time environment for PowerShell scripts.

This past weekend, I decided to see if I could build a simple version of MLNET that would take arbitrary C# code, compile it, and run commands. I was pleasantly surprised to find that it was not nearly as difficult as I thought.

My simple version of MLNET consists of a set of XQuery functions that send C# code to a web service. The service uses PowerShell to compile and execute the code, then sends back the results. As this was only a test, my version of MLNET is limited in scope. It doesn’t have any real security or error handling; and while it can evaluate any C# code, it can only return strings as results.

For those interested in the technical details, I’ve provided source code and set-up instructions with this post.

But what can MLNET do? Let’s take that hypothetical case of needing to generate a GUID. The .NET framework has a dedicated class that generates these identifiers, which I can now invoke like this:

let $guid := net:execute((),"return System.Guid.NewGuid().ToString();")

I can also call existing custom classes with MLNET. For example, let’s say that I have a Data Access Layer class already defined that saves a new username and password to a database.  The class is named “Acme.DAL.UserManager” and is compiled inside an assembly named AcmeDAL.dll. I can invoke it like this:

let $_ := net:assemblies("AcmeDAL.dll")
let $result := net:execute(("user", $username, "password", $password),
'
try{
   Acme.DAL.UserManager mgr = new Acme.DAL.UserManager();
   mgr.AddUser(user, password);
   return "OK";
}
catch (Exception e){
   return "Error";
}
')

Note that in this example, I pass the username and password values in to the net:execute() function. In other words, I don’t need to know any values ahead of time–the XQuery can supply the necessary information when the code is run.

And of course, I can also use classes that I declare directly in XQuery. For example, I can write and instantiate my own class to calculate an MD5 hash:

let $_ := net:assemblies("System.Security.dll")
let $_ := net:classes('
 using System.Security.Cryptography;
 using System.Text;
 public class Hasher{
   public string Hash(string str){
      using (MD5 md5Hash = MD5.Create()){
         byte[] data = md5Hash.ComputeHash(Encoding.UTF8.GetBytes(str));
         StringBuilder sBuilder = new StringBuilder();
         for (int i = 0; i < data.Length; i++)
            sBuilder.Append(data[i].ToString("x2"));
         return sBuilder.ToString();
      }
   }
 }
')
let $hash := net:execute( ("str", "string I want hashed"), '
Hasher h = new Hasher();
return h.Hash(str);
' )

MLNet fills an important gap that was preventing me from developing all my applications in pure XQuery. Or at least, it will once I get the code in better shape!

Unfortunately, the current version of MLNET is just a proof-of-concept. It can evaluate the code shown in these examples, but it is not efficient and doesn’t provide a rich set of features, like support for passing in and returning arrays of values, or handling any data type other than a string.

Nevertheless, I don’t think it would be very difficult to write a true, production-ready version of MLNET. Indeed, I encourage anyone who’s interested to download and improve the attached code. I know I was planning to keep working on the project.

Who knows? With a few more weekends of effort, .NET developers could soon be “MLJamming”, too.

Downloads


Download Example Code

(Note: I’m sharing the code through Google Docs, so the link will take you to a page displaying everything inside the zip. Select File > Download to get the zip file itself.)

Demian Hess About Demian Hess

Demian Hess is Avalon Consulting, LLC's Director of Digital Asset Management and Publishing Systems. Demian has worked in online publishing since 2000, specializing in XML transformations and content management solutions. He has worked at Elsevier, SAGE Publications, Inc., and PubMed Central. After studying American Civilization and Computer Science at Brown University, he went on to complete a Master's in English at Oregon State University, as well as a Master's in Information Systems at Drexel University.

Leave a Comment

*