04 February 2018

Backing up with rsync to a bitlocker disk mounted under Ubuntu on Raspberry Pi

Needed to set up backup to a bitlocker drive (corp policy - not allowed to put anything on a driver that's not bitlocker'd).  Corp laptops are set up such that they want to inspect all data written to external devices, so backups ended up being painfully slow (especially when jar files were involved since it wanted to unzip them and scan everything in them).  Decided to go with rsync over ssh from cygwin bash shell.

For fun (and since I didnt want this stuff on my personal machines), I decided to buy a Raspberry Pi and use that as my rsync target with the bitlocker disk mounted using dislocker.

I think I start out trying to install a packaged version with apt-get, but had dependency problems (might have had something to do with trying to run under Ubuntu on Raspberry Pi) , so reverted to these instructions: http://blog.airbuscybersecurity.com/post/2016/01/Mounting-Bitlocker-Volumes-Under-Linux

On the server side, I ended up with two scripts:

dislocker-fuse -V /dev/sda1 -u -- /mnt/dislocker
mount -t ntfs-3g -o loop /mnt/dislocker/dislocker-file /mnt/vmdisk
mount | grep vmdisk

umount /mnt/vmdisk && umount /mnt/dislocker
mount | grep dislocker

The mount script will prompt you for your bitlocker passphrase.  Note that shutting down the server without unmounting dislocker first can supposedly cause scary things to happen with the bitlocker encryption.

I also setup usermapping to map between windows SIDs and my linux user/group.  This goes in /mnt/vmdisk/.NTFS-3G/UserMapping:
# User mapping proposal :# -------------------- cut here -------------------1001::S-1-5-21-4271255075-229453548-3213529333-6405:1001:S-1-5-21-4271255075-229453548-3213529333-513::S-1-5-21-4271255075-229453548-3213529333-10000# -------------------- cut here -------------------
On the client side, its just standard rsync commands to send the data. Again, corp security required more dancing - cygwin needed to run as admin, but you cant do that directly anymore, so you have to run a DOS window as admin and launch cygwin from there. Also, trying to rsync the c drive doesnt work well because only the trusted installer user has write privs, so rysnc gets locked out as soon as it syncs the root directory on c: (which is the first thing it does).  Trick is to specify directories under c: as the rsync sources rather than the c drive itself.

01 July 2011

Recursive queries from a custom validator

Just spent the morning tracking down a stack overflow while trying to implement a grails custom validator.

I have a domain class that has a parent/child relationship with itself (instance form a tree). When I update instances, I need to make sure that I haven't created a cycle (the instance has itself as its own parent/ancestor)

To do this, I created a custom validator to check that the object doesnt have itself as its own parent and that its parent is not also a child/descendent

class Afsc 
  String id 
  String code 
  Afsc parent  

  def afscService 
  def sessionFactory
  static constraints = {
    parent (nullable: true, validator: { val, obj ->
      if (obj.id && (val?.id == obj.id)
          || obj.afscService.getChild(obj, val.code)) {
        return 'AFSC.cannot.be.cyclic'

The fun part is that getChild() does a recursive search checking to see that the instance doesnt have its parent as a child:

  def getChild(parent, targetCode) {
    def children = Afsc.findAllByParent(parent)
    def child = children.find {
      it.code == targetCode
    if (!child) {
      children.each {
        child = child ?: getChild(it,targetCode)
    return child

When I tried this, I started getting stack overflow exceptions.  Long story short, it turns out that using  findAllByParent() causes hibernate to flush changes before executing the query.  Well, guess what happens when changes are flushed?  That's right - the validator gets called.  Which calls getChild().  Which calls findAllByParent().  Which causes a flush.  Which gives us infinite recursion and a stack overflow.

Once I figured out what was going on, it wasnt too hard to find the solution - tell grails/hibernate not to automatically flush changes:

    parent (nullable: true, validator: { val, obj ->
      if (obj.id && val) {
        if (val.id == obj.id) {
          return 'AFSC.cannot.be.cyclic'
        def originalFlushMode = obj.sessionFactory.currentSession.flushMode
        obj.sessionFactory.currentSession.flushMode = org.hibernate.FlushMode.MANUAL
        try {
          if (obj.afscService.getChild(obj, val.code)) {
            return 'AFSC.cannot.be.cyclic'
        catch(Exception e) {
          return e.getMessage()
        finally {
          obj.sessionFactory.currentSession.flushMode = originalFlushMode

Update 04-Apr-2014:

Today I was writing a custom validator and something was telling me that querying the database from a validator might cause problems, so I decided to google a bit to see if I was right.  I guess it was my subconscious reminding me that I'd already solved this problem once before.

Having seen a couple of other solutions for this problem, it seems like a better solution might be
to use the withNewSession() domain class dynamic function so that the query can be performed in a new session.  See http://adhockery.blogspot.com/2010/01/upgrading-grails-11-12.html for an example.

28 September 2010

The Windows Way vs the Unix Way

A colleague comparing VCS systems commented that while TFS includes pretty good bug and issue tracking components, systems like Git and Mercurial have to to be combined with Trac or Bugzilla to get the same set of functionality.   As I was reading his comments, it struck me that this was another great example of the Windows Way vs the Unix Way.

The Unix way is to have lots of small focused tools that do one thing well and that can be plugged together in any combination that meets your particular needs.  Often, there are multiple tools that do approximately the same thing, but in slightly different ways, each convenient to a particular set of needs.  This leads to highly tuned solutions - you can pick exactly the right set of tools that both solve your problem and fit best with your environment.  The down-side to this is that you need to be skilled in knowing how to fit the right pieces together.  Knowing which tools are best suited to your environment takes a good bit of experience - but once you have a solution, it fits like a glove.

The Windows way is pretty much the opposite.  The Windows Way is to construct a single monolithic piece of software that solves a whole general class of problems.  The great thing about this approach is that there's no assembly required.  You dont need to know about lots of different tools and how to make them all work together - you just need to know that one tool.  And because you only need to know one, you can get to be pretty knowledgeable about it.  The downside to this, of course, is that while general solutions can usually do a lot of things, they often do none of them well.  General purpose solutions have to make assumptions about the context is which they will be used, and if you deviate from those assumptions, the solution doesn't operate as efficiently as it could.

So is one better than the other?  Not really.  At least not without having more context.  Sometimes a general purpose solution is sufficient.  Other times, you need the flexibility to be able to create a highly specific solution.  The trick is to know when each is appropriate.

08 May 2007


"Engineer for serendipity."

--Roy Fielding


28 February 2007


So, earlier I had said that I wasn't really interested in going to the W3C Web of Services Workshop. It ended up that someone else couldn't go, and since I was curious to hear reactions to some of the stuff being presented on day 2, I decided to go ahead and go.

I'm glad I went. There was a lot of good and interesting discussion. I don't think any of the world's problems were solved and I don't think anybody changed their minds about Web Services - although there did seem to be a lot of agreement that REST is good and worth investing time in.

One of the recurring topics was the uniform interface - it seems some people get it and others don't. It always seem to end up in an argument about dispatching - either you do it at the operation level or you do it at the message type level. If I define operations, they're strongly typed and I know exactly what kind of data I'll be getting. If I only have one operation that has to handle different kinds of data, then how do I know what to do with the data - I have to write a big if-statement to figure out how to handle the data. How is that an improvement? Arent you just pushing the dispatching to a different place?

And that's the wrong thing to focus on. I stumbled on this as I was trying to explain to my very dyed-in-the-wool WS-* coworker why the uniform interface is useful. I finally made progress with the following.

Imagine I have a printer with an embedded web server. The printer makes available a web service with associated WSDL that define an operation called getPrinterStatus, and that operation returns an xml document of a type we'll just call DeviceInfo. If I'm writing a client to retrieve the printer's status, I pull in the WSDL generate the stub code and fill in the business logic. Now I can monitor the status of the printer.

Now imagine that some time later, I purchase a copy machine. This copier also has an embedded web server and makes available a web service. The web service has a number of operations, but one of them is getDeviceInfo, and that operation also happens to return an xml document that has the same format as what the printer returns - DeviceInfo.

If I now want my printer monitor client to also be able to monitor my copier, I have to modify the client's code - I have to pull in the copier's WSDL, generate the stubs for its operations and then I can get the DeviceInfo document for the copier.

Now back up and imagine that each of those devices had used HTTP GET in a RESTful way. Because my printer monitor client knows how to handle documents of type DeviceInfo, all I have to do is tell it what the appropriate URI is for each of my devices - I don't have to change any software. Now, instead of only being able to interact with my printer, my client software can interact with any resource that produces DeviceInfo documents in response to a HTTP GET. For free.

I could hear the light bulb click on.

06 February 2007

Hot vs Cold

Every winter, there are people who say they'd rather be too hot than too cold. And every summer, there are people who say they'd rather be too cold than too hot. And I'm pretty sure that some of those people say different things depending on the season.

Right now, I know I'd rather be too hot than too cold. Problem is, I'm concerned that I might have a seasonal opinion - but I really can't remember what I thought when it was actually hot out. I'm pretty certain I face this same dilemma every year, and probably twice a year (but I can't remember for sure).

So this year, I'm going to do something about it - I'm writing down my winter-time position on the hot vs cold debate: I prefer heat to cold.

Now, I just have to remember to check back when it's really hot out.

28 January 2007

Catching Up

So much for practicing writing - well, at least not here. I have, however, been doing quite a bit of writing at work lately, some of which is the basis for this position paper for the W3C Workshop on Web of Services for Enterprise Computing. If I'm lucky, maybe I'll finally get to meet fellow RESTafarian Mark Baker who's also presenting something.

Although I contributed to the position paper (somewhat unwittingly), at this point I'm not sure I'm looking forward to going to the workshop. In fact, if you'd asked me before our paper was submitted whether I was interested in attending such a workshop, my response would have been something along the lines of "why would I want to work to improve something (WS-*) that I'd prefer to see fade away?"

The project I'm on now is a research-oriented project for a military customer where we're looking at SOA, ESBs and Web Services (among other things). In a nutshell, we're supposed to help our customer figure out whether or not this SOA stuff and its corresponding technologies (which in their eyes is WS-*) will actually work and be useful for their purposes. The downside is that I'm working with stuff that I don't really believe in (the WS-* part, not the SOA part). The upside is that I have the opportunity to point out the failings as I see them (and the customer actually seems willing to listen).

Our team pretty much covers the spectrum from WS-* on one end to REST on the other (that's me), so we occasionally have some spirited debates.

For the last few months, we've been looking at service discovery, trying to really focus in on what service discovery is and why you might need it. You see, the military is in the midst of an effort to get themselves some SOA goodness and they're cranking out the architectural guidelines and building themselves some infrastructure to support all these new services that'll be part of their SOA. One piece of that infrastructure is discovery.

Apparently, there's some debate as to what discovery is. If you ask one group (apparently the majority), they claim it means content discovery - being able to discover information (i.e., search) - and that if you squint the right way, services are just information sources whose output can be treated as content. However, there's another group that believes there's a fundamental distinction between services and content and that the two require different approaches for discovery.

So we've been looking at service discovery and asking lots of questions - like what's the difference between design-time and run-time discovery, and is there really a need for such a thing? It's been kind of frustrating, because any time we talk to people about it, they either just point to UDDI, or they start talking about all the cools things you could do if you could discover arbitrary services at runtime. Unfortunately, there are never any real details as to how any of this would actually work. And worse, when we ask for real-world scenarios where this would be useful, we either get more hand-waving, or something that would require a whole lot more AI than the industry's currently able to muster.

We've managed to make some progress - to the point where I've managed to formulate a somewhat coherent picture of service discovery in my head; and over the last month, I've tried to put some of it on paper. Mind you, none of it's earth-shattering; just a healthy dose of reasoning about the needs of design-time discovery and run-time discovery and some thoughts about the sort of environment in which run-time discovery would actually make sense.

One conclusion we've drawn is that (assuming run-time discovery is actually a useful thing), what's currently out there in terms of tools, technologies, and specifications probably isn't sufficient - especially not in the military world. Problem is, at this point, I have no idea what would be needed. My esteemed colleague (author of our position paper, and solidly in the WS-* camp) has decided this gap should be addressed by the W3C - and thus the position paper. Me - I'm not so sure. I'm not even convinced there's a real need for run-time discovery - at least not as an infrastructure service.

Thus, my conundrum - I may have make a case for something I'm not even sure is a problem, and I have to do it at a workshop that I'd otherwise have no interest in. Oh well, if I do go, at least I'll finally get to meet a bunch of cool people - like Mark, Noah and Dave - whose writings I've followed in such august places as the W3C TAG mailing list or the REST-discuss group.

(Let's see if I can do another one of these without waiting another two years.)