A Maglev Store-y

Jan 29 2010

Maglev has been getting hype ever since its developers became magicians, with bunnies and a neat persistence model for Ruby objects out of all things. “But PStore has been persisting my objects for ages”, you respond. Ah, ever the skeptic. So true, yet I also happen to know one of PStore’s weaknesses:

  # MRI's infamous PStore, on the scene!
  # ruby -v 
  # => ruby 1.8.7 (2008-08-11 patchlevel 72) [universal-darwin10.0]

  require 'pstore'
  pstore = PStore.new("filestore")
  pstore.transaction {|store| store[Proc] = Proc.new { "Save me!" } }
  # => TypeError: no marshal_dump is defined for class Proc
  # Maglev, the rookie with something to prove!
  # WARNING: This example does a crash and burn with maglev-irb
  # maglev-ruby -v
  # => maglev 0.6 (ruby 1.8.6) (2010-01-22 rev 22780-1096) [Darwin i386]

  Maglev::PERSISTENT_ROOT[Proc] = Proc.new { "Save me!" }
  Maglev.commit_transaction
  # => Doesn't even blink

Yowza! How’s that for a “stored procedure”.

For the uninitiated, Maglev rolls with persistence by reachability. We get an enchanted hash conveniently parked at Maglev::PERSISTENT_ROOT at the start of each Ruby session that’s fire up. Utter the magic words Maglev.commit_transaction and everything that the hash references gets syphoned and persisted into Maglev’s repository.

Anyway, being the unproven tech adopter that I am, I want to give Maglev a spin. Use it as an ORM substitution for a small project. I’m writing a script, you see.

  # The Persistable module aims for elegance and simplicity. 
  # #persist, #desist and an easy way to apply them to vanilla Ruby classes.
  require 'persistable'
  
  class Diagnosis < Struct.new(:name, :type)
    include Persistable
  end

  # It's time for another exciting episode of House!
  # Today's plot: the girl's fallen and she can't get up.

  # House: Differential diagnosis time!

  # Cameron: Lack of fever rules out infection, but she does have a family 
  #          history of diabetes type 2 and she's gained 30 pounds in less than
  #          a month, suggesting Cushing's. 
  Diagnosis.new("diabetes", :endocrine).persist # => true
  Diagnosis.new("cushings", :endocrine).persist # => true
  
  # House: Aren't you missing something, your _favorite_ something?
  # Cameron: ... lupus?
  # House: Exactly!
  Diagnosis.new("lupus", :autoimmune).persist # => true

  # Foreman: Loss of equilibrium could suggest MS.
  Diagnosis.new("multiple sclerosis", :neurological).persist # => true
  
  # Chase (in Australian drawl): Maybe her blood sugar was just low. 
  # *everyone glares at Chase*
  # Chase (from down under): Fine then. It's sarcoidosis.
  Diagnosis.new("sarcoidosis", :inflammatory).persist # => true
  
  # House: Throw in Paraneoplastic Syndrome for good luck. 
  Diagnosis.new("paraneoplastic syndrome", :cancer).persist # => true
  # House: Run the really expensive and unnecessary tests to rule out those...
  Diagnosis.count # => 6
  # House: ... six completely unrelated diseases. 

  # House: Oh, and Cameron, by the way. It's never lupus. 
  # *Cameron frowns*
  Diagnosis.delete_if {|disease| disease.name == "lupus"}

  # One commercial later...
  
  # Foreman: MRI and LP came back completely normal, ruling out any neurological disorder.
  # Chase (back from the outback): Lung biopsy revealed _nothing_, except that it's not 
  #                                sarcoidosis.
  Diagnosis.delete_if {|disease| disease.type == :neurological || disease.name == "sarcoidosis"}
  
  # Cameron: I'm telling you, it's gotta be hormonal.
  Diagnosis.select {|disease| disease.type == :endocrine}
  
  # ... 30 minutes, 2 cardiac arrests and 1 stroke later, it's none of the above anyway.
  Diagnosis.delete_all

The code for Persistable is available here. More than showing off the implementation, however, I want to discuss the steps to making it all work on Maglev.

The original test code is something akin to:

  require 'test/unit'
  
  class PersistableTest < Test::Unit::TestCase
    # [...]
  end

Running the test file with ruby persistable_test.rb works hunky dory. Running it maglev-ruby though…

  # $ maglev-ruby persistable_test.rb
  # during at_exit handler 1: error ,  each_object not implemented

The test file don’t even run. The solution: a good ol’ twin substitution. Trade Tia for Tamera, if you will.

  # maglev-gem install minitest
  
  require 'minitest/unit'

  MiniTest::Unit.autorun

  class PersistableTest < MiniTest::Unit::TestCase
    # [...]
  end

Progress! But Maglev still has some issues (just as any other adolescent would). As far as errors go, though, this is one of the more interesting ones. Back to the original code:

  module Persistable

    # A store should be able to, uh, store objects. Also:
    # - should provide a deletion mechanism
    # - should provide a querying mechanism
    # - should not accept duplicates
    # - should be Set
    # Ok, so that last one isn't a requirement. Gosh darned, it's the solution!
    require 'set'

    def self.included(klass)
      klass.class_eval do

        # The code should at least run on standard MRI, even if it doesn't really
        # provide persistence then.
        if defined?(Maglev)
          # Flag the class as persistable by Maglev
          # The documentation mentions #maglev_persistable=(true)
          # The documentation is also wrong
          self.maglev_persistable 
          @@store = ( Maglev::PERSISTENT_ROOT[self] ||= Set.new )
        else
          @@store ||= Set.new
        end

        # You are now entering the metaclass zone
        class << self
          
          include Enumerable
        
          def each(&block)
            @@store.each &block
          end          
        
          # other class methods which access @@store go here
        end
        
        # instance methods (which also access @@store) go here 
      end
    end
  end

The issue seems to be that class << self, i.e. the metaclass, doesn’t share its class variables with its own class.

  # maglev-irb
  
  class Foo
    include Persistable
  end
  
  Foo.class_variables
  # => ["@@store"]
  
  Foo.all
  # => NameError: undefined class variable @@store

I dunno why, maybe the metaclass didn’t have good parenting when growing up and never learned the art of sharing. In any case, the solution is to rip the methods from the metaclass’s grubby hands and define them on the class proper.

  module Persistable

    def self.included(klass)
      klass.class_eval do
    
        # [...]
      
        # No more metaclass fun
       
        extend Enumerable
        
        def self.each(&block)
          @@store.each &block
        end        
        
        # [...]
      
      end
    end
  
  end

One more shot at maglev-ruby persistable_test.rb and it’s down to two errors. Not bad for a third try. The bump is much lower too: Set#count demands a block. Seriously, what’s up with Maglev’s avarice? Once again, it’s substitution time:

  module Persistable
  
    # [...]

    def self.count
      # @@store.count was too demanding in Maglev
      @@store.size
    end
  
    # [...]
  
  end

Run tests again and… funny, it must be St. Patrick’s day, because all I’m seeing is green.

  # maglev-irb
  class Bacon < Struct.new(:type)
    include Persistable
  end
  
  Bacon.new(:soggy).persist
  Bacon.new(:crunchy).persist
  Bacon.new(:chunky).persist
  # Commit changes
  Maglev.commit_transaction
  # open up another maglev-irb session
  Bacon.count
  # => 3
  # Querying
  Bacon.select {|piece| piece.type == :chunky}
  # Deleting
  Bacon.delete_if {|piece| piece.type == :soggy}
  # Commit changes again
  Maglev.commit_transaction

And with that, we get a workable (for loose definitions of work) solution to handling persisted instances of Ruby classes. Don’t leave yet though, there’s so much more!

#persist may be aliased as #save, but they should be treated as distant cousins. After an object calls #persist, Persistable stores the object reference and does not make a copy, so any changes you make are live.

  # Run with irb
  # When run with maglev-irb, Bacon.all.inspect returns "[...]" for some reason
  #   but is otherwise functionally equivalent

  # Bacon class definition, including Persistable

  # Let's get some bacon and save it in our stash
  snack = Bacon.new(:chunky)
  snack.persist
  Bacon.all
  # => [#<struct Bacon type=:chunky>]

  # Oh no, some water fell on our bacon...
  snack.type = :soggy
  # ... and our stash already knows!
  Bacon.all
  # => [#<struct Bacon type=:soggy>]

Just keep in mind that when you’re editing an object, you’re editing it live. Mistakes can be rolled back with Maglev.abort_transaction, while nothing gets committed to the Maglev repository until Maglev.commit_transaction is called.

If we want validations, attribute accessors are the way to go.

  # Let's reopen our delicious class
  class Bacon
    attr_reader :expiration_date
    def expiration_date=(date)
      raise ArgumentError, "this bacon is already expired?!" if date < Time.now
      @date = date
    end
  end

Finally, a word on uniqueness. Set uses an object’s #hash and #eql? results to determine if it already contains said object. What this means is that we can establish uniqueness by altering said methods.

  # Back to the House metaphor
  class Diagnosis < Struct.new(:name, :type)

    # We want uniqueness based on type
    def hash
      type.hash
    end
  end
  
  Diagnosis.new("sjogren's disease", :autoimmune).persist # => true
  Diagnosis.new("lupus", :autoimmune).persist # => false, only one autoimmune disease per episode allowed

Needless to say, I learned more about metaclasses while dealing with this code than I had a need to previously and had fun while doing it. Things like this are the benefit of trying out an alternative Ruby implementation, so give it a whirl, with full expectations that some methods may not work as expected. Doesn’t have to be Maglev, either; go out and party with Rubinius or have some drinks with Macruby. Be forewarned though: they don’t have as cool a logo.

Update: As if you needed proof that the implementation authors care. I had been tweeting about some of the troubles I came across while working with Maglev, and got a wonderful email regarding them from one of Gemtone’s founders:

Hi Ecin,

I noticed your tweets.

We can look into this one:
“The metaclass can’t access a class’s class_variables in maglev. This will complicate things…”
If you find other holes, let us know. We’re always glad for anything that helps us make forward progress. Bug reports, fixes, examples, etc.

[…]

Cheers,
Monty