Nasty Ruby Bug Affecting Test::Unit

Introduction

Some time ago, while I was pair-programming with him, Chris alerted me to a Ruby bug he’d come across which was interfering with the diagnosis of a bug in our application. Since then I’ve tried to find out more about it, but couldn’t find much, so I’ve done a bit of investigation and thought I’d post it here in case it’s useful to anyone else. The bug has long since been fixed, but I’m sure there are still people our there with the affected versions of Ruby 1.8.6.

Ruby bug

As far as I understand it, the bug is in Ruby’s Kernel.at_exit hook. A call to Kernel.exit(false) should cause the process to exit with an exit status of 1 indicating the process did not complete successfully. The bug means that calling Kernel.exit(false) from within Kernel.at_exit incorrectly causes the process to exit with an exit status of 0.

The most relevant bug report is #9300 and the most relevant mailing list thread is made up of:-

The fix seems to be in changeset 12126…

r12126 | nobu | 2007-03-23 16:53:42 +0000 (Fri, 23 Mar 2007) | 9 lines
* eval.c (ruby_cleanup): exit by SystemExit and SignalException in END block.  [ruby-core:10609]
* test/ruby/test_beginendblock.rb (test_should_propagate_exit_code): test for exit in END block.  [ruby-core:10760]
* test/ruby/test_beginendblock.rb (test_should_propagate_signaled): test for signal in END block.

Implications for Test::Unit & Rake::TestTask

The bug has some important consequences. Test::Unit makes use of this mechanism to report test failures. Unfortunately, the bug means that a Test::Unit process will always return an exit status of 0 even when there have been test failures.

From test/unit.rb :-

at_exit do
  unless $! || Test::Unit.run?
    exit Test::Unit::AutoRunner.run
  end
end

This in turn means that a Rake::TestTask process will also always return an exit status of 0 even when there have been test failures. This is significant, because many continuous integration systems rely on Rake::TestTask processes returning an exit status of 1 to indicate that there have been test failures. Thus you will get false positive passing builds – not good.

Affected versions of Ruby

I’ve built and installed a number of versions of Ruby and run tests on them to try to establish which ones are affected. Although they aren’t comprehensive, here are the results…

affected? version
N ruby 1.8.4 (2005-12-24) [i686-darwin8.10.3]
N ruby 1.8.5 (2006-08-25) [i686-darwin8.10.3]
N ruby 1.8.5 (2007-03-16 patchlevel 37) [i686-darwin8.10.3]
N ruby 1.8.5 (2008-03-03 patchlevel 115) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-02-17 patchlevel 0) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-03-13 patchlevel 0) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-03-16 patchlevel 2) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-03-19 patchlevel 4) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-05-22 patchlevel 5) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-05-22 patchlevel 6) [i686-darwin8.10.3]
Y ruby 1.8.6 (2007-05-22 patchlevel 7) [i686-darwin8.10.3]
N ruby 1.8.6 (2007-05-22 patchlevel 8) [i686-darwin8.10.3]
N ruby 1.8.6 (2007-05-23 patchlevel 9) [i686-darwin8.10.3]
N ruby 1.8.6 (2007-05-23 patchlevel 10) [i686-darwin8.10.3]
N ruby 1.8.6 (2007-08-22 patchlevel 50) [i686-darwin8.10.3]
N ruby 1.9.0 (2007-11-28 patchlevel 0) [i686-darwin8.10.3]