Tuesday, June 12, 2007

Hacking JRuby: BigDecimal and Ruby Internals

I've submitted another patch for JRuby (viewable here), to implement the BigDecimal.mode() class method. In doing so, I learned quite a bit about JRuby's implementation strategy, as well as the internals of the C source code for MRI (Matz's Ruby Implementation) Ruby.

BigDecimal.mode() explained

BigDecimal.mode() is a funky little method in the BigDecimal module, which is not part of the core API but part of the standard library that ships with Ruby. It's kind of multi-variate -- what it does, exactly, depends on how it's called.

The first parameter to BigDecimal.mode() is required, and it must be a Fixnum representing either the constant BigDecimal::ROUNDING_MODE or the exception mode to be set (more on that later). If it's BigDecimal::ROUNDING_MODE and there is no second argument, then mode() just returns the current rounding mode. If a second argument is present, it must also be a Fixnum, and it must equate to one of the seven rounding modes Ruby recognizes (e.g., BigDecimal::ROUND_UP, BigDecimal::ROUND_FLOOR, etc.). In this case, mode() sets the rounding mode (for all BigDecimals, remember, since this is a class method) to the value of the second argument.

If the first argument is a Fixnum that is not equal to BigDecimal::ROUNDING_MODE, then it is expected to have one of its bits set to correspond to one of the known exception modes (e.g., BigDecimal::EXCEPTION_INFINITY). Again, if there is no second argument, mode() simply reports the current exception mode(s) (each bit in the returned value corresponds to a single exception mode set). If there is a second argument, it must be one of 'true' or 'false'. If 'true', mode() sets the mode passed in the first argument. If 'false', mode() unsets (i.e., turns off) the mode passed in the first argument.

Simple, huh?

Not So Fast...

When I picked up this task, mode() was just a default stub that printed a message to the console and returned nil. Not a lot to go on there. So I turned to the MRI source code to figure out just what it was supposed to do.

Introducing: rb_scan_args()

One of the first things MRI does (in a lot of methods, as it turns out) is to call the function rb_scan_args()), which is implemented in the file class.c with the following signature:
int rb_scan_args(int argc, const VALUE *argv, const char *fmt, ...)
It takes the number of arguments passed, a pointer to a structure containing the values of those arguments, a format string of some sort, and...some other stuff. The number and values of the arguments are self-explanatory, but the format string and the trailing "other stuff" are decidedly not, so let's take a look at them.

The format string consists, minimally, of two digits. The first digit is the number of required arguments, the second is the number of optional arguments. rb_scan_args parses the format string to find these numbers, then it walks the list of argument values and stuffs each value into its corresponding reference (which is what the "other stuff" in the signature actually is: a group of references to store the values of the arguments in).

For example, BigDecimal.mode() makes this call to rb_scan_args:
if(rb_scan_args(argc,argv,"11",&which,&val)==1) val = Qnil;
In English:

  • get one required argument and store its value in the variable which

  • get the optional second argument if it exists and put its value in val

  • if rb_scan_args returned 1 (i.e., only one argument was provided), then set the value of the optional argument to its default of nil

So this is how MRI Ruby (as implemented in C) handles variable/optional arguments in a general way. There's more to it, of course, including an astonishing bit of hackery with C macros that actually implements putting the argument values in the right place for return. But I won't go into that until I understand it better. Also, the format string allows for the Ruby constructs of "rest args" (indicated by an '*' in the format string) and finally a "block arg" (indicated by an '&').

Meanwhile, Back in JRuby...

This has gone on a bit long, so I'll just close by saying that JRuby does not have an equivalent for rb_scan_args(), or at least not one that is called on a per-method basis. The runtime is responsible for bundling arguments and calling the appropriate Java method based on the number of arguments actually present. This causes a bit of a problem right now for class methods that take optional arguments (as BigDecimal.mode() does), but that's a subject for another post.

2 comments:

Ola Bini said...

Hi David. Good work on the BigDecimal patch.

So, we do actually have two different ways of handling argument checking. Both are static methods on org.jruby.runtime.Arity.

The first fine is called checkArgumentCount; this takes a runtime, the args array, minimum arguments and maximum arguments. The method will throw an appropriate exception if the argument array it gets is incorrect.

The second version is called scanArgs, and takes the runtime, the argument array, a count of required and a count of optional, and will return an array of length required+optional, with all arguments set, and the rest set to nil.

None of these handles rest arguments, of course; you'll have to do that yourself.

David Rupp said...

Thanks, Ola (for the compliment and for the information about Arity). I'll take a look at the codebase and see if I can use checkArgumentCount() and/or scanArgs() to help with my class-method-with-opt-args problem.