The really interesting thing about Git::Wrapper

3 jun 2020

Since I maintain the Git::Wrapper module, I was super happy to see the recent ETOOBUSY post about it — but I thought that the post, which starts with the TL;DR of “Git::Wrapper is an interesting Perl module.” didn’t mention the thing I think is the most interesting feature of Git::Wrapper.

In my opinion, the most interesting part of the module is the core of the implementation (which was originally written by HDP). Because of the way Git::Wrapper is written, it will continue to support any new sub-commands or sub-command options as they’re added to the Git binary, without requiring a single update to the Git::Wrapper code itself!

This is accomplished by the use of Perl’s AUTOLOAD subroutine. The way AUTOLOAD works is, if a non-existent subroutine is called when an AUTOLOAD subroutine exists, the AUTOLOAD sub is called instead of Perl throwing an error about a missing subroutine — and it is given the name of the method that was called. (Folks may be familiar with similar features in other languages, like Ruby’s method_missing.)

Using AUTOLOAD means the core of the module is as simple as this:

sub AUTOLOAD {
  my $self = shift;

  (my $meth = our $AUTOLOAD) =~ s/.+:://;
  return if $meth eq 'DESTROY';

  $meth =~ tr/_/-/;

  return $self->RUN($meth, @_);
}

Git::Wrapper is an object-based module, so the first thing we do is get the object our missing method was called on. In Perl, you do this by extracting the first element from the special @_ variable. (@_ contains all the arguments that were provided when the method was called. I’m pointing this out now, because it will be relevant here in a paragraph or three.)

  my $self = shift;

Inside the AUTOLOAD method, the name of the method that was called is available in the special $AUTOLOAD variable. Since this code is in a module, the full name of the method that gets called is going to be prefixed with the module name, Git::Wrapper::. The regex in the first line strips that off, so we get just the bare name of the method that was called.

To give a concrete example, if we had a Git::Wrapper object called $git, and we had a $git->branch() method call, after these lines execute, $meth would be set to branch.

  (my $meth = our $AUTOLOAD) =~ s/.+:://;
  return if $meth eq 'DESTROY';

(DESTROY is the name of a method that Perl calls on objects when they go out of scope, to allow them to run clean-up code. We don’t have any clean-up code to run, so if this AUTOLOAD invocation is because the object is being destroyed, we just bail out.)

Since Git sub-commands are written using kebab-case and Perl method names have to be snake_case, we convert from snake_case to kebab-case in this line:

  $meth =~ tr/_/-/;

And now, the big finale: we call the RUN method with the transformed method name, and pass in whatever other arguments were provided in the original method call that resulted in AUTOLOAD being invoked. RUN calls the git binary, with the transformed method name as the sub-command — so $git->branch() results in a shell execution of git branch! RUN also handles parsing and organizing the sub-command options, collects the output from the command execution, deals with any errors, and basically does all the actual work.

  return $self->RUN($meth, @_);

The RUN method is factored out into a distinct method instead of being inline in AUTOLOAD because there are some use cases for the module where somebody might want to call the RUN method directly, to take advantage of the error and output parsing in there.

So, that is what I think is the most interesting part of Git::Wrapper: it contains what I’m pretty sure is the only justfiable use of AUTOLOAD that I’ve ever personally worked on!