Groovy variable scope is a massive game of Nomic

2013-10-20 in groovy java
Not my favourite programming language

At first glance, Groovy’s scope rules are much like Java’s. A variable name will refer to a local variable, a variable from a containing scope, or the superclass or -interface of a class scope, or possibly from static imports. Except in contrived cases, the system is fairly easy to reason about, and in any case the compiler will catch you if you do something wrong.

Groovy, of course, had to take a decent system and try to make it more “dynamic”. In some ways, the scoping system is more like Python, especially in that variable existence is determined partially at run-time. Python’s rules, however, are fairly simple, and make that system basically work.

example.py

 1class example:
 2    def function(self):
 3        # x is assigned something, it *must* be a local variable, which is
 4        # created if necessary.
 5        x = 42
 6        # FOO is not assigned in this scope, it *must* be a global variable or
 7        # constant
 8        x += FOO
 9        # member is explicitly stated to be part of object self; there is no
10        # implicit object-member access.
11        return x + self.member

One key point, however, is that whether a global variable/constant exists is determined at run-time, so access to such a variable also fails at run-time if the global does not exist. In Python, this isn’t a huge problem, since this issue is limited to function-local variable names.

Groovy merges Python’s simple, dynamic system into Java’s complex, static system, creating a variable resolution system whose error-proneness rivals that of Fortran 77. Then it throws some misunderstood Smalltalk concepts into the mix.

Let’s start small, with Java.

Main.java

 1package gl.lin;
 2
 3public class Main {
 4  public static void main(String args[]) {
 5    new Main().hello("world");
 6  }
 7
 8  private void hello(String whom) {
 9    System.out.println("hello " + who);
10  }
11}

If you even try to compile this file, the compiler will helpfully point out that you misspelled the variable in the print statement.

1src/main/groovy/gl/lin/Main.java:10: cannot find symbol
2symbol  : variable who
3location: class gl.lin.Main
4    System.out.println("hello " + who);
5                                  ^
61 error

If you take this file and rename it to .groovy (and change String args[] to String[] args, since Groovy doesn’t support the C++-style syntax), however, it compiles just fine. But, when you run it:

 1Exception in thread "main" groovy.lang.MissingPropertyException: No such propert
 2y: who for class: gl.lin.Main
 3	at org.codehaus.groovy.runtime.ScriptBytecodeAdapter.unwrap(ScriptByteco
 4deAdapter.java:50)
 5	at org.codehaus.groovy.runtime.callsite.GetEffectivePogoPropertySite.get
 6Property(GetEffectivePogoPropertySite.java:86)
 7	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.callGroovyObjec
 8tGetProperty(AbstractCallSite.java:231)
 9	at gl.lin.Main.hello(Main.groovy:10)
10	at gl.lin.Main.this$2$hello(Main.groovy)
11	at gl.lin.Main$this$2$hello.call(Unknown Source)
12	at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSi
13teArray.java:42)
14	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCa
15llSite.java:108)
16	at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCa
17llSite.java:116)
18	at gl.lin.Main.main(Main.groovy:6)

So far, basically like Python, except for the uselessly verbose stack trace. As with Java, of course, there’s also implicit member access. Here’s where things start to get fun horrifying, because there are so many places from which a variable may be introduced.

The most obvious way for a variable name to be introduced other than by defining it the local scope is for it to be in the containing class, or a class or interface that that class inherits from. This is just like Java, so I won’t spend too much time on it. If you add the line

  String who;

anywhere at class scope in Main.groovy, the code runs fine and prints hello null, pretty much just like Java. Also like Java, variables can be captured from enclosing scopes in the case of nested classes.

So far, the scope system is “just like Java” — so why don’t they have the compiler check scopes at compile-time like Java? Maybe Codehaus is just that lazy/incompetent is a first thought — but no, they actually do check variable scope for static contexts, with a stupidly and needlessly verbose error message.

Static.groovy

1package gl.lin;
2
3class Static {
4  static void foo() {
5    xyzzy = 5;
6  }
7}
 1src/main/groovy/gl/lin/Static.groovy: 6: Apparent variable 'xyzzy' was found in
 2a static scope but doesn't refer to a local variable, static field or class. Pos
 3sible causes:
 4You attempted to reference a variable in the binding or an instance variable fro
 5m a static context.
 6You misspelled a classname or statically imported field. Please check the spelli
 7ng.
 8You attempted to use a method 'xyzzy' but left out brackets in a place not allow
 9ed by the grammar.
10 @ line 6, column 5.
11       xyzzy = 5;
12       ^
13
141 error

I have no idea why they thought such explanation was necessary — “cannot find symbol” really says everything the programmer needs to know (though the third explanation hints at another weirdness I’ll get to later). Regardless, in a static context, you do get scope checking, which really makes Groovy more usable as a procedural language than an OO language.

The lack of checked non-static scopes is due to two terrible extensions to Java’s scope system: dynamic typing, and Groovy’s “metaclass” system, probably the single most unused “feature” added by the language. We won’t be going into depth on metaclasses today. The important thing is that metaclasses allow you to effectively alter the scoping rules of variables at run-time.

Since Groovy tries to be more “dynamic”, it’s object system is very much like Python’s or Ruby’s, in that you can refer to a member not defined in the class declared for a value. This is sometimes useful for callers (especially in reasonable languages like Ruby or Python), but due to Java’s implicit scoping rules, means that a subclass can override its superclass’s captures. See the below program for an illustration.

Override.groovy

 1package gl.lin;
 2
 3public class Override {
 4  String whom = "world";
 5
 6  class A {
 7    void doit() {
 8      System.out.println("hello " + whom);
 9    }
10  }
11
12  class B extends A {
13    String whom = "nobody"; // !
14  }
15
16  private void run() {
17    new A().doit();
18    new B().doit(); // !
19  }
20
21  public static void main(String[] args) {
22    new Main().run();
23  }
24}

In Java, this program would print

  hello world
  hello world

because the static scoping captures Main.whom in all cases. When run as Groovy, however, you get

  hello world
  hello nobody

This is because Groovy follows a relaxation of Java’s scoping rules at run-time. Since whom is not a local variable in A.doit, it then checks for a property named whom on the A instance. In the first case, it finds none, and procedes up to the containing class, where it finds Main.whom. However, in the second instance, it finds B.whom on when running A.doit, and thus stops there. The subclass has changed the scoping rules of its parent.

This seemingly simple change profoundly complicates the name resolution system, and compromises any safety the compiler could possibly hope to provide. Furthermore, not once have I ever seen anyone intentionally use an unqualified name to refer to a member of a subclass (ignoring for a second how fundamentally wrong it is to do that); most people don’t even know it’s possible, and just assume it’s more Groovy brokenness. In any case, the Groovy name resolution system creates a sprawling mess of possible introductory points, including not only your containing scopes and superclasses, but, if you happen to need to capture variables from another scope, any class that subclasses your class, ever. As a Groovy application grows, the number of possible places a variable could be located only grows, even when the code in question is not modified.

The problems introduced by the metaclass system don’t generally cause problems in practise for much the same reason that chainsaw bayonets rarely cause injuries in real life — nobody uses them. Since this isn’t the post about metaclasses, I present the closing example without further comment.

Horror.groovy

 1package gl.lin;
 2
 3public class Horror {
 4  String whom = "world";
 5
 6  class A {
 7    void run() {
 8      hello();
 9      new Helper().help(this);
10      hello();
11    }
12
13    void hello() {
14      System.out.println("hello " + whom);
15    }
16  }
17
18  public static void main(String[] args) {
19    new Main().run();
20  }
21
22  void run() {
23    new A().run();
24  }
25}
26
27class Helper {
28  void help(obj) {
29    obj.metaClass.getWhom = {
30      return System.getProperty("user.name");
31    };
32  }
33}