Groovy Method Overloading is Dynamically Broken

2013-11-24 in groovy java
A fractal of bad implementation

Java permits the programmer to define more than one function or method of the same name, which can be differentiated based on argument count and types; this is called “overloading”. While most modern languages consider it a fairly bad thing, it allows Java to partially overcome two serious deficiencies:

  • Java does not have default arguments. With overloading, you can define versions of a function with fewer arguments, which simply call the real version with the missing arguments filled in.

  • Java gives no way to generically perform identical operations across disparate types, especially when it comes to arrays. Overloading allows the same function to be copy-pasted with different type declarations, or to do forwarding with conversions, in order to achieve such operations.

The Java compiler resolves all overload usages at compile-time in a well-defined manner. Consider the following example Java program.

Main.java

 1package gl.lin;
 2
 3public class Main {
 4  private static class Base {
 5    public void overload(Object o) {
 6      System.out.println("Got an object");
 7    }
 8
 9    public void overload(int i) {
10      System.out.println("Got an int (primitive)");
11    }
12
13    public void overload(Integer i) {
14      System.out.println("Got an integer (object)");
15    }
16
17    public void overload(Double d) {
18      System.out.println("Got a double (object)");
19    }
20
21    public void overload(double d) {
22      System.out.println("Got a double (primitive)");
23    }
24  }
25
26  private static class Derived extends Base {
27    public void overload(String s) {
28      System.out.println("Got a string in the derived class!");
29    }
30  }
31
32  public static void main(String[] args) {
33    System.out.println("Running on Base...");
34    runExample(new Base());
35    System.out.println("\nRunning on Dervied...");
36    runExample(new Derived());
37  }
38
39  private static void runExample(Base obj) {
40    /* call with specific types */
41    line(1); obj.overload(obj);
42    line(2); obj.overload(1);
43    line(3); obj.overload(1.2);
44    line(4); obj.overload(1.2f);
45    line(5); obj.overload("str");
46    /* call with (essentially) erased types */
47    line(6); call(obj, obj);
48    line(7); call(obj, 1);
49    line(8); call(obj, 1.2);
50    line(9); call(obj, 1.2f);
51    line(10);call(obj, "str");
52  }
53
54  private static void call(Base obj, Object arg) {
55    obj.overload(arg);
56  }
57
58  private static void line(int n) {
59    System.out.print(n + ": ");
60  }
61}

Running this program in the base project yields the output

 1$ ./gradlew run
 2
 3...
 4
 5Running on Base...
 61: Got an object
 72: Got an int (primitive)
 83: Got a double (primitive)
 94: Got a double (primitive)
105: Got an object
116: Got an object
127: Got an object
138: Got an object
149: Got an object
1510: Got an object
16
17Running on Dervied...
181: Got an object
192: Got an int (primitive)
203: Got a double (primitive)
214: Got a double (primitive)
225: Got an object
236: Got an object
247: Got an object
258: Got an object
269: Got an object
2710: Got an object

One lines 1 through 5, the method overload is called directly with varying types of arguments. On 1, it is of type Base; the most specific option (and, in fact, the only one that even applies) is the first, taking Object. On 2 and 3, we pass in primitive types. These exactly match the two overload versions taking primitive types, which are more specific as they do not require “boxing” into object types. Line 4 passes in a float, which gets promoted to double (the primitve) which then works as in 3. Finally, in 5, we pass in a string, which can only go to the version taking Object.

On lines 6 through 10, we pass in the same arguments by proxy of a function taking only Object. Because the static type of the argument is Object, only that particular overload is called.

These ten lines are run for both the Base object and the Derived object. While Derived defines another version of overload, it is not called even on line 5, as the static type Base does not define it.

All in all, Java’s overload rules are fairly intuitive and consistent. The one ambiguous case — passing null where more than one method takes an object type — is a compile-time error.

Now rename Main.java to Main.groovy and observe the chaos.

 1$ ./gradlew clean run
 2
 3...
 4
 5Running on Base...
 61: Got an object
 72: Got an integer (object)
 83: Got an object
 94: Got a double (primitive)
105: Got an object
116: Got an object
127: Got an integer (object)
138: Got an object
149: Got a double (primitive)
1510: Got an object
16
17Running on Dervied...
181: Got an object
192: Got an integer (object)
203: Got an object
214: Got a double (primitive)
225: Got a string in the derived class!
236: Got an object
247: Got an integer (object)
258: Got an object
269: Got a double (primitive)
2710: Got a string in the derived class!

There’s serveral things going on here, which primarily revolve around Groovy’s pathological “dynamic” type system. First off, notice that for both classes, lines 1 through 5 exactly match 6 through 10. This occurs because Groovy essentially promotes all values to Object before invoking a method, and at runtime examines the possible options, then decides which overload to call and down-casts the arguments as necessary. Thus, the fact that call erased the static type information is no longer relevant, and has the same effect as calling it with what should be (but isn’t) static type information.

Line 1 works as expected. As overload(Object) is the only method that can possibly accept a Base (or a Derived), it gets called. Just like Java. The same can’t be said about any of the following lines.

Line 2 is interesting. We pass in a double, which one would expect Groovy to promote to Double, then pass into the respective overload. But that’s not what happens — it goes into the one taking an Object. Some digging reveals that this occurs because Groovy boxes 1.2 into a BigDecimal instead of a Double (because a multi-thousand-fold performance decrease totally doesn’t matter, right?), and perhaps surprisingly doesn’t implicitly demote it back to a Double (or a double), instead preferring to pass it as a raw Object.

Lines 3 and 4 are interesting when looked at together. On line 3, our float literal is in fact boxed to a Float, which is a legal type to pass in as a primitve double (with an implicit unboxing and promotion). But on line 4, Groovy doesn’t unbox the Integer it creates, instead calling the object Integer version of the method. What’s the difference here? The order the methods are defined in the class. In Groovy, ambiguity is resolved (probably accidentially) by preferring the method defined last in the source file (or the compiled .class file really, but in practise the two are usually the same). Thus, overload(Double) is entirely dead, as it precedes overload(double), whereas overload(int) is dead by means of preceding overload(Integer).

Finally, line 5 demonstrates that Groovy completely ignores its static type information even for the object being called, as it discovers and calls the overload(String) variant which isn’t part of the declared class. In fact, that overload could even be private and static, and Groovy will still call it on lines 5 and 10. The implementation details of a subclass interact with overloading of methods in a superclass!

Of course, the metaclass “meta class” system also interacts with overloading in strange ways, though not terribly different from what’s been shown here already.

This mayhem is the reason OKish to good dynamic languages (Python, Ruby, Tcl, etc) either don’t have overloading, or require it to be orchestrated more manually (Common Lisp). But Groovy needed to support overloading for backwards-compatibility and inter-op with Java. Yet they somehow managed to botch both the design and the implementation, in a way that makes the inherited language feature dangerous and virtually useless due to its unpredictability.