A little lesson learn from Java

A post about good books, language design and JIT compilation, in which one bug turns into another and than back……

Recently I started looking through an excellent book “Java™ Puzzlers”, where Joshua Bloch and Neal Gafter provide a list of Java’s “Traps, Pitfalls, and Corner Cases”, i.e. programs that make you think they do what they really don’t. My idea is to see how many of the puzzlers are ruled out or fixed by Kotlin. I’ve looked through the first 24 items, and 15 of them are fixed in Kotlin, which is over 60%.

Some of the puzzlers can’t be fixed without severe implications on compatibility with the rest of the world. For example, most of the tricky things about IEEE-745 floating-point numbers, But some other ones, though not fixed in Kotlin yet, may be fixed. One particular example is Puzzier 26 “In the Loop”:

/**
 * Bloch, Joshua; Gofter, Neal (2005-06-24).
 * Java™ Puzzlers: Traps, Pitfalls, and Corner Cases (p. 57).
 * Pearson Education (USA). Kindle Edition.
 */
public class InTheLoop{
    public static final int END = Integer.MAX_VALUE;
    public static final int START = END - 100;

    public static void main(String[] args){
        int count = 0;
        for(int i = START; i <= END; i++){
            count++;
        }

        System.out.println(count);
    }
}

Don’t read further until you figure what this program prints.

This program prints nothing and simply loops forever, because variable ‘i’ is of type int, and ANY int is less or equal than Integer.MAX_INT.

Now, if I write this in Kotlin:

val end = Integer.MAX_VALUE
val start = end - 100
var count = 0
for(i in start..end){
    count++
}

println(count)

It does Not loop. And prints “101”, which is the size fo the range of iteration……

This is the point where you think: “Didn’t he say that this puzzler is not yet fixed by Kotlin?” Yes, I did.

This Kotlin program SHOULD loop forever. And it does not. Sigh. I have already opened the “New issue” dialog in our tracker when I got too curious and looked at the code our compiler emits. You know what? I found nothing bad there. Written in Java (I am your honest decompiler today), it would look like this:

int end = Integer.MAX_VALUE;
int start = end - 100;
int count = 0;
for(int i = start; i <= end; i++){
    count++;
}

System.out.println("count = " + count);

And this TERMINATES and prints “101”. That’s where I really got puzzled.

After some experimentation, I discovered that making variable ‘end’ final makes the program loop forever. “It must be JIT,” — I though, and was right: when run with “java -Xint”, this code loops forever, and so does the Kotlin code.

How come? Well, I run a 64-bit JVM. Most likely, JIT optimizer makes the loop variable 64-bit, because registers are this big, or something like this, and it doese not overflow, but just becomes Integer.MAX_VALUE + 1.

Sigh. I closed our “New issue” dialog, and opened the HotSpot’s one…… (Some technicalities prevent me from finishing the reporting process right now, but I will do it on Monday).

Now, what lesson can we learn from this? I don’t think I can learn much from hitting a JIT bug. Bugs happen — that’s the lesson here, I think.

But what about the initial puzzler? I teach Pascal at high school, and one thing I really like about this laguage is that a for loop ALWAYS TERMINATES there. We cannot have the same in Kotlin, because in general for uses an iterator that may have arbitrary logic in it. But what we can do is guarantee that iteration over a range of numbers always terminates.

BTW, if a range was just a list of numbers, the loops would terminate, right? So it IS a bug in the Kotlin compiler, after all.

DSLs in Kotlin: Part 1. What’s in the toolbox + Builders

If you have a very nice API, it is the fashion nowadays to call it an internal DSL, because the code that uses such an API reads almost like a language inside your language of choise. Fluent interfaces serve as one of the most popular examples.

Many modern languages provide some advanced means for creating internal DSLs, and Kotlin is no exception here. In this post I will briefly list the features that are useful for this purpose.

Let’s start with extension functions. We all are familiar with Java’s utility classes, like java.util.Collections and like. Such classes are simply containers for a bunch of static methods, which are intended to be used with such and such classes. So we end up writing code like this:

Collection.sort(list);
int index = Collections.binarySearch(list, x);

and this does not look very pretty. Static imports make it prettier, but they don’t solve an important problem of discoverability: we all navigate through APIs with IDE’s code completion capability:

And wouldn’t it be cool to discover those utility functions the same way? So we have extension functions that are called in the form “a.foo()” even if foo() is not a member of the class of a. For example, those utility functions from Collections could be defined as extension functions, and be called like this:

list.sort();
val index = list.binarySearch(x);

These are still statically dispatched utility functions, i.e. the bytecode emitted by the compiler is the same as in Java, but the syntax is better, and code completion works. Note that, unlike members, extension functions cannot be overridden in subclasses, i.e. some special implementation of List could not override sort() to be more efficient.

To define an extension function, we just put a receiver type in front of its name:

fun <T : Comparable<T>> List<T>.sort(){
    Collections.sort(this);
}

Note that I can use a ‘this’ reference that represents my receiver object. See more here.

Now, what do extension functions give us, DSL creators? First of all you can turn any interface into a fluent one. For example, let’s create a new buffered reader with a given charset:

val reader = FileInputStream("mytext.txt").buffered().reader("utf-8")

is it a special class I wrote to be able to get this? No. It’s only two functions:

fun InputStream.buffered() = BufferedInputStream(this);
fun InputStream.reader(charset : String) = InputStreamReader(this, charset)

Then, they play very well together with operator overloading: in Kotlin, most operators, such as plus, minus and so on, are compiled by convention to named function calls. For example, when I say “a + b”, Kotlin reads “a.plus(b)” (see more in our docs). This means that that by adding an extension function named “plus” to my type I can have a binary ‘+’ working on it. For example, I could make my own ‘+’ for list concatenation:

fun List.plus(other : List) : List{
    val result = ArrayList(this)
    result.addAll(other)
    return result
}

And call it like this:

val l1 = list(1, 2, 3)
val l2 = list(4, 5, 6)
val l3 = l1 + l2 // a new list of length 6 is created

And there’s more: since indexation is compiled to calls of get() and set() functions, we can have pretty sublists (or “slices”) that look like this:

val sublist = list[a..b]

By defining an extension function get() on a list:

fun <T> List<T>.get(range : IntRange<Int>) : List<T> 
    = subList(range.start, range.end)

Infix function calls add more on top of that, because you can say, for example

it hasPrivilege WRITE

instead of

it.hasPrivilege(WRITE)

And, of course< you get a whole lot of fun with higher-order functions and function literals (i.e. “closures”). For example, check this out:

lock (myLock){
  // Do something
}

Is this a built-in construct, like Java’s synchronized section? No, it’s a function call. It uses a very handy convention: you can pass the last function literal outside the parentheses you put around your argument list. So this call is the same as “lock(myLock, {……})”, but looks prettier.

More about this example can be found here.

There’s one other nice convention that makes something very close to LINQ possible:

users
    .filter{ it hasPrivilege WRITE }
    .map{ it => it.fulName }
    .orderBy{ lastName }

The convention is: If a function with only one parameter is expected, the parameter declaration may be omitted, and the default name ‘it’ will be used. I.e. “filter {it.foo()}” is the same as “filter {it => it.foo()}”.

And finally, if we put all this (and just a tiny little bit more) together, we can get something really nice. Look at this code:

html {
    head {
        title { + "XML encoding with Kotlin" }
    }

    body {
      h1 { + "XML encoding with Kotlin" }
      p { + "this format is now type-safe" }

      /* an element with attributes and text content */
      a(href = "http://jetbrains.com/kotlin") { + "Kotlin" }
    }
}

Is it Groovy? No, it’s Kotlin, and unlike Groovy, it’s statically typed. Yes, we can do builders like Groovy, but better. I added a detailed explanation of this example to our wiki; you find it here.

Multiple Inheritance Part 2: Possible directions

In the previous post in this series(系列) we discussed the disadvantages of the inheritance(继承) model we initially planned for Kotlin. Today we will talk about alternative designs.

Note that these posts are intended to provoke(激起) a discussion, so that we can benefit(有益于) from your feedback and come up with a better design.

What’s out there

The previous post concluded with the following (incomplete) list of solutions to the problem of multiple inheritance available in other languages:

  • Java and C# have classes and interfaces, i.e. multiple interface inheritance and single implementation inheritance;
  • Scala has classes and traits(特性) that may implement methods and even have state, but their constructors can not have parameters;
  • Some other languages, like Fortress, do not allow state in traits;
  • <Your favorite language here>

We all know that Java’s approach is rock-solid(坚如磐石), but imposes severe limitations on code reuse, so we would like to relax these limitations, but without getting ourselves into trouble. “First degree” of relaxing the limitations would be stateless traits (like in Fortress, and in [1]): no state in traits, no implicit overrides. Or we can trade inheritance of traits off for state and get mixins (like in Ruby). Relaxing the limitations even more we get to Scala’s traits that have state but no parameters for constructors, and one trait may override functions of another. Then we get to CZ’s classes with requires (as presented in [2]). The next step, I guess, would already be unrestricted multiple inheritance, like in C++.

We will skip a thorough analysis of each of these solutions, and just make a remark about state.

State. One important consideration is whether to allow multiple inheritance of state in this or that form. On the one hand, it seems to be very useful, but on the other hand, it imposes problems. One problem was discussed in the previous post under the name of Problem 2:

the implementation of Left assumes it’s initialized with 3, but it may call bar() that is implemented in Right and assumes everything is initialized with 4. This may cause some inconsistent behavior.

Another problem is that having state in a unit of inheritance (a class or trait or mixin) implies having a constructor there. and constructors may have side effects, and it’s important that those come in a predictable order.

Problem 2 is rather elegantly fixed by the Scala’s approach of having no parameters in the trait constructors. Unfortunately, the problem of constructor side-effects still stands: changing inheritance relations between traits (e.g. done by a library-writer) may reorder side-effects of their constructors upon creating a subclass instance (see this comment below). And this problem seems to be inevitable no matte what approach to multiple inheritance of state we choose (I wish someone could prove me wrong here!).

All that said, I’ll explain a design we are currently considering. As mentioned above, the purpose of this text is to start a discussion, so your feedback is very welcome.

The Kotlin way(Attempt #2)

First, I would like to note that at this point, we prefer conservative solutions, so that we could naturally extend them later if the set of features they provide is not enough.

In short, the current design can be described as follows:

  • Stateless traits: no properties with backing fields, no constructors,
  • that can “require” classes to be present in the set of supertypes of a concrete class that uses the trait,
  • with no automatic resolution for overriding conflicts: if a class inherits two implementations of something, it must override this something and provide its own implementation (i.e., choose from the inherited ones, or write its own from scratch, or mix the two).

So, we refrain from having multiple inheritance of state for now. I think it’s OK if we try it this way and consider relaxing the limitations later, if there’s a real demand for that.

Syntax. Now, let’s render this in some syntax. First question here is “Should we still call those stateless guys classes, or have a special term?” They differ from classes by imposing some limitations, and for now it is that they don’t have any state. If there are only classes, the user will fall into the following situation:

  • Nothing tells me that this class is special, so
  • I add a piece of state, and
  • the compiler complains about having no constructor, so
  • I simply add a constructor, and
  • get errors from some other classes telling me that I broke someone’s supertype lists, so
  • it takes some time before I track down the real cause of the error, which is no good.

It would be a lot better if I knew that this class bears some restrictions in the first place, so I wouldn’t make any changes blindly, and if I make a bad change, the compiler would know that I have violated a local restriction, and would complain appropriately. So, it’s better to have traits differ syntactically from unrestricted classes. So, let’s have a keyword trait in font of the declaration, rather than class.

So, we have classes (state and all, but single inheritance) and traits (no state, no constructor, but multiple inheritance). Traits can declare (“require”, is CZ terms) one superclass, but not initialize it:

open class MyClass(){
    fun foo(){……}
}

trait MyTrait : MyClass {// MyClass is not initialized
    fun bar() {
      foo()// calls foo() from MyClass
    }
}

class Child : MyClass(), MyTrait{// MyClass is initialized

}

class ChildErr : MyTrait{// ERROR: MyClass must be a supertype

}

This allows traits to use members of a base class without interfering with the initialization logic.

One other syntactic issue is whether we should have a single homogenous supertype list (like in the example above) or something like Java’s “extends” and “implements” clause, or even Scala’s “extends Class with Trait1 with Trait2 with Trait3” syntax. The idea of making things explicit speaks for some syntactic separation of a class in the supertype list, for it is privileged in some way, i.e. having something like Java’s “extends” and “implements”, at least. On the other hand, we all know this annoying case in Java, when I turn an interface into an abstract class, and have to change all those subclasses that used to implement the interface, and now must extend the class. The change that could be syntactically local becomes non-local. This is why we’re inclined to have a homogenous supertype list, as in the example above.

Using traits

Now, we prohibit state in traits. It certainly is a significant limitation, but I would like to point out what it is not.

You CAN have properties in your traits. The limitation is that those properties can not have backing fields or initializers, but properties themselves may appear in traits:

trait Trait {
    val property : Int // abstract

    fun foo(){
      print(property)
    }
}

class C() : Trait{
    override val property : Int = 239
}

Our trait declares an abstract property, and the class overrides it with a stateful one. Now, the trait can use the property, and by late binding of calls, if we call foo() on an object of C, we get 239 printed.

You CAN access state in your traits. The previous example shows how you can do it sort of indirectly, by making subclasses override a property you define, but there is another way. Remember that a trait may declare (require) a superclasee:

open class A(x : Int){
    val y = x * 2
}

trait B : A {
    fun foo(){
        print(y)
    }
}

class C() : A(239), B{}

In this example, we have a base class A, that defines a concrete property y and initializes it. The trait B extends this class, but dose not pass a constructor parameter in, because traits have no initialization code at all, Note that B has access to the property y defined in A. Now, class C extends A and initializes it with 239, and extends B. Extending B is OK because B requires A, and we extend A, all right.

Now, what happens when we call foo() on an instance of C? It prints 478 (239*2), because the value of y is obtained from this instance, and the constructor of C has written 239 there.

Now, let’s look at the last example about traits:

How to resolve overriding conflicts. When we declare many types in out supertype list, it may appear that we inherit more than one implementation of the same method. For example:

trait A {
  fun foo() { print("A") }
  fun bar()
}

trait B {
  fun foo() { print("B") }
  fun bar() { print("bar") }
}

class C() : A {
  override fun bar() { print("bar") }
}

class D() : A, B {
  override fun foo(){
    super<A>.foo()
    super<B>.foo()
  }
}

Traits A and B both declare functions foo() and bar(). Both of them implement foo(), but only B implements bar() (bar() is not marked abstract in A, because this is the default for traits, if the function has no body). Now, if we derive a concrete class C from A, we, obviously, have to override bar(), because we have inherited only one implementation of it. But we have inherited two implementations of foo(), so the compiler does not know, which one to choose, and forces us to override foo() and say what we want explicitly.

I think, it’s enough for today. Your comments are very welcome, as usual.

Multiple Inheritance Part 1: Problems with the existing design

I’m back from my vacation, and it’s time to get to one one the biggest issues pointed out in the feedback we received during conference presentations and in the comments to the docs. I’m talking about inheritance.

I plan to write a series of posts on this topic. These posts are intended to provoke a discussion, so that we can benefit from your feedback and come up with a better design.

This is the first post in the series, and I discuss the design we presented in July 2011. It features the following approach to inheritance:

  • there were no interfaces, only classes;
  • each class could have multiple superclasses;
  • if some non-abstract member (property or method) was inherited from two of the supertypes, the compiler required the user to override it and specify manually what code to run.

(For more details, see our wiki as of July 20th 2011.)

This is, basically, the infamous multiple inheritance story, and we remember from the C++ times that it is sort of bad. Let’s look closer.

It’s all about initialization

Let’s a look at the following example:

abstract class Base(x : Int){……}

open class Left(x : Int) : Base(x) {……}
open class Right(x : Int) : Base(x) {……}

class Child : Left(3), Right(4) {……}

So, we have a diamond: Base at the top, Left and Right on the sides, and Child at the bottom. One thing looks suspicious here: Child initializes its superclasses passing different numbers two them: 3 to Left and 4 to right. Now, they in turn, initialize Base with those numbers…… What is Base initialized with?

Actually, there are two “instances” of Base created: one, initialized with 3, is hidden inside Left(3), and another, initialized with 4 — inside Right(4). I.e. it works like non-virtual inheritance in C++. (On the Java platform, we implemented it by delegation, which is invisible for the user.)

Now, what happens when you call a function that is defined in Base? For example, let’s say that Base defines two abstract functions:

abstract class Base(x : Int){
fun foo()
fun bar()
}

Now,let Left override foo() and Right override Bar:

open class Left(x : Int) : Base(x){
override fun foo() { print(x) }
}

open class Right(x : Int) : Base(x){
override fun bar() { print(x) }
}

In this case Child inherits two declarations of foo() and two declarations bar(), but at the same time it inherits only one implementation for each of these functions, so it’s OK, the behavior is determined. So, when we say

val c = Child(0)
c.foo()
c.bar()

The output is

4

Because foo() was called for Left, and bar() was called for Right.

If Child inherited more than one implementation of, say, foo(), the compiler would have complained until we override foo() in Child and specify the behavior explicitly. So, we are guaranteed to have no ambiguity when calling functions of Child.

So far, so good, but there still is something wrong with this approach……

Problem 1: the constructor for Base is called twice whenever we create an instance of Child. It’s bad because if it has side-effects, they are duplicated, and the author of the Child class may not know about it, because someone change the inheritance graph turning it into a diamond that was not there before.

Problem 2: the implementation of Left assumes it’s initialized with 3, but it may call bar() that is implemented in Right and assumes everything is initialized with 4. This may cause some inconsistent behavior.

Problem 3: being implemented by delegation, deep hierarchies will degrade performance by having long delegation chains.

(Im)Possible ways of fixing it

Now, how can we fix our design? C++ copes with Problem 1 and 3 by having virtual inheritance. On the Java platform and with separate compilation in mind, I do not think we can get rid of delegation when a class inherits state from two sources, so the Problem 3 stands for us anyway. And having two flavors of inheritance is no good, as we learned from c++……

Virtual inheritance does not fix Problem 2: being initialized differently, parts of the inherited implementation may make inconsistent assumptions about the overall state of the object. This problem seems intractable in the general case, but let’s be accurate and make sure it really is.

We could try to guarantee that everything is initialized consistently. In the general case, when we pass arbitrary expressions to Left and Right, there’s no way to be sure they yield same results, even if they are textually the same. Then, we could impose some constraints here. For example: only allow to pass compile-time constants or immutable variables to superclass constructors. In this case the compiler could examine the whole class hierarchy and make sure every base class is initialized consistently. There is a problem, though: if one of the superclasses change its initialization logic even slightly, subclasses may become inconsistent, so this will be a big evolution problem, for example, for libraries.

And, of course, it would be too restrictive to impose those constraints on all classes. So we end up with two flavors of classes……

Well, it seems that “there are only classes(i.e. no interfaces or alike)” approach did not work out. Now, it’s time to consider other approaches.

What’s out there

Different languages manage multiple inheritance differently, and I summarize some of the approaches here.

  • Java and C# have classes and interfaces, i.e. multiple interface inheritance and single implementation inheritance;
  • Scala has classes and traits that may implement methods and even have state, but their constructors can not have parameters;
  • Some other languages, like Fortress, do not allow state in traits;
  • <Your favorite language here>

In the next post of this series we will discuss the options in detail.

And now it’s time for your comments. They are very welcome.

The Kotlin issue tracker is now public

Following the tradition of other JetBrains projects, we’ve opened up the issue tracker for Kotlin to the public. In the issue tracker, you can see some of our thinking and things we’re working on, and you can also file issues asking for new features in the language or changes in the current design. We hope that the tracker will let us keep the discussion more structured than the comments in the blog and on Confluence pages.

Why JetBrains needs Kotlin

The question of motivation(动机) is one of the first asked when someone learns that someone else is working on a new programming language. Kotlin documentation offers a fairly(详细) detailed overview of why the language exists. Still, we would like to make it clearer what exactly(究竟) JetBrains expects to gain from the whole endeavor(努力). We’re obviously(明显的) in it for the long run, and yes, we realize(认识到) it will take years to reach(达到) our goals. And here’s why we are willing to make this investment(投资).
[collapse title=”翻译”] 当人们得知另一些人在研究一个新编程语言时,第一个被问到的问题就是动机。在Kotlin的文档中,对这种语言为什么存在提供了非常详细概述。不过,我们想弄清楚的是JetBrains究竟想通过整个努力得到什么。显然我们需要一个长时间的努力,是的,我们也意识到要达到我们的目标可能需要很多年。这就是我们为什么要进行这样的投资。
[/collapse]

First and foremost(最重要的), it’s about our own productivity(生产力). Although we’ve developed support for several JVM-targeted programming languages, we are still writing all of our IntelliJ-based IDEs almost entirely(全部地) in Java. The IntelliJ build system is based on Groovy and Gant, some Groovy is also used for tests, there is some JRuby code in RubyMine, and that’s it. We want to become more productive by switching to a more expressive(表现力) language. At the same time(同时), we cannot accept compromises(妥协,折中方法) in terms(条款,地位) of either Java interoperability(互用性) (the new language is going to be introduced gradually(逐步地,渐渐地), and needs to interoperate smoothly(平稳的,平滑的) with the existing code base) or compilation(编辑) speed (our code base takes long enough to compile with javac, and we cannot afford(提供) making it any slower).
[collapse title=”翻译”] 首先也是最重要的,就是我们的生产力。虽然我已经开发了几种针对JVM的编程语言,而且我们仍然几乎完全使用Java写我们的基于IntelliJ的所有IDE。IntelliJ编译系统是基于Groovy和Gant,Groovy也被用于测试,在RubyMine里有一些JRuby代码,就是这样。我们想切换到一种更具表现力的语言来提高效率。同时,我们不能在与Java的交互方面(新语言在逐渐的被引入的过程中,和现有的基础代码的交互需要平稳)和编译速度方面(我们的基础代码使用javac编译时需要很长时间,我们不能让它再慢了)进行妥协.
注:in terms of 在……方面
[/collapse]

The next thing is also fairly straightforward: we expect Kotlin to drive the sales of IntelliJ IDEA. We’re wording on a new language, but we do not plan to replace the entire ecosystem of libraries that have been built for the JVM. So you’re likely to keep using Spring and Hibernate, or other similar frameworks, in your projects built with Kotlin. And while the development tools for Kotlin itself are going to be free and open-source, the support for the enterprise development frameworks and tools will remain part of IntelliJ IDEA Ultimate, the commercial version of the IDE. And of course the framework support will be fully integrated with Kotlin.

The final point is less obvious but still important: new programming languages is a topic that many people really enjoy talking about, and the first days that have passed since we’ve unveiled Kotlin prove that. We see that people who are already familiar with JetBrains trust the company to be able to do a good job with this project. Thus, we believe that this trust and the increasing community awareness of JetBrains will not only drive the company’s business, but will attract even more people to our approach to building development tolls, and let them Develop with Pleasure.

And we’d like to reiterate that our work on Kotlin does not in any way affect our investment into other development tools, and in particular the Scala plugin. If you’re already happy with Scala and have no need for another new language, we’ll continue to do our best providing you with first-class Scala development tooling.

Hello World

Today at the JVM Language Summit, JetBrains is unveiling the new project we’ve been working on for almost a year now. The project is Kotlin, a new statically typed programming language for the JVM.

With Kotlin, we’re building upon the many years of experience creating development tools for different languages, and hoping to provide a language which is productive enough for today’s environment and at the same time simple enough for the ordinary programmer to learn.

Right now Kotlin is under active development and now here near mature enought to be used by anyone outside of the development team. What you can do today is read the language documentation and leave feedback on the design of the language — what features you like, which ones are missing, what’s confusing and so on.

One thing to note: since we’re a development tools company, we’re building first-class IDE support for Kotlin in parallel with the language itself. And as soon as the language reaches its beta stage (currently planned for the end of 2011), we’ll release both the complier and the development tools as open-source under the Apache 2 license.

There’s still a huge amount of work ahead of us, and we’re excited to hear what you guys think about our latest eneavor. So let the discussions begin!