Log in

Doing some compact code in Java 8

My good friend Rafael Naufal has posted a very interesting challenge which consists in comparing the expressiveness of different languages to a very single problem involving object containers, and as much as I can see that Java is the more verbose between his examples I could provide a more expressive version of the Java 8 solution:

   1 Map<String, List<String>> mapOfLists = new HashMap<>();
   2 mapOfLists.put("a", Arrays.asList("1", "2", "3"));
   3 mapOfLists.put("b", Arrays.asList("4", "5", "6"));
   4 mapOfLists.put("c", Arrays.asList("7"));
   6 class Pair {
   7     String key, val;
   9     public Pair(String key, String val) {
  10         this.key = key;
  11         this.val = val;
  12     }
  14     public String toString() {
  15         return String.format("[%s, %s]", key, val);
  16     }
  17 }
  19 List<Pair> pairs = new LinkedList<>();
  20 mapOfLists.forEach((k, list) -> list.forEach(e -> pairs.add(new Pair(k, e))));
  21 System.out.println(pairs)

The output is:

[[a, 1], [a, 2], [a, 3], [b, 4], [b, 5], [b, 6], [c, 7]]

The key issues are:

  1. Other languages can do the creation of the Map in pretty much a single line, Java could would be this 4 lines yes;

  2. In the absence of Tuples we must create a 'Pair' class (getting 10 more lines), although I consider the toString significantly improved;

Other than that the actual transformation (line 20 above) is just one line like in the other languages, which says a lot about Java, being statically typed and all (the comparison used Ruby and Groovy).

I also did not use the Java 8 enhanced APIs (particulary the Stream package), but it was also not necessary, but once I study it I can provide another code, being less compact or not.

I'll try to provide an example in both Kotlin and Scala in the future (I think both provide one-liners for class declarations and/or tuples).


From some yesterday posts I got from {1 and 2} to:

  1. LLVM compiler infrastructure to:

  2. Rust language to a comment saying that Go had a GC: "Rust does not use an automated garbage collection system like those used by Go, Java or .NET." and finally to this article:

  3. Go GC: Prioritizing low latency and simplicity;

Which is from 31-Aug-2015 (and will do a follow up of more recent info later on) and says interesting things like:

To create a garbage collector for the next decade, we turned to an algorithm from decades ago. Go's new garbage collector is a concurrent, tri-color, mark-sweep collector, an idea first proposed by Dijkstra in 1978. This is a deliberate divergence from most "enterprise" grade garbage collectors of today ...

And yes, they talk about JVMs GC 'under the lines' several times, which is funny to read (not in a disrespectful way, just trivially funny to see them avoiding to mention the JVM).

They mention this GC which will do as much as possible concurrently (which is a strong characteristic of Golang) and Richard Hudson gives us a nice summarized explanation of the tri-color collector:

In a tri-color collector, every object is either white, grey, or black and we view the heap as a graph of connected objects. At the start of a GC cycle all objects are white. The GC visits all roots, which are objects directly accessible by the application such as globals and things on the stack, and colors these grey. The GC then chooses a grey object, blackens it, and then scans it for pointers to other objects. When this scan finds a pointer to a white object, it turns that object grey. This process repeats until there are no more grey objects. At this point, white objects are known to be unreachable and can be reused.

The "Devil is in the details" section that follows is very honest and elucidating too, but they are explicit running away from giving the users all kinds of configurations to play with the GC strategy of even to change it, which the Oracle's JVMs offers in a very complex|powerful way, instead they want their GC to able to handle things intelligently for the user with only one single configuration or 'knob':

At a higher level, one approach to solving performance problems is to add GC knobs, one for each performance issue. The programmer can then turn the knobs in search of appropriate settings for their application. The downside is that after a decade with one or two new knobs each year you end up with the GC Knobs Turner Employment Act. Go is not going down that path. Instead we provide a single knob, called GOGC. This value controls the total size of the heap relative to the size of reachable objects. The default value of 100 means that total heap size is now 100% bigger than (i.e., twice) the size of the reachable objects after the last collection. 200 means total heap size is 200% bigger than (i.e., three times) the size of the reachable objects. If you want to lower the total time spent in GC, increase GOGC. If you want to trade more GC time for less memory, lower GOGC.

Which is to say the only thing the user can choose is the ratio between the heap size and the number of objects in use (not elligible to be collected), which is also a very common type of GC 'knob' presented (for a long time...) by the Java HotSpot VM Options related to its GC:

  • InitialSurvivorRatio: Sets the initial survivor space ratio used by the throughput garbage collector ...

  • MaxHeapFreeRatio: Sets the maximum allowed percentage of free heap space (0 to 100) after a GC event.

  • MinHeapFreeRatio: Sets the minimum allowed percentage of free heap space (0 to 100) after a GC event.

  • NewRatio: Sets the ratio between young and old generation sizes.

  • SurvivorRatio: Sets the ratio between eden space size and survivor space size.

Not to mention the long avaiable Parallel GC option '+UseParallelGC' that as one could guess "Enables the use of the parallel scavenge garbage collector" and also +UseParallelOldGC, +UseParNewGC, etc.

Credit where it is due Golang designers: even going back to Dijkstra in 1978 there is still some Java inspiration in there, but you just come to cite (not giving names) the things of the Hotspot VM you consider bad and seeks to avoid... not fair. That said, the Go lang goal to prevent the "GC Knobs Turner Employment Act" is noble but we will always have to prepare ourselves to The Law of Leaky Abstractions as the wise Joel remembered us back in 2002.
This article presents some good arguments towards assembly level programming using LLVM Intermediate Representation (IR) code. He shows how LLVM front ends all generate IR code from many distinct languages, then general optimization can take place at the IR code which then can be used to generate the real assembly targeting some processor architecture.

He argues in favour of using the expert assembly knownlegde one might have in order to help to enhance the optmization or the {IR to assembly} steps instead of keep repeating this work manually, which is a strong and good argument and he closes saying:

I think it's amazing that once written pseudo-assembly code can be retargetted to any architecture. For me IR has all the advantages of assembly without any of its problems: fast, expressive, retargettable and maintainable.


I've posted a few entries based on this article about Graal & Truffle, here are some highlights of this article:

Truffle is a framework for writing interpreters with annotations and small bits of extra code in them which, when Truffle is paired with its sister project Graal, allow those interpreters to be converted into JIT compiling VMs … automatically. The resulting runtimes have peak performance competitive with the best hand-tuned language-specific compilers on the market. For example, the TruffleJS engine which implements JavaScript is competitive with V8 in benchmarks. The RubyTruffle engine is faster than all other Ruby implementations by far. The TruffleC engine is roughly competitive with GCC.

This project could advance an order of grandess the quick experimentation|variability and consequent evolution of new programming languages|platforms. The modern state of the art characteristics of successful programming languages are given without effort to those willing to work with this toolset, which is already showing the remarkable results cited in the excerpt above.

Graal is designed from the start as a multi-language compiler, but its set of optimization techniques is especially well suited to compiling programs with high levels of abstraction and dynamism. It runs Java as fast as the existing JVM compilers do, but when applied to Scala programs it runs them about 20% faster. Ruby programs get 400% faster than the best alternative runtime (i.e. not MRI)."

Some notable languages already experimented with are:

  • Smalltalk;

  • LLVM bitcode, allowing C/C++/Objective-C/Swift programs to run on it;

  • Python 3 (yes, 3);

  • R;

  • Lua (Lua can get 'for free' a state of the art VM... awesome);

Mike Hearn also reminds us that:

To give a feel for how easy it is to write these engines, TruffleJS is only about 80,000 lines of code compared to about 1.7 million for V8.

80k to 1.7mi? That is a gap... and we can thank Oracle Labs for that, and they say:

Graal is a new just-in-time compiler for the JVM, written in Java and focused on performance and language interoperability. Graal offers performance advantages to Java code, thanks to new techniques of method inlining, eliding object allocations and speculative execution - and makes possible high-performance scripting language engines.

Fast scripting languages, this is addressing one of the most common flaws of languages of this nature, one that has become the most popular in the past years.
Recently I've posted some entries citing LLVM and IR, amidst the texts I was reading one came with real good summarized definitions for both terms:

LLVM is an umbrella project for a modular and reusable compiler infrastructure written in C++. It includes a compiler frontend clang for compiling C, C++, Objective C and Objective C++ to LLVM bitcode IR. Many of the other tools such as the optimizer opt, assembler, linker, and backends then operate on the LLVM IR, to finally produce machine code. LLVM envisions that transformations and analyses can be applied during compile-time, link-time, runtime, and offline.

LLVM IR is a language that resembles assembler, but which provides type-safety and has virtual registers that are in Static Single Assignment (SSA) form.

Here is some pieces of human readeable generated LLVM IR:

@.str = private unnamed_addr constant [14 x i8]
    c"Hello World \0A\00", align 1
@str = internal constant [13 x i8] c"Hello World \00"

define i32 @main() nounwind uwtable {
%puts = tail call i32 @puts(i8* getelementptr inbounds
  ([13 x i8]* @str, i64 0, i64 0))
ret i32 0

One could recognize some traces of the classic C hello world below that was used to generate the IR above:

#include <stdio.h>
int main() {
    printf("Hello World \n");

All this from the Sulong project intro page:

Sulong (Graal LLVM) is an interpreter for LLVM IR written in Java using the Truffle language implementation framework and Graal as a just-in-time (JIT) compiler.

With Sulong you can execute C/C++, Fortran, and other programs written in a LLVM language on the JVM. To execute a program by Sulong, you have to compile the program to LLVM IR by a LLVM front end such as Clang. By using Truffle and Java the interpreter implementation is simple and is thus a great platform for experimentation. On the other hand, dynamic optimizations and JIT compilation with Graal still provides native execution speed (improving performance is work in progress). Through Truffle's language interoperability capabilities, you will soon be able to call functions from/to other languages on Truffle such as Ruby, JavaScript, or R.

bitcode vs. bytecode

This statement at this article about Graal & Truffe got me thinking about the differences between bitcode and bytecodes (or portable code):

Most Truffle runtimes interpret source code, but there’s nothing that says you have to do that. The Sulong project is creating a Truffle interpreter for LLVM bitcode.

So I found this definition of what are bit codes at the LLVM documentation (IR = Intermediate Representation):

What is commonly known as the LLVM bitcode file format (also, sometimes anachronistically known as bytecode) is actually two things: a bitstream container format and an encoding of LLVM IR into the container format.

The bitstream format is an abstract encoding of structured data, very similar to XML in some ways. Like XML, bitstream files contain tags, and nested structures, and you can parse the file without having to understand the tags. Unlike XML, the bitstream format is a binary encoding, and unlike XML it provides a mechanism for the file to self-describe “abbreviations”, which are effectively size optimizations for the content.

LLVM IR files may be optionally embedded into a wrapper structure, or in a native object file. Both of these mechanisms make it easy to embed extra data along with LLVM IR files

So looking further I found helpful these two bits of information at Stack Overflow:

The format is literally a bitstream, not a bytestream. See this document for more details: http://llvm.org/docs/BitCodeFormat.html

The biggest difference between JVM bytecode and and LLVM bitcode is that JVM instructions are stack-oriented, whereas LLVM bitcode is not. This means that rather than loading values into registers, JVM bytecode loads values onto a stack and computes values from there. I believe that an advantage of this is that the compiler doesn't have to allocate registers, but I'm not sure.

LLVM bitcode is closer to machine-level code, but isn't bound by a particular architecture. For instance, I think that LLVM bitcode can make use of an arbitrary number of logical registers.

And openned the path to this comment "ARC in Swift is much better than Java GC" pointing to this article at Quora: Why doesn't Apple Swift adopt the memory management method of garbage collection like in Java? Which I'll definately read in the next days!
I was remembering details of the LLVM project (or more specifically an umbrella of projects), driven by this wonderful article about Graal & Truffle:

LLVM support
Most Truffle runtimes interpret source code, but there’s nothing that says you have to do that. The Sulong project is creating a Truffle interpreter for LLVM bitcode.

Sulong is still very new and code run this way has many limitations. But by running bitcode with Graal & Truffle, the framework should in theory gain support for not only C, but also C++, Objective-C, FORTRAN, Swift and potentially even Rust.

Then Rust came to be mentioned and I was surprised to know that it's heavly tailored for performance and parallel processing, which is a best fit for a experimental web browser layout engine like Servo.

The prototype seeks to create a highly parallel environment, in which many components (such as rendering, layout, HTML parsing, image decoding, etc.) are handled by fine-grained, isolated tasks. Source code for the project is written in the Rust programming language.

Not only that but Rust got the 1st place at a survey of Stack Overflow for the Most Loved Programming Language in 2016.

Rust is a general-purpose, multi-paradigm, compiled programming language sponsored by Mozilla Research. It is designed to be a "safe, concurrent, practical language", supporting pure-functional, imperative-procedural, and object-oriented styles.

The language grew out of a personal project by Mozilla employee Graydon Hoare. Mozilla began sponsoring the project in 2009 and announced it in 2010. The same year, work shifted from the initial compiler (written in OCaml) to the self-hosting compiler written in Rust. Known as rustc, it successfully compiled itself in 2011. rustc uses LLVM as its back end.
Although its development is sponsored by Mozilla, it is an open community project. The design of the language has been refined through the experiences of writing the Servo web browser layout engine and the Rust compiler.

Delphi is still alive

Surprised to know that Delphi is not dead but doing even mobile apps and some modern features like:
... the language has grown and supports many other modern language features, including generics and anonymous methods ...

Kotlin Hello World

Yesterday I did a Hello World in Kotlin following the first free chapter of the book Kotlin in Action:

   1 data class Person(val name:String, val age: Int? = null)
   3 fun main(args: Array<String>) {
   5     val persons = listOf(Person("Alice", age=35), Person("Bob", age=27))
   6     val oldest = persons.maxBy { it.age ?: 0 }
   8     println("The oldest is: $oldest")
   9 }

Using this nice web IDE for experimentation: try.kotlinlang.org there is an option to see the kotlin source code converted to Javascript:

   1 Kotlin.out.flush();
   2 (function (Kotlin) {
   3   'use strict';
   4   var _ = Kotlin.defineRootPackage(null, /** @lends _ */ {
   5     Person: Kotlin.createClass(null, function (name, age) {
   6       if (age === void 0)
   7         age = null;
   8       this.name = name;
   9       this.age = age;
  10     }, /** @lends _.Person.prototype */ {
  11       component1: function () {
  12         return this.name;
  13       },
  14       component2: function () {
  15         return this.age;
  16       },
  17       copy_bm4lxs$: function (name, age) {
  18         return new _.Person(name === void 0 ? this.name : name, 
  19             age === void 0 ? this.age : age);
  20       },
  21       toString: function () {
  22         return 'Person(name=' + Kotlin.toString(this.name) + (', age=' + 
  23             Kotlin.toString(this.age)) + ')';
  24       },
  25       hashCode: function () {
  26         var result = 0;
  27         result = result * 31 + Kotlin.hashCode(this.name) | 0;
  28         result = result * 31 + Kotlin.hashCode(this.age) | 0;
  29         return result;
  30       },
  31       equals_za3rmp$: function (other) {
  32         return this === other || (other !== null && (typeof other === 'object' && 
  33             (Object.getPrototypeOf(this) === Object.getPrototypeOf(other) && 
  34                 (Kotlin.equals(this.name, other.name) && 
  35                     Kotlin.equals(this.age, other.age)))));
  36       }
  37     }),
  38     main_kand9s$: function (args) {
  39       var oldest;
  40       var persons = Kotlin.modules['stdlib'].kotlin.collections.listOf_9mqe4v$(
  41           [new _.Person('Alice', 35), new _.Person('Bob', 27)]);
  42       maxBy_l82ugp$break: {
  43         var iterator = persons.iterator();
  44         if (!iterator.hasNext()) {
  45           oldest = null;
  46           break maxBy_l82ugp$break;
  47         }
  48         var maxElem = iterator.next();
  49         var tmp$0;
  50         var maxValue = (tmp$0 = maxElem.age) != null ? tmp$0 : 0;
  51         while (iterator.hasNext()) {
  52           var e = iterator.next();
  53           var tmp$1;
  54           var v = (tmp$1 = e.age) != null ? tmp$1 : 0;
  55           if (Kotlin.compareTo(maxValue, v) < 0) {
  56             maxElem = e;
  57             maxValue = v;
  58           }
  59         }
  60         oldest = maxElem;
  61       }
  62       Kotlin.println('The oldest is: ' + Kotlin.toString(oldest));
  63     }
  64   });
  65   Kotlin.defineModule('moduleId', _);
  66   _.main_kand9s$([]);
  67 }(Kotlin));
  69 Kotlin.out.buffer;
That is cool to see.

Latest Month

July 2016



RSS Atom
Powered by LiveJournal.com
Designed by Tiffany Chow