I have found it!
A lightweight open source
platform independent
nd-array library
for the JVM
Platform independent,
extendable and
compatible
with any JVM language
Completely open-source
and free to use,
MIT-Licenced
Vendor agnostic
GPU acceleration
through OpenCl
Why Neureka?
Not only is it a flexible nd-array library for general purpose use, but also a tensor library for doing deep learning.
Neureka trains your neural network using a computation graph recorder.
This is contrary to the approaches found in other frameworks such as TensorFlow, Theano, Caffe, and CNTK
which require the definition of a computation graph ahead of time.
In that case a developer has to build a neural network structure which
cannot change during runtime.
Neureka on the other hand uses the recorded
computation graph in order to apply a technique called
reverse-mode auto-differentiation.
This technique allows your network structure to change during
runtime arbitrarily with zero lag or overhead.
This powerful feature was inspired by PyTorch
which also uses a dynamic computation graph to achieve such a high degree
of flexibility.
Why Java?
Although Java is a robust and safe language, it
is often times considered to be too verbose and explicit for
simple prototyping or more explorative workloads...
Therefore popular machine learning and tensor / deep learning libraries
rely on python, which in many cases offers a more concise syntax.
So one might come to wonder, why would anybody ever build
a deep learning library for Java?
The answer is simple!
Nobody did!
This library was written for all JVM-languages, namely:
Groovy, Kotlin, Scala, and Jython just to name a few.
Take a look at the following examples side by side!
Neureka can be integrated by any language which compiles to, or understands
JVM bytecode!
var x = Tensor.of(3d).setRqsGradient(true)
var b = Tensor.of(-4d)
var w = Tensor.of(2d)
var y = ((x+b)*w)**2
y.backward(1)
// x.getGradient(): "(1):[-8]"
If you prefer fast prototyping
with Jupyter, then Neureka can be used there too.
BeakerX
is a jupyter extension that supports many JVM
languages like Groovy, Scala, Clojure, Kotlin and Java.
Performance
Not only are the operations within the default backend implemented as generalized, modular and concise as possible, they are also optimized for multi-threading and specifically designed to be auto-vectorizable by the JVM into SIMD machine code instructions.
However performance wise Neureka still has lots of room for improvement, but because it is a lightweight and highly extensible library with a consistent API designed to allow for the support of any backend, you can easily go the extra mile to improve performance for your specific use case. like for example implementing a more specialized kind of OpenCL kernel for convolution...
Currently Neureka is mostly held back
by the JVM's lack of allowing for more memory localized types and
also a lack of an API for consistent SIMD vectorization.
(...take a look at the upcoming vector API...)
This upcoming vector API alongside the introduction of inline/value types from Project-Valhalla will greatly benefit the performance of Neureka as well as improve Machine Learning on the JVM in general.