NML -- The ultimate Lambda for Scientific Programming
 
 
In the original OCaml language first used back in 1999 for implementing NML, the only unboxed arrays of floats available in OCaml were its Vector objects.
 
We needed to interface with external data sources and sinks, and we wanted less restriction on the maximum possible size of arrays. So in addition to OCaml float vectors we implemented our own version of Foreign Arrays and then made NML view Array objects as either of the OCaml Float Vectors or our Foreign Array objects.
 
After several years the OCaml team came up with their own version of something similar to our ForeignArrays, called BigArrays, but we continued along our own lines, not wanting to go through the effort to change over.
 
Now, recently, we have been operating on Mac OS X and we have many instances where we would like to utilize their implementation of DSPLib and LAPACK routines. These routines are blazing fast implementations, originally for the PowerPC Altivec engine, and now for the Pentium Core Duo with SSE and SSE2 extensions. But these routines have strict data alignment requirements in order to obtain the maximum processing speed.
 
Our Foreign Arrays always offered the proper data alignment, but OCaml float Vectors still do not, unless you operate in a 64-bit environment. But their BigArray objects utilize unboxed arenas that are external to the OCaml heap, and they make use of the system malloc() routine to create those data arenas. The OS X malloc() routine guarantees proper data alignment for high-performance vectorized math processing.
 
We finally bit the bullet, and made the switch to OCaml BigArray’s and our ForeignArrays for all of our numeric arrays. The switchover to OCaml BigArrays was not really so horrendous. I think it took all of 1-2 hours to do. Such is the beauty of OCaml as the implementation language. Its rigid type checking helped make that transition relatively painless.
 
We still needed the ability to access foreign data, and so that is why the NML GeneralArray type is shown as:
 
type general_double_array = (float, float64_elt,
                                c_layout) Array1.t
type general_array =
    ML_Array      of general_double_array
  | Foreign_Array of foreign_array
 
But for the most part now, OCaml BigArrays carry the majority of the unboxed array values in NML. There is no severe size restriction on OCaml BigArrays as there was on OCaml Vectors, and BigArrays are always properly data aligned for DSPLib speed.
 
In the past, whenever an array value might exceed the maximum permissible size of an OCaml Vector, we had to coerce the data to one of our ForeignArray objects. That is no longer ever the case.
 
As an experiment in speeding up the system, but also losing some portability from Mac OS X to Windows XP, we have been slowly replacing some of the NML kernel vectorized math routines with calls to the lower level DSPLib routines. Those specialized routines presumably make use of the AltiVec and SSE/SSE2 extensions.
 
(AltiVec cannot vectorize with double-precision float data, only single-precision floats, but the SSE and SSE2 extensions can handle double-precisions floats just fine.)
 
Perhaps in time, M$ will support LAPACK the way Apple has done, making Windows machines more science friendly. But M$ also needs something as elegant as Display PDF like Apple has, for nice quality graphical output.
Saturday, February 3, 2007
Recent Changes to Arrays in NML