The Mathematics of Arrays is an algebra for representing operations on arrays that was created by Lenore Mullin, after she had worked on the APL interpreter. It separates the Denotational Normal Form of the array computation, which just thinks about arrays as a shape and a function form indices to value, from the Operational Normal Form, which deals with memory layout (strides, etc). Her PSI calculus compiler defines a number of reductions in the DNF, so that complex array operations can be reduced to basic indexing operations and computation, which reduce the need for intermediate values. Then the DNF is transformed to the ONF through one of the layout functions (gamma) for different implementations dependending on the hardware requirements.
I am working on understanding how this compiler works and what is possible with it. I wanted to make it more accessible for others to look at.
How can we represent this MoA structure in Python? Can we create a friendly lazy array API that can then target different backends for the generated array code? These could be native Python, NumPy, XND, llvm, TVM, Tensorflow, PyTorch, Tensor Comprehensions, etc. What about a sparse or compressed array backend?
Then, could we create a subset of the NumPy API on top of that? For example, starting with what xarray and sparse arrys need.
Finally, could we extend this to variable dimension arrays? If we were able to support these, then we effectively have support for arbitrary nested structures, and could see how this could be used to target the Parquet/Arrow/JSON use cases.
This work is supported by Quansight. If you are interested in discussing this further, please reach out.