Skip to content

Releases: morloc-project/morloc

Sockets and Shared Memory (10,000X speedup)

02 Jan 17:31
Compare
Choose a tag to compare

This release includes a full rewrite of the morloc backend. In prior releases,
every cross-language call would require writing all arguments to files on the
disk and calling a morloc-generated executable for the given language with the
arguments as temporary files. For interpreted languages like R and Python,
starting these executables would require initializing the interpreter at a cost
of ~300ms or ~50ms, respectively. So cross-language calls were very expensive.

This release replaces file-based communication with shared memory and cold calls
to executables with UNIX domain socket messages between daemons. Compiling a
morloc module creates a nexus executable that serves as the command-line
interface to the exported functions. The nexus accepts function arguments as raw
JSON, JSON files or MessagePack binary files. Calling a specific function will
first initialize a daemon for every language the morloc function uses. Each
daemon listens over a UNIX domain socket for commands (either from the nexus or
another pool). When the nexus or a language daemon makes a cross-language call,
arguments are converted to a generic binary form in a shared memory pool. The
relative pointers to these arguments are sent to the downstream daemon via a
message over a UNIX domain socket. The downstream daemon performs a computation,
writes the result back to the shared memory, and returns a message over the
socket telling the caller where to find the result. These messages also encode
error status, allowing error messages and possibly other metadata to propagate
between languages and ultimately back to the user.

Cross-language communication now has a constant overhead of a few microseconds
for needed to message over a socket plus the time required to format
argument data to/from generic binary forms in shared memory. As a simple test,
the morloc function map inc xs -- where map is a C++ loop, inc is a python
function that increments an integer, and xs is a list of integers -- runs
at under 3 microseconds per integer. This is a ~10-20 thousand fold
improvement over the past cost of ~50ms per call to Python.

None of these changes to the backend have any effect on the code the morloc
programmer writes. The type annotations that were added in the past are
sufficient for the morloc compiler to convert all types from all languages to
and from generic binary structures.

Better Typing

10 May 11:45
Compare
Choose a tag to compare

The main changes since the 0.43.0 release are the addition of typeclasses, basic value checking, and explicit function type parameters.

Typeclasses cannot yet be used as explicit constraints in function signatures, so their value in modeling data is limited. But at least we can now have one add function name for both integers and doubles. Also, packing and unpacking has been re-implemented using a new Packable type class. This means we no longer need the special pack and unpack descriptors in signatures.

Value checking is important since morloc can define multiple definitions for one term. For example, it is legal to write:

x = 1
x = 2

This would not redefine x, as is done in many languages, but would rather associate both values with the variable name and attempt to disambiguate them later (which in the past implementation would have arbitrarily picked the last one). Now I have a very rudimentary value checker that will check for contradictions between primitives. It cannot descend past a source function call. In the future, I will need to extend the value checker to compare different sourced functions. This will likely have an LLM solution.

Explicit function parameters are now added to function signatures to provide an order for the generic type variables. For example:

snd a b :: (a, b) -> b

This deviates from Haskell syntax, but clarifies the relationship between the morloc type signature and type signatures in other languages, such as the C++ prototype:

template <class A, class, B>
B snd(tuple<A,B>);

This also allows us to conveniently refer to functions as parameterized types, e.g.: snd Int Bool. Possibly such type functions could be used in signatures as well. I will explore this later.

Nearly Useful

16 Jan 18:26
Compare
Choose a tag to compare

The full influenza case study, a re-implementation of OctoFlu, is supported in this release. This proves that morloc can be used to solve non-trivial problems. Here are the main advancements of this release:

  1. Infer all concrete types directly from the general type
  2. Allow file inputs rather than only raw JSON -- this allows large data sets to be processed without hitting argument size limits
  3. Better (though still far from good) debugging options and error messages
  4. Clean import/export system with wildcards
  5. Support for eta-reduction
  6. Many bug fixes and greatly extended test coverage

However, the language is still weak in many areas:

  1. No type classes
  2. No effect handling (e.g., exceptions, mutations, non-determinacy)
  3. Weak record/table/object support
  4. No pattern matching or sum types
  5. No binary operator support (I'm getting a little tired of writing add rather than +)
  6. Limited debugging features
  7. Limited language support
  8. Slow compile times (due to one specific issue in the frontend type system)
  9. Inefficient serialization scheme (uses JSON currently, should convert to some sort of remote procedure call system)
  10. No formal specification of the type-system -- the conversions from general to concrete to serial, the resolution of ambiguous trees, the propagation of types through segmentation, the threading of arguments -- all this is very involved but not yet mathematically defined. I am not confident that it is all sound.
  11. No shiny paladin salesperson, only a grumpy morlock who thinks only about problems

Where scoping

24 Oct 19:16
Compare
Choose a tag to compare

This release adds scoped where to morloc and fixes several subtle bugs in the typesystem.

Under the hood, the entire architecture has been refactored. Previously concrete and abstract typechecking were done in a single step, now first the general types are inferred, then the trees are disambiguated, and finally the concrete types are inferred. Also, typechecking is now done AFTER the raw expressions have been parsed and desugared into the set of ASTs that will be exported from a module. There are many, many more changes in the implementation. If you are curious, read the commit messages.

Pretty good typechecking and serialization

03 Nov 23:04
Compare
Choose a tag to compare

This release sets the foundation for morloc. Basic typechecking/inference, code generation, interoperability, and serialization are all working well. Finally morloc is sufficiently developed to be useful.

The main future goals break down as follows:

  • Richer type system - typeclasses, "shapes", semantic types (probably use a logic engine like z3)
  • Effect handling and error/warning propagation
  • Optimization - all current optimizations steps are basically stubs
  • Doxygen-like documentation, caching, manifold hooks and such (see the last release)
  • Improved build system
  • Support for many more languages and a streamlined language onboarding process
  • The MorlocIO package manager and community portal
  • MorlocStudio

PyCon 2019

06 May 07:13
Compare
Choose a tag to compare
PyCon 2019 Pre-release
Pre-release

This release marks the version of morloc that was used in the poster presented at PyCon 2019 in Cleveland.

Minimal Haskell Prototype

24 Sep 06:52
Compare
Choose a tag to compare
Pre-release

This release presents a very simple Morloc prototype. It is mostly experimental and will change greatly in the future with no attempt to preserve backwards compatibility.

This prototype includes

  • A simple, typed, functional scripting language
  • A compiler to translate these scripts into RDF graphs and then executable code
  • Simple type checking
  • Support for Python and R
  • A system for specifying language-specific types and transforming the data as needed
  • Syntax for specifying type constraints

Pre-release of Haskell prototype

22 Mar 09:55
Compare
Choose a tag to compare
Pre-release

This prototype is (currently) much less sophisticated than the C prototype. However, the code is far more elegant and will serve as a more flexible foundation for future development.

It can currently run R code in a simple shell interface. For example:

> sum [1,2,3]
6

This passes the Morloc vector [1,2,3] into the R function sum and returns the result.

This pre-release is an experimental foundation for the Morloc language. The syntax and features will change wildly in the future with no attempt at maintaining backwards compatibility.

Final version of the C prototype

22 Mar 09:45
Compare
Choose a tag to compare

This is the final version of the C prototype.

The features are described in the README. Here is an overview:

  • integrated R, Python, and Bash through a simple type system

  • workflows are pull-based graphs

  • explores the "manifold" template idea and multi-dimensional workflows

  • compilation exposes all exported functions through the manifold nexus

  • allows checks and effects to be added outside of the core workflow

This prototype will not be maintained in the future.