Lang
Design notes for a theoretical programming language “Lang” Pulling inspiration from Haskell, Koka, Javascript, Nix, ... Note: I use a mix of syntaxes in this document, particularly Haskell and Javascript. Do not interpret these too strictly; I do not guarantee, for instance, that Haskell-looking code will successfully typecheck.
Basic Syntax
Arrays [ a, b, let d = e, c ]
Objects [ a: b, let c = d, e: f ]
Plain bindings let var = val in expr
Lambdas \var -> expr Effect affordances are supplied as a plain parameter named aff
Effect scopes { a = mkA; stmt a; b <- mkB; stmt b; stmt (mkC;); } An effect scope evaluates to the composition of all its effects (think do) Paranthetical expressions ending in a semicolon—like (this;)—are embedded effects which will be lifted to the innermost effect scope. So { f (g;); } is equivalent to { x <- f; g x; }.
Effects
Effects in Lang are implemented on top of delimited continuations. Using delimited continuations for effects allows us to have managed effects in an ergonomic way without types!
Effects: anchor
Lang includes a single continuation control operator called anchor which has semantics as follows: anchor = \body -> reset \prompt -> body (shift prompt) -- or this, if it's more your thing anchor = reset . (. shift) Where reset and shift have their usual meaning, except are multi-prompt. The thought is just that anchor gives a slightly smoother interface than reset/shift: anchor \pivot -> { x <- pivot (\k -> k 10 + k 20); x * x; } -- results in 10*10 + 20*20 = 500 -- equivalent to reset \prompt -> { x <- shift prompt (\k -> k 10 + k 20); x * x; } Restriction: an expression anchor (\pivot -> f pivot) is disallowed from evaluating to something which contains a reference to pivot by storing it in a structure, closing over it, or through other means. Failure to adhere to this restriction is a runtime error. This restriction ensures that pivots never go out of the bounds in which they are meaningful.
Effects: affordances
All Lang code is evaluated effectfully. In Haskell terms, the evaluator executes in something like a Control.Monad.CC.CCT rg IO. However, this does not mean that arbitrary Lang expressions can perform arbitrary effects. Quite the opposite; effects in Lang are managed. In Haskell, any bit of code can reference putStrLn, but can’t do much with it except to (at most) thread it into a result IO action. More generally, most languages with managed effects allow unmitigated access to effectful operations but restrict where said operations can be executed. In Lang, it’s the opposite: any expression can, in theory, do something effectful, except that only select expressions are able to access effectful operations in the first place. There is no importable Lang IO module. Instead, main is supplied with a single argument, usually called aff, which is an object containing typical I/O operations as well as the anchor control operator. This passed-in object which provides effectful operations is called an affordance. main may choose to pass aff down to subroutines or to pass down a less expansive affordance, effectively restricting the effects that a subroutine is able to execute. For example, we may have a program that looks like this: main = \aff -> { aff.io.putStrLn "Hello"; subaff = [ print: \it -> aff.io.putStrLn (toString it) ]; subroutine subaff 10; } subroutine = \aff n -> { for (range 1 n) \n -> aff.print n; } for = \xs f -> List.match xs :cons [ :x, :xs ] -> { List.cons (f x;) (for f xs;) } :nil -> List.empty # note that 'for' is able to chain effectful computations # without requiring an affordance because it does not 'itself' # perform any effects! -- result: Hello!\n1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n Because subroutine is only supplied print as an affordance, we know that it is unable to perform any effects other than printing to the console. A more complex affordance, and one that does not invoke I/O, may be to use anchor in order to provide nondeterminism: main = \aff -> aff.anchor \pivot -> { subaff = [ fork: \a b -> pivot \k -> k a ++ k b ]; List.singleton (subroutine subaff); }; subroutine = \aff -> { x <- fork 1 2; y <- fork 10 20; x + y; } -- result: [11, 21, 12, 22]
Effects: problems with uneval
The following two desires are in tension:
To have an uneval function which can unevaluate any value back into an expression
To have managed effects via affordances
The issue is that almost all affordances will either be closures around the toplevel aff or around a pivot, neither of which can be effectively unevaluated. We cannot unevaluate the toplevel aff because it contains built-in code. We cannot unevaluate a pivot because doing so makes no sense: when it is again evaluated, there is no anchor to which it can pivot. (Remark. one solution here is static typing: track which expressions may contian a pivot, and which definitely don’t; allow only the latter to be unevaluated. But I don’t want to dip my toes into static typing just yet, and when I do I want to fuck around with prevaluation.)
I am torn as to whether or not there is some other design which may allow for managed effects in a similar manner without this issue. For now, I am deciding to stick with this style of managed effects, but I have at least considered other options:
Terms and Types
In the Lang universe, terms are almost complete blackboxes. Given an arbitrary term v, the only guarantee you have about it is the existence of two other values called type v and type_view v. Colloquially, the type of a value answers the question “what this value?" and the typeview answers “how do I interact with values of that type?". Semantically, types in Lang should be thought of as sets (or “concrete/arity-0 types1), except that we are guarnateed that no two types ever contain the same values. Specifically, even if two Lang values are the same ‘under the hood’, if they have different types, they are considered distinct.
At this point, Lang does not support more complex type features such as higher-kinded types. The hope is to have these at some point, built on top of an existing semantic for terms.
Operationally, a Lang type is just a value, usually an fqn, distinct from all other types. A type does not on its own provide any behaviour but only acts to tag a value as having that particular type. All user-defined types are simply tags over primitive types. (ie, only newtypes exist) The typeview of a value is a map from fqns to values. This map stores traits ( %% general %% % shorthands \newcommand{\cl}[1]{ \mathcal{#1} } \newcommand{\sc}[1]{ \mathscr{#1} } \newcommand{\bb}[1]{ \mathbb{#1} } \newcommand{\fk}[1]{ \mathfrak{#1} } \renewcommand{\bf}[1]{ \mathbf{#1} } \renewcommand{\sf}[1]{ \mathsf{#1} } \renewcommand{\rm}[1]{ \mathrm{#1} } \newcommand{\floor}[1]{ { \lfloor {#1} \rfloor } } \newcommand{\ceil}[1]{ { \lceil {#1} \rceil } } \newcommand{\ol}[1]{ \overline{#1} } \newcommand{\t}[1]{ \text{#1} } \newcommand{\norm}[1]{ { \lvert {#1} \rvert } } % norm/magnitude (REMOVE) \newcommand{\mag}[1]{ { \left\lvert {#1} \right\rvert } } % magnitude \newcommand{\smag}[1]{ { \lvert {#1} \rvert } } % short mag \newcommand{\card}{ \t{cd} } % cardinality \newcommand{\dcup}{ \sqcup } % disjoint untion \newcommand{\tup}[1]{ \langle {#1} \rangle } % tuples \newcommand{\tl}{ \tilde } \newcommand{\wt}{ \widetilde } % f \onall X = { f(x) : x ∈ X } \newcommand{\onall}[1]{ { \llbracket {#1} \rrbracket } } % shorthands: various brackets \newcommand{\tpar}[1]{ \left( {#1} \right) } % "tall parens" \newcommand{\tbrak}[1]{ \left[ {#1} \right] } % "tall brackets" \newcommand{\tbrac}[1]{ \left\{ {#1} \right\} } % "tall braces" % reverse \mapsto (FIXME: make better) \newcommand{\mapsfrom}{ \mathop{\leftarrow\!\mid} } % reverse-order composition \newcommand{\then}{ \operatorname{\ ;\ } } % Like f' represents "f after modification", \pre{f} % represents "f before modification" % TODO: remove this? \newcommand{\pre}[1]{{ \small `{#1} }} % hook arrows \newcommand{\injects}{ \hookrightarrow } \newcommand{\embeds}{ \hookrightarrow } \newcommand{\surjects}{ \twoheadrightarrow } \newcommand{\projects}{ \twoheadrightarrow } \newcommand{\id}{ \,\mathrm d } % integration d % derivatives: use {\ddn n x y} for (dy/dx) \newcommand{\ddn}[3]{ \frac{ {\mathrm d}^{#1} {#2} }{ {\mathrm d} {#3}^{#1} } } % nth derivative \newcommand{\dd}{ \ddn{} } % first derivative \newcommand{\d}{ \dd{} } % first derivative (no numerator) \newcommand{\dn}[1]{ \ddn{#1}{} } % nth derivative (no numerator) % derivatives: use {\D n x y} for (∂_x y) \newcommand{\Dn}[2]{ \partial^{#1}_{#2} } \newcommand{\D}{ \Dn{} } % no power \newcommand{\ig}[2]{ \int {#2} \, \mathrm d {#1} } % first integral %% category theory %% % category names \newcommand{\cat}[1]{{ \sf{#1} }} % yoneda embedding \newcommand{\yo}{よ} % represents an anonymous parameter % eg. $f(\apar)$ usually denotes the function $x \mapsto f(x)$ % TODO: remove this? \newcommand{\apar}{ {-} } %% computability %% % turing machines \newcommand{\halts}{ {\downarrow} } \newcommand{\loops}{ {\uparrow} } \approx typeclass instances). ntfs: the typview design is bad. Then we can have two values a, b of the same type reporting different monoid instances. Either combine the type with its view (gross, then we get into newtype territory) or go the way of Scala implicits (not sure how to implement efficiently) or require manual dictionary-passing.
Fields
The code [ k: v ].k evaluates to v. It would be tempting, to infer that Lang uses records as a fundamental datatype, but this is not the case. Strictly speaking, the value universe of Lang is inhabited totally by opaque values, which can only be interacted with by being combined with other opaque values. Essentially the only primitive is function application. On top of this, however, Lang has particular affordances for creating and using record-like values. As such, “fields” do not exist in Lang in the usual way. It is technically incorrect to say that the value [ k: v ] “has the field” k. The value itself is opaque, but its type provides a lens for the symbol key :k, which allows usage of the “field” k. Field syntax looks something like this:
a.b.c is a.(.b.c)
.b.c is Fields.compose (Fields.make :b) (Fields.make :c)
a.(f) is a |> Fields.get f
f .= x is Fields.setting f x, use like a |> .b.c .= x
Loose notes
Perhaps to allow for nominal types / nominal modules we require them exist statically at toplevel and then assign them FQNs as the first step of computation? almost a step before computation? Or maybe just make gensym an effect? (er, an affordance)
Here’s an interesting idea: make imports a part of aff. In other words, imports don’t exist. Instead, a “module” is just a function whic accepts an aff which has an aff.imports dict which is a map taking a (relative) path and producing the evaluation of the source file at that path. Then when you actually invoke your program, you pass in a --lib flag which contains all source files, eg --lib ./src/**/*.lang. This sounds crazy, but it’s basically how things work anyway. At least in the case of Haskell-likes, we “import” a module name but then have to manually provide the source files where the module resides. Instead of passing --lib we set up a .cabal or configure purs-nix, etc. But the --lib pattern is still there, even if it’s not explicit. Anyway, the benefit of this design is two-fold. (1) removal of language features; we get “imports” for free. (2) increase in purity; to evaluate a file we need nothing but the language interpreter. The evaluate a whole program we need nothing but the language interpreter. To execute a program we then need --lib. Perhaps have syntactic sugar, so import aff ../some/other/file.lang desugars to let file = aff.imports.get("../some/other/file.lang").
You know, this “affordances for effects” thing could be implemented ‘on top of’ an existing language. Create a “new language” called jsaff whose syntax is a subset of javascript’s and whose semantics are equal. In other words, it’s just javascript but you’re not allowed to reference console.log directly; you must get it through aff. This will be statically enforced. Disallow nonlocal mutation as well.
Just realized that with this design there’s no way of limiting the effects of something incoming. If I write, for instance, a function f = \g -> g 100, it’s impossible for me to limit what effects g can have; g could close over all sorts of pivots. One reason we might want to limit the effects of something incoming is as follows. If we were writing an elmlike, we would want to enforce that the view function be pure. I think maybe we can add this kind of limiting as a language feature, but as of yet it’s not quite clear to me how to do so...
ok I’m thinking about affordances again. I had the idea that it would be nice to design the language so that by using it in a traditional way you get the benefits of affordances with doing no extra work. ie, you get dependency injection ‘for free’ For instance. Perhaps affordances are not a parameter named aff but just an implicit parameter all functions have (so code lives in ReaderT Aff (ContT IO) where Aff ~ Map String Dynamic). To retrieve an affordance instead of writing aff.thing you use the special syntax, say, <thing>. To locally overwrite an affordance you use the special syntax, say, with <THING> as VALUE { EXPR } And say as well that importing was an affordance, so instead of doing import get_codes, launch_nukes from pentagon you write let { get_codes, launch_nukes } = <import>("pentagon") where <import> retrieves the import affordance which is a plain ol’ function. One could teach a beginner that “the language just has strange import syntax” and if desired they could use it without knowing about affordances. But—whether or not they’re aware—they have all the power of affordances for free. Say that module is mod.lang. Any parent module may swap out a <import>("mod.lang") import with with <import> as new_import { <import>("mod.lang") } where new_import overrides the pentagon module with something else. Since mod.lang is written only in terms of the expression syntax and <import>, this means that parent modules essentially have full dependency injection on mod.lang, for free. IMO overriding imports is not as clean as having explicit affordances, since writing a module intended to be overwritten is likely to be more subtle and difficult than writing a module intended only to be imported. In other words, there should be explicit opt-in to intentionally being an affordance which signifies that “I, the affordance author, have thought about how this object will work as an affordance” (for an example of why writing an affordance is more subtle than writing a module, see the next bulletpoint)
Here’s an open problem. Say I have a loggig affordance log which exposes e.g. log.info and log.warn. Under the hood log.warn(str) just calls log._print("[warn]", colors.red, str) and log.info(str) just calls log._print("[info]", colors.none, str). If i want to locally override this affordance then I should only have to produce a new log._print and ideally the rest of the functions should be defined automagically. At the same time, log._print should not be a part of the public API. Probably the way to do this is have log also expose log.override which accepts a new _print value and produces a new logging affordance. ... A similar example that maybe works better. Say I have an io module which exposes methods like readFile and putStrLn and the like. Some of these methods will be defined based off of other methods: if we have putStr then we should get putStrLn for free. Hence if I want to locally override the io import then I shouldn’t have to create an entire new object specifying both putStr and putStrLn — I should have some way to specify only the minimal ‘basis’ methods and have the rest be filled in automagically. hm... maybe the solution here is also that io have a .override method? Could do something like Nix derivation .override where the io module is constructed as io = { ...(func params), override: (\newParams -> func newParams) } where params are the default values for the minimal ‘basis’ methods. Solutions to this kinda thing in the wild:
Inheritance — subclass provides ‘basis’ methods and superclass provides derived defaults
Typeclasses — instances provide values for basis methods and derived functions are written generically over the instance
naturally I think the typeclass solution is best, but it’s kind of a non-starter here. if the io module provides only basis methods, then what provides derived methods? io_derived? do you call io_derived(io).putStrLn("my string")? gross