One Conceptualization of Distributive Laws
Table of Contents
Distributive Laws
Scattered around in mathematics and some areas of programming are laws/conditions that look kind of like this:
I call these ‘distributivity conditions’ because they look a little like distributivity.
Frankly, I’ve never really understood these conditions. I mean, I ‘get’ them to some extent: the above law means that " becomes under ". But what does that mean in context? Why is it interesting? And why does it show up so often?
Luckily for me, I stumbled into an experience which I felt deeply enriched my understanding of these kinds of conditions, and I hope to codify and preserve that understanding here.
As is traditional for informal texts containing Haskell code, we will explore the topic by way of definitely-not-contrived example.
Distributivity as Semantics-Assignment
Say we are—for some reason—interested in working with values of type String -> String, and say that it is of use to us to be able to tell when two values of type String -> String are equalmotivation. Unfortunately for us, the type String -> String does not admit an Eq instance1. Sounds like game over. However, we realize that all throughout our program we only ever construct String -> String values of a highly specific form; namely, it happens that each f :: String -> String we construct can be written as:
If this feels contrived to you, I can offer the following situation as a context in which these desires arise somewhat naturally.
Say we’re writing some logging code for an application. Callsites invoke
log :: MonadLog m => String -> m (), and the provided string is printed to stdout. However, it is first decorated by middleware of type String -> String which had already been registered higher up along the callstack.
For instance, a callFooBarService function might register logging middleware to prefix each logged string with "[in foo-bar] ", thereby contextualizing logging performed during its lifetime.
Then our emitted logs might hypothetically look something like this:
application start
[in initialization] config found in env vars
[in initialization] no grasshopper found; falling back to cicada
main loop start
[in main loop] begin main loop
[in main loop] need to call foo-bar
[in main loop] [in foo-bar] calling with foo=100 and bar=22
[in main loop] [in foo-bar] error! foo-bar returned bad response
Say that we decide that this logging format is noisy, and we’d like to collate log lines which have the same middleware, so the above becomes
application start
[in initialization]
config found in env vars
no grasshopper found; falling back to cicada
main loop start
[in main loop]
begin main loop
need to call foo-bar
[in main loop] [in foo-bar]
calling with foo=100 and bar=22
error! foo-bar returned bad response
In order to do this, we need to be able to tell when two messages were logged with the same middleware. Since middleware is of type String -> String, this amounts to asking for an Eq (String -> String) instance.2
Little remark. In reality we’d probably want to allow for logging middleware of type
String -> m String so that we can, for instance, decorate each log line with the current time.
The strategy discussed in this text to (roughly) equip String -> String with an Eq instance could also theoretically be applied to String -> m String. Although it might get annoying.
Turing machines can be encoded as functions of type
String -> String. Hence if there were an Eq (String -> String) instance, we would have decidable equality on the type of all turing machines, which is impossible.
f :: String -> String
f = \a -> intercalate a ss
= \a -> ss₀ <> a <> ss₁ <> a <> … <> a <> ssᵢ
for some ss :: [String]3. (The intercalate function combines a list of strings by some string delimiter)
This is not—as I had teased—actually a contrived example. In the codebase that inspired this text, I actually had desire to compare
We can create a type called String -> String values for equality; I actually found that I only ever used String -> String values of the form above; and I actually used the technique discussed in this text to acquire my desired Eq instance.Weave4 which gives a concrete representation to functions of this kind. A Weave is a sequence of elements, each of which is either a known string or a distinguished value representing the function parameter.
The name
Weave is intended as an allusion to intercalatenewtype Weave = Weave { elems :: [Elem] }
data Elem = Hole | Elem String
deriving (Eq)
The intended interpretation of a Weave value is given by
interp :: Weave -> (String -> String)
interp (Weave elems) = foldMap interp1 elems
where
interp1 (Elem a) = const a
interp1 Hole = id
For example, we have
> interp (Weave [ Elem "<", Hole, Elem " & ", Hole, Elem ">" ]) "🐛"
"<🐛 & 🐛>"
Here each Hole gets ‘filled in’ by the provided string "🐛", each other Elem is left as-is, and everything is concatenated to produce the final String.
Some reasonable operations on Weave are given by
-- Create a Weave which produces the given value
str :: String -> Weave
str s = Weave [Elem s]
-- Create a Weave which reads the paramter
par :: Weave
par = Weave [Hole]
-- Concatenate two Weaves
(.++) :: Weave -> Weave -> Weave
(.++) (Weave xs) (Weave ys) = Weave $ xs <> ys
-- Compose two weaves right-to-left as functions
-- This works by "plugging in" each Hole in the first Weave
-- with all the values in the second Weave
(.<<) :: Weave -> Weave -> Weave
(.<<) (Weave ys) (Weave xs) = Weave $
ys >>= \y -> case y of
Elem y -> [Elem y]
Hole -> xs
Let’s see how these operations work
brack, brace, twice :: Weave
brack = str "[" .++ par .++ str "]" -- (\s -> "[" <> s <> "]")
brace = str "{" .++ par .++ str "}" -- (\s -> "{" <> s <> "}")
twice = par .++ par -- (\s -> s <> s)
> interp (brace .++ twice .++ brack) "🌵"
"{🌵}🌵🌵[🌵]"
> interp (brace .<< twice .<< brack) "🌵"
"{[🌵][🌵]}"
Truly stupendous.
Now we can do what we came here for, which is to write an Eq instance for Weave
instance Eq Weave where
w == v = normalize w == normalize v
-- Combines adjacent non-Hole Elems
normalize :: Weave -> [Elem]
normalize = normElems . elems
where normElems :: [Elem] -> [Elem]
normElems elems = case elems of
(Elem a : Elem b : rest) -> normElems $ Elem (a <> b) : rest
(x : rest) -> x : normElems rest
[] -> []
Sounds all good, right? And it is! We’ve identified that our program, which makes use of String -> String values, doesn’t actually utilize all of String -> String (which is quite large5); we’ve identified which collection of String -> String values our program does need, created a concrete representation for this collection, and in doing so achieved an Eq instance, just as we wanted. Great!
The entire
Now is the part where I try to convince you that all of this has everything to do with distributivity.
The key observation to make is this. We created String -> String type has a lot of values, including a bunch that are probably not of use to us. For instance, it includes the String -> String value which transforms the given String to uppercase if its length is prime; it has the value which rotates odd-indexed characters; it has the value which ignores all characters besides those in 0123456789+*(), interprets this reduced string as a mathematical expression, and produces the result as a String; it has the value which removes all vowels in the given string beyond the first three.Weave with the intent of being a “concrete representation for a substructure of String -> String". However, strictly speaking, Weave is just some type, some isomorphism of [Either () String]6. There is nothing requiring us, or some other programmer, to ‘treat’ a Weave as a String -> String.
Which, incidentally, is another way of saying that it’s isomorphic to
We can, for instance, write operations over Int. Such is the way of a kind-* type in Haskell!
Weave which are perfectly valid but nonetheless don’t really make any sense if we’re considering Weave as a substructure of String -> String. For instance, we may write:
removeNthElem :: Int -> Weave -> Weave
removeNthElem n (Weave xs) = Weave $ take n xs <> drop (n + 1) xs
This function makes no sense because—if we’re thinking of Weave as a substructure of String -> String—it may have different effects on two Weave values which “are the same”. For instance, the function removeNthElem 1 has a different effect on str "¿" .++ str "¡" .++ par as compared to str "¿¡" .++ par, despite them “being the same String -> String".
> interp (removeNthElem 1 (str "¿" .++ str "¡" .++ par)) "🐝"
"¿🐝"
> interp (removeNthElem 1 (str "¿¡" .++ par)) "🐝"
"¿¡"
-- ¿¡where'd my bug friend go!? ;(
So Weave is ‘just some type’ with no particular semantics attached. But we certainly don’t treat it ‘just some way’. We know that we are thinking of it as a substructure of String -> String, and the operations we’ve written—str, par, .++, and .<<—abide by that intended treatment.
Right?
If you’ll excuse the poor segue, what I’d like to address now is the following question. How can we justify the claim that the operations we’ve written “abide by the intended interpretation of Weave"? If we were showing the code to someone new, how would we convince them that our definitions for .++ & friends are correct, beyond saying “it works”? Let us try.
When we defined Weave, we had intention for it to be “a concrete representation of a substructure of String -> String". And when we defined .++, we had intention for it to be “concatenation”. And if our definition of .++ is correct, then .++ will actually be concatenation. That is, if we have two Weave values f, g respectively representing functions f' and g' of type String -> String, then it ought to be that f .++ g represents the function f' <> g'7.
For anyone unfamiliar, here
In that paragraph, if we interpret "f' <> g' is equivalent to \s -> f' s ++ g' s where ++ concatenates two strings.
(If you’re curious, this is due to the Monoid instances for String and a -> b)
f :: Weave represents f' :: String -> String" as interp f = f', then we get a direct translation of the claim “if we have f, g :: Weave representing f', g' :: String -> String, then f .++ g ought to represent f' <> g'". Namely, this translates as:
if interp f = f'
interp g = g'
then
interp (f .++ g) = f' <> g'
By plugging in the equalities, we can re-phrase this as:
interp (f .++ g) = interp f <> interp g
For good measure, let’s read this back out again in English. It says that if we have f, g :: Weave and apply the .++ operation to them—which allegedly represents concatenation—and then interp the alleged-concatenation to find out what String -> String it represents, we will find that the function it represents is the actual concatenation <> of the functions that f and g represent.
This, hopefully, would be a satisfactory justification to our hypothetical inquisitor as to how we know our definition of .++ is correct. (And hopefully is a satisfactory justification to you!)
And—hey!—do you see it? It’s a distributivity law!
interp (f .++ g) = interp f <> interp g
φ (x × y) = φ(x) • φ(y)
The .++ operation does indeed satisfy this law8, and additionally the other operations we defined on Weave satisfy ‘distributivity conditions’ of their own; namely:
The truth of this has been divined from the Platonic Realm directly and requires no proof.
interp (str s) = const s
interp par = id
interp (w .<< v) = interp w . interp v
Each of these conditions (and distributivity laws in general) follow a common pattern. On the left of the = we have some expression of type Weave which is then passed into interp; on the right of the = we have some expression of type String -> String which often itself internally makes use of interp.
We now give an English-language conceptualization of laws of this form. Several phrasings, in fact. Choose whichever makes the most sense to you.
The laws tell us that
interp will turn certain Weave-related operations into corresponding String -> String-related operations
We created a datatype 
Weave and assigned it a semantics with interp. We identified some operations on String -> String which were useful to us (const, id, <>, .), and defined the functions on Weave which correspond to them exactly. We “pulled” those operations from String -> String “back onto” Weave9
This is not to be confused with the term ‘pullback’, which refers to either (a) a related-but-slightly-different concept, if you are talking to most people; or (b) a completely unrelated concept, if you are talking to a category theorist
The laws says the following. If we find ourselves working with values of type
Weave, and we know that whatever value we finally produce will be passed into interp :: Weave -> String, then we have a guarantee about how the operations str, par, .++, and .<< will affect the resultant String -> String.
With softer words, if you’ll allow: via
interp, the Weave type acts as a kind of ‘window’ into a subset of String -> String. Certain operations performed on Weave will be ‘reflected’ by interp onto String -> String in the expected manner
In my head, it’s almost as if Weave and String -> String are two dolls on the same marionette. Operations correspond to hand movement of the marionette strings; some operations are ‘well-behaved’ in that they induce the same movement on both dolls.
Weave/interp to all ‘distributivity-looking’ laws:
Say we find ourselves in a situation where we have two typesHopefully one of these phrasings has been at least a little bit helpful to you. There is more to discuss, but first we must rest. Please join me by the campfire.TandSand a functionf : T -> Sbetween them, and say that we know our function satisfies some condition of the formf (a × b) = f a • f bwhere(×) :: T -> T -> T (•) :: S -> S -> SWe may interpret this condition as follows. When we are working with values of typeT, if we know that whateverTwe finally produce will be interpreted byf(ie, passed intof), then we can think of×as ‘secretly performing’•insofar asfwill interpret×as•Further, the same may be said more generally about conditions satisfying the formf ( lhs(a,b,…) ) = rhs(a,b,…)for somelhs :: T,rhs :: S, and wheree(x)signifies that the expressionecontains the variablex. Here, if we find ourselves working withinTand know that we will eventually interpret withf, then whenever we happen to operate on ourTvalues in a way matching matching thelhs, we can think of it as ‘secretly being’ therhs.
Subtlety: Relativity to the Interpretation Function
At least one subtlety in the situation has so far been brushed under the rug.
During our exploration we admitted that Weave is ‘just some type’ and there is no way to ensure that operations on Weave treat Weave how we intend for it to be treated—ie, with Elem representing a string constant, Hole representing a parameter, and juxtaposition representing <>. We gave the example removeNthElem as something which does not respect this intention, because it does not treat juxtaposition as <>10.
If it did, it would not be able to distinguish
Despite this realization that str "ab" .++ str "c" from str "a" .++ str "bc"Weave does not come with ‘semantics attached’, we then uttered phrases like:
"[...] turn certain
Weave-related operations into corresponding String -> String-related operations”
That is, despite accepting that
Weave does not come with ‘semantics attached’, we then sometimes used language that suggest there is only one interpretation for Weave.
It’s worth driving the point home that such a conclusion would be mistaken. Weave indeed does not have an intrinsic semantics. Semantics for operations are in fact given by interp; note how the distributivity law for .<< only tells us what .<< does if it is interpreted by interp. If we choose to interpret the Weave in some other way, the law tells us nothing. Distributivity laws are relative to a chosen interpretation function!
To exemplify this fact, we can look at a new way of interpreting Weave as a substructure of String -> String. This new interpretation also allows for writing str, par, .++, and .<<. However, it is completely incompatible with the existing interp function.
Our new interpretation function, interp', works as follows. Given a Weave, it considers each Hole to be a delimiter and accordingly splits the Weave into a series of lists over Elem. Each of these lists gets a Hole placed between each element and then is individually interpreted as per interp. The result is a sequence of String -> Strings; these are then composed as functions. Let us present that now in code, along with operations str', par', !++, and !<<:
import Data.List (intersperse)
import Data.List.Split (splitOn)
interp' :: Weave -> (String -> String)
interp' = foldEm . map subInterp . splitEm
where
splitEm :: Weave -> [[Elem]]
splitEm = splitOn [Hole] . elems
subInterp :: [Elem] -> (String -> String)
subInterp = interp . Weave . intersperse Hole
foldEm :: [String -> String] -> (String -> String)
foldEm = foldl (.) id
str' :: String -> Weave
str' s = Weave [Elem s]
par' :: Weave
par' = Weave [Elem "", Elem ""]
(!++) :: Weave -> Weave -> Weave
(!++) w v = pull (push w .++ push v)
where
push = foldl (.<<) par . map (Weave . intersperse Hole) . splitOn [Hole] . elems
pull = Weave . filter (/= Hole) . normalize
(!<<) :: Weave -> Weave -> Weave
(!<<) (Weave u) (Weave v) = Weave $ u <> [Hole] <> v
> interp' (brace' !++ twice' !++ brack') "🌵"
"{🌵}🌵🌵[🌵]"
> interp' (brace' !<< twice' !<< brack') "🌵"
"{[🌵][🌵]}"
The details of how this works aren’t really important11. The point is that we’ve given a new way to interpret Weave values as String -> String values. Further, this new interpretation is just as good as the old one, as the operations of interest (str/str', par/par', .++/!++, .<</!<<) can be performed equally well with either representation.
Also, if you look closely, you’ll see that I “cheated” a bit by using
If we wanted to characterize this new representation, we would write down the same conditions as before:
.++ to define !++. Certainly !++ could be defined ‘on its own’, but it would be tricky.
This ‘cheating’ does not invalidate any of the analysis in this text.
interp' (w !++ v) = interp' w <> interp' v
interp' (w !<< v) = interp' w . interp' v
interp' par' = id
interp' (str' s) = const s
From the point of view of wanting to use Weave as a substructure of String -> String, both interpretations are essentially the same.
But here’s the key: they don’t mix!
> interp' (brace !++ twice !++ brack') "🌵"
"{[🌵]"
Trying to mix operations from one representation with operations from the other simply does not work.
The distributive laws only guarantee that our operations of interest (str/str' & friends) work as intended when interpreted by the proper function. That is, we have
interp (w .++ v) = interp w <> interp v
interp' (w !++ v) = interp' w <> interp' v
but not, for instance,
interp (w !++ v) = interp w <> interp v
interp' (w .++ v) = interp' w <> interp' v
So, Weave is, indeed, ‘just a type’, and does not come with semantics attached. We can assign Weave a semantics by choosing some interpretation function. If we do, distributive laws will give operations over Weave a semantics relative to the chosen interpretation function. However, there very well may be other interpretation functions out there with just-as-rich operations.
Subtlety: Effects of ‘Incompatible’ Operations
There’s one other subtlety to the situation worth calling attention to.
Our conceptualization of a distributivity law right now is the following. Just as we treat the function
interp : Weave -> (String -> String)
as “giving a semantics” to Weave, we treat distributivity laws like
interp (w .<< v) = interp w . interp v
as “giving a semantics” to .<<: it tells us what .<< does, under interp.
This is true, but also a slight oversimplification. For full accuracy, we must recognize that distributive laws are conditional: they apply so long as—and only so long as—what’s to the left of the = is matched. This may sound like an obvious thing to draw attention to, but the point is that they do not necessarily apply.
Example time! Say we construct the following function:
announce :: Weave -> Weave
announce w = (par .++ par) .<< (w .++ str "! ")
> w = str "My " .++ par
> interp w "🍈"
"My 🍈"
> interp (announce w) "🍈"
"My 🍈! My 🍈! "
Via the distributive laws, we can deduce exactly what the semantics of interp . announce are:
interp . announce
= [ apply definitions of announce and (.) ]
\w -> interp $ (par .++ par) .<< (w .++ str "! ")
= [ distribute interp over .<< ]
\w -> interp (par .++ par) . interp (w .++ str "! ")
= [ distribute interp over .++ ]
\w -> (interp par <> interp par) . (interp w <> interp (str "! "))
= [ apply 'interp par' and 'interp (str x)' laws ]
\w -> (id <> id) . (interp w <> const "! ")
= [ rewrite ]
\s -> interp w s <> "! " <> interp w s <> "! "
Looks good. Via the distributivity laws we were able to find the exact value of interp . announce.
However, say we throw a ‘naughty’ operation into the mix which doesn’t have a distributivity law:
announce' :: Weave -> Weave
announce' w = (par .++ par) .<< removeNthElem 0 (w .++ str "! ")
We can attempt to perform the same deduction as before, but eventually we’ll get stuck:
interp . announce'
= [ apply definitions of announce' and (.) ]
\w -> interp $ (par .++ par) .<< removeNthElem 0 (w .++ str "! ")
= [ distribute over as many operations as possible, then rewrite ]
\w -> (\s -> s <> s) . interp (removeNthElem 0 (w .++ str "! "))
Since we don’t in general know how removeNthElem relates to interp, we can’t really go any further from here with the distributivity laws.
Hence, saying plainly that “distributivity laws give semantics to operations” is a bit misleading. More accurately, distributivity laws give semantics up to the operations they mention. As soon as we hit an operation without an applicable distributivity law, we’ll need to look somewhere else.
Accordingly, the only time distributivity laws can give semantics to an entire value is when all of that value can be expressed in a way that the distributivity laws know how to consume. This is not necessarily an unreasonable condition: with str, par, .++, and .<<, for instance, we can indeed construct every single12 Weave value. (In fact, .<< isn’t even necessary to do so!)
Well... Every single finite
Weave value.Case Study:
To wrap up, let’s take a look at the Monad MorphismsMonad algebraic structure, whose morphism laws are very ‘distributivity-like’. We’ll use the perspective presented in this text to analyze the meaning of these laws.
If we have two monads M, bind, return and N, bind', return', a monad morphism between them is a function mor :: forall a. M a -> N a abiding by:
mor (return x) = return' x
mor (mx `bind` f) = mor mx `bind'` (mor . f)
If we understand mor as ‘interpreting’ or ‘executing’ an M a computation within N, then we can phrase these laws as follows. If we build up a computation of type M a through repeated use of bind and return, and then finally execute it via mor to produce an N a, we are guaranteed that:
a call to
bind within M becomes a call to bind' within N—ie, the sequencing of computations is not mangled; we do not add, remove, or reorder computations, or do anything else ‘unusual’
a call to
return within M becomes a call to return' within N—ie, mor does not add any effects to return
That’s all. A sincere ‘thank you’ to everyone reading.