A Category of Systems

Exploring the consequences of Systems & Cybernetics in Engineering
Author Tom Westbury License Creative Commons Licence

Patterns for types

Date 2020-09-06
Tags

In my last post, I showed how, with a lit­tle ad­di­tional syn­tax, Data Types can be given a lot more power within a UM­L/SysML mod­el. In this post I’d like to dial up the power one step fur­ther and talk about how we can spot pat­terns in the us­age of cer­tain Data Types and en­code this as an ad­di­tion to UM­L/SysML that will give us bet­ter in­sight into the cor­rect­ness of our mod­els and al­low us to de­fine pat­terns we use again and again in an ab­stract way. This pat­tern is some­times called the “trait” (in the Rust lan­guage) or the “type class” (in Haskel­l/S­cala/Idris).

Let’s re­turn to the UML de­f­i­n­i­tion of a prim­i­tive Data Type:

A Prim­i­tiveType de­fines a pre­de­fined DataType, with­out any sub­struc­ture. A Prim­i­tiveType may have al­ge­bra and op­er­a­tions de­fined out­side of UML, for ex­am­ple, math­e­mat­i­cal­ly. The run-­time in­stances of a Prim­i­tiveType are val­ues that cor­re­spond to math­e­mat­i­cal el­e­ments de­fined out­side of UML (for ex­am­ple, the In­te­gers). [1]

In the last post we ex­plored how prim­i­tive types can be ex­pressed us­ing a few new pieces of UML syn­tax. Now though, I’d like to ex­plore the sec­ond sen­tence of the de­f­i­n­i­tion: A prim­i­tive type may have al­ge­bra and op­er­a­tions de­fined out­side of UML… I’d like to ask the ques­tion: why should­n’t we de­fine al­ge­bras and op­er­a­tions in­side of UML? and ex­plore what that might en­tail.

What is an Al­ge­bra?

To start with lets talk about how math­e­mati­cian­s,­physi­cists and com­puter sci­en­tists model the world: through al­ge­bra. The con­cept of al­ge­braic struc­tures is usu­ally new to en­gi­neers who aren’t nor­mally taught ab­stract maths at uni­ver­sity lev­el.

An al­ge­braic struc­ture is a set of ob­jects, a set of op­er­a­tions that you can use with them and a set of laws that they must con­form to. Let’s have a look at the ex­am­ple of a Monoid to show you what I mean. A Monoid is any set where two mem­bers of the set can be com­bined with a “mul­ti­pli­ca­tion” op­er­a­tion to cre­ate an­other mem­ber of the set. There must also be an Id el­e­ment that when com­bined with other el­e­ments, re­turns the other el­e­ment un­scathed. Let’s use the pro­gram­ming lan­guage “Idris” to il­lus­trate this:

 in­ter­face Monoid m where   -- m is any type that we're defin­ing a monoid for
   mId : m                  -- mId is the iden­tity
   mPlus : m -> m -> m      -- mPlus is our com­bin­ing op­er­a­tion

I’ve cho­sen the Idris lan­guage as it can be used to de­fine the laws as­so­ciate with a monoid. These are il­lus­trated be­low:

   mI­dIsIdR         : mPlus mId m = m                           -- mPlussing the Id to any­thing re­turns that thing
   mI­dIsIdL         : mPlus m mId = m                           -- Same as last, but left­-handed be­cause monoids don't necess­sar­ily com­mute
   mPlusAs­so­ci­ates : mPlus a (m­Plus b c) = mPlus (m­Plus a b) c  -- mPlus is as­so­cia­tive

Those who’ve read my pre­vi­ous blog post on al­ge­braic data types might no­tice that these monad laws are be­ing ex­pressed as propo­si­tions as the type of a func­tion. This is a cool as­pect of a few pro­gram­ming lan­guages that en­able you to math­e­mat­i­cally prove as­pects of your pro­gramme (for in­stance, this Idris in­ter­face for monoid forces you to pro­vide a proof for each of the monad laws be­fore you can use it). Proofs are be­yond the scope of this ar­ti­cle, but if you’d like to know more, the wikipedia ar­ti­cle on the Cur­ry-Howard-Lam­bek Iso­mor­phism is a good place ot start.

By this point, you’ve prob­a­bly al­ready thought of a few monoids that we use in daily life. Pos­si­bly the most ubiq­ui­tous ex­am­ple is the monoid of the nat­ural num­bers and ad­di­tion. In this case, mPlus is plain old ad­di­tion and mId is 0. In Idris, this would look some­thing like:

im­ple­men­ta­tion Monoid Nat where
  mId = 0       -- The ad­di­tive iden­tity is 0
  mPlus = (+)   -- The com­bin­ing op­er­a­tor is nor­mal ad­di­tion

To ham­mer the point home, here are a cou­ple of other well used monoids:

im­ple­men­ta­tion Monoid Nat­Mult where -- A mul­ti­plica­tive monoid for the nat­ural num­bers
  mId = 1       -- 1 is the iden­tity
  mPlus = (*)   -- Mul­ti­pli­ca­tion is the bi­nary op­er­a­tor

im­ple­men­ta­tion Monoid Bool where
  mId = True    -- True is the iden­tity
  mPlus = (&&)  -- Boolean AND is the bi­nary op­er­a­tor

im­ple­men­ta­tion Monoid String where
  mId = ""      -- The iden­tity is the empty string
  mPlus = (++)  -- The op­er­a­tor is string con­cate­na­tion

The power of know­ing what sort of al­ge­braic struc­ture you are play­ing with is that, al­most cer­tain­ly, math­e­mati­cians have shown all sorts of in­ter­est­ing and use­ful re­sults for that struc­ture. Just by show­ing that the prob­lem you are fac­ing con­forms to a cer­tain al­ge­braic struc­ture gives you a lot of in­for­ma­tion for free. From a pro­gram­ming point of view it en­ables us to use the same func­tional chain for dif­fer­ent data types: if the be­hav­iour of our pro­gram uses only func­tions from al­ge­braic struc­tures, we can change the un­der­ly­ing type of data be­ing fed through the “pipeline” with­out hav­ing to mod­ify too much of the func­tion­al­i­ty. This can also ap­ply to UML and SysML, take the ex­am­ple of the join node in ac­tiv­ity di­a­grams.

There is a good rea­son why I started by men­tion­ing the Monoid. Not only is the Monoid pos­si­bly the sim­plest in­tu­itive al­ge­braic struc­ture, it could be used to give mean­ing to the join node in ac­tiv­ity di­a­grams for ob­ject flows.

Activity diagram fragment showing two object flows connecting to a join node

In the cur­rent UM­L/SysML in­ter­pre­ta­tion of join nodes, they can­not be used eas­ily with ob­ject flows as in the di­a­gram above. This is be­cause if two to­kens reach the node, what value do we give the out­go­ing to­ken? In the cur­rent spec, you have to de­fine each join node with a con­straint de­tail­ing how it com­bines the in­puts Spec­i­fy­ing that all of the in­com­ing flows must be of the same type with a de­fined monoid solves this prob­lem. We sim­ply de­fine the value of the out­go­ing to­ken as the “m­Plus” of the val­ues of the in­com­ing to­kens. As a monoid must be as­so­cia­tive, this pat­tern would work for any num­ber of in­com­ing ob­ject flows.

There are far more use­ful and in­ter­est­ing al­ge­braic struc­tures that em­body many pat­terns that arise in en­gi­neer­ing sit­u­a­tions. I’ll ex­plore a few in some fu­ture blog posts, per­haps I may even try to tackle the dreaded Monad in UML.

A UML im­ple­men­ta­tion for Al­ge­braic Struc­tures

So how would we im­ple­ment such thing as an al­ge­braic struc­ture in UML? We’re go­ing to need to work out a way to rep­re­sent the as­pects of the Set, the op­er­a­tions and the laws. We’re also go­ing to have to en­sure that one al­ge­braic struc­ture can be ap­plied to mul­ti­ple dif­fer­ent sets pos­si­bly mul­ti­ple times (e.g. mul­ti­pli­ca­tion and ad­di­tion monoids of nat­ural num­ber­s).

The Set

The set as­so­ci­ated with the al­ge­braic struc­ture could be a class, a data type or in­deed any clas­si­fi­er. This points to­wards the well un­der­stood idea of “in­ter­faces” in Java where an ab­stract class (a class that can­not di­rectly type an ob­ject) is used to im­part cer­tain op­er­a­tions onto a class.

There­fore, at­tach­ing a data type to a al­ge­braic struc­ture is done sim­ply with a gen­er­al­i­sa­tion re­la­tion­ship. This scans well as gen­er­al­i­sa­tion is an “is a” re­la­tion­ship; any data type that is gen­er­alised by the al­ge­braic struc­ture is a spe­cial­i­sa­tion of the al­ge­braic struc­ture.

The Op­er­a­tions

The op­er­a­tions of the al­ge­braic struc­ture are just op­er­a­tions. Thanks to the sub­typ­ing poly­mor­phism given to us by UM­L’s gen­er­al­i­sa­tion, we only need to de­fine the op­er­a­tions in terms of the al­ge­braic struc­ture. In UML no­ta­tion, the monoid op­er­a­tion might look some­thing like this:

 mplus (a : Monoid, b : Monoid) : Monoid

Note that with UML, the mZero func­tion be­comes an OCL con­straint. In­ter­est­ing­ly, in Idris all of the op­er­a­tions and laws are, in fact, func­tions due to the Cur­ry-Howard-Lam­bek iso­mor­phism where propo­si­tions iso­mor­phic to types; the func­tion con­struc­tor ’’‘->’’’ is iso­mor­phic to im­pli­ca­tion in prepo­si­tional log­ic.

The Laws

Laws are quite eas­ily added to the new clas­si­fier in the form of OCL con­straints. In fact the monoid laws can be rewrit­ten in OCL as:

 con­text Monoid
 inv:    self -> forAl­l(m | ex­ist­s(mZero | mAd­d(mZe­ro, m) == m))
 inv:    self -> forAl­l(m | ex­ist­s(mZero | mAd­d(m, mZe­ro) == m))
 inv:    self -> forAl­l(m | forAll (n | forAll (o | mAd­d(m, mAd­d(n, o)) == mAd­d(­mAd­d(m, n), o))))

Per­son­al­ly, I find the OCL a lot harder to read than Idris. Its some­what un­for­tu­nate that the UML and OCL de­vel­op­ers had­n’t en­coun­tered the ML fam­ily of lan­guages as OCL could be have been more eas­ily grokkable to the av­er­age en­gi­neer with­out los­ing its for­mal­i­ty. Per­haps we’d have had a UML based on cat­e­gory the­o­ry; the third part of the Cur­ry-Howard-Lam­bek Iso­mor­phism.

con­clu­sion

In this post, we ex­plored what al­ge­braic struc­tures are and how they can be use­ful to us as en­gi­neers at­tempt­ing to model sys­tems. This re­sulted in a pro­posal for a way of mod­el­ling some­thing anal­o­gous to type classes in UML us­ing stereo­types as a base. We were, how­ev­er, left with one as­pect of type classes that UML can’t han­dle: how to im­ple­ment the laws. Im­ple­ment­ing laws and proofs is out­side the scope of UML and for now we’ll have to set­tle for OCL con­straints to im­ple­ment our laws. Would­n’t it be nice, though, to have one lan­guage? To have a graph­i­cal equiv­a­lent of “Idris”? I’ll ex­plore what makes Idris Idris in an­other blog post where we’ll ex­plore de­pen­dent types and what a mod­el­ling lan­guage would have to be to in­clude them.

Ref­er­ences

[1] OMG 2017 OMG™ Uni­fied Mod­el­ing Lan­guage™ (OMG UM­L™) Ver­sion 2.5.1 http://www.omg.org/spec/SysM­L/1.4/