KRC

The
Semantic
Elegance
D. A.
University
of A p p l i c a t i v e
Languages
Turner
of K e n t a t C a n t e r b u r y
An overview
In w h a t d o e s the a l l e g e d s u p e r i o r i t y of
applicative languages consist?
In the
l a s t a n a l y s i s the a n s w e r m u s t be in t e r m s
o f t h e r e d u c t i o n in t h e t i m e r e q u i r e d to
p r o d u c e a c o r r e c t p r o g r a m to s o l v e a g i v e n
problem.
On r e f l e c t i o n I d e c i d e d t h a t
t h e b e s t w a y to d e m o n s t r a t e t h i s w o u l d be
to t a k e some r e a s o n a b l y n o n - t r i v i a l
p r o b l e m a n d s h o w how, by p r o c e e d i n g w i t h i n
a c e r t a i n k i n d of a p p l i c a t i v e l a n g u a g e
f r a m e w o r k it w a s p o s s i b l e to d e v e l o p a
w o r k i n g s o l u t i o n w i t h a f r a c t i o n of t h e
e f f o r t that w o u l d h a v e b e e n n e c e s s a r y in
a conventional imperative language.
The
particular problem I have chosen also
b r i n g s o u t a n u m b e r of g e n e r a l p o i n t s of
interest which I shall discuss briefly
afterwards.
of KRC
A K R C p r o g r a m is a c o l l e c t i o n of e q u a t i o n s
by m e a n s of w h i c h t h e u s e r a t t a c h e s n a m e s
to v a r i o u s o b j e c t s in K R C ' s u n i v e r s e of
discourse.
T h e u n i v e r s e of d i s c o u r s e c o n tains four types of object.
There are two
b a s i c d a t a t y p e s , n u m b e r s and s t r i n g s ;
a n d t w o t y p e s of s t r u c t u r e d o b j e c t - l i s t s
and functions.
Numbers and strings have
the s o r t of p r o p e r t i e s o n e w o u l d e x p e c t .
L i s t s a r e d e n o t e d by u s i n g s q u a r e b r a c k e t s
a n d c o m m a s - t h u s [i, 2, 3, 4] a n d t h e
e m p t y l i s t is w r i t t e n [].
L i s t s m a y be a c c e s s e d by i n d e x i n g , so if L
is the n a m e of a l i s t L 3 d e n o t e s its
third component.
An important operator on
l i s t s is ":" w h i c h j o i n s a n e w m e m b e r at
t h e front.
So
B e f o r e p r o c e e d i n g it w i l l be n e c e s s a r y
for m e to q u i c k l y o u t l i n e t h e l a n g u a g e
f r a m e w o r k w i t h i n w h i c h we s h a l l b e w o r k ing.
V e r y b r i e f l y it c a n be s u m m a r i s e d
a s ( n o n - s t r i c t , h i g h e r order) r e c u r s i o n
e q u a t i o n s + set a b s t r a c t i o n .
Obviously
w h a t m a t t e r s are the u n d e r l y i n g s e m a n t i c
c o n c e p t s , n o t the p a r t i c u l a r s y n t a x t h a t
is u s e d to e x p r e s s them, b u t for t h e
s a k e of d e f i n i t e n e s s I s h a l l u s e t h e
n o t a t i o n of KRC (= " K e n t R e c U r s i v e
C a l c u l a t o r " ) , an a p p l i c a t i v e p r o g r a m m i n g
s y s t e m i m p l e m e n t e d at t h e U n i v e r s i t y of
K e n t [ T u r n e r 81].
K R C is f a i r l y
c l o s e l y b a s e d o n the e a r l i e r l a n g u a g e
SASL, [ T u r n e r 763, b u t I h a v e a d d e d a
f a c i l i t y for set a b s t r a c t i o n .
0:
[1,
2,
33
h a s t h e v a l u e [0, i, 2, 33.
The elements
of a l i s t m a y be of a n y t y p e a n d n e e d n o t
all be of t h e same type.
L i s t s m a y be
f i n i t e o r i n f i n i t e , for e x a m p l e t h e
equation
x = l:x
d e f i n e s x to be a n a m e
l i s t [I, i, 1 ...].
for t h e
infinite
The universe of discourse contains all
possible infinite lists (actually not all
of t h e m c a n b e d e n o t e d b y w r i t i n g d o w n
e q u a t i o n s , b u t for r e a s o n s of c o n t i n u i t y
in t h e s e m a n t i c s w e h a v e t o say t h e y a r e
all there).
A u s e f u l p i e c e of s h o r t h a n d
is t h e ".." n o t a t i o n , t h u s
[a..b]
Permission to copy without fee all or part of this material is granted
provided that the copies are not made or distributed for direct
commercial advantage, the ACM copyright notice and the title of the
publication and its date appear, and notice is given that copying is by
permission of the Association for Computing Machinery. To copy
otherwise, or to republish, requires a fee and/or specific permission.
d e n o t e s t h e list of n u m b e r s
f r o m a t o b.
This also has
form, so for e x a m p l e
[I..]
denotes
©
1981 ACM 0-89791-060-5/81-10/0085
(inclusive)
an open-ended
$00.75
85
the
list of all
natural
numbers.
Functions are denoted by writing down one
o r m o r e e q u a t i o n s , w i t h t h e n a m e of t h e
f u n c t i o n f o l l o w e d by s o m e f o r m a l p a r a m e t e r s on t h e l e f t a n d a n e x p r e s s i o n
g i v i n g a v a l u e for the f u n c t i o n o n t h e
right.
So for e x a m p l e f u n c t i o n s f o r
squaring
a n u m b e r a n d for c a l c u l a t i n g
f a c t o r i a l s w o u l d be w r i t t e n r e s p e c t i v e l y
( " p r o d u c t " is a l i b r a r y f u n c t i o n ) .
zfexpression
qualifier
[i..n]
0 = A
A m n = A
(n-l)),
(a:x)
= a * product
f x = {a
I a ÷ x;
f a}
T h e u s e of t h e c o n s t r u c t is p a r t i c u l a r l y
c o n v e n i e n t w h e n t h e r e is m o r e t h a n o n e
g e n e r a t o r i n v o l v e d , for e x a m p l e a f u n c t i o n
for g e n e r a t i n g t h e c a r t e s i a n p r o d u c t of
two l i s t s (i.e. a l i s t of a l l p a i r s f o r m e d
by d r a w i n g o n e f r o m each) c a n b e w r i t t e n
m>0&n>0
N o t i c e t h a t t h e o r d e r in w h i c h the a b o v e
e q u a t i o n s a r e w r i t t e n h a s no l o g i c a l s i g nificance
(we i n s e r t g u a r d s w h e r e n e c e s s a r y to e n s u r e t h i s ) .
As another
example of the use of pattern matching,
t h e l i b r a r y f u n c t i o n " p r o d u c t " c a n be
defined
[1 = 1
I a ÷ x}
The library "filter", which filters a list
t h r o u g h a g i v e n p r e d i c a t e , c o u l d be
defined
i, m > 0
(m-l) (A m
product
f x = {f a
filter
(m-l)
product
::= b o o l e a n - e x p r e s s i o n
map
0 n = n + 1
A m
::= v a r + l i s t - e x p r e s s i o n
guard
O n e e x a m p l e is the f o l l o w i n g d e f i n i t i o n of
t h e (2nd order) l i b r a r y f u n c t i o n "map"
which applies a function to every element
of a l i s t
A f u n c t i o n c a n be d e f i n e d b y m o r e t h a n
one equation, with the different cases
b e i n g d i s t i n g u i s h e d b y the u s e of p a t t e r n
m a t c h i n g in t h e f o r m a l p a r a m e t e r s a n d / o r
t h e u s e of g u a r d s ( b o o l e a n e x p r e s s i o n s ,
w r i t t e n o n t h e r i g h t of a n e q u a t i o n
following a comma).
Ackermann's function
for e x a m p l e , c o u l d be d e f i n e d b y t h e
following three equations
A
generator
T h e v a r i a b l e o n t h e l e f t o f a g e n e r a t o r is
a l o c a l v a r i a b l e o f t h e zf e x p r e s s i o n a n d
r a n g e s o v e r t h e m e m b e r s of t h e l i s t o n the
r i g h t o f t h e "÷" sign.
sq n = n * n
fac n = p r o d u c t
::= { e x p l q u a l i f i e r ; -..
...; q u a l i f i e r }
::= g e n e r a t o r l g u a r d
cp x y = {[a,b]la÷x;b÷y}
T h e a d d i t i o n of set a b s t r a c t i o n r e p r e s e n t s
a v e r y c o n s i d e r a b l e i n c r e a s e in t h e expressive power of an applicative language.
C o n s i d e r for e x a m p l e t h e p r o b l e m of
defining a function which will return a
l i s t o f all p o s s i b l e p a r t i t i o n s o f a
number into positive integers.
For the
s a k e o f d e f i n i t e n e s s , l e t us s a y w i t h
p e r m u t a t i o n s , so [2.11 a n d [1,21 a r e t w o
d i f f e r e n t p a r t i t i o n s o f 3.
Such a funct i o n c a n b e e x p r e s s e d r a t h e r s u c c i n c t l y as
follows
x
N o t e t h a t t h e u s e of p a t t e r n s i n v o l v i n g
":" o n t h e l e f t of a n e q u a t i o n a v o i d s
t h e n e e d for e x p l i c i t u s e of t h e s e l e c t o r s
"hd" a n d "tl" o n lists.
F u n c t i o n s c a n be n o n - s t r i c t if the e q u a t i o n s i m p l y t h a t t h e y s h o u l d be.
For
e x a m p l e if w e d e f i n e f by
f x = 3
partitions
0 =
[[]]
partitions
n = {i:p
I i÷
[l..n];
p+ partitions(n-i)}
then of course f must always return the
r e s u l t 3, e v e n if its a r g u m e n t is r e presented by a non-terminating computation.
[We d o n o t d i s c u s s i m p l e m e n t a t i o n s at a l l here, b u t as t h e
reader will have already deduced from
the p r e s e n c e of i n f i n i t e lists, it
m u s t i n v o l v e s o m e f o r m of l a z y
evaluation.]
N o t i c e b y t h e w a y t h a t t h e v a r i a b l e s introduced by generators come into scope
f r o m l e f t to r i g h t , so l a t e r g e n e r a t o r s
c a n i n v o l v e e a r l i e r o n e s (but n o t v i c e
versa).
Following
this
extremely brief overview
of t h e n o t a t i o n of KRC, w e n o w t u r n o u r
a t t e n t i o n t o a s o m e w h a t l a r g e r p r o b l e m , as
promised.
Set A b s t r a c t i o n
We use a language construct closely based
on Zermelo-Frankel
set a b s t r a c t i o n
(except t h a t t h e r e s u l t is h e r e a l i s t
r a t h e r t h a n a set).
A n (informal)
syntax for these expressions may be given
as f o l l o w s
86
The
statement
p o s s i b l e , b u t in t h i s c a s e it is c l e a r
t h a t a c e n t r a l p r o b l e m is g o i n g to be def i n i n g an e q u i v a l e n c e r e l a t i o n o n m o l e c u l e s , a n d I f o u n d it d i f f i c u l t to b r i n g
t h i s into f o c u s w i t h o u t f i x i n g on an (at
l e a s t p r o v i s i o n a l ) r e p r e s e n t a t i o n for
molecules.
of the problem
T h e p r o b l e m w h i c h I h a v e c h o s e n to u s e as
a n i l l u s t r a t i o n h e r e a r i s e s f r o m the f i e l d
of organic chemistry.
I d a r e say t h a t
f r o m the p o i n t of v i e w o f a c h e m i s t w h o
h a s m a d e a s t u d y of t h e s e m a t t e r s it w i l l
seem a rather easy problem and perhaps
somebody, somewhere has a FORTRAN program
sitting on their desk which already solves
it.
N e v e r t h e l e s s f r o m t h e p o i n t of v i e w
of a p r o g r a m m e r w h o h a d n o t t r i e d it
b e f o r e , the p r o b l e m s e e m e d d i f f i c u l t
e n o u g h to be i n t e r e s t i n g .
At least
s e v e r a l c o m p e t e n t p r o g r a m m e r s I k n o w rep o r t e d to m e t h a t t h e y h a d f o u n d it so.
It t h e r e f o r e s e e m e d a g o o d t e s t i n g g r o u n d
for the l a n g u a g e f r a m e w o r k o u t l i n e d in the
o p e n i n g s e c t i o n of t h i s p a p e r .
In w h a t
f o l l o w s I h a v e c o n c e n t r a t e d on r e p r o d u c i n g
fairly honestly the thought processes
w h i c h f i r s t led m e to a r u n n a b l e s o l u t i o n ,
r a t h e r t h a n o n p r e s e n t i n g the m o s t e l e g a n t
or e f f i c i e n t s o l u t i o n p o s s i b l e .
Ideally we would like to define a canonical
o r i e n t a t i o n for e a c h m o l e c u l e so t h a t e a c h
d i s t i n c t i s o m e r is r e p r e s e n t e d by o n e a n d
only one data structure.
T h e r e d o e s not
s e e m to be a n y s t r a i g h t f o r w a r d w a y o f defining such a canonical orientation, however, (at l e a s t I c o u l d n o t see one) so w e
s h a l l go for t h e a l t e r n a t i v e p l a n of g i v ing a n o n - u n i q u e r e p r e s e n t a t i o n t o g e t h e r
w i t h a set of l a w s for d e t e r m i n i n g w h i c h
r e p r e s e n t a t i o n s are e q u i v a l e n t
(in t h e
sense that they are different orientations
of t h e s a m e m o l e c u l e ) .
O n e o b v i o u s w a y to r e p r e s e n t a p a r a f f i n
m o l e c u l e in t e r m s of l i s t s t r u c t u r e s is
as f o l l o w s .
We pick on one carbon atom
a r b i t r a r i l y a n d d e e m it to be t h e " l e a d i n g "
one.
W e t h e n r e p r e s e n t t h e m o l e c u l e as
a 4 - 1 i s t , t h e c o m p o n e n t s of t h e l i s t b e i n g
the sub-molecules
( " r a d i c a l s " in c h e m i c a l
t e r m i n o l o g y ) a t t a c h e d to t h e 4 b o n d s of
the l e a d i n g c a r b o n atom.
Each radical
is r e p r e s e n t e d e i t h e r b y t h e s t r i n g "H",
if it is j u s t a h y d r o g e n ato~p, or e l s e by
a 3 - 1 i s t c o r r e s p o n d i n g to a c a r b o n a t o m
w i t h 3 f u r t h e r r a d i c a l s a t t a c h e d t o it.
T h e p r o b l e m is to e n u m e r a t e , w i t h o u t r e p e t i t i o n s a n d in o r d e r of i n c r e a s i n g size,
all p o s s i b l e p a r a f f i n m o l e c u l e s .
For
those who have forgotten their high school
c h e m i s t r y , p a r a f f i n s a r e b u i l t up u s i n g
o n l y c a r b o n , w h i c h h a s a v a l e n c y of 4 a n d
h y d r o g e n , w h i c h h a s a v a l e n c y of i.
The
first few paraffin molecules, together
w i t h t h e i r n a m e s a r e s h o w n in f i g u r e I.
In a p a r a f f i n o n e is a l l o w e d n e i t h e r
d o u b l e b o n d s n o r c y c l e s , so all p a r a f f i n s
with n carbon atoms share the empirical
formula
In t e r m s of t h e s e c o n v e n t i o n s , then,
methane would be represented thus ["H",
Cn
H2n+2
"H",
"H"l
a n d o n e of s e v e r a l p o s s i b l e
t i o n s for p r o p a n e is
b u t for all n ~
there are several distinct molecules
("isomers") w i t h the s a m e
formula.
T h e n u m b e r of i s o m e r s r i s e s
r a t h e r r a p i d l y w i t h n.
In c o u n t i n g
i s o m e r s it s h o u l d be b o r n e in m i n d t h a t
t h e four b o n d p o s i t i o n s o n a g i v e n c a r b o n
a t o m a r e in f a c t i n d i s t i n g u i s h a b l e
(i.e.
the four bond positions
can
be
freely
i n t e r c h a n g e d ).
So w h a t s e e m
at f i r s t
to be d i f f e r e n t m o l e c u l e s m a y in fact
t u r n out to be d i f f e r e n t o r i e n t a t i o n s of
the same m o l e c u l e .
N o t i c e in p a r t i c u l a r
t h a t the p h e n o m e n o n of " s t e r e o i s o m e r i s m " (a m o l e c u l e b e i n g d i f f e r e D t
f r o m its m i r r o r image) is n o t p o s s i b l e
with paraffins.
representa-
[ [ "H", "H", "H" l, [ "H", "H", "H" ], "H" p "H" 3
W e c o u l d o b v i o u s l y s a v e a g r e a t d e a l of
space by not representing hydrogen atoms
explicitly, but following a principle of
s e p a r a t i o n of c o n c e r n s let us l e a v e t h a t
as an o p t i m i s a t i o n to be c a r r i e d o u t later.
D e f i n i n 9 an E q u i v a l e n c e
Molecules
Relation
on
T h e a b o v e r e p r e s e n t a t i o n for a m o l e c u l e
has t w o e l e m e n t s o f a r b i t r a r i n e s s - f i r s t
that any carbon atom could have been
p i c k e d o n as t h e l e a d i n g o n e - s e c o n d t h a t
for e a c h c a r b o n a t o m t h e o r d e r in w h i c h
its d e p e n d e n t r a d i c a l s a r e l i s t e d is
a r b i t r a r y and c o u l d be f r e e l y p e r m u t e d .
T h e p r o b l e m as p o s e d is n o t m e r e l y to
c o u n t t h e n u m b e r of i s o m e r s for e a c h n,
b u t a c t u a l l y to p r o d u c e a r e p r e s e n t a t i o n
(just one) of e a c h m o l e c u l e .
Choosing a Representation
Type "Paraffin Molecule"
"H",
for t h e D a t a
W e c a p t u r e t h i s b e l o w in t e r m s of t h r e e
transformations which we can carry out on
the representation of a molecule " i n v e r t " , " r o t a t e " a n d "swap".
Each
t r a n s f o r m a t i o n is e x p r e s s e d h e r e by a f u n c t i o n w h i c h c a n be a p p l i e d to a r e p r e s e n t a t i o n o f a m o l e c u l e to r e t u r n a ( p e r h a p s
L e t us b e g i n by c h o o s i n g a r e p r e s e n t a t i o n
for t h e m o l e c u l e s .
N o r m a l l y in t o p - d o w n
prograr~ning o n e is a d v i s e d to d e l a y
d e c i s i o n s a b o u t r e p r e s e n t a t i o n as l o n g as
87
H
-
H
H
H
H
H
I
I
I
I
i
C
-
H
H
-
C
H
H
-
H
H
-
H
I
C
-
H
H
H
I
i
I
C
I
-
I
H
C
-
I
H
C
l
H
H
-
I
-
I
C
-
H
-
H
i
H
H
H
H
t
-
C
propane
H
H
-
H
[
C
-
C
i
-
I
I
H
C
I
ethane
methane
-
C
I
I
H
-
H
I
H
H
C
H
H - C - H
normal
I
butane
H
iso-butane
H
-
H
H
H
H
i
I
I
I
C
-
C
-
C
-
C
H
I
-
C
I
I
i
I
I
H
H
H
H
H
normal
-
H
H
I
pentane
H
-
C
-
H
"H
H
]
H
I
H
-
C
-
H
H
H
I
I
I
C
I
I
H
H
-
C
-
C
H
-
C
I
- -
C
C
I
-
I
H
H
I
H
H
-
C
-
i
H
H
H-C-H
I
H
neo-pentane
iso-pentane
Figure
1
"Some
Paraffin
Molecules"
88
H
-
H
modified)
cule.
representation
invert
of the
[[a, b, c~ d, e, f3
= [a, b, c [d, e,
quotient
quotient
x i = "H"
rotate
[a, b,
c, d3 = [b, c, d, a3
b,
c, d3 = [b,
a b = member
equivclass
a, c, d3
closure
(equivclass
a = closure
under
under
para
a)b
[rotate,invert,swap3[a]
laws f s =
s ++ c l o s u r e ' f s s
0 = ["H"]
output
= layn
everything we need
s o l u t i o n to o u r
t o d e f i n e an o u t of " p a r a f f i n " ,
(append(map
paraffin
El..J))
a n d p r i n t i n g " o u t p u t " w i l l g i v e us the
r e q u i r e d l i s t o f p a r a f f i n m o l e c u l e s in
o r d e r o f i n c r e a s i n g size.
T h e l i s t is
i n f i n i t e a n d so the p r o c e s s of p r i n t i n g
it w i l l go o n
f o r e v e r , or at l e a s t
u n t i l t h e u s e r i n t e r r u p t s it at t h e terminal.
(Note - " a p p e n d " is a l i b r a r y
f u n c t i o n w h i c h t a k e s a l i s t of l i s t s a n d
j o i n s t h e n all t o g e t h e r w i t h "++"; "layn"
is a s t a n d a r d l a y o u t f u n c t i o n w h i c h
c a u s e s t h e e l e m e n t s of a l i s t to be
p r i n t e d o n e p e r l i n e w i t h n u m b e r e d lines.)
s a})
T h e k e y idea is e m b o d i e d in t h e f u n c t i o n
" c l o s u r e u n d e r l a w s " w h i c h t a k e s a set
of f u n c t Y o n s a ~ d a set of o b j e c t s a n d
finds the closure of the latter under
r e p e a t e d a p p l i c a t i o n s of the m e m b e r s
of t h e f o r m e r .
C l e a r l y t h i s is a
f u n c t i o n w h i c h c o u l d f i n d application.s
in a w i d e r a n g e of p r o b l e m s b e y o n d t h e
p r e s e n t one.
T h e a b o v e s o m e w h a t ind i r e c t d e f i n i t i o n , v i a the a u x i l i a r y
function closure' and closure", was
c h o s e n f o r r e a s o n s of e f f i c i e n c y .
Molecules
b}
f [] = []
At this point we have
to p r o d u c e a r u n n a b l e
problem.
We have only
p u t s t r u c t u r e in t e r m s
thus
In t h e a b o v e " m e m b e r " , "map" a n d " m k s e t "
are library functions - "mkset" removes
r e p e t i t i o n s f r o m a list, and "++" is t h e
a p p e n d o p e r a t o r o n lists.
All
a
n = {[a, b, c] I i , j , k ÷ [ 0 . . n - 1 3 ;
i~j~k; i + j + k = n - l ;
a ÷ p a r a i; b + p a r a j; c ÷ p a r a k}
c l o s u r e " f s t = [3, t = [3
= t ++ c l o s u r e ' f (s++t) t, t ~ [1
Generating
Size.
f x;
f
para
laws
closure' f s t = closure" f s
( m k s e t { a l f ' ÷ f ; a ÷ m a p f' t;
~ member
(n-l)}
= a:{blb÷quotient
W h e r e '~para", to be d e f i n e d b e l o w is a
f u n c t i o n w h i c h r e t u r n s a l i s t (perhaps
w i t h r e p e t i t i o n s ) of all p a r a f f i n r a d i c a l s of a g i v e n size.
The function
"quotient" defined above takes the
q u o t i e n t o f a set w i t h r e s p e c t to a
g i v e n e q u i v a l e n c e r e l a t i o n (i.e. r e t u r n s
a set c o n t a i n i n g o n l y o n e r e p r e s e n t a t i v e
of e a c h e q u i v a l e n c e c l a s s p r e s e n t in the
o r i g i n a l set) a n d is u s e d a b o v e t o e n s u r e
t h a t e a c h m o l e c u l e is r e p r e s e n t e d o n l y
o n c e in t h e f i n a l o u t p u t .
There follows
a d e f i n i t i o n o f "para"
It is c l e a r , m o r e o v e r , t h a t all t h e r e p r e s e n t a t i o n s of a g i v e n m o l e c u l e c a n be o b tained from any one representation by rep e a t e d a p p l i c a t i o n s of the a b o v e t r a n s formations.
We can freely permute the
b o n d s of t h e l e a d i n g c a r b o n a t o m
by comb i n e d a p p l i c a t i o n s o f " r o t a t e " a n d "swap"
a n d by c o m b i n i n g t h e s e w i t h a p p l i c a t i o n s
of " i n v e r t " a n y c a r b o n a t o m c a n e v e n t u a l l y
be b r o u g h t into t h e l e a d i n g p o s i t i o n .
We
c a n n o w d e f i n e a p r e d i c a t e " e q u i v " on
r e p r e s e n t a t i o n s of p a r a f f i n m o l e c u l e s
s u c h t h a t " e q u i v a b" d e t e r m i n e s w h e t h e r
a a n d b r e p r e s e n t t h e s a m e m o l e c u l e . Thus:
equiv
(a:x)
~
x = x,
[a,
f
equiv
1 x ÷ para
f]]
invert
swap
paraffin n = quotient
{Ix, "H", "H", "H"l
same m o l e -
~his solution, however, runs with appalling s l o w n e s s (I t r i e d it) m a i n l y b e c a u s e
of e a s i l y r e m o v a b l e i n e f f i c i e n c i e s in o u r
d e f i n i t i o n of "para".
T h e r e is a m i n o r
problem and a major problem.
T h e m i n o r p r o b l e m is t h a t t h e w a y w e
c h o o s e i, j a n d k in t h e d e f i n i t i o n of
p a r a n is n e e d l e s s l y w a s t e f u l .
We could
in fact f i r s t c h o o s e i in t h e set
[0.. (n-l)/3] t h e n c h o o s e j in t h e set
[i. (n-l-i)/23, w h e r e u p o n k is f i x e d to be
n-l-i-j.
T h i s l e a d s us to r e w r i t e t h e
s e c o n d l i n e in t h e d e f i n i t i o n of "para"
as
of a G i v e n
B e c a u s e of t h e a b s e n c e o f c y c l e s e v e r y
p a r a f f i n m o l e c u l e m u s t c o n t a i n at
l e a s t o n e o c c u r r e n c e of t h e m e t h a n e
r a d i c a l C H 3 a n d w e c a n w i t h o u t l o s s of
g e n e r a l i t y c h o o s e t h i s to t h e " l e a d i n g "
c a r b o n atom.
A function for generating
a list containing
(once each) a l l t h e
paraffin molecules with n carbons can
therefore be written
p a r a n = {[a, b, c3 I i ÷ [0.. (n-i)/33;
j ÷ [i.. ( n - l - i ) / 2 ] ;
a ÷ para i ;
b ÷ p a r a j ; c ÷ p a r a (n-l-i-j) }
T h e m a j o r p r o b l e m is t h a t in e v a l u a t i n g
" p a r a n" w e r e p e a t e d l y r e - e v a l u a t e p a r a i
for e a c h i < n a l a r g e
n u m b e r of times.
W e n e e d to m a k e p a r a into a " m e m o f u n c t i o n " [ M i c h i e 681 i.e. a f u n c t i o n
w h o s e v a l u e is c a l c u l a t e d o n l y o n c e for
89
e a c h a r g u m e n t , n a m e l y o n its f i r s t c a l l
a n d t h e r e u p o n s t o r e d in a table, so t h a t
if it is r e q u i r e d a g a i n it c a n be f o u n d
by t a b l e lookup, r a t h e r t h a n b y r e c a l c u lation.
F o r a r e c u r s i v e f u n c t i o n , like
"para", m e m o - i s a t i o n l e a d s to an e x p o n e n t i a l i m p r o v e m e n t in p e r f o r m a n c e .
(or to
p u t it a n o t h e r way, f a i l u r e to m e m o - i s e
l e a d s an e x p o n e n t i a l d e t e r i o r a t i o n in
performancel)
T h e f i r s t g e n e r a l l e s s o n I w o u l d d r a w is
t h a t b y t h e u s e of a n a p p r o p r i a t e l y d e signed applicative language the effort
n e c e s s a r y to a r r i v e at (and the s p a c e
n e c e s s a r y to e x p r e s s ) a n e x e c u t a b l e s o l u tion to a problem can be
r e d u c e d to a
s m a l l f r a c t i o n of t h a t r e q u i r e d in a t r a ditional programming language.
E v e n in
the present situation, where we lack the
h a r d w a r e n e c e s s a r y for t h e d i r e c t s u p p o r t
of a p p l i c a t i v e l a n g u a g e s , a n i m p l e m e n t a t i o n of a n a p p l i c a t i v e l a n g u a g e c a n b e an
e x t r e m e l y v a l u a b l e t o o l for t h e d e v e l o p m e n t a n d t e s t i n g of a l g o r i t h m s .
For
e x a m p l e I h a d a n u m b e r of m i s c o n c e p t i o n s
a b o u t the p a r a f f i n s p r o b l e m (which I
e l i d e d f r o m t h e a b o v e a c c o u n t ) of w h i c h I
w a s f a i r l y q u i c k l y d i s a b u s e d by i n t e r a c t ing with the KRC system.
If I ~ n o w h a d to
s o l v e t h e p r o b l e m in, say, P A S C A L , I w o u l d
do so w i t h m u c h g r e a t e r c o n f i d e n c e .
A t f i r s t s i g h t it s e e m s t h a t a m e m o f u n c t i o n i n v o l v e s in an e s s e n t i a l w a y t h e
u s e o f s i d e - e f f e c t s for its e x p r e s s i o n .
T h i s is, h o w e v e r , n o t t h e case.
T h e r e is
in S A S L a s t a n d a r d t r a n s f o r m a t i o n for
t u r n i n g a f u n c t i o n into a m e m o f u n c t i o n
in a p u r e l y a p p l i c a t i v e w a y - see
[ T u r n e r 81a, C h a p t e r 43.
A p p l y i n g the
i d e a to "para" l e a d s us to r e w r i t e its
d e f i n i t i o n as f o l l o w s :
para
0 = ["H"3
para
n = paralist
paralist
T h e s e c o n d g e n e r a l a b s e r v a t i o n is t h a t t h e
language framework we are using here
supports very nicely the following separat i o n of c o n c e r n s (which h a s of c o u r s e b e e n
advocated many times before).
In a f i r s t
step we concentrate on writing down a
l o g i c a l l y c o r r e c t d e f i n i t i o n of the d e sired function, completely ignoring cons i d e r a t i o n s of e f f i c i e n c y .
Recursion
p l u s set a b s t r a c t i o n is a v e r y p o w e r f u l
c o m b i n a t i o n for t h i s p u r p o s e , e n a b l i n g us
to t h i n k v e r y "big t h o u g h t s " in o n e go.
Typically, however, the definitions we
a r r i v e at in t h i s w a y h a v e a n e x p o n e n t i a l
or combinatorial run-time, whereas there
m a y e x i s t an a l g o r i t h m w h i c h is l i n e a r
(or at l e a s t p o l y n o m i a l ) .
In a s e c o n d
s t e p w e r e p a i r t h e e f f i c i e n c y of t h e
d e f i n i t i o n , by a p p l y i n g t r a n s f o r m a t i o n s
know to preserve correctness.
In a surp r i s i n g l a r g e n u m b e r of c a s e s it t u r n s
o u t t h a t a s m a l l n u m b e r of s t a n d a r d o p t i misations are sufficient to bring about
t h e n e c e s s a r y i m p r o v e m e n t in p e r f o r m a n c e .
T w o in p a r t i c u l a r s e e m to be of s u c h
g e n e r a l a p p l i c a b i l i t y as to d e s e r v e s p e c i a l m e n t i o n in a n e x t (and final) s e c t i o n
of t h i s p a p e r .
n
= map genpara
If..]
g e n p a r a n = {[a, b, c] I i + [0.. (n-i)/33;
j ÷ [i.. ( n - l - i ) / 2 ] ; a ÷ p a r a i ;
b + p a r a j ; c ÷ p a r a (n-l-i-j)}
In t h e a b o v e " g e n p a r a " p e r f o r m s the c a l c u l a t i o n , b u t t h e r e c u r s i o n is r e p l a c e d by
table lookup.
T h e l o o k u p t a b l e is r e p r e sented by the (infinite)list "paralist"
(in S A S L a n d K R C l i s t s c a n b e i n d e x e d b y
a p p l y i n g t h e m to a n i n t e g e r ) .
The elem e n t s o f " p a r a l i s t " a r e i n i t i a l i s e d by
c a l l i n g " g e n p a r a " b u t t h a n k s to l a z y
evaluation they only come into existence
as t h e y a r e a c c e s s e d .
W e n o w h a v e a r u n n a b l e p r o g r a m for o u r
paraffin problem.
The complete text
o f t h e p r o g r a m is s h o w n in f i g u r e 2,
a c c o m p a n i e d b y a n i n i t i a l s e g m e n t of
its o u t p u t .
T h e c o m m a n d " o u t p u t S " is
a n i n s t r u c t i o n to t h e KRC s y s t e m to p r i n t
t h e list, o u t p u t .
Lessons
drawn
T h e t h i r d a n d f i n a l o b s e r v a t i o n I w i s h to
m a k e r e l a t e s m o r e s p e c i f i c a l l y to t h e
paraffin problem.
I believe that the
reason why this seems on first inspection
to b e r a t h e r a h a r d p r o b l e m is b e c a u s e it
involves an unfree data type and I suspect
t h i s is c h a r a c t e r i s t i c of a lot of the
more recalcitrant problems one meets.
A
g e n e r a l w a y of c h a r a c t e r i s i n g a n u n f r e e
d a t a t y p e , w h i c h w e u s e d in t h i s e x a m p l e ,
is as t h e q u o t i e n t o f a f r e e d a t a t y p e
under an equivalence relation and a good
way of defining an equivalence relation
is t o g i v e a set o f l a w s of w h i c h it is
the closure.
Our function "closure
u n d e r l a w s " g i v e s us a c o n v e n i e n t h ~ n d l e
o n t o ~ h i s a n d I h o p e it w i l l t u r n o u t to
b e u s e f u l for o t h e r a p p l i c a t i o n s in t h e
future.
The above program was the fruit of about
a n h o u r ' s l a b o u r at a t e r m i n a l a n d s e e m s
a r e a s o n a b l y c o n v i n c i n g d e m o n s t r a t i o n of
t h e u t i l i t y of r e c u r s i o n e q u a t i o n s p l u s
set a b s t r a c t i o n as a l a n g u a g e f r a m e w o r k .
T h e p r o g r a m is far f r o m f u l l y p o l i s h e d
a n d h a s v e r y m u c h t h e s t a t u s of a f i r s t
cut.
B y t h e a p p l i c a t i o n of t w o o b v i o u s
o p t i m i s a t i o n s - (a) t h e r e m o v a l o f t h e
r e d u n d a n t "H"s f r o m t h e i n t e r n a l r e p r e s e n t a t i o n of m o l e c u l e s a n d (b) t h e u s e
of a n i d e a c a l l e d " f i l t e r p r o m o t i o n "
(see
later), I was subsequently led to a program which ran perhaps ten times faster
than the above.
Rather than pursuing
t h e s e f u r t h e r r e f i n e m e n t s in d e t a i l here,
however, this seems an appropriate moment
to b r e a k o f f f r o m t h e c o n s i d e r a t i o n of t h i s
particular problem and draw some general
lessons.
90
output = layn (append (map paraffin [I..]))
paraffin n = quotient equiv {[x,"H","H","H"]Ix<-para
(n-
I)}
para 0 : ["H"]
para n = paralist n
paralist = map genpara [I..]
genpara n = {[a,b,c][i<-[0..(n-])/3];j<-[i..(n-]-i)/2];
a<-para i;b<-para j;c<-para ( n -
] - i - j)}
equiv a b = member (equivclass a) b
equivclass x = closure under laws [invert,rotate,swap]
invert [[a,b,c],d,e,f]
Ix]
= [a,b,c,[d,e,f]]
invert x = x, × ] = "H"
rotate [a,b,c,d] = [b,c,d,a]
swap [a,b,c,d] : [b,a,c,d]
closure'
closure"
f s t = closure"
f s (mkset {a',f'<-f;a<-map f' t;\member s a})
f s t = [], t = []
= t ++ closure'
f (s ++ t) t
closure under laws f s : s ++ closure'
quotient f (a:x) = a:{b[b<-quotient
f s s
f x;\f a b}
quotient f [] = []
output!
I)
[ "H" ," H" ," H" ," H" ]
2)
[ [ "H" ,"H" ,"H" ] ," H" ," H" ," H" ]
3)
[ [ "H" ," H" ,[ "H" ," H" ,"H" ] ] ,"H" ," H" ," H" ]
4)
[ [ "H"," H", [ "H"," H", [ "H" ," H" ," H" ] ] ], "H" ," H"," H" ]
5)
[ [ "H" ,[ "H" ," H" ,"H"], [ "H" ," H" ," H" ] ] ,"H" ,"H" ,"H" ]
6)
[ [ "H" ," H" ,[ "H" ,"H" ,[ "H" ," H" ,[ "H" ,"H" ," H" ] ] ] ] ,"H" ,"H" ,"H"]
7)
[ [ "H" ," H", [ "H", [ "H" ," H" ,"'H" ], [ "H" ," H" ," H" ] ] ] ," H" ," H"," H" ]
8)
[ [ [ "H" ," H" ," H" ], [ "H" ," H" ," H" ], [ "H" ," H" ," H" ] ], "H" ," H" ," H" ]
9)
[ [ "H" ," H" ,[ "H" ," H", [ "H" ," H", [ "H" ," H", [ "H" ," H" ," H" ] ] ] l ] ," H" ," H"," H" ]
I0)
[ [ "H" ," H", [ "H" ," H", [ "H", [ "H" ," H" ," H" ], [ "H"," H" ," H" ] ] ] ]," H" ," H" ," H" ]
11)
[ [ "H" ," H", [ "H" ,[ "H" ," H" ,"H" ], [ "H" ," H", [ "H" ," H" ," H" ] ] ] ], "H" ," H" ,"H" ]
12)
[ [ "H" ," H" , [ [ "H" ," H" ,"H" ], [ "H" ," H" ," H" ], [ "H" ," H" ,"H" ] ] ], "H" ,"H" ," H" ]
13)
[ [ "H" ,[ "H" ," H" ," H" ], [ "H", [ "H" ," H" ," H" ], [ "H" ,"H" ," H" ] ] ] , " H " ,"H" ,"H" ]
•
.
•
•
FIGURE 2
IT.e Paraffins program in K R C , with some initial output.
91
Two useful
optimisations
eliminate permutations, say by deciding
to a l l o w o n l y i n c r e a s i n ~ p a r t i t i o n s , e.g.
a m o n g t h e p a r t i t i o n s of 3 w e a l l o w
[ i , i , i ] a n d [ l , 2 ] b u t n o t [2,1].
Our
f i r s t t h o u g h t c o u l d b e to a p p l y a f i l t e r
to our original function
T h e t w o o p t i m i s a t i o n s w h i c h w a r r a n t spec i a l m e n t i o n h e r e are m e m o - i s a t i o n
( o r i g i n a l l y d u e to D o n a l d M i c h i e ) a n d
"filter promotion"
(both t h e n a m e a n d t h e
idea o f w h i c h a r e d u e to J o h n D a r l i n g t o n )
[ D a r l i n g t o n 79].
partitions'
(a) M e m o - i s a t i o n
(n-i)
+ fib
(n-2),
partitions'
partitions'
n>2
A l t h o u g h t h i s h a s s o m e c l a i m to be c o n s i d e r e d the m o s t n a t u r a l d e f i n i t i o n it
suffers from a run-time that increases
e x p o n e n t i a l l y w i t h n.
W e c o u l d of c o u r s e
program the well-known linear algorithm
explicitly
(by t a i l r e c u r s i o n ) b u t it is
in f a c t p o s s i b l e to a c h i e v e a l i n e a r r u n t i m e w i t h o u t a b a n d o n i n g t h e s t r u c t u r e of
the above definition.
= m a p f i b [i..3
i
i
f i b l i s t (n-l)
+ fiblist(n-2),
T h e r u n - t i m e of "fib"
s t e a d of e x p o n e n t i a l ~
is n o w
linear
Filter
0 = [[]]
n = {i : p I i ÷ F l . . n ] ;
p÷partitions'
(n-i) ;
p = [] v i ~ h d p}
Note
T h e u s e of Z e r m e l o - F r a n k e l set a b s t r a c t i o n
as a n i m p l e m e n t a b l e l a n g u a g e f e a t u r e s e e m s
o r i g i n a l l y to h a v e b e e n p r o p o s e d b y J o h n
D a r l i n g t o n [ D a r l i n g t o n 75].
REFERENCES
D a r l i n g t o n 75
" A p p l i c a t i o n s of p r o g r a m
t r a n s f o r m a t i o n to p r o g r a m s y n t h e s i s "
Proceedings conference on proving and
i m p r o v i n g p r o g r a m s , A r c et S e n a n s 1 9 7 5
D a r l i n g t o n 79
"A s y n t h e s i s
sorting algorithms"
A c t a I n f o r m a t i c a 1979
n>2
in-
of
several
M i c h i e 68
"Memo-functions - a language
feature with role learning properties"
in E X P E R I M E N T A L P R O G R A M M I N G 1 9 6 6 - 7 ,
Edinburgh University, Dept.of Machine
Intelligence and Perception, January
1968
O b v i o u s l y t h e t e c h n i q u e c a n be a p p l i e d
to a n y f u n c t i o n o f i n t e g e r a r g u m e n t s .
Notice
that this purely applicative
a p p r o a c h to m e m o f u n c t i o n s d e p e n d s
heavily on the fact that the language
in w h i c h w e a r e w o r k i n g h a s a n o n strict semantics.
(Incidentally, a
m o r e r a d i c a l a p p r o a c h , w h i c h m i g h t be
w o r t h p u r s u i n g , w o u l d be to t r y a n d g e t
t h e s y s t e m to p e r f o r m t h i s c l a s s of o p timisations automatically.
One quite
promising approach, with which I have
been experimenting,
is to m o d i f y t h e
r u n - t i m e s y s t e m by k e e p i n g a n a s s o c i a t i v e c a c h e of t h e r e s u l t s o f a l l r e c e n t
function applications.)
(b)
n)
Like memo-isation, filter promotion can
l e a d t o v e r y d r a m a t i c i m p r o v e m e n t s in
performance.
W e do t h i s b y t u r n i n g "fib" i n t o a m e m o function.
We introduce a data structure
" f i b l i s t " in w h i c h w e s t o r e t h e v a l u e s of
t h e f u n c t i o n a n d r e p l a c e a l l c a l l s to t h e
f u n c t i o n in t h e r e s t of t h e p r o g r a m , including the recursive calls inside the
d e f i n i t i o n of "fib" i t s e l f , b y t a b l e lookup.
Thus
fiblist
fib i =
fib 2 =
fib n =
increasing
(partitions
where we can easily give a recursive
d e f i n i t i o n of " i n c r e a s i n g " as a p r e d i c a t e
o n lists.
W e c a n g e t a c o n s i d e r a b l e :imp r o v e m e n t in p e r f o r m a n c e , h o w e v e r , b y
pushing the filter inside the generator
" p a r t i t i o n s " so t h a t t h e u n w a n t e d l i s t s
a r e n o t c r e a t e d in t h e f i r s t p l a c e (the
reader should compare this with the defin i t i o n s of p a r t i t i o n s e a r l i e r in t h e
paper):
This optimisation technique has already
been demonstrated earlier, on the funct i o n "para".
A similar example, however, m a y b r i n g o u t t h e m e t h o d m o r e
clearly.
Consider the following"obvious"
d e f i n i t i o n o f t h e f u n c t i o n "fib n" w h i c h
returns the n'th fibonacci number.
fib i = i
fib 2 = i
fib n = fib
n = filter
T u r n e r 76
" S A S L L a n g u a g e M a n u a l " , St.
Andrews University Technical Report,
December 1976
T u r n e r 81
"KRC L a n g u a g e M a n u a l " ,
U n i v e r s i t y of K e n t (in p r e p a r a t i o n )
T u r n e r 81a
D.Phil.Thesis,
University 1981
Promotion
Again we can best bring out the technique
by means of a simple example.
Suppose
we a r e a s k e d to m o d i f y o u r e a r l i e r d e f i n i t i o n of t h e f u n c t i o n " p a r t i t i o n s " so as to
92
Oxford