The Semantic Elegance D. A. University of A p p l i c a t i v e Languages Turner of K e n t a t C a n t e r b u r y An overview In w h a t d o e s the a l l e g e d s u p e r i o r i t y of applicative languages consist? In the l a s t a n a l y s i s the a n s w e r m u s t be in t e r m s o f t h e r e d u c t i o n in t h e t i m e r e q u i r e d to p r o d u c e a c o r r e c t p r o g r a m to s o l v e a g i v e n problem. On r e f l e c t i o n I d e c i d e d t h a t t h e b e s t w a y to d e m o n s t r a t e t h i s w o u l d be to t a k e some r e a s o n a b l y n o n - t r i v i a l p r o b l e m a n d s h o w how, by p r o c e e d i n g w i t h i n a c e r t a i n k i n d of a p p l i c a t i v e l a n g u a g e f r a m e w o r k it w a s p o s s i b l e to d e v e l o p a w o r k i n g s o l u t i o n w i t h a f r a c t i o n of t h e e f f o r t that w o u l d h a v e b e e n n e c e s s a r y in a conventional imperative language. The particular problem I have chosen also b r i n g s o u t a n u m b e r of g e n e r a l p o i n t s of interest which I shall discuss briefly afterwards. of KRC A K R C p r o g r a m is a c o l l e c t i o n of e q u a t i o n s by m e a n s of w h i c h t h e u s e r a t t a c h e s n a m e s to v a r i o u s o b j e c t s in K R C ' s u n i v e r s e of discourse. T h e u n i v e r s e of d i s c o u r s e c o n tains four types of object. There are two b a s i c d a t a t y p e s , n u m b e r s and s t r i n g s ; a n d t w o t y p e s of s t r u c t u r e d o b j e c t - l i s t s and functions. Numbers and strings have the s o r t of p r o p e r t i e s o n e w o u l d e x p e c t . L i s t s a r e d e n o t e d by u s i n g s q u a r e b r a c k e t s a n d c o m m a s - t h u s [i, 2, 3, 4] a n d t h e e m p t y l i s t is w r i t t e n []. L i s t s m a y be a c c e s s e d by i n d e x i n g , so if L is the n a m e of a l i s t L 3 d e n o t e s its third component. An important operator on l i s t s is ":" w h i c h j o i n s a n e w m e m b e r at t h e front. So B e f o r e p r o c e e d i n g it w i l l be n e c e s s a r y for m e to q u i c k l y o u t l i n e t h e l a n g u a g e f r a m e w o r k w i t h i n w h i c h we s h a l l b e w o r k ing. V e r y b r i e f l y it c a n be s u m m a r i s e d a s ( n o n - s t r i c t , h i g h e r order) r e c u r s i o n e q u a t i o n s + set a b s t r a c t i o n . Obviously w h a t m a t t e r s are the u n d e r l y i n g s e m a n t i c c o n c e p t s , n o t the p a r t i c u l a r s y n t a x t h a t is u s e d to e x p r e s s them, b u t for t h e s a k e of d e f i n i t e n e s s I s h a l l u s e t h e n o t a t i o n of KRC (= " K e n t R e c U r s i v e C a l c u l a t o r " ) , an a p p l i c a t i v e p r o g r a m m i n g s y s t e m i m p l e m e n t e d at t h e U n i v e r s i t y of K e n t [ T u r n e r 81]. K R C is f a i r l y c l o s e l y b a s e d o n the e a r l i e r l a n g u a g e SASL, [ T u r n e r 763, b u t I h a v e a d d e d a f a c i l i t y for set a b s t r a c t i o n . 0: [1, 2, 33 h a s t h e v a l u e [0, i, 2, 33. The elements of a l i s t m a y be of a n y t y p e a n d n e e d n o t all be of t h e same type. L i s t s m a y be f i n i t e o r i n f i n i t e , for e x a m p l e t h e equation x = l:x d e f i n e s x to be a n a m e l i s t [I, i, 1 ...]. for t h e infinite The universe of discourse contains all possible infinite lists (actually not all of t h e m c a n b e d e n o t e d b y w r i t i n g d o w n e q u a t i o n s , b u t for r e a s o n s of c o n t i n u i t y in t h e s e m a n t i c s w e h a v e t o say t h e y a r e all there). A u s e f u l p i e c e of s h o r t h a n d is t h e ".." n o t a t i o n , t h u s [a..b] Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the ACM copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Association for Computing Machinery. To copy otherwise, or to republish, requires a fee and/or specific permission. d e n o t e s t h e list of n u m b e r s f r o m a t o b. This also has form, so for e x a m p l e [I..] denotes © 1981 ACM 0-89791-060-5/81-10/0085 (inclusive) an open-ended $00.75 85 the list of all natural numbers. Functions are denoted by writing down one o r m o r e e q u a t i o n s , w i t h t h e n a m e of t h e f u n c t i o n f o l l o w e d by s o m e f o r m a l p a r a m e t e r s on t h e l e f t a n d a n e x p r e s s i o n g i v i n g a v a l u e for the f u n c t i o n o n t h e right. So for e x a m p l e f u n c t i o n s f o r squaring a n u m b e r a n d for c a l c u l a t i n g f a c t o r i a l s w o u l d be w r i t t e n r e s p e c t i v e l y ( " p r o d u c t " is a l i b r a r y f u n c t i o n ) . zfexpression qualifier [i..n] 0 = A A m n = A (n-l)), (a:x) = a * product f x = {a I a ÷ x; f a} T h e u s e of t h e c o n s t r u c t is p a r t i c u l a r l y c o n v e n i e n t w h e n t h e r e is m o r e t h a n o n e g e n e r a t o r i n v o l v e d , for e x a m p l e a f u n c t i o n for g e n e r a t i n g t h e c a r t e s i a n p r o d u c t of two l i s t s (i.e. a l i s t of a l l p a i r s f o r m e d by d r a w i n g o n e f r o m each) c a n b e w r i t t e n m>0&n>0 N o t i c e t h a t t h e o r d e r in w h i c h the a b o v e e q u a t i o n s a r e w r i t t e n h a s no l o g i c a l s i g nificance (we i n s e r t g u a r d s w h e r e n e c e s s a r y to e n s u r e t h i s ) . As another example of the use of pattern matching, t h e l i b r a r y f u n c t i o n " p r o d u c t " c a n be defined [1 = 1 I a ÷ x} The library "filter", which filters a list t h r o u g h a g i v e n p r e d i c a t e , c o u l d be defined i, m > 0 (m-l) (A m product f x = {f a filter (m-l) product ::= b o o l e a n - e x p r e s s i o n map 0 n = n + 1 A m ::= v a r + l i s t - e x p r e s s i o n guard O n e e x a m p l e is the f o l l o w i n g d e f i n i t i o n of t h e (2nd order) l i b r a r y f u n c t i o n "map" which applies a function to every element of a l i s t A f u n c t i o n c a n be d e f i n e d b y m o r e t h a n one equation, with the different cases b e i n g d i s t i n g u i s h e d b y the u s e of p a t t e r n m a t c h i n g in t h e f o r m a l p a r a m e t e r s a n d / o r t h e u s e of g u a r d s ( b o o l e a n e x p r e s s i o n s , w r i t t e n o n t h e r i g h t of a n e q u a t i o n following a comma). Ackermann's function for e x a m p l e , c o u l d be d e f i n e d b y t h e following three equations A generator T h e v a r i a b l e o n t h e l e f t o f a g e n e r a t o r is a l o c a l v a r i a b l e o f t h e zf e x p r e s s i o n a n d r a n g e s o v e r t h e m e m b e r s of t h e l i s t o n the r i g h t o f t h e "÷" sign. sq n = n * n fac n = p r o d u c t ::= { e x p l q u a l i f i e r ; -.. ...; q u a l i f i e r } ::= g e n e r a t o r l g u a r d cp x y = {[a,b]la÷x;b÷y} T h e a d d i t i o n of set a b s t r a c t i o n r e p r e s e n t s a v e r y c o n s i d e r a b l e i n c r e a s e in t h e expressive power of an applicative language. C o n s i d e r for e x a m p l e t h e p r o b l e m of defining a function which will return a l i s t o f all p o s s i b l e p a r t i t i o n s o f a number into positive integers. For the s a k e o f d e f i n i t e n e s s , l e t us s a y w i t h p e r m u t a t i o n s , so [2.11 a n d [1,21 a r e t w o d i f f e r e n t p a r t i t i o n s o f 3. Such a funct i o n c a n b e e x p r e s s e d r a t h e r s u c c i n c t l y as follows x N o t e t h a t t h e u s e of p a t t e r n s i n v o l v i n g ":" o n t h e l e f t of a n e q u a t i o n a v o i d s t h e n e e d for e x p l i c i t u s e of t h e s e l e c t o r s "hd" a n d "tl" o n lists. F u n c t i o n s c a n be n o n - s t r i c t if the e q u a t i o n s i m p l y t h a t t h e y s h o u l d be. For e x a m p l e if w e d e f i n e f by f x = 3 partitions 0 = [[]] partitions n = {i:p I i÷ [l..n]; p+ partitions(n-i)} then of course f must always return the r e s u l t 3, e v e n if its a r g u m e n t is r e presented by a non-terminating computation. [We d o n o t d i s c u s s i m p l e m e n t a t i o n s at a l l here, b u t as t h e reader will have already deduced from the p r e s e n c e of i n f i n i t e lists, it m u s t i n v o l v e s o m e f o r m of l a z y evaluation.] N o t i c e b y t h e w a y t h a t t h e v a r i a b l e s introduced by generators come into scope f r o m l e f t to r i g h t , so l a t e r g e n e r a t o r s c a n i n v o l v e e a r l i e r o n e s (but n o t v i c e versa). Following this extremely brief overview of t h e n o t a t i o n of KRC, w e n o w t u r n o u r a t t e n t i o n t o a s o m e w h a t l a r g e r p r o b l e m , as promised. Set A b s t r a c t i o n We use a language construct closely based on Zermelo-Frankel set a b s t r a c t i o n (except t h a t t h e r e s u l t is h e r e a l i s t r a t h e r t h a n a set). A n (informal) syntax for these expressions may be given as f o l l o w s 86 The statement p o s s i b l e , b u t in t h i s c a s e it is c l e a r t h a t a c e n t r a l p r o b l e m is g o i n g to be def i n i n g an e q u i v a l e n c e r e l a t i o n o n m o l e c u l e s , a n d I f o u n d it d i f f i c u l t to b r i n g t h i s into f o c u s w i t h o u t f i x i n g on an (at l e a s t p r o v i s i o n a l ) r e p r e s e n t a t i o n for molecules. of the problem T h e p r o b l e m w h i c h I h a v e c h o s e n to u s e as a n i l l u s t r a t i o n h e r e a r i s e s f r o m the f i e l d of organic chemistry. I d a r e say t h a t f r o m the p o i n t of v i e w o f a c h e m i s t w h o h a s m a d e a s t u d y of t h e s e m a t t e r s it w i l l seem a rather easy problem and perhaps somebody, somewhere has a FORTRAN program sitting on their desk which already solves it. N e v e r t h e l e s s f r o m t h e p o i n t of v i e w of a p r o g r a m m e r w h o h a d n o t t r i e d it b e f o r e , the p r o b l e m s e e m e d d i f f i c u l t e n o u g h to be i n t e r e s t i n g . At least s e v e r a l c o m p e t e n t p r o g r a m m e r s I k n o w rep o r t e d to m e t h a t t h e y h a d f o u n d it so. It t h e r e f o r e s e e m e d a g o o d t e s t i n g g r o u n d for the l a n g u a g e f r a m e w o r k o u t l i n e d in the o p e n i n g s e c t i o n of t h i s p a p e r . In w h a t f o l l o w s I h a v e c o n c e n t r a t e d on r e p r o d u c i n g fairly honestly the thought processes w h i c h f i r s t led m e to a r u n n a b l e s o l u t i o n , r a t h e r t h a n o n p r e s e n t i n g the m o s t e l e g a n t or e f f i c i e n t s o l u t i o n p o s s i b l e . Ideally we would like to define a canonical o r i e n t a t i o n for e a c h m o l e c u l e so t h a t e a c h d i s t i n c t i s o m e r is r e p r e s e n t e d by o n e a n d only one data structure. T h e r e d o e s not s e e m to be a n y s t r a i g h t f o r w a r d w a y o f defining such a canonical orientation, however, (at l e a s t I c o u l d n o t see one) so w e s h a l l go for t h e a l t e r n a t i v e p l a n of g i v ing a n o n - u n i q u e r e p r e s e n t a t i o n t o g e t h e r w i t h a set of l a w s for d e t e r m i n i n g w h i c h r e p r e s e n t a t i o n s are e q u i v a l e n t (in t h e sense that they are different orientations of t h e s a m e m o l e c u l e ) . O n e o b v i o u s w a y to r e p r e s e n t a p a r a f f i n m o l e c u l e in t e r m s of l i s t s t r u c t u r e s is as f o l l o w s . We pick on one carbon atom a r b i t r a r i l y a n d d e e m it to be t h e " l e a d i n g " one. W e t h e n r e p r e s e n t t h e m o l e c u l e as a 4 - 1 i s t , t h e c o m p o n e n t s of t h e l i s t b e i n g the sub-molecules ( " r a d i c a l s " in c h e m i c a l t e r m i n o l o g y ) a t t a c h e d to t h e 4 b o n d s of the l e a d i n g c a r b o n atom. Each radical is r e p r e s e n t e d e i t h e r b y t h e s t r i n g "H", if it is j u s t a h y d r o g e n ato~p, or e l s e by a 3 - 1 i s t c o r r e s p o n d i n g to a c a r b o n a t o m w i t h 3 f u r t h e r r a d i c a l s a t t a c h e d t o it. T h e p r o b l e m is to e n u m e r a t e , w i t h o u t r e p e t i t i o n s a n d in o r d e r of i n c r e a s i n g size, all p o s s i b l e p a r a f f i n m o l e c u l e s . For those who have forgotten their high school c h e m i s t r y , p a r a f f i n s a r e b u i l t up u s i n g o n l y c a r b o n , w h i c h h a s a v a l e n c y of 4 a n d h y d r o g e n , w h i c h h a s a v a l e n c y of i. The first few paraffin molecules, together w i t h t h e i r n a m e s a r e s h o w n in f i g u r e I. In a p a r a f f i n o n e is a l l o w e d n e i t h e r d o u b l e b o n d s n o r c y c l e s , so all p a r a f f i n s with n carbon atoms share the empirical formula In t e r m s of t h e s e c o n v e n t i o n s , then, methane would be represented thus ["H", Cn H2n+2 "H", "H"l a n d o n e of s e v e r a l p o s s i b l e t i o n s for p r o p a n e is b u t for all n ~ there are several distinct molecules ("isomers") w i t h the s a m e formula. T h e n u m b e r of i s o m e r s r i s e s r a t h e r r a p i d l y w i t h n. In c o u n t i n g i s o m e r s it s h o u l d be b o r n e in m i n d t h a t t h e four b o n d p o s i t i o n s o n a g i v e n c a r b o n a t o m a r e in f a c t i n d i s t i n g u i s h a b l e (i.e. the four bond positions can be freely i n t e r c h a n g e d ). So w h a t s e e m at f i r s t to be d i f f e r e n t m o l e c u l e s m a y in fact t u r n out to be d i f f e r e n t o r i e n t a t i o n s of the same m o l e c u l e . N o t i c e in p a r t i c u l a r t h a t the p h e n o m e n o n of " s t e r e o i s o m e r i s m " (a m o l e c u l e b e i n g d i f f e r e D t f r o m its m i r r o r image) is n o t p o s s i b l e with paraffins. representa- [ [ "H", "H", "H" l, [ "H", "H", "H" ], "H" p "H" 3 W e c o u l d o b v i o u s l y s a v e a g r e a t d e a l of space by not representing hydrogen atoms explicitly, but following a principle of s e p a r a t i o n of c o n c e r n s let us l e a v e t h a t as an o p t i m i s a t i o n to be c a r r i e d o u t later. D e f i n i n 9 an E q u i v a l e n c e Molecules Relation on T h e a b o v e r e p r e s e n t a t i o n for a m o l e c u l e has t w o e l e m e n t s o f a r b i t r a r i n e s s - f i r s t that any carbon atom could have been p i c k e d o n as t h e l e a d i n g o n e - s e c o n d t h a t for e a c h c a r b o n a t o m t h e o r d e r in w h i c h its d e p e n d e n t r a d i c a l s a r e l i s t e d is a r b i t r a r y and c o u l d be f r e e l y p e r m u t e d . T h e p r o b l e m as p o s e d is n o t m e r e l y to c o u n t t h e n u m b e r of i s o m e r s for e a c h n, b u t a c t u a l l y to p r o d u c e a r e p r e s e n t a t i o n (just one) of e a c h m o l e c u l e . Choosing a Representation Type "Paraffin Molecule" "H", for t h e D a t a W e c a p t u r e t h i s b e l o w in t e r m s of t h r e e transformations which we can carry out on the representation of a molecule " i n v e r t " , " r o t a t e " a n d "swap". Each t r a n s f o r m a t i o n is e x p r e s s e d h e r e by a f u n c t i o n w h i c h c a n be a p p l i e d to a r e p r e s e n t a t i o n o f a m o l e c u l e to r e t u r n a ( p e r h a p s L e t us b e g i n by c h o o s i n g a r e p r e s e n t a t i o n for t h e m o l e c u l e s . N o r m a l l y in t o p - d o w n prograr~ning o n e is a d v i s e d to d e l a y d e c i s i o n s a b o u t r e p r e s e n t a t i o n as l o n g as 87 H - H H H H H I I I I i C - H H - C H H - H H - H I C - H H H I i I C I - I H C - I H C l H H - I - I C - H - H i H H H H t - C propane H H - H [ C - C i - I I H C I ethane methane - C I I H - H I H H C H H - C - H normal I butane H iso-butane H - H H H H i I I I C - C - C - C H I - C I I i I I H H H H H normal - H H I pentane H - C - H "H H ] H I H - C - H H H I I I C I I H H - C - C H - C I - - C C I - I H H I H H - C - i H H H-C-H I H neo-pentane iso-pentane Figure 1 "Some Paraffin Molecules" 88 H - H modified) cule. representation invert of the [[a, b, c~ d, e, f3 = [a, b, c [d, e, quotient quotient x i = "H" rotate [a, b, c, d3 = [b, c, d, a3 b, c, d3 = [b, a b = member equivclass a, c, d3 closure (equivclass a = closure under under para a)b [rotate,invert,swap3[a] laws f s = s ++ c l o s u r e ' f s s 0 = ["H"] output = layn everything we need s o l u t i o n to o u r t o d e f i n e an o u t of " p a r a f f i n " , (append(map paraffin El..J)) a n d p r i n t i n g " o u t p u t " w i l l g i v e us the r e q u i r e d l i s t o f p a r a f f i n m o l e c u l e s in o r d e r o f i n c r e a s i n g size. T h e l i s t is i n f i n i t e a n d so the p r o c e s s of p r i n t i n g it w i l l go o n f o r e v e r , or at l e a s t u n t i l t h e u s e r i n t e r r u p t s it at t h e terminal. (Note - " a p p e n d " is a l i b r a r y f u n c t i o n w h i c h t a k e s a l i s t of l i s t s a n d j o i n s t h e n all t o g e t h e r w i t h "++"; "layn" is a s t a n d a r d l a y o u t f u n c t i o n w h i c h c a u s e s t h e e l e m e n t s of a l i s t to be p r i n t e d o n e p e r l i n e w i t h n u m b e r e d lines.) s a}) T h e k e y idea is e m b o d i e d in t h e f u n c t i o n " c l o s u r e u n d e r l a w s " w h i c h t a k e s a set of f u n c t Y o n s a ~ d a set of o b j e c t s a n d finds the closure of the latter under r e p e a t e d a p p l i c a t i o n s of the m e m b e r s of t h e f o r m e r . C l e a r l y t h i s is a f u n c t i o n w h i c h c o u l d f i n d application.s in a w i d e r a n g e of p r o b l e m s b e y o n d t h e p r e s e n t one. T h e a b o v e s o m e w h a t ind i r e c t d e f i n i t i o n , v i a the a u x i l i a r y function closure' and closure", was c h o s e n f o r r e a s o n s of e f f i c i e n c y . Molecules b} f [] = [] At this point we have to p r o d u c e a r u n n a b l e problem. We have only p u t s t r u c t u r e in t e r m s thus In t h e a b o v e " m e m b e r " , "map" a n d " m k s e t " are library functions - "mkset" removes r e p e t i t i o n s f r o m a list, and "++" is t h e a p p e n d o p e r a t o r o n lists. All a n = {[a, b, c] I i , j , k ÷ [ 0 . . n - 1 3 ; i~j~k; i + j + k = n - l ; a ÷ p a r a i; b + p a r a j; c ÷ p a r a k} c l o s u r e " f s t = [3, t = [3 = t ++ c l o s u r e ' f (s++t) t, t ~ [1 Generating Size. f x; f para laws closure' f s t = closure" f s ( m k s e t { a l f ' ÷ f ; a ÷ m a p f' t; ~ member (n-l)} = a:{blb÷quotient W h e r e '~para", to be d e f i n e d b e l o w is a f u n c t i o n w h i c h r e t u r n s a l i s t (perhaps w i t h r e p e t i t i o n s ) of all p a r a f f i n r a d i c a l s of a g i v e n size. The function "quotient" defined above takes the q u o t i e n t o f a set w i t h r e s p e c t to a g i v e n e q u i v a l e n c e r e l a t i o n (i.e. r e t u r n s a set c o n t a i n i n g o n l y o n e r e p r e s e n t a t i v e of e a c h e q u i v a l e n c e c l a s s p r e s e n t in the o r i g i n a l set) a n d is u s e d a b o v e t o e n s u r e t h a t e a c h m o l e c u l e is r e p r e s e n t e d o n l y o n c e in t h e f i n a l o u t p u t . There follows a d e f i n i t i o n o f "para" It is c l e a r , m o r e o v e r , t h a t all t h e r e p r e s e n t a t i o n s of a g i v e n m o l e c u l e c a n be o b tained from any one representation by rep e a t e d a p p l i c a t i o n s of the a b o v e t r a n s formations. We can freely permute the b o n d s of t h e l e a d i n g c a r b o n a t o m by comb i n e d a p p l i c a t i o n s o f " r o t a t e " a n d "swap" a n d by c o m b i n i n g t h e s e w i t h a p p l i c a t i o n s of " i n v e r t " a n y c a r b o n a t o m c a n e v e n t u a l l y be b r o u g h t into t h e l e a d i n g p o s i t i o n . We c a n n o w d e f i n e a p r e d i c a t e " e q u i v " on r e p r e s e n t a t i o n s of p a r a f f i n m o l e c u l e s s u c h t h a t " e q u i v a b" d e t e r m i n e s w h e t h e r a a n d b r e p r e s e n t t h e s a m e m o l e c u l e . Thus: equiv (a:x) ~ x = x, [a, f equiv 1 x ÷ para f]] invert swap paraffin n = quotient {Ix, "H", "H", "H"l same m o l e - ~his solution, however, runs with appalling s l o w n e s s (I t r i e d it) m a i n l y b e c a u s e of e a s i l y r e m o v a b l e i n e f f i c i e n c i e s in o u r d e f i n i t i o n of "para". T h e r e is a m i n o r problem and a major problem. T h e m i n o r p r o b l e m is t h a t t h e w a y w e c h o o s e i, j a n d k in t h e d e f i n i t i o n of p a r a n is n e e d l e s s l y w a s t e f u l . We could in fact f i r s t c h o o s e i in t h e set [0.. (n-l)/3] t h e n c h o o s e j in t h e set [i. (n-l-i)/23, w h e r e u p o n k is f i x e d to be n-l-i-j. T h i s l e a d s us to r e w r i t e t h e s e c o n d l i n e in t h e d e f i n i t i o n of "para" as of a G i v e n B e c a u s e of t h e a b s e n c e o f c y c l e s e v e r y p a r a f f i n m o l e c u l e m u s t c o n t a i n at l e a s t o n e o c c u r r e n c e of t h e m e t h a n e r a d i c a l C H 3 a n d w e c a n w i t h o u t l o s s of g e n e r a l i t y c h o o s e t h i s to t h e " l e a d i n g " c a r b o n atom. A function for generating a list containing (once each) a l l t h e paraffin molecules with n carbons can therefore be written p a r a n = {[a, b, c3 I i ÷ [0.. (n-i)/33; j ÷ [i.. ( n - l - i ) / 2 ] ; a ÷ para i ; b ÷ p a r a j ; c ÷ p a r a (n-l-i-j) } T h e m a j o r p r o b l e m is t h a t in e v a l u a t i n g " p a r a n" w e r e p e a t e d l y r e - e v a l u a t e p a r a i for e a c h i < n a l a r g e n u m b e r of times. W e n e e d to m a k e p a r a into a " m e m o f u n c t i o n " [ M i c h i e 681 i.e. a f u n c t i o n w h o s e v a l u e is c a l c u l a t e d o n l y o n c e for 89 e a c h a r g u m e n t , n a m e l y o n its f i r s t c a l l a n d t h e r e u p o n s t o r e d in a table, so t h a t if it is r e q u i r e d a g a i n it c a n be f o u n d by t a b l e lookup, r a t h e r t h a n b y r e c a l c u lation. F o r a r e c u r s i v e f u n c t i o n , like "para", m e m o - i s a t i o n l e a d s to an e x p o n e n t i a l i m p r o v e m e n t in p e r f o r m a n c e . (or to p u t it a n o t h e r way, f a i l u r e to m e m o - i s e l e a d s an e x p o n e n t i a l d e t e r i o r a t i o n in performancel) T h e f i r s t g e n e r a l l e s s o n I w o u l d d r a w is t h a t b y t h e u s e of a n a p p r o p r i a t e l y d e signed applicative language the effort n e c e s s a r y to a r r i v e at (and the s p a c e n e c e s s a r y to e x p r e s s ) a n e x e c u t a b l e s o l u tion to a problem can be r e d u c e d to a s m a l l f r a c t i o n of t h a t r e q u i r e d in a t r a ditional programming language. E v e n in the present situation, where we lack the h a r d w a r e n e c e s s a r y for t h e d i r e c t s u p p o r t of a p p l i c a t i v e l a n g u a g e s , a n i m p l e m e n t a t i o n of a n a p p l i c a t i v e l a n g u a g e c a n b e an e x t r e m e l y v a l u a b l e t o o l for t h e d e v e l o p m e n t a n d t e s t i n g of a l g o r i t h m s . For e x a m p l e I h a d a n u m b e r of m i s c o n c e p t i o n s a b o u t the p a r a f f i n s p r o b l e m (which I e l i d e d f r o m t h e a b o v e a c c o u n t ) of w h i c h I w a s f a i r l y q u i c k l y d i s a b u s e d by i n t e r a c t ing with the KRC system. If I ~ n o w h a d to s o l v e t h e p r o b l e m in, say, P A S C A L , I w o u l d do so w i t h m u c h g r e a t e r c o n f i d e n c e . A t f i r s t s i g h t it s e e m s t h a t a m e m o f u n c t i o n i n v o l v e s in an e s s e n t i a l w a y t h e u s e o f s i d e - e f f e c t s for its e x p r e s s i o n . T h i s is, h o w e v e r , n o t t h e case. T h e r e is in S A S L a s t a n d a r d t r a n s f o r m a t i o n for t u r n i n g a f u n c t i o n into a m e m o f u n c t i o n in a p u r e l y a p p l i c a t i v e w a y - see [ T u r n e r 81a, C h a p t e r 43. A p p l y i n g the i d e a to "para" l e a d s us to r e w r i t e its d e f i n i t i o n as f o l l o w s : para 0 = ["H"3 para n = paralist paralist T h e s e c o n d g e n e r a l a b s e r v a t i o n is t h a t t h e language framework we are using here supports very nicely the following separat i o n of c o n c e r n s (which h a s of c o u r s e b e e n advocated many times before). In a f i r s t step we concentrate on writing down a l o g i c a l l y c o r r e c t d e f i n i t i o n of the d e sired function, completely ignoring cons i d e r a t i o n s of e f f i c i e n c y . Recursion p l u s set a b s t r a c t i o n is a v e r y p o w e r f u l c o m b i n a t i o n for t h i s p u r p o s e , e n a b l i n g us to t h i n k v e r y "big t h o u g h t s " in o n e go. Typically, however, the definitions we a r r i v e at in t h i s w a y h a v e a n e x p o n e n t i a l or combinatorial run-time, whereas there m a y e x i s t an a l g o r i t h m w h i c h is l i n e a r (or at l e a s t p o l y n o m i a l ) . In a s e c o n d s t e p w e r e p a i r t h e e f f i c i e n c y of t h e d e f i n i t i o n , by a p p l y i n g t r a n s f o r m a t i o n s know to preserve correctness. In a surp r i s i n g l a r g e n u m b e r of c a s e s it t u r n s o u t t h a t a s m a l l n u m b e r of s t a n d a r d o p t i misations are sufficient to bring about t h e n e c e s s a r y i m p r o v e m e n t in p e r f o r m a n c e . T w o in p a r t i c u l a r s e e m to be of s u c h g e n e r a l a p p l i c a b i l i t y as to d e s e r v e s p e c i a l m e n t i o n in a n e x t (and final) s e c t i o n of t h i s p a p e r . n = map genpara If..] g e n p a r a n = {[a, b, c] I i + [0.. (n-i)/33; j ÷ [i.. ( n - l - i ) / 2 ] ; a ÷ p a r a i ; b + p a r a j ; c ÷ p a r a (n-l-i-j)} In t h e a b o v e " g e n p a r a " p e r f o r m s the c a l c u l a t i o n , b u t t h e r e c u r s i o n is r e p l a c e d by table lookup. T h e l o o k u p t a b l e is r e p r e sented by the (infinite)list "paralist" (in S A S L a n d K R C l i s t s c a n b e i n d e x e d b y a p p l y i n g t h e m to a n i n t e g e r ) . The elem e n t s o f " p a r a l i s t " a r e i n i t i a l i s e d by c a l l i n g " g e n p a r a " b u t t h a n k s to l a z y evaluation they only come into existence as t h e y a r e a c c e s s e d . W e n o w h a v e a r u n n a b l e p r o g r a m for o u r paraffin problem. The complete text o f t h e p r o g r a m is s h o w n in f i g u r e 2, a c c o m p a n i e d b y a n i n i t i a l s e g m e n t of its o u t p u t . T h e c o m m a n d " o u t p u t S " is a n i n s t r u c t i o n to t h e KRC s y s t e m to p r i n t t h e list, o u t p u t . Lessons drawn T h e t h i r d a n d f i n a l o b s e r v a t i o n I w i s h to m a k e r e l a t e s m o r e s p e c i f i c a l l y to t h e paraffin problem. I believe that the reason why this seems on first inspection to b e r a t h e r a h a r d p r o b l e m is b e c a u s e it involves an unfree data type and I suspect t h i s is c h a r a c t e r i s t i c of a lot of the more recalcitrant problems one meets. A g e n e r a l w a y of c h a r a c t e r i s i n g a n u n f r e e d a t a t y p e , w h i c h w e u s e d in t h i s e x a m p l e , is as t h e q u o t i e n t o f a f r e e d a t a t y p e under an equivalence relation and a good way of defining an equivalence relation is t o g i v e a set o f l a w s of w h i c h it is the closure. Our function "closure u n d e r l a w s " g i v e s us a c o n v e n i e n t h ~ n d l e o n t o ~ h i s a n d I h o p e it w i l l t u r n o u t to b e u s e f u l for o t h e r a p p l i c a t i o n s in t h e future. The above program was the fruit of about a n h o u r ' s l a b o u r at a t e r m i n a l a n d s e e m s a r e a s o n a b l y c o n v i n c i n g d e m o n s t r a t i o n of t h e u t i l i t y of r e c u r s i o n e q u a t i o n s p l u s set a b s t r a c t i o n as a l a n g u a g e f r a m e w o r k . T h e p r o g r a m is far f r o m f u l l y p o l i s h e d a n d h a s v e r y m u c h t h e s t a t u s of a f i r s t cut. B y t h e a p p l i c a t i o n of t w o o b v i o u s o p t i m i s a t i o n s - (a) t h e r e m o v a l o f t h e r e d u n d a n t "H"s f r o m t h e i n t e r n a l r e p r e s e n t a t i o n of m o l e c u l e s a n d (b) t h e u s e of a n i d e a c a l l e d " f i l t e r p r o m o t i o n " (see later), I was subsequently led to a program which ran perhaps ten times faster than the above. Rather than pursuing t h e s e f u r t h e r r e f i n e m e n t s in d e t a i l here, however, this seems an appropriate moment to b r e a k o f f f r o m t h e c o n s i d e r a t i o n of t h i s particular problem and draw some general lessons. 90 output = layn (append (map paraffin [I..])) paraffin n = quotient equiv {[x,"H","H","H"]Ix<-para (n- I)} para 0 : ["H"] para n = paralist n paralist = map genpara [I..] genpara n = {[a,b,c][i<-[0..(n-])/3];j<-[i..(n-]-i)/2]; a<-para i;b<-para j;c<-para ( n - ] - i - j)} equiv a b = member (equivclass a) b equivclass x = closure under laws [invert,rotate,swap] invert [[a,b,c],d,e,f] Ix] = [a,b,c,[d,e,f]] invert x = x, × ] = "H" rotate [a,b,c,d] = [b,c,d,a] swap [a,b,c,d] : [b,a,c,d] closure' closure" f s t = closure" f s (mkset {a',f'<-f;a<-map f' t;\member s a}) f s t = [], t = [] = t ++ closure' f (s ++ t) t closure under laws f s : s ++ closure' quotient f (a:x) = a:{b[b<-quotient f s s f x;\f a b} quotient f [] = [] output! I) [ "H" ," H" ," H" ," H" ] 2) [ [ "H" ,"H" ,"H" ] ," H" ," H" ," H" ] 3) [ [ "H" ," H" ,[ "H" ," H" ,"H" ] ] ,"H" ," H" ," H" ] 4) [ [ "H"," H", [ "H"," H", [ "H" ," H" ," H" ] ] ], "H" ," H"," H" ] 5) [ [ "H" ,[ "H" ," H" ,"H"], [ "H" ," H" ," H" ] ] ,"H" ,"H" ,"H" ] 6) [ [ "H" ," H" ,[ "H" ,"H" ,[ "H" ," H" ,[ "H" ,"H" ," H" ] ] ] ] ,"H" ,"H" ,"H"] 7) [ [ "H" ," H", [ "H", [ "H" ," H" ,"'H" ], [ "H" ," H" ," H" ] ] ] ," H" ," H"," H" ] 8) [ [ [ "H" ," H" ," H" ], [ "H" ," H" ," H" ], [ "H" ," H" ," H" ] ], "H" ," H" ," H" ] 9) [ [ "H" ," H" ,[ "H" ," H", [ "H" ," H", [ "H" ," H", [ "H" ," H" ," H" ] ] ] l ] ," H" ," H"," H" ] I0) [ [ "H" ," H", [ "H" ," H", [ "H", [ "H" ," H" ," H" ], [ "H"," H" ," H" ] ] ] ]," H" ," H" ," H" ] 11) [ [ "H" ," H", [ "H" ,[ "H" ," H" ,"H" ], [ "H" ," H", [ "H" ," H" ," H" ] ] ] ], "H" ," H" ,"H" ] 12) [ [ "H" ," H" , [ [ "H" ," H" ,"H" ], [ "H" ," H" ," H" ], [ "H" ," H" ,"H" ] ] ], "H" ,"H" ," H" ] 13) [ [ "H" ,[ "H" ," H" ," H" ], [ "H", [ "H" ," H" ," H" ], [ "H" ,"H" ," H" ] ] ] , " H " ,"H" ,"H" ] • . • • FIGURE 2 IT.e Paraffins program in K R C , with some initial output. 91 Two useful optimisations eliminate permutations, say by deciding to a l l o w o n l y i n c r e a s i n ~ p a r t i t i o n s , e.g. a m o n g t h e p a r t i t i o n s of 3 w e a l l o w [ i , i , i ] a n d [ l , 2 ] b u t n o t [2,1]. Our f i r s t t h o u g h t c o u l d b e to a p p l y a f i l t e r to our original function T h e t w o o p t i m i s a t i o n s w h i c h w a r r a n t spec i a l m e n t i o n h e r e are m e m o - i s a t i o n ( o r i g i n a l l y d u e to D o n a l d M i c h i e ) a n d "filter promotion" (both t h e n a m e a n d t h e idea o f w h i c h a r e d u e to J o h n D a r l i n g t o n ) [ D a r l i n g t o n 79]. partitions' (a) M e m o - i s a t i o n (n-i) + fib (n-2), partitions' partitions' n>2 A l t h o u g h t h i s h a s s o m e c l a i m to be c o n s i d e r e d the m o s t n a t u r a l d e f i n i t i o n it suffers from a run-time that increases e x p o n e n t i a l l y w i t h n. W e c o u l d of c o u r s e program the well-known linear algorithm explicitly (by t a i l r e c u r s i o n ) b u t it is in f a c t p o s s i b l e to a c h i e v e a l i n e a r r u n t i m e w i t h o u t a b a n d o n i n g t h e s t r u c t u r e of the above definition. = m a p f i b [i..3 i i f i b l i s t (n-l) + fiblist(n-2), T h e r u n - t i m e of "fib" s t e a d of e x p o n e n t i a l ~ is n o w linear Filter 0 = [[]] n = {i : p I i ÷ F l . . n ] ; p÷partitions' (n-i) ; p = [] v i ~ h d p} Note T h e u s e of Z e r m e l o - F r a n k e l set a b s t r a c t i o n as a n i m p l e m e n t a b l e l a n g u a g e f e a t u r e s e e m s o r i g i n a l l y to h a v e b e e n p r o p o s e d b y J o h n D a r l i n g t o n [ D a r l i n g t o n 75]. REFERENCES D a r l i n g t o n 75 " A p p l i c a t i o n s of p r o g r a m t r a n s f o r m a t i o n to p r o g r a m s y n t h e s i s " Proceedings conference on proving and i m p r o v i n g p r o g r a m s , A r c et S e n a n s 1 9 7 5 D a r l i n g t o n 79 "A s y n t h e s i s sorting algorithms" A c t a I n f o r m a t i c a 1979 n>2 in- of several M i c h i e 68 "Memo-functions - a language feature with role learning properties" in E X P E R I M E N T A L P R O G R A M M I N G 1 9 6 6 - 7 , Edinburgh University, Dept.of Machine Intelligence and Perception, January 1968 O b v i o u s l y t h e t e c h n i q u e c a n be a p p l i e d to a n y f u n c t i o n o f i n t e g e r a r g u m e n t s . Notice that this purely applicative a p p r o a c h to m e m o f u n c t i o n s d e p e n d s heavily on the fact that the language in w h i c h w e a r e w o r k i n g h a s a n o n strict semantics. (Incidentally, a m o r e r a d i c a l a p p r o a c h , w h i c h m i g h t be w o r t h p u r s u i n g , w o u l d be to t r y a n d g e t t h e s y s t e m to p e r f o r m t h i s c l a s s of o p timisations automatically. One quite promising approach, with which I have been experimenting, is to m o d i f y t h e r u n - t i m e s y s t e m by k e e p i n g a n a s s o c i a t i v e c a c h e of t h e r e s u l t s o f a l l r e c e n t function applications.) (b) n) Like memo-isation, filter promotion can l e a d t o v e r y d r a m a t i c i m p r o v e m e n t s in performance. W e do t h i s b y t u r n i n g "fib" i n t o a m e m o function. We introduce a data structure " f i b l i s t " in w h i c h w e s t o r e t h e v a l u e s of t h e f u n c t i o n a n d r e p l a c e a l l c a l l s to t h e f u n c t i o n in t h e r e s t of t h e p r o g r a m , including the recursive calls inside the d e f i n i t i o n of "fib" i t s e l f , b y t a b l e lookup. Thus fiblist fib i = fib 2 = fib n = increasing (partitions where we can easily give a recursive d e f i n i t i o n of " i n c r e a s i n g " as a p r e d i c a t e o n lists. W e c a n g e t a c o n s i d e r a b l e :imp r o v e m e n t in p e r f o r m a n c e , h o w e v e r , b y pushing the filter inside the generator " p a r t i t i o n s " so t h a t t h e u n w a n t e d l i s t s a r e n o t c r e a t e d in t h e f i r s t p l a c e (the reader should compare this with the defin i t i o n s of p a r t i t i o n s e a r l i e r in t h e paper): This optimisation technique has already been demonstrated earlier, on the funct i o n "para". A similar example, however, m a y b r i n g o u t t h e m e t h o d m o r e clearly. Consider the following"obvious" d e f i n i t i o n o f t h e f u n c t i o n "fib n" w h i c h returns the n'th fibonacci number. fib i = i fib 2 = i fib n = fib n = filter T u r n e r 76 " S A S L L a n g u a g e M a n u a l " , St. Andrews University Technical Report, December 1976 T u r n e r 81 "KRC L a n g u a g e M a n u a l " , U n i v e r s i t y of K e n t (in p r e p a r a t i o n ) T u r n e r 81a D.Phil.Thesis, University 1981 Promotion Again we can best bring out the technique by means of a simple example. Suppose we a r e a s k e d to m o d i f y o u r e a r l i e r d e f i n i t i o n of t h e f u n c t i o n " p a r t i t i o n s " so as to 92 Oxford