.EQ
delim $$
.EN
.ls 1
.ce
PROGRAMMING BY EXAMPLE REVISITED
.sp
.ce
by John G. Cleary
.ce
Man-Machine Systems Laboratory
.ce
University of Calgary.
.sp
.sh "Introduction"
.pp
Efforts to construct an artificial intelligence have relied on
ever more complex and carefully prepared programs.  While useful in
themselves, these programs
are unlikely to be useful in situations where ephemeral and
low value knowledge must be acquired.  For example a person (or robot)
working in a normal domestic environment knows a lot about which
cupboards have sticky doors and where the marmalade is kept.  It seems
unlikely that it will ever be economic to program such knowledge 
whether this be via a language or a discourse with an expert system.
.pp
It is my thesis, then, that any flexible robot system working in the
real world must contain a component of control intermediate
between hard wired 'reflex' responses and complex intellectual 
reasoning.  Such an intermediate system must be adaptive, be able
to carry out complex patterned responses and be fast in operation.
It need not, however, carry out complex forward planning or be capable
of introspection (in the sense that expert systems are able to explain
their actions).
.pp
In this talk I will examine a system that acquires knowledge by 
constructing a model of its input behaviour and uses this to select its
actions.  It can be viewed either as an automatic adaptive system  or
as an instance of 'programming by example'.  Other workers have
attempted to do this, by constructing compact models in some appropriate
programming language:e.g. finite state automata [Bierman, 1972], 
[Bierman and Feldman, 1972]; LISP [Bierman and Krishnaswamy, 1976]; 
finite non-deterministic
automata [Gaines,1976], [Gaines,1977],
[Witten,1980]; high level languages [Bauer, 1979], [Halbert, 1981].
These efforts, however, suffer from
the flaw that for some inputs their computing time is 
super-exponential in the number
of inputs seen.  This makes them totally impractical in any system which
is continuously receiving inputs over a long period of time.
.pp
The system I will examine comprises one or more simple independent
models.  Because of their simplicity and because no attempt is made to 
construct models which are minimal,
the time taken to store new information and to make 
predictions is constant and independent of the amount of information stored
[Cleary, 1980].  This leads to a very integrated and responsive environment.
All actions by the programmer are immediately incorporated into the program
model. The actions are also acted upon so that their consequences are 
immediately apparent.
However, the amount of memory used could grow 
linearly with time. [Witten, 1977] introduces a modelling system related
to the one here which does not continually grow and which can be updated
incrementally.
.pp
It remains to be shown that the very simple models used are capable 
of generating any
interestingly complex behaviour.
In the rest of this
talk I will use the problem of executing a subroutine to illustrate
the potential of such systems.
The example will also illustrate some of the techniques which have been
developed for combining multiple models, [Cleary, 1980], [Andreae
and Cleary, 1976], [Andreae, 1977], [Witten,1981].  It has also been
shown in [Cleary, 1980] and in [Andreae,1977] that such systems can
simulate any Turing machine when supplied with a suitable external memory.
.sh "The modelling system"
.pp
Fig. 1 shows the general layout of the modeller.  Following the flow
of information through the system it first receives a number of inputs
from the external world.  These are then used to update the current
contexts of a number of Markov models.  Note, that each Markov model
may use different inputs to form its current context, and that they
may be attempting to predict different inputs.  A simple robot
which can hear and move an arm might have two models; one, say, in
which the last three sounds it heard are used to predict the next
word to be spoken, and another in which the last three sounds and the last
three arm movements are used to predict the next arm movement. 
.pp
When the inputs are received each such context and its associated 
prediction (usually
an action) are added to the Markov model.  (No
counts or statistics are maintained \(em they are not necessary.)  When the
context recurs later it will be retrieved along with all the predictions
which have been stored with it.
.pp
After the contexts have been stored they 
are updated by shifting in the new inputs. These new contexts are then
matched against the model and all the associated predictions are retrieved.
These independent predictions from the individual Markov models
are then combined into a single composite 
prediction.
(A general theory of how to do this has been
developed in [Cleary, 1980]).  
.pp
The final step is to present this 
composite prediction to a device I have called the 'choice oracle'.
This uses whatever information it sees fit to choose the next action.
There are many possibilities for such a device.  One might be to choose
from amongst the predicted actions if reward is expected and to choose
some other random action if reward is not expected.  The whole system then 
looks like
a reward seeking homeostat.  At the other extreme the oracle might be
a human programmer who chooses the next action according to his own
principles.  The system then functions more like a programming by
example system \(em [Witten, 1981] and [Witten, 1982] give examples of such 
systems.
[Andreae, 1977] gives an example of a 'teachable' system lying between
these two extremes.
.pp
After an action is chosen this is
transmitted to the external world and the resultant inputs are used
to start the whole cycle again.  Note that the chosen action will
be an input on the next cycle.
.sh "Subroutines"
.pp
An important part of any programming language is the ability to write a 
fragment of a program and then have it used many times without it having
to be reprogrammed each time.  A crucial feature of such shared code is
that after it has been executed the program should be controlled by the
situation which held before the subroutine was called. A subroutine can be 
visualised as a black box with an unknown and arbitrarily complex interior.
There are many paths into the box but after passing through each splits again
and goes its own way, independent of what happened inside the box.
.np
Also, if there are $p$ paths using the subroutine and $q$ different sequences
within it then the amount of programming needed should be proportional to
$p + q$ and not $p * q$.  The example to follow possess both these properties
of a subroutine.
.rh "Modelling a Subroutine."
The actual model we will use is described in Fig. 2.  There are two Markov
models (model-1 and model-2) each seeing and predicting different parts of
the inputs.  The inputs are classified into four classes; ACTIONs that
move a robot (LEFT, RIGHT, FAST, SLOW), patterns that it 'sees' (danger,
moved, wall, stuck) and two types of special 'echo' actions, # actions
and * actions (*home, #turn).  The # and * actions have no effect on the 
environment,
their only purpose is to be inputs and act as place keepers for relevant
information.  They may be viewed as comments which remind the system of
what it is doing.  (The term echo was used in [Andreae,1977], where the
idea was first introduced, in analogy to spoken words of which one
hears an echo.)
.pp
Model-2 is a Markov model of order 2 and uses only # actions in its
context and seeks to predict only * actions.  Model-1 is a Markov model 
of order 3 and uses all four classes of inputs in its context.  It
seeks to predict ACTIONs, # actions and * actions.  However, * actions
are treated specially.  Rather than attempt to predict the exact * action
it only stores * to indicate that some * action has occurred.  This
special treatment is also reflected in the procedure for combining the
predictions of the two models.  Then the prediction of model-2 is used,
only if model-1 predicts an *.  That is, model-1 predicts that some 
* action will occur and model-2 is used to select which one. If model-1
does not predict an * then its prediction is used as the combined prediction
and that from model-2 is ignored.
.pp
The choice oracle that is used for this example has two modes.  In
programmer mode a human programmer is allowed to select any action
she wishes or to acquiesce with the current prediction, in which case
one of the actions in the combined prediction is selected.  In
execution mode one of the predicted actions is selected and the
programmer is not involved at all.
.pp
Before embarking on the actual example some points about the predictions
extracted from the individual Markov models should be noted.  First, if 
no context can be found stored in the memory which equals the current
context then it is shortened by one input and a search is made for any
recorded contexts which are equal over the reduced length.  If necessary
this is repeated until the length is zero whereupon all possible
allowed actions are predicted.
.pp
Fig. 3 shows the problem to be programmed.  If a robot sees danger it
is to turn and flee quickly.  If it sees a wall it is to turn and return
slowly.  The turning is to be done by a subroutine which, if it gets 
stuck when turning left, turns right instead.
.pp
Fig. 4 shows the contexts and predictions stored when this is programmed.
This is done by two passes through the problem in 'program' mode: once
to program the fleeing and turning left; the other to program the wall
sequence and the turning right.  Fig. 5 then shows how this programming
is used in 'execute' mode for one of the combinations which had not been
explicitly programmed earlier (a wall sequence with a turn left).  The
figure shows the contexts and associated predictions for each step.
(Note that predictions are made and new contexts are stored in both
modes.  They have been omitted from the diagrams to preserve clarity.)
.sh "Conclusion"
.pp
The type of simple modelling system presented above is of interest for a
number of reasons.  Seen as a programing by example system, 
it is very closely 
integrated. Because it can update its models incrementally in real time
functions such as input/output, programming, compilation and execution
are subsumed into a single mechanism. Interactive languages such as LISP
or BASIC gain much of their immediacy and usefulness by being interpretive 
and not requiring a separate compilation step when altering the source
program. By making execution integral with the process of program entry
(some of) the consequencs of new programming become immediately apparent.
.pp
Seen as an adaptive controller, the system has the advantage of being fast
and being able to encode any control strategy. Times to update the model
do not grow with memory size and so it can operate continuously in real time.
.pp
Seen as a paradigm for understanding natural control systems, it has the
advantage of having a very simple underlying storage mechanism. Also,
the ability to supply an arbitrary choice oracle allows for a wide
range of possible adaptive strategies.
.sh "References"
.in +4m
.sp
.ti -4m
ANDREAE, J.H. 1977
Thinking with the Teachable Machine.  Academic Press.
.sp
.ti -4m
ANDREAE, J.H. and CLEARY, J.G. 1976
A New Mechanism for a Brain.  Int. J. Man-Machine Studies
8(1):89-119.
.sp
.ti -4m
BAUER, M.A. 1979 Programming by examples. Artificial Intelligence 12:1-21.
.sp
.ti -4m
BIERMAN, A.W. 1972
On the Inference of Turing Machines from Sample Computations.
Artificial Intelligence 3(3):181-198.
.sp
.ti -4m
BIERMAN, A.W. and FELDMAN, J.A. 1972
On the Synthesis of Finite-State Machines from Samples of
their Behavior.  IEEE Transactions on Computers C-21, June:
592-597.
.sp
.ti -4m
BIERMAN, A.W. and KRISHNASWAMY, R. 1976 Constructing programs from example 
computations. IEEE transactions on Software Engineering SE-2:141-153.
.sp
.ti -4m
CLEARY, J.G. 1980
An Associative and Impressible Computer. PhD thesis, University
of Canterbury, Christchurch, New Zealand.
.sp
.ti -4m
GAINES, B.R. 1976
Behaviour/structure transformations under uncertainty.
Int. J. Man-Machine Studies 8:337-365.
.sp
.ti -4m
GAINES, B.R. 1977
System identification, approximation and complexity.
Int. J. General Systems, 3:145-174.
.sp
.ti -4m
HALBERT, D.C. 1981
An example of programming by example. Xerox Corporation, Palo Alto, 
California.
.sp
.ti -4m
WITTEN, I.H. 1977
An adaptive optimal controller for discrete-time Markov
environments.  Information and Control, 34, August: 286-295.
.sp
.ti -4m
WITTEN, I.H. 1979
Approximate, non-deterministic modelling of behaviour
sequences.  Int. J. General Systems, 5, January: 1-12.
.sp
.ti -4m
WITTEN, I.H. 1980
Probabilistic behaviour/structure transformations using
transitive Moore models.  Int. J. General Systems, 6(3):
129-137.
.sp
.ti -4m
WITTEN, I.H. 1981
Programming by example for the casual user: a case study.
Proc. Canadian Man-Computer Communication Conference, Waterloo,
Ontario, 105-113.
.sp
.ti -4m
WITTEN, I.H. 1982
An interactive computer terminal interface which predicts user 
entries. Proc. IEE Conference on Man-Machine Interaction,
Manchester, England.
.in -4m