Nick Thomas honors thesis

29
Is interactive computation a superset of Turing computation? Nick Thomas September 21, 2012 Abstract Modern computers interact with the external environment in com- plex ways — for instance, they interact with human users via keyboards, mouses, monitors, etc., and with other computers via networking. Ex- isting models of computation — Turing machines, λ-calculus functions, etc. — cannot model these behaviors completely. Some additional con- ceptual apparatus is required in order to model processes of interactive computation. 1 Introduction It is generally believed that everything that computers do can be done by Turing machines. We believe that, for every task that a computer can carry out, we can write a Turing machine that does the same thing. Recently this belief has been challenged by proponents of the “interactive paradigm” of computing. They claim that there are a class of computations, called “interactive” computations, which computers can perform, but which Turing machines cannot perform. See, for example, Goldin and Wegner [2005], Goldin and Wegner [2008], Eberbach et al. [2004], van Leeuwen and Wiedermann [2001], Dodig-Crnkovic [2011], Smolka et al. [2006]. Interactive computations are computations which involve the computing de- vice receiving input, and producing output, while in the middle of computation. The idea is that a Turing machine cannot perform these computations, because it can only receive input and produce output at the beginning and at the end of its computations. Recall that a Turing machine’s “input” is simply the contents of its tape when it starts running. Its “output” is simply the contents of its tape when it stops running. The concept of a Turing machine contains no provision for receiving input, or giving output, while in the middle of computation. It starts with its input, it performs its computation, and it ends with its output. This means that there are certain computer programs which Turing machines cannot implement. Goldin and Wegner [2008] give the example of a computer program designed to drive a car from one location to another. The program must 1

Transcript of Nick Thomas honors thesis

Page 1: Nick Thomas honors thesis

Is interactive computation a superset of Turing

computation?

Nick Thomas

September 21, 2012

Abstract

Modern computers interact with the external environment in com-plex ways — for instance, they interact with human users via keyboards,mouses, monitors, etc., and with other computers via networking. Ex-isting models of computation — Turing machines, λ-calculus functions,etc. — cannot model these behaviors completely. Some additional con-ceptual apparatus is required in order to model processes of interactivecomputation.

1 Introduction

It is generally believed that everything that computers do can be done by Turingmachines. We believe that, for every task that a computer can carry out, wecan write a Turing machine that does the same thing.

Recently this belief has been challenged by proponents of the “interactiveparadigm” of computing. They claim that there are a class of computations,called “interactive” computations, which computers can perform, but whichTuring machines cannot perform. See, for example, Goldin and Wegner [2005],Goldin and Wegner [2008], Eberbach et al. [2004], van Leeuwen and Wiedermann[2001], Dodig-Crnkovic [2011], Smolka et al. [2006].

Interactive computations are computations which involve the computing de-vice receiving input, and producing output, while in the middle of computation.The idea is that a Turing machine cannot perform these computations, becauseit can only receive input and produce output at the beginning and at the endof its computations.

Recall that a Turing machine’s “input” is simply the contents of its tapewhen it starts running. Its “output” is simply the contents of its tape whenit stops running. The concept of a Turing machine contains no provision forreceiving input, or giving output, while in the middle of computation. It startswith its input, it performs its computation, and it ends with its output.

This means that there are certain computer programs which Turing machinescannot implement. Goldin and Wegner [2008] give the example of a computerprogram designed to drive a car from one location to another. The program must

1

Page 2: Nick Thomas honors thesis

take input from the environment (the movements of other cars, the movements ofpedestrians, traffic signals, etc.) and respond to this input by giving instructionsto the car to change its behavior in various ways (turning, accelerating, braking,etc.).

This program, by its nature, requires taking input from the environmentand giving output to the environment while in the middle of computation. Acomputer can implement this program; a Turing machine cannot.

The reason that a Turing machine cannot implement this program is thata Turing machine cannot take new input while it is running. It must takeall of its input at the start of computation, and produce all of its output atthe end of computation. So in order to implement a driving program with aTuring machine, it would have to be the case that it was possible to supply allof the information about the environment before the computation began, runthe computation, and then carry out all of the driving instructions after thecomputation was finished. But this is not possible, because what happens inthe environment is dependent upon the instructions the computer gives to thecar; so the environmental information at a given moment is unknown until thedriving instructions of the previous moment have already been carried out.

Other examples of programs like this are programs which interact with hu-man users, robotics programs, and programs which dynamically interact withother programs over the Internet (e.g., web servers). The interactive view ofcomputation says that all of these programs are examples of interactive compu-tations, and that none of them can be implemented by Turing machines.

Every behavior that Turing machines can implement can be thought of ascomputing a mathematical function. The behaviors of computers which Tur-ing machines cannot implement are behaviors which cannot be thought of ascomputing a mathematical function. These behaviors involve interacting withprocesses outside the computation while the computation is running.

Various formal models of interactive computation exist. These include theπ-calculus [Milner, 1999] and persistent Turing machines [Goldin, 2000, Goldinet al., 2004]. These models offer a rigorous way to talk about interactive com-putation.

However, as we will see, we do not need to depart completely from exist-ing computational paradigms in order to model interactive computation. It issufficient to augment existing models (e.g., Turing machines) with additionalconceptual apparatus. We can reinterpret existing work on interactive compu-tation in these terms.

1.1 Turing machines

Turing machines are a mathematical model which formalizes the concept ofcomputation. They were invented by Alan Turing as a way of studying themathematical properties of computers and things like computers.

A Turing machine consists of a tape, a tape head, and a set of state transitionrules. The tape is a series of squares, each of which may be blank, or containa symbol. The tape is unbounded in length in one direction, to the right. The

2

Page 3: Nick Thomas honors thesis

tape head is located at a definite position on the tape at any given time, andit is capable of reading the symbol in the square it stands over, and writing anew symbol into that square.

There is a set of states, representing steps in the algorithm the Turing ma-chine executes. The states include one starting state, and one halting state. Thestate transition rules specify what the Turing machine may do in each state.

A state transition rule says something of the form, “if you are in state s andthe symbol under the tape head is c, then transition to state t and perform thisaction.” The action may be one of, “move the tape head left,” “move the tapehead right,” or “write the symbol d under the tape head.”

So a Turing machine is similar to a computer, except that it is extremelyspartan in its basic design.

Every Turing machine can be thought of as computing some mathematicalfunction. We can think of the contents of the tape when the machine startsrunning as the input to the function; and we can think of the contents of thetape when the machine halts as the output of the function. If there exists aTuring machine which can compute any value of a given function in finite time,then that function is said to be “Turing-computable.”

Most mathematical functions are Turing-computable, but it is known thatthere are some functions which are not. For instance, no Turing machine cansolve the “halting problem:” that is, the problem of determining whether or notan arbitrary Turing machine, with an arbitrary input, will eventually halt.

As another example, no Turing machine can compute the “busy beaver func-tion,” discovered by Rado [1962]. The busy beaver function Σ(n) is (roughly)the largest number that can be written out by a Turing machine with n distinctstates. It is clear that no Turing machine can compute the busy beaver function.Such a TM would need to have k states for some k, but it would output Σ(k+1)when given k + 1 as input. However, by definition, the largest number it canpossibly output is Σ(k), leading to a contradiction.

Let us finish by giving the set-theoretic definition of a Turing machine, asmay be found, e.g., in Bridges [1994].

Definition 1.1.1. A Turing machine M = (K,Σ, δ, s0) is:

� A finite set K of states.

� A finite alphabet Σ, including the blank symbol # and not including L orR.

� A transition function δ ∶K ×Σ→ (K ∪ {h}) × (Σ ∪ {L,R}).

� s0 ∈K is the initial state.

� h ∉K is the halting state.

We describe the state of a Turing machine at a given time using a structurecalled a “configuration.” The configuration says what state the Turing machine

3

Page 4: Nick Thomas honors thesis

is in, what the contents of its tape are, and where the tape head is located onthe tape.

Definition 1.1.2. A configuration C = (s,w,n) of M is:

� s ∈K ∪ {h}, the state of M.

� w ∈ Σ∗, the contents of M’s tape.1

� n ∈ N, the position of M’s tape head.

Now we need to describe the way that the Turing machine’s configurationchanges over time. We define a relation C ↦ C′, read “C yields C′,” meaningthat if the Turing machine is in configuration C, its next computational stepwill leave it configuration C′.

Definition 1.1.3. LetM be a Turing machine, and C,C′ two configurationsof M:

C = (s,w,n),C′ = (s′,w′, n′).Then C ↦ C′ iff:

(s,w[n], s′, c) ∈ δ, c ∈ Σ;

n′ = n + 1 if c = R,n′ = n − 1 if c = L and n ≠ 1,

n′ = n otherwise;

w′ = w if c ∈ {L,R},w′ = w[c/n] otherwise.

The importance of Turing machines is that they are a general model of com-putation. Every computable function can be computed by a Turing machine.This fact is expressed in the “Church-Turing thesis.”

1.2 The Church-Turing thesis

The Church-Turing thesis sprang out of research into the concept of “effec-tive calculability.” This was an informal, intuitively understood concept whichmathematicians of the time worked with. It corresponds to our modern conceptof computability.

A function is said to be effectively calculable if there is some procedure(an “effective method”) to produce the values of the function, which a personcan carry out in finite time, following a precisely and finitely describable set

1Note the use of the Kleene star. Σ∗ means, “the set of all strings over symbols in Σ.”

4

Page 5: Nick Thomas honors thesis

of instructions, without requiring any insight or ingenuity on the part of theperson. An example of an effective method is the technique of performing longdivision which is taught in elementary schools.

Several individuals attempted to formalize the concept of effective calcu-lability, by inventing mathematical formalisms in which, it was hoped, everyeffective method could be expressed. There were three such contemporaneousattempts:

� Alan Turing invented Turing machines.

� Alonzo Church invented the λ-calculus.

� Alonzo Church, Stephen Kleene, and J.B. Rosser invented µ-recursivefunctions.

It was proven that all three of these models computed the same set of func-tions. Subsequent work involved designing many different models of computa-tion in order to determine if any of these models had capabilities that surpassedthose of Turing machines, the λ-calculus, and µ-recursive functions. In everycase they were found to be equivalent.

For example, it was found that non-deterministic Turing machines are equiv-alent to Turing machines [Lewis and Papadimitriou, 1981]. A non-deterministicTuring machine is like a Turing machine, except that at every step of computa-tion, it is capable of splitting into multiple copies of itself which follow divergentcomputation paths.

The fact that every model of computation computes the same set of functionsis expressed in the Church-Turing thesis.

The Church-Turing thesis (CT) states that every effectively calculable func-tion can be computed by a Turing machine. It therefore follows that everyeffectively calculable function can also be computed by the λ-calculus, by µ-recursive functions, by a non-deterministic Turing machine, etc. Thus, accord-ing to the CT, the notions of computability, effective calculability, and Turing-computability are equivalent. A mathematical function is computable if andonly if it is possible to write a Turing machine which computes it.

Since the notions of computability and effective calculability have no for-mal, mathematical definitions, the CT is not a mathematical theorem, and itcannot be formally proven. Nonetheless, it is universally accepted by computerscientists, and it is one of the foundational ideas of the field.

1.3 Interpretations of the Church-Turing thesis

The Church-Turing thesis, since it is an informal and imprecise claim, has var-ious interpretations. A conservative interpretation, what Piccinini [2011] callsthe “Mathematical CT,” is that every computable function can be computed bya Turing machine, and that all formalisms for specifying computable functions(the λ-calculus, µ-recursive functions, etc.) are equivalent to Turing machines.

5

Page 6: Nick Thomas honors thesis

The Church-Turing thesis can also be understood as having implicationsabout the capabilities of physical systems. Piccinni defines the “Modest PhysicalCT,” which says that it is impossible to construct a physical system which cancompute a non-Turing-computable function.

Piccinini also defines the “Bold Physical CT,” which says, roughly, thatthe behavior of every physical system is Turing-computable. This would mean,for instance, that it would be possible to construct a Turing machine whichcomputed everything about the future state of the universe, given completeinformation about the present state of the universe.2

Goldin and Wegner [2008] define another interpretation of the CT, whichthey call the “Strong Church-Turing Thesis” (SCT). They describe the SCTas the claim that Turing-computation and computation are the same thing.Roughly, the idea is that everything that physically realizable computers do canbe done by Turing machines.

Goldin and Wegner claim that the SCT is false. In formulating this claim,they have interactive systems in mind. They want to say that computation isthe same thing as interactive computation, and that Turing computation is asubset of interactive computation. They have in mind examples such as theprogram that drives a car, the program that interacts with a user via teletype,etc. Before examining Goldin and Wegner’s claim in more detail, we want toclarify the concept of an interactive system.

1.4 Interactive systems

An “interactive system” is a set of components, some or all of which are com-puter algorithms, which execute behaviors that involve communicating witheach other. We follow Milner [1999] in referring to the components of interac-tive systems as “processes.”

A process is a physical system which exchanges information with other pro-cesses. For our purposes it is either a computer, or something that can exchangeinformation with a computer, such an I/O device or something that can interactwith an I/O device. Thus, for instance, we can think of a computer keyboardas a process, and we can also think of a human typing at a computer keyboardas a process.

Let us consider some real-world examples of interactive systems. Such sys-tems include the Internet, cell phone networks, and concurrent software whichinvolves multiple interacting threads. They also include cases in which a humanuser interacts with a computer; we consider the human to be one process, andthe computer to be another process. We may also think of object-oriented pro-gramming as an interactive paradigm, by thinking of each object as a process,and saying that they interact with each other when they invoke each other’smethods.

2There is a practical qualifier to this line of thinking, which is that such a Turing machinewould presumably be too large to fit inside the universe. So this Turing machine can existonly in thought experiment, and not in reality.

6

Page 7: Nick Thomas honors thesis

Figure 1: An example of an interactive system, drawn abstractly.

There are various mathematical models of interactive computation. Theproblem of formulating an ideal formal model of interactive computation is aninteresting area of active research. Later we shall work with some such models:namely, persistent Turing machines and the π-calculus. For now we shall discussinteractive computation in an informal fashion.

We can draw an interactive system in a diagram such as Figure 1. Eachnode represents a process, and each arrow represents a point of communicationbetween two processes, with information being sent in the direction of the arrow.

Each process has its own internal state, which it updates periodically. Fur-thermore, each process periodically sends and receives information along itscommunication channels. (We do not specify, in this informal introduction, theprecise nature of this communication. For instance, we do not specify whetherit is synchronous or asynchronous.)

We can consider two types of processes: computable processes, and non-computable proceses.3 Intuitively, a computable process is one whose behaviorcan be simulated by a Turing machine; and a non-computable process is onewhose behavior cannot be simulated by a Turing machine.

A computable process could represent a computer or a computer program.A non-computable process could represent something such as a human user oran instrument for interacting with the outside environment. (For example, inthe car-driving program, we could have a non-computable process representingthe video camera which monitors the road.)

If a process is computable, then there is some Turing machine which takesas input the internal state of the process and the contents of its input channelsat a given moment, and produces as output the internal state of the processand the contents of its output channels at the next moment. If a process is

3We do not assume the Bold Physical CT, under which there are no non-computableprocesses.

7

Page 8: Nick Thomas honors thesis

non-computable, then there is no such Turing machine.Various formal models of interactive systems exist. Let us now turn to one

such model: the persistent Turing machine, or PTM.

1.5 Persistent Turing machines

A persistent Turing machine (PTM) is a modification of a Turing machine de-signed for modeling interactive computation. It is a non-deterministic three-tapeTuring machine (N3TM) with a read-only input tape, a read-write work tape,and a write-only output tape.

A PTM is non-deterministic. What this means is that, at every step in com-putation, it may branch into multiple copies of itself, which follow different com-putation paths. (One may notice the similarity to quantum non-determinism.)Note carefully that nobody thinks that PTMs’ non-determinism makes themmore powerful than Turing machines; non-deterministic Turing machines areTuring-equivalent.

The PTM goes through a series of cycles, called “macrosteps.” In eachmacrostep, it gets an input string from the environment, performs a compu-tation, and supplies an output string to the environment, in that order. Ineach macrostep, it starts with the work tape contents that were present at theend of the last macrostep: so we say that the work tape is “persistent” acrossmacrosteps.

We can model an interactive system as a set of PTMs strung together in asequence. For instance, maybe we have PTMs called A, B, and C. The outputof A becomes the input of B, and the output of B becomes the input of C.

One shortcoming of PTMs is that they can only easily model interactivesystems in which all of the processes are chained together in a sequence likethis. They could not, for instance, model the system drawn in Figure 1, becausesome of the processes have multiple inputs and multiple outputs. So we say thatPTMs model “sequential interaction,” rather than interaction in general.

See Subsection 2.1 for the formal definition of a PTM. It is important to notethat a PTM is simply an N3TM, interpreted in a certain way. The set-theoreticdefinition of a PTM is is the set-theoretic definition of an N3TM.

The definition of an N3TM, in turn, is a simple variation on the definition ofa Turing machine given in Section 2.1. The only differences are that there arethree tapes, and δ is one-to-many, rather than one-to-one, capturing the ideathat the N3TM is non-deterministic. (We call it a transition relation, ratherthan a transition function.)

The difference between an N3TM and a PTM is that we go on to make someinformal statements about how that set-theoretic definition is to be interpreted.We say that the first tape is a “read-only input tape,” the second tape is a“read-write work tape,” and the third tape is a “write-only output tape.” Thenwe say that the PTM goes through a series of “macrosteps,” in each of whichit receives input, performs a computation, and produces output.

8

Page 9: Nick Thomas honors thesis

1.6 Is interactive computation a superset of Turing com-putation?

Let us scrutinize the claim that interactive computation is a superset of Turingcomputation. Goldin and Wegner [2008] say that a certain interpretation ofthe Church-Turing thesis, which they call the “Strong Church-Turing Thesis,”is false, because Turing machines cannot model interactive systems. Are theycorrect in saying this?

Goldin and Wegner argue that certain real-world computer systems (e.g., acar-driving program) cannot be implemented by a Turing machine. They claim,furthermore, that they have formally proven that the SCT is false (see page 29of Goldin and Wegner [2008]). However, they give only a vague sketch of sucha proof; and we can see on independent grounds that there cannot be such aproof.

The Church-Turing thesis is not a formal claim of pure mathematics; forthis reason, there is no formal proof of the Church-Turing thesis. The same istrue of the SCT. There cannot be a formal proof that the SCT is true or false,because it is not a formal claim.

In asking whether the SCT is true or false, we need to realize that what weare asking is not a mathematical question. It is more of a philosophical question.

Goldin and Wegner argue that PTMs can model systems that Turing ma-chines cannot model, and that the SCT is therefore false. One important ques-tion to ask here is, “are PTMs actually a different thing from Turing machines?”I suggest that PTMs are actually a kind of Turing machine.

If we we say “PTMs and Turing machines are different things,” how are weto ascribe meaning to that statement? It cannot mean, “a (K,Σ, δ, s0) is nota (K,Σ, δ, s0),” because that is false. But if that is not what it means, then Idon’t know what it means.

Informally, we think of a PTM as having a read-only input tape, a read-writework tape, and a write-only output tape. We think of it as being somethingwhich can interact with other processes. But why can’t I think of an N3TMas having those properties? Is a PTM anything other than an N3TM which Ithink of as having those properties?

Formally speaking, the difference between a PTM and an N3TM is thata PTM is a Turing machine which has been augmented with the concept ofmacrosteps. We say that a PTM goes through a series of macrosteps, throughwhich a work tape persists; whereas we do not say this of an N3TM.

But instead of saying, “PTMs go through a series of macrosteps, whereasN3TMs do not,” couldn’t I say, “N3TMs sometimes go through a series ofmacrosteps, and sometimes do not?” Does the content of those two statementsdiffer in any way? If so, what is the difference?

This is clearer if we substitute the terms “PTM” and “N3TM” with theirset-theoretic definitions. Then the first statement becomes, “a (K,Σ, δ, s0)goes through a series of macrosteps, whereas a (K,Σ, δ, s0) does not.” Thesecond statement becomes, “a (K,Σ, δ, s0) sometimes goes through a series ofmacrosteps, and sometimes does not.”

9

Page 10: Nick Thomas honors thesis

We could throw out the word “PTM” entirely without losing any expres-sive power. I could just replace every instance of “PTM” with “N3TM withmacrosteps,” and I wouldn’t be saying anything different.

For instance, let us re-write the statement, “PTMs can model interactivesystems, whereas Turing machines cannot.” This statement becomes, “whenN3TMs are augmented with the concept of macrosteps, they can model inter-active systems, and otherwise they cannot.”

Incidentally, we can take the second statement as a clarified version of, “theSCT is false.” It seems to be true that Turing machines cannot model interactivesystems without invoking some additional concepts; and this captures the thrustof Goldin and Wegner’s argument.

(We say, “some additional concepts,” rather than “the concept of macrosteps,”for the following reason. Turing machines plus macrosteps cannot model all in-teractive systems; they can only model sequentially interactive systems. So it isnot clear that macrosteps are the right concept with which to augment Turingmachines. But I think that it is clear that some conceptual augmentation isrequired for Turing machines to model interactive systems.)

Let us take this as our working formulation of the SCT: “Turing machines canmodel interactive systems without invoking any additional concepts.” (Later weshall consider other, non-equivalent formulations.) Now we may ask: is the SCTtrue or false?

We have already pointed out that the SCT is not a claim of pure mathemat-ics. There cannot be a formal proof that it is true or false, because it cannoteven be stated formally.

The SCT, besides not being a mathematical claim, is also not an empir-ical claim. There is no scientific experiment we could perform which woulddetermine whether the SCT is true or false.

It would seem, then, that we are not likely to be able to rigorously prove orfalsify the SCT. We cannot expect to meet the scientist’s or the mathematician’sstandard of proof. We may at best hope for the philosopher’s standard of proof.

Before addressing this issue in further detail, it is worth reflecting on themost mysterious word in our formulation of the SCT: that is, the word “model.”What does it mean for a mathematical formalism — such as a Turing machineor a Turing machine with macrosteps — to be a model of a physical system?Let us consider an example.

I can write a computer program — a Perl program, let us say — which printsout a list of the first 1,000 prime numbers. I can also design a Turing machinewhich does the same thing. We say that the Turing machine is a “model” ofthe Perl program.

There are many differences between the Turing machine and the Perl pro-gram. The Turing machine is not written in Perl. They may differ in manydetails of the algorithm they use. The Perl program may write its output to afile, or to a teletype interface; whereas the Turing machine writes its output toits output tape.

A Perl program does not have a set of tapes; it does not have a read-writehead; it does not have a table of transition rules. But still we say that the

10

Page 11: Nick Thomas honors thesis

Turing machine is a model of the Perl program.There must be some relevant similarity between a model and the thing it

models. In this case the similarity is just this: that the Turing machine and thePerl program compute the same function.

It is this sort of similarity that we are talking about when we say that aPTM is a “model” of a car-driving program. The similarity is not that they arecomputing the same function, because neither one of them computes a function.But in some sense, they have the same behavior. If the car-driving programproduces a given output from a given input and internal state, then the PTMproduces the same output from the same input and internal state. That is oneway of describing the similarity.

What does it mean for something to model something else? We speak, forexample, of a scale model of a building: a little construction of wood, glue, andpaint which captures the general features of a real or planned building.

We also say that Newtonian physics models the behavior of the physicalworld, within certain restrictions of scale, speed, etc. This is closer to themeaning we are discussing. In this context, by “model” we specifically mean amathematical formalism which captures some salient features of certain physi-cally existing things.

In order to call something a “model,” we might want it to say everythingthat can be said about the thing modeled. For example, perhaps a physicistcould say what properties one would have to enumerate, for a particular atom,in order to have exhaustively described that atom.

But something can also be a model without being an exhaustive description.Generally when we model physical objects, we do not model everything aboutthem. For instance, a description of a baseball in the language of Newtonianphysics probably does not mention the color of the stitches on the ball, orwhether or it had previously been thrown by a famous baseball player. It maymention nothing more than the mass and position. And yet we still call this amodel of the baseball.

So a model does not have to be an exhaustive description of a physicalobject; it can be something which merely captures some salient features of theobject. And it is in this sense in which our Turing machine that computes primenumbers is a model of the Perl program that computes prime numbers. Thesalient feature that the Turing machine captures is the function that the Perlprogram computes.

Now let us try to stretch our understanding of the word “model.” SupposeI find a strange snake in the wilderness which has red, green, and blue scales,arranged in a complex pattern on its back. Suppose, furthermore, that I find analgorithm which predicts the pattern, and I write a Turing machine (T) whichcomputes that algorithm. Uncontroversially, we say that T models the patternof the snake’s scales.

Suppose, however, that T does not output the correct pattern. Suppose,furthermore, that I can write another Turing machine (U) which takes T’s out-put, and processes it to produce the correct pattern. We would probably saythat T and U together form a (somewhat awkward) model of the pattern of the

11

Page 12: Nick Thomas honors thesis

snake’s scales.But, here is a hard question: is T by itself a model of the pattern of the

snake’s scales? On the one hand, we can say that it is, because T “contains”the desired information, in a coded, implicit form, which can be extracted withthe appropriate method. On the other hand, we can say that it is not, becausethe relationship is too indirect.

What I would like to illustrate with this dilemma is that “model” has noprecise meaning. It is a concept with fuzzy boundaries. In the gray area between“model” and “not a model,” there are things which are neither clearly models,nor clearly not models.

I would say that T is sort of a model of the snake’s scales, and I do notthink we will get anywhere in trying to give a hard “yes” or “no” answer to thequestion of whether or not it is a model of the snake’s scales.

Now let us return to our question: is interactive computation a superset ofTuring computation? Is the SCT true or false? As I have already stated, Ithink that we can re-formulate this question as: can Turing machines modelinteractive systems without invoking any additional concepts?

By “interactive systems” we mean physically realized computing systemswhich interact with their environment while running: the car-driving program,a program interacting with a human user, etc. We want to know whether ornot Turing machines can model these systems.

As I said before, we cannot resolve this question either with a formal proof,or by the scientific method. So it is not entirely clear how it is to be resolved.Nonetheless, I think that it has a fairly clear answer.

I think that the SCT is false; that Turing machines cannot model inter-active systems without invoking additional concepts. Let us briefly reiteratethe argument. A Turing machine cannot accept new input while it is running.This implies, in particular, that a Turing machine’s input cannot have causaldependencies upon its output. I see no way around these problems that doesn’tinvolve introducing the concept of macrosteps, or something analogous.

Having stated what I think is a true interpretation of “the SCT is false,” Iwill now state what I think are false interpretations of that claim.

First of all, I take issue with the statement, “PTMs can model interactivesystems, whereas Turing machines cannot.” I am inclined to re-write it as, “a(K,Σ, δ, s0) can model interactive systems, whereas a (K,Σ, δ, s0) cannot.” Ifthat is what the statement means, then it is false; and if that is not what thestatement means, then I don’t know what it means.

Another interpretation of “the SCT is false” would be, “interactive systemscan compute non-Turing-computable functions.” Is this true or false?

Theorem 2.4.10 will show that there is a “timeless simulation” of every se-quentially interactive system. In other words, given any sequence of PTMsstrung together, and a sequence of input tape states to be given to that se-quence of PTMs, there exists a Turing machine which will compute the seriesof output tape states that the sequence of PTMs produces.

This would seem to demonstrate that (sequentially) interactive systems can-not compute non-Turing-computable functions. But it doesn’t quite do that,

12

Page 13: Nick Thomas honors thesis

for the following reason. If a sequentially interactive system contains non-computable processes, then the simulating Turing machine does not say whatthose non-computable processes do; it only gives the set of possible things thatthose non-computable processes could do. The simulating Turing machine can-not say what the non-computable processes actually do, but only what theymight do.

So, suppose that we construct an interactive system which contains a non-computable process that can compute a non-computable function. For instance,suppose that we have a PTM interacting with a magical oracle which can solvethe halting problem. The PTM is programmed to take a Turing machine asinput, give the Turing machine to the oracle, receive the oracle’s answer con-cerning whether or not the Turing machine halts on all inputs, and then give usthat answer.

We would say that this interactive system computes a non-Turing-computablefunction. But there are limitations to the significance of this point. First, wehave no reason to believe that such an oracle actually physically exists in ouruniverse; so this interactive system exists for us only in thought experiment.Second, the PTM in this thought experiment is not, so to speak, doing the“heavy lifting” of actually computing a non-computable function; it just passesthe job on to the oracle and then reports the oracle’s findings. Third, thisis not a new thought experiment. The oracle Turing machine idea has beenaround since Turing Turing [1939]; and the interactive paradigm doesn’t reallyadd anything new to the thought experiment.

The essential point is that we can’t actually build an interactive system whichcomputes a non-Turing-computable function. Suppose one went up to the CEOof a hardware company and said, “hey, interactive systems can compute non-Turing-computable functions; you should invest in researching them.” The CEOwould be excited to hear about this powerful new technology. But you wouldneed to add, “oh, sorry boss, none of the interactive systems we can actuallybuild can do that.” After hearing that he wouldn’t be so excited.

Goldin and Wegner do not seem to be claiming that interactive systems cancompute non-Turing-computable functions. So “the interactive thesis is falsebecause interactive systems can’t compute non-Turing-computable functions”is a strawman argument.

That said, it would be easy to misinterpret Goldin and Wegner as claimingthat interactive systems can compute non-Turing-computable functions. FromCockshott and Michaelson [2007], it looks like such a miscommunication hasalready occurred at least once in the literature. I expect this particular mis-communication to continue to occur in the future. People have a mental habitof equating the topic of the limitations of Turing machines with the topic ofTuring-computable functions. Our question concerning interactive systems isone of the few places in which it isn’t correct to equate the two topics.

13

Page 14: Nick Thomas honors thesis

1.7 Implications

Why should we care whether or not interactive computation is a superset ofTuring computation?

Firstly, Goldin and Wegner’s findings point out a certain deficiency in thecommon understanding of the Church-Turing thesis. Computer scientists wouldtypically say that every physically realized computational process can be mod-eled as a Turing machine. We have seen that this is not quite correct.

Secondly, the interactive paradigm in general is a useful one. Our formalmodels of computation were designed in the early days of computing. Computerswere things that solved math problems in batch mode. At this time, our modelsof computation corresponded in a natural way to real computers.

Computers have since changed. They are no longer things that solve mathproblems in batch mode. They are things that do a vast panopoly of tasks; andthese tasks usually involve dynamic interaction with people and the physicalworld. Models of interactive computation are a more natural way of model-ing modern computers. They are a map with a tidier correspondence to theterritory.

It is not really accurate, however, to say that interactive systems constitute aform of “super-Turing computation.” Interactive systems cannot compute non-Turing-computable functions. So I would consider the phrase to be a misleadingone.

2 Persistent Turing Machines

Now we shall consider some formal properties of PTMs, which are relevant tothe question of whether or not interactive computation is a superset of Turingcomputation.

A persistent Turing machine, or PTM, is a minimal modification of a Turingmachine which lets it act as a process in an interactive system.

A PTM is a non-deterministic three-tape Turing machine (N3TM). One ofits tapes is a read-only input tape, one of its tapes is a read-write work tape,and one of its tapes is a write-only output tape. It goes through a series of“macrosteps,” each of which is a single halting Turing machine computation,that begins in the starting state and ends in the halting state.

In every macrostep, it is assumed that the PTM begins having received astring on its input tape from some external input source. It performs computa-tions which leave a string on its output tape, and it is assumed that that stringis given to some external output sink once the PTM is finished computing thismacrostep.

Furthermore, the PTM’s work tape state is preserved between macrosteps;thus, the work tape state at the start of a given macrostep is the work tape stateat the end of the previous macrostep. This is an important property, because itlets the PTM “remember” what happened before this macrostep, in whateverway is computationally important.

14

Page 15: Nick Thomas honors thesis

Why are PTMs non-deterministic? They don’t need to be. The interactivityof PTMs doesn’t depend on their non-determinism. Nobody is trying to claimthat the non-determinism of PTMs lets them do anything that ordinary TMscan’t do. Non-deterministic TMs are Turing-equivalent. PTMs could have beendefined as deterministic, and we would still be saying most of the same thingsabout them.

The non-determinism has the following advantage. With it, we can modelnon-deterministic processes in the real world. For instance, suppose we have aprocess which is an instrument that reads the state of a quantum system — say, adigital geiger counter. Because the geiger counter’s behavior is unpredictable, wecan’t write a deterministic PTM to model the geiger counter. What algorithmwould the PTM implement?

We can model the geiger counter as a non-deterministic PTM. Instead ofwriting an algorithm which says what output the geiger counter produces, we canwrite an algorithm which non-deterministically branches into all of the possibleoutput states that the geiger counter could have. This gives us a provisionalsort of model that we can use to talk about the geiger counter.

Now we shall review the formal definition of PTMs, and explore some oftheir mathematical properties which are relevant to our question.

2.1 Definition of persistent Turing machines

We repeat the definition of persistent Turing machines, as it is given by Goldinet al. [2004]. See that paper for further details. Persistent Turing machines are amodification of nondeterministic three-tape Turing machines (N3TMs). Hence,we begin with the definition of N3TMs.

Definition 2.1.1 A nondeterministic three-tape Turing machine (N3TM)M = (K,Σ, δ, s0) is:

� A finite set K of states.

� A finite alphabet Σ, including the blank symbol # and not including L orR.

� A transition relation δ ⊆ K × Σ × Σ × Σ × (K ∪ {h}) × (Σ ∪ {L,R}) × (Σ ∪{L,R}) × (Σ ∪ {L,R}).

� s0 ∈K is the initial state.

� h ∉K is the halting state.

An N3TM is deterministic (a D3TM) iff δ is a function δ ∶ K ×Σ ×Σ ×Σ →(K ∪ {h}) × (Σ ∪ {L,R}) × (Σ ∪ {L,R}) × (Σ ∪ {L,R}).

The N3TM is equipped with three unbounded tapes. Each tape is dividedinto a series of cells, each of which holds a string of symbols from Σ∗.

Each tape has a tape head which, at any given time, is over a particularcell. The configuration of the N3TM at any given time is described by giving itsstate, the contents of its tapes, and the locations of its tape heads. Formally:

15

Page 16: Nick Thomas honors thesis

Definition 2.1.2 A configuration C = (s,w1,w2,w3, n1, n2, n3) of M is:

� s ∈ (K ∪ h), the state of M.

� w1,w2,w3 ∈ Σ∗, the contents of M’s tapes.

� n2, n2, n3 ∈ N, the positions of M’s tape heads.

The transition relation describes how the configuration of the N3TM evolvesover time. Each transition is called, in Goldin et. al.’s terminology, a “mi-crostep.”

The N3TM evolves according to the following process. Suppose that theN3TM is in state s, and the symbols on its tapes are a, b, and c. Find all rules(s, a, b, c, s′, a′, b′, c′) ∈ δ. If there are multiple such rules, then the N3TM splitsinto multiple copies of itself, each of which follows one of the available rules.

In the new configuration for a given copy, the N3TM’s state is s′. If s′ = h,then the N3TM halts. If a′ = L, then the first tape’s head moves one cell to theleft. If a′ = R, then the first tape’s head moves one cell to the right. Otherwise,the symbol in the current cell becomes a′. The same rule applies to b′ and thesecond tape, and to c′ and the third tape. Formally:

Definition 2.1.3 Let M be an N3TM and C,C′ be two configurations ofM:

C = (s,w1,w2,w3, n1, n2, n3), C′ = (s′,w′

1,w′

2,w′

3, n′

1, n′

2, n′

3).Then C ↦ C′, i.e. C yields C′ in one microstep, iff, for i = 1,2,3:

(s,w1[n1],w2[n2],w3[n3], s′, c1, c2, c3) ∈ δ.n′i = ni + 1 if ci = R;

n′i = ni − 1 if ci = L and ni ≠ 1;

n′i = ni otherwise.

w′

i = wi if ci = L or ci = R;

w′

i = wi[ci/ni] otherwise.

IfM is a D3TM, then for every configuration C there is at most one solutionC′ to the relation C → C′, because δ is a function.

It will also be useful to define a microsequence, i.e., a series of microsteps:

Definition 2.1.4 Let M be an N3TM and C,C′ two configurations of M.

We say that C ∗↦ C′ iff there exist configurations C0, ...,Cn for some n ≥ 0 such

that C = C0, C′ = Cn, and Ci ↦ Ci+1 for all 0 ≤ i ≤ n.∗↦ is the transitive closure

of ↦.

This completes our definition of N3TMs. Now we proceed to PTMs.

16

Page 17: Nick Thomas honors thesis

A persistent Turing machine is an N3TM in which the first tape is consideredto be a read-only input tape, the second tape is considered to be a read-writework tape, and the third tape is considered to be a write-only output tape.

The PTM goes through an unbounded series of “macrosteps,” each of whichconsists of a series of microsteps terminating in the halting state h. Eachmacrostep begins with the PTM having been provided with the contents wiof its input tape, which we suppose to have come from some other process. Itends with the PTM having produced the contents wo of its output tape, whichwe suppose to be passed on to some other process.

Throughout the series of macrosteps, the contents of the PTM’s work tapeare preserved (hence the term “persistent”), with the contents of the work tapeat the end of one macrostep being the contents of the work tape at the beginningof the next macrostep.

When P is a PTM, wwi/woÐ→P

w′ means that when P starts running with work

tape w and input tape wi, it halts with work tape w′ and output tape wo.Formally:

Definition 2.1.6 Let M be a PTM, Σ its alphabet, and wi,w,w′,wo ∈ Σ∗.

Then wwi/woÐ→M

w′, i.e. w yields w′ in one macrostep, iff, for some s, ni, n′

i, with

i = 1,2,3:

(s,wi,w, ε, n1, n2, n3)∗↦ (h,wi,w′,wo, n

1, n′

2, n′

3).(ε denotes the empty tape.)

Note that if M is deterministic, then for every w,wi, then there is at most

one solution wo,w′ to the relation w

wi/woÐ→M

w′.

That completes our description of persistent Turing machines.

2.2 Sequential interaction

PTMs formalize the notion of “sequential interaction.” Sequential interactionis a subset of interaction in general, in which every process receives input fromat most one process, and gives output to at most one process.

Since sequential interaction is only a subset of general interaction, resultsabout sequential interaction do not necessarily extend to general interaction.But it is not straightforward to extend PTMs to capture general interaction.So we shall content ourselves, in studying PTMs, with studying sequential in-teraction. We will later find that many of the conclusions that we draw aboutsequential interaction do in fact extend to general interaction.

Let us clarify the notion of a sequentially interactive system. We have somenumber of processes, and they are chained together, with the output of oneprocess being the input to the next.

We will postulate that the number of processes is finite. This is because weare interested in physically realizable systems, and an infinite chain of processesis not physically realizable.

17

Page 18: Nick Thomas honors thesis

We will also postulate that the first process takes input from some unspecifiedoutside source, and the last process gives output to an unspecified outside source.

We therefore define sequentially interactive systems, or SIS’s.

Definition 2.2.1 A sequentially interactive system (SIS) S = (P,Σ) is:

� A finite sequence P ∶ [1, n]→ {the set of PTMs}, of processes.

� A finite alphabet Σ, where we require that every process has Σ as itsalphabet.

In order for the PTMs to communicate with each other, they must have thesame alphabet. It would be possible to arrange affairs so that the PTMs hada common alphabet for communication, but used different alphabets internally;or, to arrange affairs so that each connection used a single alphabet, but differentconnections had different alphabets. For simplicity, however, we restrict allPTMs in a given system to using a single common alphabet.

Now we must describe the behavior of SIS’s: that is, the manner in whichthey map inputs to outputs. We do this by extending the “yields in onemacrostep” (Ð→) operator of Goldin et al. [2004] to work with SIS’s. Instead ofspeaking of the macrosteps of a single PTM, we now speak of the macrostepsof a SIS.

When P is a PTM, wwi/woÐ→P

w′ means that when P starts running with work

tape w and input tape wi, it halts with work tape w′ and output tape wo.

When S is an SIS, we write Wwi/woÐ→S

W ′, where W and W ′ are sequences

of tape states, with one tape state per PTM in S. The elements of W are thestarting work tape states, and the elements of W ′ are the ending work tapestates. wi is the input to the system, and the input to the initial PTM. wo isthe output of the system, and the output of the final PTM.

We also use a sequence of tape states, W i, which represents the series ofintermediate outputs that the system produces. The first element of W i is theinput to S (i.e., wi), and the remaining elements of W i are algorithmicallygenerated, with a non-initial element of W i being the output of one PTM andthe input to the next PTM. So, for example, the first PTM takes as input W i

1,and produces as output W i

2. The second PTM takes as input W i2, and produces

as output W i3; and so forth.

Definition 2.2.3 Let S = (P,Σ) be an SIS. Let p be the length of P . LetW,W ′ ∈ [1, p] → Σ∗ be sequences over Σ∗, and wi,wo ∈ Σ∗. Then W yields W ′

in one macrostep, i.e.:

Wwi/woÐ→S

W ′

if and only if there exists a W i ∈ [1, p]→ Σ∗ such that W i1 = wi and:

18

Page 19: Nick Thomas honors thesis

W1

W i1/W

i2Ð→

P1

W ′

1 and

W2

W i2/W

i3Ð→

P2

W ′

2 and

Wp

W ip/wo

Ð→Pp

W ′

p.

The first clause in the biconditional characterizes the behavior of the sequen-tially interactive system. The second clause, consisting of a series of conjunc-tions, characterizes the corresponding behavior of the PTMs that make up thesystem.

This series of conditions can be read as, “each PTM, starting in the initialwork tape state belonging to it, with the input tape contents belonging to it,yields in one macrostep the input to the next PTM (or the output of the system,if it is the final PTM), and ends in the final work tape state belonging to it.”

For example, suppose that the SIS is a video decoder. Video decoders oftenconsist of a set of algorithms which are applied in series: the first algorithm’soutput is the input to the second algorithm, the second algorithm’s output isthe input to the third algorithm, etc.

Consider a video decoder consisting of three algorithms; call them A, B, andC. The above series of conditions would be read, “A takes the starting inputwi = W i

1 and yields as output W i2. B takes the input W i

2 and yields as outputW i

3. C takes the input W i3 and yields as output wo.”

2.3 Modeling physical processes

We can take the mathematical abstractions of PTMs and SIS’s as representingprocesses in the physical world. Physical systems which alternate between re-ceiving input data, and giving output data, can be modeled as PTMs. Groupsof such processes, sequentially interacting with each other, can be modeled asSIS’s. For instance, we can model the “driving home from work” example, or ahuman interacting with a computer terminal, or a streaming video decoder, asSIS’s.

We shall model non-computable processes by taking nondeterminism as rep-resenting non-computable behavior. For instance, suppose we have a randomnumber generator, which outputs a series of numbers between 1 and 4, usinga source of quantum nondeterminism in order to select the numbers. We canmodel this system using a PTM which implements the following algorithm (forone macrostep):

Algorithm 2.3.1.

1. Branch into four computation paths.

19

Page 20: Nick Thomas honors thesis

2. In one of these paths, write 1 to the output tape. In one of these paths,write 2. In one of these paths, write 3. In one of these paths, write 4.

3. Halt.

The model specifies a set of possible final tape contents (specifically finalwork tape and output tape contents, since the input tape is constant). Theactual system yields one of the possibilities in this set, such that no Turingmachine can predict which of the possibilities is actually selected, no matterhow much information it is given about the state of the system at the beginningof the macrostep.

This is how we choose to model the behavior of non-computable processes.As a further example, we can model the behavior of a human user typing ata keyboard by writing a PTM which branches into paths that output everypossible string to the output tape. If the string lengths are unbounded thenthere will be an infinite number of halting configurations, and the PTM as awhole will never halt; but this is not problematic. Of course, this PTM willbe an idealization of the actual physically realized system, because the physicalsystem will at some point reach a limit to the length of its strings.

These PTMs are an idealization of the actual systems in a further way, whichis that they do not capture the statistical properties of the systems in question.For instance, it may be that the quantum random number generator is twiceas likely to output a 2 as it is to output a 1; but our model does not capturethis fact. Similarly, our model does not capture the fact that a human is morelikely to type certain things (e.g., coherent sentences or computer commands)than they are to type other things (e.g., random gibberish). In short, our modelspecifies only what behavior is possible, without specifying what behavior isprobable.

A Turing machine, as we will see in Subsection 2.4, can compute the setof possible behaviors which a non-computable process may exhibit. But aTuring machine cannot determine which behavior the physically realized non-computable process actually exhibits. It cannot determine which number thequantum random number generator actually outputs, or which string the humanuser actually types. So the behavior of these physically realized non-computableprocesses is, unsurprisingly, not Turing computable.

Now we will specify more precisely the behavior of the physical system thata sequentially interactive system models.

Specification 2.3.2. Given an SIS S = (P,Σ), we say that S models agiven physically realized system iff, for every process in the physical systemcorresponding to a process Pn in P , during any given execution of the system’salgorithm:

� The physical process alternates between taking input data, and givingoutput data. A cycle of taking one unit of input data and giving one unitof output data corresponds to one macrostep of Pn. Call the physical

20

Page 21: Nick Thomas honors thesis

process correspoding to a macrostep, a “physical macrostep.” Numberthe macrosteps and corresponding physical macrosteps as 1, 2, 3, etc.

� For every m, the process of computing the physical macrostep m tempo-rally precedes the process of computing the physical macrostep m + 1.

� The physical process of computing macrostep m temporally follows thephysical process of receiving input data for macrostep m, and temporallyprecedes the physical process of giving output data for macrostep m.

� If Pn is not the last process (i.e., n is not the length of P ), then for everym, the process of computing the physical macrostep m of Pn temporallyprecedes the process of computing the physical macrostep m of Pn+1.

� The physical process’s input/output behavior follows the rules specifiedby Pn. In every macrostep, it gives one of the outputs which is possiblegiven its current input data and the contents of Pn’s work tape at thebeginning of the macrostep.

� In macrostep 1, the physical process behaves as if Pn’s work tape is empty.In subsequent macrosteps, the physical process behaves as if Pn’s worktape is in the state which resulted from the previous macrostep. This statewill be one of the states which is possible according to Pn. (Just as onemacrostep may yield multiple possible output tape states, it also may yieldmultiple possible final work tape states. We nondeterministically selectone of the final work tape states associated with the output tape statethat actualy occurred, as being the final work tape state which actuallyoccurred.)

2.4 Computing the behavior of sequentially interactivesystems

The behavior of PTMs and sequentially interactive systems is Turing-computablein the following sense. Given a PTM or SIS, we can construct a computablefunction which takes as input a series of inputs to be given to the PTM/SIS,and produces as ouptut the series of outputs which the PTM/SIS yields fromthat series of inputs.

We have already pointed out that a PTM/SIS in a certain sense cannot beemulated by a Turing machine, because a Turing machine cannot accept newinput while it is running. So in a certain sense PTMs/SIS’s are a more expressivemodel than Turing machines.

Now we wish to point out that in another sense PTMs/SIS’s are only asexpressive as Turing machines. If a Turing machine is given as input a wholeseries of inputs to be given to a PTM over a series of macrosteps, it can produceas output the series of outputs that that PTM/SIS would produce from thoseinputs, over the same series of macrosteps.

21

Page 22: Nick Thomas honors thesis

Hereafter shall restrict our attention to SIS’s, because we can consider aPTM as an SIS consisting of one PTM. Given an SIS S, we define the “behaviorrelation” of S as a relation ⊸ over sequences of strings, (⊸) ⊆ ([1, n]→ Σ∗)2:

i1, ..., in ⊸ o1, ..., oniff

εi1/o1Ð→S

W1,

W1i2/o2Ð→S

W2,

⋮Wn−1

in/onÐ→S

Wn,

(1)

for some W1, ...,Wn, where the Wi are sequence of work tape states. In thedeterministic case, the behavior relation is actually a behavior function.

Just as we speak of Turing machines computing functions, we may speak ofnondeterministic Turing machines computing relations. We say that an NTMcomputes a relation R just in case the NTM produces the input o from theoutput i iff iRo. And, using that notion, we may speak of a relation beingcomputable or not. With all that, now we wish to say that the behavior ofrelation of every SIS is computable.

Theorem 1. Let S be an SIS, and ⊸ its behavior relation. ⊸ is computable.Proof. It is sufficient to construct an NTM which computes ⊸. We shall

sketch how that could be done.First, notice that a single macrostep of the execution of a PTM is com-

putable. That is, given an input tape and initial work tape, we can computethe resulting (set of) output tape(s) and final work tape(s). This is a trivialconsequence of the fact that PTMs are NTMs and one macrostep of a PTM’sexecution is a complete execution for the PTM-considered-as-an-NTM.

Suppose S consists of PTMs P1, ..., Pn. Computing a macrostep of S withinput i consists of computing a macrostep of P1 with input i, letting its outputbe the input to P2 and computing a macrostep of P2, and so forth until theoutput of Pn−1 is the input to Pn and the output of Pn is the output of thesystem S.

Suppose that the initial work tape state sequence was W , so that Wi wasthe initial work tape state for Pi. Then we are left with some W ′, where W ′

i

was the final work tape state for Pi.Clearly, given that a macrostep of Pi is always computable, this whole pro-

cess is computable. So we can compute a macrostep of S. Next we note thatcomputing the behavior relation ⊸ consists of computing a series of macrostepsof S, with the input being given in each case, and the initial work tape statesequence being the final work tape state sequence of the previous macrostep.Since this procedure is a simple iteration of a computable procedure, it is itselfcomputable. That is, ⊸ is computable.

22

Page 23: Nick Thomas honors thesis

2.5 Discussion

We noted in Section 1 that Turing machines cannot emulate the behavior ofPTMs, because they cannot accept new input while they are running. Thesame is true with respect to Turing machines and SIS’s. We also noted thatevery SIS (and therefore every PTM, considering PTMs as the limiting caseof SIS’s with one PTM) has a computable behavior relation. That is, given asequence of inputs for an SIS, we can compute the sequence(s) of outputs whichthat SIS would produce from those inputs.

These two results make contrasting statements about the relative expres-siveness of Turing machines and PTMs as models of computation. There is nological contradiction; but it is not immediately clear how to interpret the results.The key point to be made, I think, is that an NTM that computes a PTM’sbehavior relation is not, in a semantic sense, a model of the process that thePTM models. The information it provides is similar; but it is not a mechanismfor thinking about an interactive system, because it lacks the notion of dynamicreceptivity.

3 The π-Calculus

The π-calculus is a model of interactive computation. It is more general thanPTMs, because it can capture non-sequential interactions, and interactions inwhich processes and process connections dynamically come into and go out ofexistence. The latter capabilities are called “mobility.”

The π-calculus is of interest to us because we have drawn various conclusionsabout interactive systems by studying PTMs, and we want to see that theseconclusions extend to a more general model of interactive computation. Inparticular, we want to see that Turing machines can simulate the behaviors ofnon-sequential, mobile interactive systems.

The basic concept in the π-calculus is that of a “process.” Processes areabstract objects which can send messages to, and receive messages from, eachother. Processes are defined recursively; all but the simplest processes consistof collections of concurrent, communicating processes.

Processes communicate through named “ports.” If one process wants tosend along a port of a given name, and another process wants to receive along aport of a given name, then a “reaction” can occur between the two processes, inwhich the first process sends information to the second process. The informationthat is sent is the name of another port.

The π-calculus is defined, similarly to the λ-calculus, using a formal languagein which expressions are constructed according to simple syntactic rules, andtransformed into other expressions according to simple syntactic rules. Theexpressions represent different interactive systems.

Most models of interactive computation work by beginning with a non-interactive model of computation, and adding interaction as an extra layeron top of the non-interactive model. For instance, PTMs begin with the non-

23

Page 24: Nick Thomas honors thesis

interactive Turing machine paradigm, and add extra conceptual apparatus whichenables the modeling of interactive systems.

The π-calculus takes a different approach. It takes the notion of interactionas primitive, and all computations, even the simplest computations, are accom-plished by systems of concurrently interacting processes. For instance, to addthree and three, one would have a process representing the first three, a processrepresenting the second three, and an “adder” process which interacted withthe two threes to create another process representing a six.

3.1 Definition of the π-calculus

We repeat the definition of the π-calculus, as given in Milner [1999]. Somefurther acquaintance with the π-calculus is probably necessary in order to readthe remainder of this paper, as the π-calculus is a fairly complex system. Anextended introduction of the π-calculus is beyond the scope of this paper; so werefer the reader to Milner [1993].

Let there be an infinite set N of names. Names are denoted by lower caseletters: x, y, z, ....

An action prefix represents an atomic behavior which a process can carryout. The possibilities are sending a message, receiving a message, or performingan internal, unobservable action. The set π of action prefixes is defined by thefollowing syntax:

π ∶∶= x(y) receive y along xx⟨y⟩ send y along xτ unobservable action

(2)

Definition 3.1.1 The set Pπ of π-calculus expressions is defined by thefollowing syntax:

P ∶∶=∑i∈I

πi.Pi ∣ P1∣P2 ∣ new a P ∣ !P (3)

where I is any finite indexing set. The processes ∑i∈Iπi.Pi are called summa-

tions or sums. The empty sum is written 0.

The restriction new y and the input action x(y) both bind the name y. y isfree in the output action x⟨y⟩. The set of free names in a process expression Pis written fn(P ).

The result of replacing all instances of a name x with a name y in a processexpression P is written {x/y}P .

The atomic computational steps of a π-calculus process are called “reac-tions.” In order to define the reaction rules we need a “structural congruence”relation, which says when two process expressions are identical, discountingvariations in syntax which have no semantic significance.

24

Page 25: Nick Thomas honors thesis

Definition 3.1.2 Two π-calculus process expressions P and Q are struc-turally congruent, written P ≡ Q, if we can transform one into the other byusing the following equations (in either direction):

1. Change of bound names (alpha-conversion);

2. Reordering of terms in a summation;

3. P ∣ 0 ≡ P ; P ∣ Q ≡ Q ∣ P ; P ∣ (Q ∣ R) ≡ (P ∣ Q) ∣ R;

4. new x (P ∣ Q) ≡ P ∣ new x Q if x ∉ fn(P ); new x 0 ≡ 0; new xy P ≡new yx P ;

5. !P ≡ P ∣ !P .

Definition 3.1.3 A process expression

new Ð→a (M1 ∣ ⋯ ∣ Mm ∣ !Q1 ∣ ⋯ ∣ !Qn)is said to be in standard form if each Mi is a non-empty sum, and each Qj

is itself in standard form. (If m = n = 0 then the form is new Ð→a 0; if Ð→a is emptythen there is no restriction.)

Proposition 3.1.4 Every process is structurally congruent to a standardform.

Definition 3.1.5 The reaction relation → over Pπ contains exactly thosetransitions which can be inferred from the rules in the table below:

TAU : τ.P +M → P

REACT : (x(y).P +M) ∣ (x⟨z⟩.Q +N)→ {z/y}P ∣ Q

PAR :P → P ′

P ∣ Q→ P ′ ∣ Q

RES :P → P ′

new x P → new x P ′

STRUCT :P → P ′

Q→ Q′if P ≡ Q and P ′ ≡ Q′

This completes our definition of the π-calculus.

25

Page 26: Nick Thomas honors thesis

3.2 Deterministic and non-deterministic processes

As we did with PTMs, we will think of deterministic π-calculus processes asmodeling physical systems with computable behavior, and non-deterministic π-calculus processes as modeling physical systems with non-computable behavior.In order to do this, we must define what it means for a π-calculus process to bedeterministic or non-deterministic.

Intuitively, a process is deterministic if it can react to every input from theenvironment in only one way. This means that, in the process expression’sstandard form, there cannot be two input actions receiving along the samechannel, and there cannot be two output actions sending along the same channel,except in cases where those two actions guard the same process expression.4

Furthermore, if the process is capable of performing a τ -transition, performingthis transition must be the only thing it is capable of doing.

Definition 3.2.1 Let P be a process expression in standard form:

P = new Ð→a (M1 ∣ ⋯ ∣ Mn1 ∣ !Q2 ∣ ⋯ ∣ !Qm)= new Ð→a1 (( ∑

i∈I1,1

π1,1,i.P1,1,i) ∣ ( ∑i∈I1,2

π1,2,i.P1,2,i) ∣ ⋯ ∣ ( ∑i∈I1,n1

π1,n1,i.P1,n1,i) ∣

!(new Ð→a2 ( ∑i∈I2,1

π2,1,i.P2,1,i) ∣ ( ∑i∈I2,2

π2,2,i.P2,2,i) ∣ ⋯ ∣ ( ∑i∈I2,n2

π2,n2,i.P2,n2,i)) ∣

!(new Ð→a3 ( ∑i∈I3,1

π3,1,i.P3,1,i) ∣ ( ∑i∈I3,2

π3,2,i.P3,2,i) ∣ ⋯ ∣ ( ∑i∈I3,n3

π3,n3,i.P3,n3,i)) ∣

⋮!(new Ð→am ( ∑

i∈Im,1

πm,1,i.Pm,1,i) ∣ ( ∑i∈Im,2

πm,2,i.Pm,2,i) ∣ ⋯ ∣ ( ∑i∈Im,nm

πm,nm,i.Pm,nm,i)))

m is the number of replications plus one, and acts as an index. m = 1 refersto the summations which are not inside a replication. n1 is the number ofconcurrent, non-replicated summations. I1,a is the indexing set for the terms ofthe ath non-replicated summation. nj with j > 1 is the number of concurrentsummations in the jth replication. Ij,a is the indexing set of the ath summationin the jth replication.

If, for all k, j, i, πk,j,i = τ and, for all k, j, i, k′, j′, i′, Pk,j,i ≡ Pk′,j′,i′ , then Pis deterministic.

If, for all k, j, i, k′, j′, i′, a, x, y:

� πk,j,i = a(x) and πk′,j′,i′ = a(y) only if {x/y}Pk,j,i ≡ Pk′,j′,i′ , and

� πk,j,i = a⟨x⟩ and πk,j,i = a⟨y⟩ only if x = y and Pk,j,i ≡ Pk′,j′,i′ ,

then P is deterministic.Otherwise, P is non-deterministic.

4We will often run across the latter case, because it obtains in any process expression thatinvolves a replication.

26

Page 27: Nick Thomas honors thesis

3.3 Modeling physical processes

A π-calculus process can model any physical system which produces and/orconsumes information, and can interact with a computer. As with PTMs, wemodel non-computable processes as non-deterministic π-calculus processes. Theset of possible reactions of a process to a given input represents the set of possiblebehaviors of the real-world system it models in response to that input.

In the π-calculus, pieces of information (numbers, strings, etc.) are repre-sented as processes. Thus we must model not only the physical informationprocessing system, but also the information that it processes, using π-calculusprocesses.

Suppose that we want to model an interactive system in which a human userinteracts with a computer via a keyboard and teletype display. Our model willhave four processes: the human, the keyboard, the display, and the computer.

The computer has an output port along which it sends messages to thedisplay, and an input port along which it receives messages from the keyboard.The human has an output port into the keyboard, and input port from thedisplay. The messages which the processes exchange with each other representstrings.

3.4 Decidability of the reaction relation

π-calculus processes cannot compute non-Turing-computable functions. In thissense, they are Turing-equivalent. We want to make a more precise statement ofthe relationship between the π-calculus and Turing machines. We can achievethe desired precision by saying that the reaction relation of the π-calculus isdecidable.

If the reaction relation is decidable, that means that for every π-calculusprocess, there exists a Turing machine which can say what it will do in thefuture. (For non-deterministic processes, the Turing machine says only whatthe set of possible futures is, without saying which future actually obtains. Thisis the same situation that we saw with the timeless simulations of sequentiallyinteractive systems.)

The decidability of the reaction relation is predicated on the decidability ofthe structural congruence relation, since the definition of reaction uses structuralcongruence (Definition 3.1.18). The decidability of the structural congruencerelation, in turn, was historically a difficult open question. However, Engelfrietand Gelsema [1999] gave a positive resolution to this question; the structuralcongruence relation is decidable. So the way is clear to prove the decidabilityof the reaction relation.

Theorem 3.4.1. The reaction relation → is decidable.Proof. Let P,Q ∈ Pπ be processes. We want to give an algorithm to decide

whether or not P → Q.Let P ′ be a standard form structurally congruent to P . Let:

27

Page 28: Nick Thomas honors thesis

P ′ = new Ð→a M1 ∣ ⋯ ∣ Mn ∣ !Q2 ∣ ⋯ ∣ !Qm

= new Ð→a ((∑i∈I1

π1,i.P1,i) ∣ (∑i∈I2

π2,i.P2,i) ∣ ⋯ ∣ (∑i∈In

πn,i.Pn,i) ∣ !Q1 ∣ !Q2 ∣ ⋯ ∣ !Qm)

Furthermore, for all Qi, with 1 ≤ i ≤m, let Qi ≡Mj for some 1 ≤ j ≤ n. Thatis, let P ′ feature at least one expansion of every replication.

We will enumerate a set of standard forms Q ⊂ Pπ such that P → Q iffQ ≡ Q′ for some Q′ ∈ Q.

Whenever, for some j, i, πj,i = τ , let Pj,i ∈ Q. Whenever, for some j, i, j′, i′,with j ≠ j′, πj,i = x(y) and πj′,i′ = x⟨z⟩, let:

M1 ∣ M2 ∣ ⋯ ∣ {x/y}Pj,i ∣ Pj′,i′ ∣ ⋯ ∣ Mn ∣ !Q1 ∣ !Q2 ∣ !Qm ∈ Q.P → Q iff Q ≡ Q′ for some Q′ ∈ Q. Given P , it is possible to algorithmically

construct P ′. Given P ′, it is possible to algorithmically construct Q. Given Q,it is possible to algorithmically decide whether or not Q ≡ Q′ for some Q′ ∈ Q.Therefore, → is decidable.

4 Acknowledgements

I wish to thank Dr. Brad Armendt for his helpful comments throughout thewriting process.

References

Douglas Bridges. Computability: A Mathematical Sketchbook. Number 146 inGraduate Texts in Mathematics. Spinger-Verlag, 1994.

Paul Cockshott and Greg Michaelson. Are there new models of computation?reply to wegner and eberbach. The Computer Journal, 50(2):232–247, 2007.

Gordana Dodig-Crnkovic. Significance of models of computation, from turingmodel to natural computation. Minds and Machines, 21:301–322, 2011.

Eugene Eberbach, Dina Goldin, and Peter Wegner. Turing’s ideas and modelsof computation. In Alan Turing: Life and Legacy of a Great Thinker, pages159–194. Springer-Verlag, 2004.

Joost Engelfriet and Tjalling Gelsema. Multisets and structural congruence ofthe π-calculus with replication. Theoretical Computer Science, 211(1):311–337, 1999.

D. Goldin and P. Wegner. The church-turing thesis: Breaking the myth. InNew Computational Paradigms, volume 3526 of Lecture Notes in ComputerScience, pages 152–168. Springer-Verlag Berlin, 2005.

28

Page 29: Nick Thomas honors thesis

D. Goldin and P. Wegner. The interactive nature of computing: Refuting thestrong church-turing thesis. Minds and Machines, 18(1):17–38, 2008.

Dina Q. Goldin. Persistent turing machines as a model of interactive compu-tation. In FoIKS, volume 1762 of Lecture Notes in Computer Science, pages166–135, 2000.

Dina Q. Goldin, Scott A. Smolka, Paul C. Attie, and Elaine L. Sonderegger.Turing machines, transition systems, and interaction. Information and Com-putation, 194:101–128, 2004.

Harry R. Lewis and Christos H. Papadimitriou. Elements of the Theory ofComputation. Prentice-Hall, 1981.

Robin Milner. The polyadic π-calculus: A tutorial. In F.L. Hamer, W. Brauer,and H. Schwichtenberg, editors, Logic and Algebra of Specification. Springer-Verlag, 1993.

Robin Milner. Communicating and Mobile Systems: The π-Calculus. CambridgeUniversity Press, 1999.

Gualtiero Piccinini. The physical church-turing thesis: Modest or bold? BritishJournal for the Philosophy of Science, 62:733–769, 2011.

Tibor Rado. On non-computable functions. Bell System Technical Journal, 41(3), 1962.

Scott A. Smolka, Dina Q. Goldin, and Peter Wegner, editors. Interactive Com-putation: The New Paradigm. Springer-Verlag Berlin, 2006.

Alan Turing. Systems of logic based on ordinals. Proceedings of the Londonmathematical society, 45, 1939.

Jan van Leeuwen and Jirı Wiedermann. Beyond the turing limit: Evolvinginteractive systems. In SOFSEM 2001: Theory and Practice of Informatics,volume 2234 of Lecture Notes in Computer Science, pages 90–109. Springer-Verlag Berlin, 2001.

29