Densmore and Dennett on Virtul Machines and Consciousness Paul M. Churchland Philosophy and Phenomenological Research, Vol. 59, No. 3. (Sep., 1999), pp. 763-767. Stable URL: http://links.jstor.org/sici?sici=0031-8205%28199909%2959%3A3%3C763%3ADADOVM%3E2.0.CO%3B2-S Philosophy and Phenomenological Research is currently published by International Phenomenological Society.
Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at http://www.jstor.org/about/terms.html. JSTOR's Terms and Conditions of Use provides, in part, that unless you have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use. Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at http://www.jstor.org/journals/ips.html. Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.
JSTOR is an independent not-for-profit organization dedicated to and preserving a digital archive of scholarly journals. For more information regarding JSTOR, please contact
[email protected].
http://www.jstor.org Sat May 12 00:25:25 2007
Philosophy and Phenonlenological Research Vol. LIX, No. 3, September 1999
Densrnore and Dennett on Virtual Machines and Consciousness PAUL M. CHURCHLAND
University of California, San Diego
In the preceding article, Densmore and Dennett (hereafter, D&D) urge me to note the parallels, and the potentially complementary fit, between Dennett's views on consciousness and my own. They are correct in arguing that I had not fully appreciated the parallels. I am pleased and honored by every one of them. Still, it will be more illuminating to press the remaining differences between us. They are several and they are important, quite aside from the question of which of our views is correct. Like good functionalists, D&D's first and almost unrestrained impulse is to see the mechanism of a recurrent PDP network as a (welcome and perhaps correct) story of how the abstract cognitive organization of humans and animals gets implemented in the physical brain. The familiar metaphors of "virtual machines" and "explanatory levels" are once again confidently deployed, and the mind is assimilated to "a software Program that is installed upon the parallel network of the brain." D&D are of course aware that these metaphors now need (at least) an additional layer of cautionary commentary, and, just as Dennett did in Consciousness Explained (CE), they do not hesitate to offer it. But the commentary still seems to me to be self-deceptive and uncomprehending. Those metaphors do not need to be qualified, they need to be junked. Though we have all grown up on them, they should be ushered swiftly into our history, at least where talk of minds is concerned. Those metaphors derive their primary meanings from a perfectly sensible theory, a theory that is perfectly adequate to classical computational procedures and to classical computing machines (and, let us add, to classical functionalism in philosophy). But that well-polished theory, according to some of us, has turned out to be entirely the wrong model for the cognitive activity of biological brains. It is wrong in its story of our basic computational mechanisms, it is wrong in its account of how occurrent information is coded and manipulated, it is wrong in its account of how skills are embodied and how "regular" behavior
SYMPOSIUM
763
is typically produced, it is wrong about the nature of learning, and it is even wrong about language. Particularly misleading, it seems to me, is D&D's (qualified) assimilation of "training a network" with "installing a program onto a classical machine." In both cases, to be sure, there is an implastic part: the basic functional architecture. And there is a plastic part: the zerolone option across all the memory registers in the classical case, and the changeable synaptic weights in the network case. But the similarity ends there. Dennett briefly acknowledges that networks are typically trained on thousands or millions of examples which reconfigure the weights only very slowly, whereas the configuration of zeros and ones in a classical memory is typically done in one fell swoop by downloading a pre-written program. But this difference is not the important one, and is superficial in any case. If one knows the desired weight configuration beforehand, one can set a network's weights directly or "by hand," without a single "training experience." And on the other side, a classical program can grow very slowly in a classical machine's memory as the result of three week's painful cut-and-try hacking on the part of some groping human programmer. The important difference lies elsewhere. In the classical case, the "acquired competence" resides in an identifiable set of discrete and recursively applicable instructions or rules, lodged in labeled locations so the CPU can retrieve them when and as each is required, rules whose diverse syntactic structures are causally efficacious in producing the diverse operations of the machine. In the network case, there are no such rules, they are not lodged in labeled locations, and the behavior of the machine is not to be traced to its retrieving such rules and following their discrete instructions. Its "acquired competence" resides instead in the acquired shape of its high-dimensional dynamical state space, a contiiz~lousmanifold with preferred trajectories, stable limit cycles, similarity gradients, hierarchical partitions, and unstable bifurcation points. It is no more "following rules" than is the water of the Mississippi following rules in order to meander its way down a literal landscape from the northwest plains to the Gulf. It is no more "following rules" than is a galloping cat in directing its leg movements. Now, Dennett knows this, and in CE (p. 225., cited also in the review at hand) he attempts to mark the relevant contrast: "...in place of the precise, systematic 'fetch-execute cycle' or 'instruction cycle' that brings each new instruction to the instruction register to be executed [in a classical von Neumann machine], we should look,for imperfectly marshalled, somewhat wandering, ,f(~r~from-logical ti.ansition 'rules'.. . " (emphasis mine).
But this way of drawing the contrast makes the network's mode of operation
look like a (charmingly) "failed case" of the classical machine's mode of
operation, a case of "the same thing, except sloppier and fuzzier." But it isn't.
764
PAUL M. CHURCHLAND
The network isn't following fuzzy rules, or imperfectly marshalled rules, or virtual rules; it isn't following rules at all. What we should look for, in explanation of the network's behavior, is the acquired dynamical landscape of its global activation space. That, plus its current activational state, is what dictates its behavior. Rules play no causal role at all, and neither do 'rules'. To use that term in scare quotes, as Dennett does, is just to undermine the primary negative point here, and to set up explanatory hopes and expectations (concerning "virtual machines" and "design level explanations") that are doomed to go unsatisfied. D&D have gently criticized me already for exaggerating the distance between classical machines and neural networks. So that I will not seem here to be compounding a prior felony, let me address their announced worries. They focus (pp. 750-51) on the fact that most artifical networks are simulated, apparently successfully, on classical machines. (They are wrong, incidentally, to say that a l l of the networks I discuss in The Engine of Reason are classically simulated. Chapter 9 contains eight pages and four figures concerning Carver Mead's etched-silicon VLSI retina. That is a piece of devoted nonclassical hardware, and so is his artifical cochlea. But I am quibbling.) The classical-machine simulations to which Dennett and Densmore refer are indeed successful, on the whole, and neural network research would still be in the dark ages without them. And yet they do have two limitations. One is subtle, the other is brutish, and both are insuperable. The first concerns their inability to do infinite-precision arithmetic, which means that their simulations will always be blind to any nonlinear dynamical features that are sensitive to differences or perturbations below the accuracy of an eight, or twelve, or whatever-place rational/decimal representation of the state of the system being simulated. This limitation is currently invisible and irrelevant because of the small size and simple dynamical character of the neural networks so far simulated. But it will grow in importance as we come to model interacting systems of very large recurrent networks, because their exquisitely nonlinear dynamics will be increasingly sensitive to such finegrained but inevitably unrepresented differences in state. The topic of larger networks brings us to the second and more urgent limitation: time. A successful classical simulation of the human brain, for example, would have to compute, at a bare minimum, the transformations effected at each one of our 1014synapses, at least ten times each second, for a total of at least 10'' computations per second. A current 100 mHz machine, performing one such computation each CPU cycle (a blistering pace), would take 1015/108= lo7 seconds or almost four months to complete that same onesecond task. (Well, we may hope it will be "the same": recall, once more, the first limitation.) In short, classical-machine simulations of neural networks, even when they do work, cannot function in anything remotely like real time.
SYMPOSIUM
765
These are not exaggerations, but they are old points. And they concern classical simulations of PDP networks, instead of PDP "simulations" of serial processes, which is D&D's primary concern. Let me therefore make a new point. It concerns a difference between Dennett's ')programmed recurrent PDP network" theory of consciousness, and the "recurrent PDP network" theory of consciousness advanced in The Engine of Reason. For Dennett, it is the specific nature of the programming that elevates a recurrent network to the achievement of consciousness. That is why, on his account, human consciousness differs from animal consciousness: our "programming" is importantly different. Quite aside from our disagreements over what such "programming" involves or produces, notice that, on my account of consciousness, the configuration of the network's synaptic weights (what Dennett wants to call its "program") plays no explanatory role in the emergence or generation of consciousness itself. The weight configuration will indeed play a decisive role in what concepts the creature deploys, what features of the world it cares about, and what behaviors it is capable of producing, but it plays no role in giving rise to consciousness in the first place. That, according to the account in Engine, is owed to the basic recurrent architecture of the network at issue. The human brain does not have to be carefully progranzrned in order to exhibit the basic dynamical features of short-term memory, steerable attention, plastic interpretation, sensory-free activity (both day-dreaming and night-dreaming), quiescence (deep sleep), and polymodal integration. These basic features arise automatically from the basic physical structure of the system, independently of the details or the maturity of its conceptual development. This means that, on my account, creatures with inchoate or still poorlydeveloped conceptual frameworks-creatures such as human infants and nonhuman animals-still can and must be conscious in the same ways that we adult humans are. Their concerns may be narrow and their comprehension dim, but when they wake up in the morning they, too, are conscious. By contrast, it is a consequence of Dennett's theory that human infants, as well as nonhuman animals, cannot be truly conscious, even when awake. I continue to find this implausible, but that is not the important point. The important point is that Dennett is forced to bite this particular bullet precisely because he locates the dynamical source of the "Magnificent Seven" cognitive features in the plastic or "programmable" aspects of the physical brain-aspects that require much time and cultural instruction-instead of locating their dynamical source in the endogenously-specified and $xed aspects of the brain, namely, its integrated polymodal recurrent PDP architecture. I am deeply encouraged that D&D regard the Seven Features as focal dimensions of consciousness, and also by their attempt to propose a non-
trivial alternative account of them. But Dennett can lose an embarrassing and inescapable consequence of his own "program-level" account by adopting instead an "architecture-level" account of the general sort I have proposed. There is another reason for adopting the architecture-level account over the program-level story. The former provides a simple and unitary account of all seven features. They fall out of the recurrent architecture without artifice, and they form a mutually coherent family. By contrast, D&D's accounts of the seven features, despite their successes in each case, look positively Ptolemaic, at least to these eyes. One is impressed by the intricacy and the high level of the artifice displayed, rather than by its absence. It looks Ptolemaic in a further respect. Upon hearing of Newton's Laws of motion and Theory of Universal Gravitation, a really determined Ptolemaist, sensing the inevitable, might well have tried to welcome them as a novel account of the "dynamical implementation" of the Ptolemaic system. He could argue that "virtual deferent circles" and "virtual epicycles," dictated (indirectly of course) by gravity, are responsible for the roughly elliptical paths of the planets. After all, those venerable devices do generate a set of entirely "real patterns." But virtual epicycles do not give rise to the planetary motions. And virtual machines, I suggest, do not give rise to consciousness. The machines that do are entirely real, and for the bare production of consciousness it matters little or none how they have been programmed by human culture. I close with a concession. D&D point out that PDP models currently offer no serious account of episodic or narrative memory, beyond the seconds-tominutes range of what I have called "short-term" memory. This is correct, and the phenomenon is a major puzzle for PDP research. Empirical research (e.g., Zola-Morgan and Squire 1991) indicates that biographical episodes take several weeks to become fixed in long-term memory, and that the hippocampus (a focal point of several important recurrent pathways) is essential for successful fixing thereof. This suggests that salient or important biographical episodes (only a tiny portion of one's history is ever remembered) get fixed by an extended process of synaptic adjustment guided by the activity of the hippocampus. But this still leaves us grasping for an explanatory mechanism. Providing it remains an undischarged obligation of the PDP research program, just as D&D claim.
References Dennett, D. C. (1991), Consciousness Explained (Boston: Little Brown). Zola-Morgan, S. and Squire, L. (1991), "The primate hippocampal formation: evidence for a time-limited role in memory storage," Science 250: 288-89.
SYMPOSIUM
767