=
n
pPa(x)A0(x)
= n
p(Pa(x)
R
P$(x))
=
= OV^s)) n (ry£ ( x ) ) = Pfa{x)r\pf0{xy Finally, by using Eqs. (3.2) and (2.5), we get Pa(,x)V0(x) ~ npPa(x)V0(x)
=
n
p(Pa(x)
U
P/3(x)) -
3 (n P ^ ( x ) ) u (n„pg(x)) =Pfa{x) u ^ ( i ) . To close up, we note that the definitions of -< and « on (x)/~ by the preorder -< defined on (x)) is orderisomorphic to (V?, C). 4. The general notion of testability The intended physical interpretation of C{x) suggests that a sentence of C{x) can be classified as empirically decidable, or testable, iff it can be associated with a registration procedure that allows one (under physical conditions to be carefully specified, see Sec. 2) to determine its truth value whenever an interpretation p of the variable x is given. Since all elementary sentences are testable, one is thus led to define the subset T(X) Q <j>(x) of all testable wffs of (f>{x) as follows. {x) | 3Ea G £ : a(x) = Ea(x)}.
(4.1)
The subset Vj- C V* of all physical propositions associated with wffs of (J>T(X) will then be called the set of all testable physical propositions. More formally,
H = {P{{X) e ?f I «(*) e M*)Y
(4-2)
165
Of course, CJ>T(X) is preordered by the restrictions of the preorders < and -< denned on {x) to it. For the sake of simplicity, we will denote preorders and equivalence relations on <J>T(X) by the same symbols used to denote them on (j>(x). Hence, the logical preorder < implies the physical preorder -<, and the logical equivalence = implies the physical equivalence sa also on 4>T(X). We thus get two preorder structures, (<J>T{X), <) and ((/>T(X), -<), and two posets (T(x)/=,<) and (>T(£)/R;,~0- The latter, in particular, is isomorphic to (Vj>, Q)We shall see in the next sections some further characterizations of the foregoing posets within the framework of specific theories. 5. Classical mechanics (CM) It is well known that in classical mechanics (CM) all physical objects in a given state S possess the same properties. This feature of CM can be formalized here by introducing the following assumption. CMS. For every S 6 <S and E e £, either extsE — Us or extsE = 0. It follows from assumption CMS that, for every interpretation p € 1Z, Ps{x) 6 extsE iff extsE = Us, and ps(x) £ extsE iff extsE = 0. Therefore, the assignment function o-ps does not depend on the specific interpretation p. More explicitly, for every interpretation p and state S, ( <rps{E(x))=T { aps(E(x))=F
iff iff
extsE = Us extsE =
o-"s(E(x) A F(x)) =T crps(E{x) A F(x)) =F
iff iff
extsE = Us = extsF extsE ^ extsF
(where E,F e £), etc. Since aps does not depend on p, neither the individual proposition ppaix\ depends on p, and we can omit writing the index p in both symbols. Thus, for every p £1Z, the individual proposition associated with a(x) E (f>(x) is given by pa{x)
= {SeS:
as(a(x))
= T}.
(5.1)
More explicitly, we have PE(X) = {SeS: PE(X)AF(X) = {S eS:
extsE
extsE
= Us},
= Us = extsF}
= pE{x)
(5.2) (5.3)
166
etc. The set V of all individual propositions associated with wffs of C(x) obviously does not depend on p, and will be simply denoted by V. Because of the above specific features, the general notions introduced in Sees. 2, 3, 4 particularize in CM as follows. For every a(x),(}(x) € 4>{x), and S E S, as{a{x))=T a(x)<0{x)
iff iff
S€paix), Pa(x)QP0(x),
a(x)=P{x)
iff
pa(x)=p0(x).
It also follows from the general case that the Lindenbaum-Tarski algebra (<j)(x)/=, <) of C(x) is isomorphic to the Boolean lattice of individual propositions (V, C), so that the two lattices can be identified. Coming to physical propositions, we get, for every a(x) € <j){x), Pfa{x)=Pa(x),
(5-4)
and, therefore, V* = V. Thus, the set of all physical propositions coincides in CM with the set of all individual propositions, and the notions of true and certainly true also coincide. Furthermore the intended physical interpretation suggests that every sentence of the language C{x) is testable in CM. This inspires the following assumption. CMT. The set of all testable sentences of the language C(x) coincides in CM with the set of all sentences of C(x), that is, T(X) — (j>(x) in CM. Assumption CMT implies that V^ = Vf = V, whence CP£,C) = (P,C). More explicitly, the poset of all testable physical propositions of a physical system Q, coincides with the poset of all individual propositions of its language £(x), and has the structure of a Boolean lattice. This result explains, in particular, the common statement in the literature that "the logic of a classical mechanical system is a classical propositional logic" ? This statement is however misleading in our opinion, since it ignores the conceptual difference between individual, physical and testable physical propositions, that coincide in CM only because of assumptions CMS and CMT.
167 6. Quantum mechanics (QM) We have stressed in Sec. 2 that our semantics (hence the general scheme in Sees. 2, 3 and 4) is unsuitable for QM whenever the standard interpretation of this theory is accepted. As anticipated in the Introduction and in Sec. 2, we therefore adopt in the present paper the SR interpretation of QM worked out by one of the authors and by other authors in a series of articles, 11 ~ 1 3 ' 1 5 , 1 7 , 1 8 according to which extsE can be defined in every physical situation (we show in Sec. 7 that the new perspective also allows us to elucidate the concept of quantum truth underlying the standard interpretation of QM). At variance with CM, it may then occur in QM that 0 7^ extsE ^ lis, so that the assignment function aps generally depends on the interpretation p. The formulas written down for the general case cannot be simplified as in Sec. 5. In particular, Vf ^ V, assumptions CMS and CMT do not hold, and v£cVf. In order to discuss how the general case particularizes when QM is considered, let us briefly remind the mathematical representations of physical systems, states and properties within this theory. Let fi be a physical system. Then, fi is associated with a separable Hilbert space V. over the field of complex numbers. Let us denote by (£(H), C) the poset of all closed subspaces of 7i, partially ordered by settheoretical inclusion, and let A C £(H) be the set of all one-dimensional subspaces of H. Then (in absence of superselection rules) a mapping
(6.1)
exists which maps bijectively the set <S of all pure states of fi onto A (for the sake of simplicity, we will not consider mixed states in this paper, so that we understand the word pure in the following)d. In addition, a mapping X:EE£^
xiE) € C(U)
(6.2)
exists which maps bijectively the set £ of all properties of fi onto C(Jl). The poset (£(H), C) is characterized by a set of mathematical properties. In particular, it is a complete, orthocomplemented, weakly modular, atomic lattice which satisfies the covering law. 2 2 - 2 4 We denote by -1, fn\ and l!U orthocomplementation, meet and join, respectively, in (£(H), C) (it "It follows easily that every pure state S can also be represented by any vector \r/>) £ f(S) G -4, which is the standard representation adopted in elementary QM. Moreover, a pure state 5 is usually represented by an (orthogonal) projection operator on tp(S) in more advanced QM. However, the representation ip introduced here is more suitable for our purposes in the present paper.
168
is important to observe that T fH coincides with the set-theoretical intersection fl of subspaces of C(H), while ^ does not generally coincide with the set-theoretical complementation ', nor iyj coincides with the set-theoretical union U). Furthermore, we note that A obviously coincides with the set of all atoms of (£(%), C). Let us denote by -< the order induced on £, via the bijective representation x, by the order C defined on C(V). Then, the poset {£,-<) is orderisomorphic to (£(%), C), hence it is characterized by the same mathematical properties characterizing (C(H), C). In particular, the unary operation induced on it, via x, by the orthocomplementation defined on (£(%), C), is an orthocomplementation, and {£, -<) is an orthomodular {i.e., orthocomplemented and weakly modular) lattice, usually called the lattice of properties of fi. By abuse of language, we denote the lattice operations on (£, -<) by the same symbols used above in order to denote the corresponding lattice operations on (£(H), C). Orthomodular lattices are said to characterize semantically orthomodular QLs in the literature. 3 The lattice of properties (£, -<) is a less general structure in QM, since it inherits a number of further properties from (£(%), C), and can be identified with the concrete, or standard, sharp QL mentioned in Sec. 1 (simply called QL here for the sake of brevity). A further lattice, isomorphic to (£, -<), will be used in the following. In order to introduce it, let us consider the mapping
9:Ee£—>SE
= {SeS\
tp(S) C * ( £ ) } e C(S),
(6.3)
where £(<S) = {SE \ E e £} is the range of 8, and generally is a proper subset of the power set V(S) of S. The poset (£(<S), C) is order-isomorphic to (£(%), C), hence to (£, -<), since
169
because of the analogous result holding in (£(H), C) e . Basing on the above definitions, we now introduce the following assumption. Q M T . The poset (V^C) of all testable physical propositions associated with statements of <J>T{X) (equivalently, with atomic statements of C{x)) coincides in QM with the lattice (£(S), C) of all closed subsets of S. Assumption QMT is intuitively natural, and can be justified by using the standard statistical interpretation of QM. We do not insist on this topic here for the sake of brevity. We note instead that assumption QMT implies that the posets ( 0 T ( ^ ) / « , - < ) and (Vj,, C), on one side, and the lattices (£(S),C), {£{%), C), (£,-<) on the other side, are order-isomorphic. Therefore also the operations of meet, join and orthocomplementation on (0T(:E)/RS, <) and (Pj., C) will be denoted by the symbols r{x), (PUX))^S\PI(XV f
f
(6-4)
P a(x)n4(x)=P a(x)®4(xy
(6-5)
Pfa(x)UPf0(x)^Pl{x)wP0{xy
(6-6)
The isomorphisms above allow one to recover QL as a quotient algebra of sentences of C{x). They, however, make intuitively clear that associating the properties (or 'propositions') of QL with sentences of C{x) is not trivial. The association requires indeed selecting testable wffs of {x) and the lattice operations of QL. To this end, let us note that statements (i), (ii) and (iii) in Sec. 3, e Whenever the dimension of V. is finite, the lattice {C{H), C) and/or the lattice (C(S), C) can be identified with Birkhoff and von Neumann's modular lattice of experimental propositions, which was introduced in the 1936 paper that started the research on QL. 25 This identification is impossible if the dimension of H is not finite, since {C{H), C) and (£(S), C) are weakly modular but not modular in this case. Birkhoff and von Neumann's requirement of modularity has deep roots in von Neumann's concept of probability in QM according to some authors.2
170
if compared with Eqs. (6.4), (6.5) and (6.6), respectively, yield, for every a(x)J(x) e<j>T{x), pL(x)^S\pfa{x)D{pi(x))\
P«(x)v/3(*) 5P f aix) Upf0{x) C pfa{x)
(6.7)
&pf0(x).
(6.9)
Eq. (6.8) shows that, if a(x) and /3(x) belong to <j>r{x), then a(x) A0(x) belongs to CJ>T(X), and establishes a strong connection between the connective A of £{x) and the lattice operation |fj) of QL. Eqs. (6.7) and (6.9) establish instead only weak connections between the connectives -• and V, from one side, and the lattice operations -1 and iyj, from the other side. Hence, no simple structural correspondence can be established between C(x) and QL. One can, however, obtain a more satisfactory correspondence between the sentences of a suitable language and the 'propositions' of QL by using a fragment of C{x) in order to construct a new quantum language JCTQ(X), as follows. First of all, we consider two properties E,F € £ and observe that, since the mapping \ introduced in Eq. (6.2) is bijective, E and F coincide whenever they are represented by the same subspace of C(H). This implies that the following sequence of equivalences holds. PEM=PF(X)
%
E = F
^
E{x)nF{x)
iff
E(x) = F(x).
It follows in particular that every equivalence class of <J>T(.X)/~ contains one and only one atomic wff of C{x). Since the set £{x) of all atomic wffs of C(x) (Sec. 2) belongs to 4>T(X), we conclude that the correspondence that maps every a(x) € <J>T(X) onto the atomic wff Ea(x), the existence of which is guaranteed by Eq. (4.1), is a surjective mapping. Moreover, this mapping maps all physically equivalent wffs of (J>T{X) onto the same atomic wff o f f (a;). Secondly, let us consider the set (j>^(x) of all wffs of C(x) which either are atomic or contain the connective A only. Because of Eq. (6.8), the proposition associated with a wff a(x) A ft(x) of this kind belongs to VT, hence a(x) A f3(x) belongs to <J)T{X), SO that <j>^{x) C 4>T(X). Then, let us introduce a new connective ->Q (quantum negation) which can be applied (repeatedly) to wffs of cj>/\(x) following standard formation rules for negation connectives. We thus obtain a new formal language CTQ{X), whose set of wffs will be denoted by <J>TQ(X). We adopt the semantic rules introduced in Sec. 2 for all wffs of <j)/\(x) C (f>Tq(x), and complete the semantics of £TQ {%) by means of the following rule.
171
Q N . Let a(x) e <J>TQ{X) and let a wff Ea{x) e £{x) exist such that a{x) is true iff Ea{x) is true. Then, -TQ{X), an elementary wff Ea(x) exists such that a(x) is true iff Ea(x) is true. This conclusion has the following immediate consequences. (i) One can define, for every interpretation p of the variable x and state S, an assignment function T | : <J>TQ{X) —> {T,F}. Hence, a logical preorder and a logical equivalence relation (that we still denote by the symbols < and =, respectively, by abuse of language) can be defined on <J>TQ(.%) by using the definitions in Sec. 2 with <J>TQ{X) in place of <j){x) and Tg in place of o-ps. (ii) One can associate a physical proposition with every a(x) e (J>TQ (Z) by using Eq. (3.1) with r | in place of aps. Hence a physical preorder and a physical equivalence relation (that we still denote by the symbols -< and « , respectively, by abuse of language) can be defined on (PTQ(%) by using the definitions in Sec. 3 with 4>TQ{X) in place of (j>(x) (one can also show that « coincides with = on (J>TQ(X)). (iii) The notion of testability introduced in Sec. 4 can be extended to CTQ{X) by using Eq. (4.1) with (J>TQ(X) in place of (j>{x), obtaining that all wffs of TQ{X) coincides with V^. It follows from (ii) and (iii) that {$>TQ{X)I~,-£) is isomorphic to the lattice {Vj., C), so that these two order structures can be identified. The set of connectives defined on CTQ {X) can now be enriched by introducing derived connectives. In particular, a quantum join can be defined by setting, for every a(x),p(x) e <J>TQ{X), a(x) VQ /3(x) = ^Q(->Qa(x)
A
-Q/3(X)).
(6.10)
It is then easy to show that the following equalities hold. < , « ( , ) = (^(x))" 1 .
(6- 11 )
Pfa(X)^(X)=Pfa{X)^P0(Xy
( 6 - 12 )
p{(x)vQ(3(x)=Pfa(x)®P0{xy
(6-13)
The equations above establish a strong connection between the logical operations defined on <J>TQ(X) and the lattice operations of QL. Hence, a structural correspondence exists between £TQ(X) and QL, and the latter can be recovered within our general scheme also by firstly considering the
172
set of all elementary wffs of C{x), and then constructing CTQ{X) and the quotient algebra {4>TQ{X)I~,-<). It is now apparent that the semantic rules for quantum connectives have an empirical character (they depend on the mathematical representation of states and properties in QM and on assumption QMT) and that they coexist with the semantic rules for classical connectives in our approach (the deep reason of this is, of course, our adoption of the SR interpretation of QM). In our opinion, these conclusions are relevant, since they deepen and formalize a new perspective on QL that has been propounded in some previous papers 1 0 - 1 8 and is completely different from the standard viewpoint about this kind of logic. To conclude, let us observe that a further derived connective —> can be introduced in
<J>TQ(X)
by setting, for every a(x),/3(x)
G
Q <J>TQ{X),
a{x) - • 0{x) = {~>Qa{x)) VQ (a(x) A /?(*)).
(6.14)
w One can thus recover within £TQ(%) the Sasaki hook, the role of which is largely discussed in the literature on QL. 2 ' 3,24 7. Quantum truth The general notion of certainly true introduced in Sec. 3 is denned for all wffs of C(x). Yet, according to our approach, only wffs of <j)r{x) can be associated with empirical procedures which allow one to check whether they are certainly true or not. Whenever a(x) € ^T(X), the notion of certainly true can be worked out in order to define a verificationist notion of quantum truth (Q-truth) in QM, as follows. Q T . Let a(x) 6 <J>T{X). Then, we put:
a(x) is Q-true in S € S iff 5 € pLx\', a{x) is Q-false in S 6 S iff S e (pfa{x))L; a(x) has no Q-truth value in S e <S (equivalently, ct(x) is Q indeterminate in S) iff S e S \ (pLx\ u (Pari))"1)It obviously follows from definition QT that a(x) is Q-true in S iff it is certainly true in S. Definition QT can be physically justified by using the analysis of the notion of truth in QM recently provided by ourselves26 and successively deepened by one of us. 21 We only note here that it is equivalent to defining a wff a(x) e <j>{x) as Q-true (Q-false) in 5 iff: (i) a(x) is testable;
173
(ii) a(x) can be tested and found to be true (false) on the physical object x without altering the state S of x. The proof of the equivalence of the two definitions is rather simple but requires some use of the laws of QM (see again Refs. 21 and 26). It is apparent that the notions of truth and Q-truth coexist in our approach. Indeed, a wff a(x) e (f>{x) is Q-true (Q-false) for a given state S of the physical system iff it belongs to <J>T{X) and it is true (false) independently of the interpretation of the variable x (equivalently, iff it belongs to T{X) and can be empirically proved to be true or false without altering the state S of x). This realizes an integrated perspective, according to which the classical and the quantum conception of truth are not mutually incompatible. 21,26 ' 27 However, definition QT introduces the notion of Q truth on a fragment only (the set <J>T{X) C {x)) of the language C{x). If one wants to introduce this notion on the set of all wffs of a suitable quantum language, one can refer to the language CTQ{X) constructed at the end of Sec. 6. Then, all wffs of TQ{X) are testable, and definition QT can be applied in order to define Q-truth on CTQ{X) by simply substituting 4>TQ(X) to (f>r(x) in it. Again, classical truth and Q-truth may coexist on £TQ{X)
in our approach.
Let us close this section by commenting briefly on the notion of truth within standard interpretation of QM. Whenever this interpretation is adopted, the languages C{x) and CTQ{X) can still be formally introduced, but no classical semantics can be defined on them because of the impossibility of defining, for every S 6 S and E 6 S, extsE (see Sec. 2). One can still define, however, a notion of Q-truth for CTQ{X). Indeed, one can firstly introduce a mapping x : a{x) G TQ(X) —• Ea G £ by means of recursive rules, as follows. For every a(x) € TQ(X), For every a(x),fi(x) € 4>TQ(X),
xhQ<*(x)) = E£, x(a(x) A Pi.x) = EafibEp.
Then, one can associate a physical proposition va,x\ € C(S) with every a(x) € <J>TQ{X) by settingp f a , x \ = 0(Ea). Finally, one can define Q-truth on 4>TQ{X) by means of definition QT, independently of any classical definition of truth. It is apparent that the above notion of Q-truth can be identified with the (verificationist26) quantum notion of truth whose peculiar features have been widely explored by the literature on QL (in particular, a tertium non datur principle does not hold in CTQ{X)). Hence, the interpretation of QL as a new way of reasoning which is typical of QM seems legitimate. But
174
this widespread opinion is highly problematical. Indeed, whenever S is given, some wffs of <J>TQ(X) have a truth value, some have not, quantum connectives are not truth-functional and the notion of truth appears rather elusive and mysterious. 5 Accepting our general perspective provides instead a reinterpretation of the notion of truth underlying the standard interpretation of QM, reconciling it with classical truth, and allows one to avoid the paradoxes following from the simultaneous (usually implicit) adoption of two incompatible notions of truth (classical and quantum). 8. The pragmatic interpretation of QL The definition of Q-true in S as certainly true in S for wffs of <pr(x) in Sec. 7 suggests, intuitively, that the assertion of a sentence a(x) of 4>T(X) should be considered justified in S whenever a[x) is Q-true in S, unjustified otherwise. This informal definition can be formalized by introducing the assertion sign h and setting h a(x) is justified (unjustified) in S iff a(x) is Q-true (not Q-true) in S. The set of all elementary wffs of T(X), each preceded by the assertion sign r-, can be identified with the set of all elementary assertive formulas of the quantum pragmatic language CQ introduced by one of the authors in a recent paper 21 in order to provide a pragmatic interpretation of QL f . The set ip® of all assertive formulas (afs) of CQ is made up by all aforesaid elementary afs plus all formulas obtained by applying recursively the pragmatic connectives N, K, A to elementary afs. For every S E <S a pragmatic evaluation function ITS is defined which assigns a justification value (justified/unjustified) to every af of ip][ and allows one to introduce on tp^ a preorder -< and an equivalence relation as following standard procedures. More important, a p-decidable sublanguage CQD of CQ can be constructed whose set <$\D of afs consists of a suitable subset of all afs of tp^ which have a justification value that can be determined by means of empirical procedures of proof (in particular, all elementary afs of ip® belong to ^D)CQD can then be compared with the quantum language CTQ(X) introduced f
It must be noted that the pragmatic interpretation of QL has some advantages with respect to the interpretation propounded in Sec. 6. In particular, it is independent of the interpretation of QM that is accepted (standard or SR), while our interpretation in this paper follows from adopting a classical notion of truth, hence from accepting the SR interpretation of QM.
175 at the end of Sec. 6 by constructing a one-to-one mapping r of onto 9ADI as follows. For For For For
every every every every
<J>TQ(X)
E(x) E 4>TQ{X), T(E(X)) =\- E(x), ot(x) e (J>TQ{X), T(-IQOL(X)) = JV h a{x), a(x),fi(x) e (J>TQ(X), r(a(x) A/3(a;)) = h a(x)K V- /?(x), a(x),/3(x) € <J>TQ{X), T(OL(X) VQ /3(X)) = h a{x)A h fl(x).
Indeed, it is rather easy to show (we do not provide an explicit proof here for the sake of brevity) that the mapping r preserves the preorder -< and the equivalence relation RS (in the sense that a{x) -< /3(x) iff r(a(x)) -< r(/3(x)), and a{x) « /?(a:) iff T(O:(:E)) « r(/3(a;))). Moreover, the wff a(x) 6 <J>TQ(X) is Q-true iff the af T(a(a;)) 6 ^D is justified, which translates a semantic concept (Q-true) defined on the language £TQ(X) into a pragmatic concept (justified) defined on the pragmatic language CQD. Bearing in mind our comments at the end of Sec. 6, we can summarize these results by saying that QL can be interpreted as a theory of the notion of testability in QM from a semantic viewpoint, a theory of the notion of empirical justification in QM from a pragmatic viewpoint. The two interpretations can be connected, via the mapping T, in such a way that Q-true transforms into justified, which is intuitively satisfactory. 9. Physical propositions and possible worlds The formal language C(x) introduced in Sec. 2 is exceedingly simple from a syntactical viewpoint, even if it is very useful in order to illustrate what physicists actually do when dealing with QL. Its syntactical simplicity has forced us, however, to set up a somewhat complicate semantics, in which, in particular, states are formally treated as possible worlds of a Kripkelike semantics. A less intuitive but logically more satisfactory approach should provide an extended syntactical apparatus, simplifying semantics. This could be done by enriching the alphabet of C(x) in two ways: (i) adding a universal quantifier (with standard semantics); (ii) adding the set of states as a new class of monadic predicates of C(x). Let us comment briefly on these possible extensions of C(x). Firstly, let (i) only be introduced. Then, a family of individual propositions can be associated with the quantified wff (Vx)a(x), and a proposition P(vx)a(x) — Dpp^/j can be associated with it. Hence, we get Pfa{x) = P(V*)a(z)
176
which provides a satisfactory interpretation of the physical propositions introduced in Sec. 3 and of the related notion of certainly true. Second, let us note that considering states as possible worlds is a common practice in QL, 3 but it doesn't fit well with the standard logical interpretation of possible worlds. In order to avoid this problem, one could introduce (ii), as one of us has done, together with other authors, in several papers. 17,18 In this case, states are not considered possible worlds, propositions as denned in the present paper are not propositions in the standard logical sense (rather, an 'individual proposition' associated with a wff a(x) is the set of all states which make a sentence of the form S(x) ->• a(x) true in a given interpretation of x, while a 'physical proposition' is a set of 'certainly yes' states which make a sentence of the form (\/x)(S(x) -> a(x)) true). We do not insist here on this more general scheme, and limit ourselves to observe that it is compatible with a standard Kripkean semantics, which can be enriched by introducing physical laboratories in order to characterize the truth mode of empirical physical laws in more details and connect the notions of probability and frequency.17,18 Yet, of course, an approach of this kind would make much less direct and straightforward the interpretation of QL that we have discussed in this paper. References 1. M. Jammer, The Philosophy of Quantum Mechanics (Wiley, New York, 1974). 2. M. Redei, Quantum Logic in Algebraic Approach (Kluwer, Dordrecht, 1998). 3. M. Dalla Chiara, R. Giuntini and R. Greechie, Reasoning in Quantum Theory (Kluwer, Dordrecht, 2004). 4. D. Aerts, in Quantum Physics and the Nature of Reality, D. Aerts and J. Pykacz eds. (Kluwer, Dordrecht, 1999). 5. B. C. van Praassen, in The Logico-Algebraic Approach to Quantum Mechanics, Vol. I, C. A. Hooker ed. (Reidel, Dordrecht, 1975). 6. J. S. Bell, Physics 1, 195 (1964). 7. N. D. Mermin, Rev. Mod. Phys. 65, 803 (1993). 8. J. S. Bell, Rev. Mod. Phys. 38, 447 (1966). 9. S. Kochen and E. P. Specker, J. Math. Mech. 17, 59 (1967). 10. C. Gaxola and L. Solombrino, Found. Phys. 26, 26, 1329 (1996b). 11. C. Garola, in Quantum Physics and the Nature of Reality, D. Aerts and J. Pykacz eds. (Kluwer, Dordrecht, 1999). 12. C. Garola, Found. Phys. 30, 1539 (2000). 13. C. Garola, Found. Phys. 32, 1597 (2002). 14. C. Garola, Found. Phys. Lett. 16, 599 (2003). 15. C. Garola and J. Pykacz, Found. Phys. 34, 449 (2004). 16. C. Garola, Int. J. Theor. Phys. 44, 807 (2005). 17. C. Garola, Int. J. Theor. Phys. 30, 1 (1991).
177 18. C. Garola and L. Solombrino, Found. Phys. 26, 1121 (1996a). 19. A. Tarski, in Semantics and the Philosophy of Language, L. Linski ed. (Urbana, University of Illinois Press, 1944). 20. A. Tarski, in Logic, Semantics, Metamathematics, A. Tarski ed. (Oxford, Blackwell, 1956). 21. C. Garola, quant-ph/0507122 (2005). 22. G. W. Mackey, The Mathematical Foundations of Quantum Mechanics (Benjamin, New York, 1963). 23. C. Piron, Foundations of Quantum Physics (Benjamin, Reading, MA, 1976). 24. E. Beltrametti and G. Cassinelli, The Logic of Quantum Mechanics (Addison-Wesley, Reading, MA, 1981). 25. G. Birkhoff and J. von Neumann, Ann. Math. 37, 823 (1936). 26. C. Garola and S. Sozzo, Found. Phys. 34, 1249 (2004). 27. C. Garola, quant-ph/0510199 (2005).
THE ELECTROMAGNETIC CONCEPTION OF NATURE AND THE ORIGINS OF QUANTUM PHYSICS ENRICO A. GIANNETTO Department of 'Scienze delta Persona', University of Bergamo, Piazzale S. Agostino2, Bergamo 24129, Italy The rise of quantum physics is analyzed by outlining the historical context in which different conceptions of Nature (mechanistic, thermodynamic and electromagnetic ones) were in competition to give a foundation to physics. In particular, electromagnetic conception roots of quantum physics are shown: since Larmor's first trials to Poincar6's and to Heisenberg's new mechanics.
1. Introduction 1.1. Conceptions of Nature As well known, in the late XlXth century physics was no more mechanics only, but also thermodynamics and electrodynamics. This new situation implied the problem of the very foundations of physics, and the correlated issue of the hierarchical relations among these different physical disciplines [1]. There were at least four different «fighting» conceptions of Nature. The socalled Energetic conception of Nature, which was looking at energy as the fundamental unifying concept of physics and had its most important proponents in Georg Helm (1851-1923) and Wilhelm Ostwald (1853-1932). The Thermodynamic conception of Nature, which had energy, entropy and system as fundamental concepts and was looking at thermodynamics as the real foundation block of physics. Its major exponents were Pierre Duhem (18611916) and Max Planck (1858-1947). The Mechanical conception of Nature, which was the most conservative one as searching for a mechanical reduction of the other physical disciplines and of all the physical concepts in terms of mass, space and time by means of the models of material point and action at-a-distance forces. Hermann von Helmholtz (1821-1894), Heinrich Hertz (1857-1894) and Ludwig Boltzmann (1844-1906) were the most representative scientists of this perspective. The Electromagnetic conception of Nature, based on the concepts of field, energy and charge was looking at electromagnetism theory as the foundation
178
179 level of the other physical disciplines. Among the physicists who gave the most relevant contributions to this perspective there are: Hendrik Antoon Lorentz (1853-1928), Joseph Larmor (1857-1942), Wilhelm Wien (1864-1928), Max Abraham (1875-1922) and Henry Poincare (1854-1912). The electromagnetic conception of Nature has deep roots in the history of mankind and certainly has been developed by the elaboration of the Brunian-Leibnizian physics and tradition. On one side, it has been developed within the German physics or Naturphilosophie, on the other side mainly within English physics. Electromagnetism had shown that physical reality was not only inertial and passive matter, but also dynamical, active electromagnetic field, irreducible to a mechanical matter model. Furthermore, Maxwell equations present vacuum solutions, that is, in absence of charged matter: electromagnetic field exists even when there is no matter. Thus, the possibility of a new non-dualistic view of physical reality was considered: if matter cannot exist without electromagnetic field and electromagnetic field can exist without matter, electromagnetic field could be the only physical reality and matter could be derived from the field. 1.2. Electromagnetic Conception of Nature and Relativity Usually, the electromagnetic conception of Nature has been considered as superseded by the developments of XXth century physics. However, a deep historical inquiry shows that the electromagnetic conception of Nature is at the roots of both the relativistic and quantum transformations of physics. Concerning relativity, the 1900, 1902, 1904 and (5 June) 1905 papers written by Poincare [2] show as special relativity dynamics derived from, and was a first realization of, the electromagnetic conception of nature. Einstein's (30 June) 1905 paper was only an incomplete mechanistic version of this new dynamics. This historical recognition is also fundamental to understand the first reception of special relativistic dynamics in all countries, and in particular in Italy. A first complete presentation of this new dynamics appeared in the July 1905 paper written by Poincare and published in 1906 [3]. In this paper the new dynamics was presented as an invariant one by the Lorentz-Poincare transformation group, and it was derived by Maxwell's theory of electromagnetism and contained also a theory of gravitation (absent in Einstein's 1905 paper). The starting point was electromagnetic self-induction phenomenon related to the so-called radiation reaction. When a charged particle is submitted to the action of an electromagnetic field, it is accelerated and it irradiates. This
180 radiation modifies the field and the new field modifies the acceleration of the particle, which again irradiates and so on. In this way, the electromagnetic field depends on all the time derivatives of position up to the infinite one. This means that there is also a contribution to the field force proportional to the acceleration, the coefficient of which involves an electromagnetic mass, that is an electromagnetic contribution to the particle inertia. At this point, the question was: is it possible that mechanical (inertial and gravitational) mass was not a primitive concept and indeed is wholly due to this electromagnetic effect? Poincare, among other scientists, realized that this was the case also for non-charged matter as long as is constituted by charged particles: that is mechanical mass was nothing else than electromagnetic mass, and electromagnetic mass is not a static fixed quantity but depends on velocity. Mass is so related to the electromagnetic field energy by the today well-known (now considered from a mechanistic and not electromagnetic perspective) equation: m = Ee.m.fieId/ c 2 . If mass is nothing else than electromagnetic field energy and charge can be defined, via Gauss' theorem, by the electric field flux through a certain space surface, matter can be completely understood in terms of the electromagnetic field, and it has also active and dynamical features beyond the passive and inertial ones. If mass must be understood in terms of the electromagnetic field, mechanics must be derived by electromagnetism theory which becomes the fundamental theory of physics. If mass changes with velocity, Newtonian mechanics is no more valid and must be modified. The new mechanics must have the same invariance group of electromagnetic theory, that is the LorentzPoincare transformation group, to which a new relativity principle and a new gravitation theory (even gravitational mass changes with velocity) must also be conformed. 2. Electromagnetic Conception of Nature and Quantum Physics The rising of quantum physics is conventionally related to the works of Planck during the years 1899-1900 [4]. However, Joseph Larmor, within an electromagnetic conception of Nature, was working to understand the atomic structure of matter in terms of the electromagnetic field at least since 1893 [5]. After leaving the idea of a "vortex atom", he considered the electrons as vortices into the sea of the electromagnetic field: this idea lead him to what, many years later, was called a "quantum atom". Electrons as rotations into the electromagnetic field constitute stable, stationary non-radiant configurations of atoms: these configurations correspond to given discrete values of the conserved
181 angular momentum. Radiation is emitted or absorbed by atoms by impulses only when these configurations change in respect to the minimal total energy. Thus, emission of radiation and loss of energy were not related to the absolute translations of the electron as an accelerated, charged material particle, but to the relative changes (within the atoms) of the inertial rotational motions constituting electrons (in any stable state the change of velocity in a period is zero). This idea furnished an explanation of atomic spectra and even a prediction of the Zeeman effect. This electromagnetic conception of the atomic matter structure, that is the recognition of these atomic matter structures within the electromagnetic field, Larmor understood, would be also the key to the calculus of specific heats in terms of internal energy and equal partition of energy within the kinetic theory of gases. Planck wanted to show the universality of thermodynamics and its second principle showing that it holds also for electromagnetic phenomena. Planck was forced to use Boltzmann's statistical thermodynamics concept of entropy, but showed that thermodynamics cannot be reduced to mechanics because heat is not only disordered matter motion but also electromagnetic radiation and that thermodynamics could be deduced from electromagnetism theory too. In 1900 Planck introduced discrete values of energy as heuristic tool within statistical thermodynamics of radiation to fit black-body radiation distribution experimental data. That is, energy was treated by Planck not as a continuous mathematical variable, but discrete: E = n h v , where n is an integral number and so energy is given by an integral multiple of the product of a universal constant h = 6.55 10"27 erg . sec with the physical dimension of an action and the radiation frequency. Planck's words made reference to "energy elements" (Energie-eletnenten), but Planck did not want to introduce an essential discontinuity within Nature but only to solve by the mathematical artifact of discreteness the problem to fit experimental data: he did not want to modify classical physics or to make a revolution. In 1899 Planck had already introduced this constant naming it "b" and not "h", it did not denote an action and it was a constant in the different theoretical context of finding an absolute system of natural units of measure. The first actual physical meaning to this constant was given not by Einstein, but by Larmor in 1902 within his electromagnetic conception of Nature [6]. Following Larmor, Planck's constant was not related to a mathematical artifact but had to be interpreted in terms of the relationship between matter and (ether) electromagnetic field, that is as the ratio between matter energy (given by electromagnetic field energy) and radiation frequency. Planck's constant, for
182 Larmor, was a quantum of the conserved angular momentum to be related to atomic electrons considered as vortices within electromagnetic field. Larmor proposed also to leave the abstract oscillator model of matter used by Planck and to take count of the actual electromagnetic nature and origin of matter. This implied to use the simple idea of 'elementary receptacles of energy', that is of cells in the phase space of physical systems. This idea was deduced from the consideration of the nature of radiation, constituted by discrete elements given by short trains of simple undulations. The phase space reformulation of Planck's problem lead to the discreteness of the atomic conserved angular momentum from which was deduced the discreteness of energy. J. W. Nicholson in 1912 [7] explored this explanation of the atomic structure and his work was the starting point of Niels Bohr's model. From Larmor's perspective, from the electromagnetic conception of Nature, the discrete, discontinuous, quantum nature of matter and radiation is easily understood because matter is derived from the fundamental physical reality given by the electromagnetic field. Thus, electromagnetic field must present wave but also corpuscular aspects to explain the origin of matter, and matter particles must present corpuscular but also wave aspects as long as they derive from the electromagnetic field. Bohr [8] reconsidered Nicholson's model but completely changing its meaning: atom was no more understood in terms of the electromagnetic conception of Nature but in terms of an axiomatic approach in which the meaning of Planck's constant is no more given by the electromagnetic nature of the atomic matter structure but by an abstract quantum of mechanical action. Bohr followed Arnold Sommerfeld's perspective [9] which presumed to understand all the things in terms of an a priori assumed and unexplained constant, that is Planck's constant: electromagnetic as well as thermodynamic and mechanical models were considered to be no more suitable because electromagnetic field theory as well as thermodynamics and mechanics must be reformulated in order to fit experiments and to overcome the problem of their incompatibility. However, Sommerfeld and Bohr seem not to understand that their interpretation of Planck's constant was mechanical and this put mechanics at the fundamental level of physics, restating a new mechanistic perspective. It happened something like to the procedure of axiomatization which led to the loss of electromagnetic meaning of the light velocity constant c in the mechanistic version of relativity dynamics given by Einstein. The meaning variance of a revolutionary item (c as well as h), together with the change in its "title" ("Universal Constant"), is a well known process which leads to a restoration, to a
183 dogma to be understood "mechanically" and to a myth of the foundations of a new religion as well as a new scientific theory. From Larmor's perspective, Planck's statistical thermodynamics of electromagnetism implied that classical electromagnetism continuous variables lose meaning and cannot be precisely determined, but only probabilistically just in order to derive matter corpuscles from the electromagnetic field. In 1905-1906 Einstein [10], as well as he had done with Poincare's new electromagnetic relativistic dynamics, by criticizing Planck noted the discontinuous and probabilistic character of radiation but inverted Larmor's perspective and introduced the quanta of light to reduce electromagnetism (as a statistical theory) to corpuscular mechanics. In 1911-1912, from an electromagnetic conception of Nature, Poincare [11] showed that these new characters of light and electromagnetic field cannot be understood in terms of the old corpuscular mechanics, and, on the contrary, these changes within electromagnetic theory imply a new mechanics. Indeed, if mechanics has to be built on electromagnetism and electromagnetism must be changed, then also mechanics must be modified: there must be a new "electromagnetic dynamics". From this perspective, electromagnetism cannot be reduced to mechanics, but, on the contrary, mechanics must be modified again and in more radical way by the relativistic electromagnetic dynamics: mechanics must be intrinsically probabilistic even for only one material particle, because the origin of matter is electromagnetic and electromagnetic radiation is discontinuous. Poincare's new electromagnetic discontinuous mechanics based on a discontinuous electromagnetic action was mathematically very difficult for the other physicists and was not understood at all: it was the first form of a new revolutionary "electromagnetic quantum mechanics". Only after many years, in 1925, Heisenberg [12] stated the necessity of, and posed the basis for, a new quantum mechanics: his starting point was not the electromagnetic conception of Nature, but an operational perspective. Heisenberg showed that at the atomic or microphysical level the only measurable variables were the electromagnetic variables of frequency and intensity of electromagnetic radiation absorbed or emitted by electrons within atoms. From this point of view, mechanical variables, as long as they are not directly measurable and cannot be objects of absolute experimentation, intuition or visualization at the atomic microphysical level, must be redefined in terms of such measurable electromagnetic variables. This implied, as then stated in 1927 by Heisenberg himself [13], a fundamental indeterminacy of mechanical variables. If physical reality is only what can be experimentally measured, from
184 Heisenberg's perspective the electromagnetic conception of Nature can be deduced without any aprioristic assumption. Its deduction follows merely from the request of an operational definition of physical variables at the microscopic level. Unfortunately, this original derivation and foundation of quantum mechanics has been completely forgotten and removed. It was for ideological reasons that mechanics must be maintained independent from electromagnetism and at the foundation level of the physical sciences. This priority of mechanics is related to the mechanistic conception of Nature. Considering Nature and the other nonhuman living beings as machines, that is as inert and passive matter, is the precondition to avoid any ethical problem in respect of Nature and the other nonhuman living beings and to the complete violent dominion over, and exploitation of, Nature and the other living beings. References 1.
E. Giannetto, Saggi di stone del pensiero scientifico (Sestante, Bergamo 2005). 2. H. Poincare, Revue de Metaphysique et Morale 6, 1 (1898); H. Poincare, Arch. Need. 5, 252 (1900); H. Poincare, La Science et I'Hypothese (Flammarion, Paris, 1902); H. Poincare, Bulletin des Sciences Mathematiques 28, 302 (1904); H. Poincare, Comptes Rendus de I'Academie des Sciences 140, 1504 (1905). 3. H. Poincare, Rendiconti del Circolo Matematico di Palermo 21, 129 (1906). 4. M. Jammer, The Conceptual Development of Quantum Mechanics (McGraw-Hill, New York, 1966); M. Planck, Berliner Berichte 18 (May), 440 (1899); M. Planck, Verhandlungen der Deutschen Pysikalischen Gesellschaft 2 (14 December), 237 (1900), Engl, transl. in The Old Quantum Theory, D. ter Haar ed., (Pergamon Press, Oxford, 1967). 5. J. Larmor, part I abstract, in Proc. Roy. Soc. 54, 438 (1893); part I, in Phil. Trans. Roy. Soc. 185, 719 (1894); part II abstract, in Proc. Roy. Soc. 58, 222 (1895); part II, in Phil. Trans. Roy. Soc. 186, 695 (1895); part III abstract, in Proc. Roy. Soc. 61, 272 (1897); part III, in Phil. Trans. Roy. Soc. A190, 205 (1897); J. Larmor, Phil. Mag. (5) 44, 503 (1897); J. Larmor, Aether and Matter (Cambridge University Press, Cambridge, 1900); B. Giusti Doran, "Origins and Consolidation of Field Theory in NineteenthCentury Britain: From the Mechanical to the Electromagnetic View of Nature", Historical Studies in the Physical Sciences 6 (Princeton University Press, Princeton, 1975). 6. J. Larmor, "Theory of Radiation", Encyclopedia Britannica 8 (vol. XXXII of the complete work), 120 (1902), Black, London. J. Larmor, Reports Brit.
185
7. 8. 9. 10. 11.
12. 13.
Assoc. Adv. Sci. 1902, 546 (1903) (abstract of a paper presented at the Belfast meeting); J. Larmor, Proc. Roy. Soc. London A83, 82 (1909); J. Larmor, Preface (1911) to The Scientific Papers of S. B. McLaren (Cambridge University Press, Cambridge, 1925). J. W. Nicholson, Monthly Notices of the Royal Astronomical Society 72, 49, 139, 677, 693, 729 (1912). N. Bohr, Phil. Mag. 26, 1, 476, 857 (1913). A. Sommerfeld, Physikalische Zeitschrift 12, 1057 (1911). A. Einstein, Annalen der Physik 17, 132 (1905); A. Einstein, Annalen der Physik 20, 199(1906). H. Poincare, Comptes Rendus de I'Academie des Sciences 153, 1163 (1911); H. Poincare, Journal de Physique theorique et appliquee' s. 5, t. 2, 5 (1912); H. Poincare, Revue scientifique s. 4,1.17, 225 (1912); R. Dugas, Histoire de la mecanique (Griffon, Neuchatel 1955), Engl, transl. by J. R. Maddox, A History of Mechanics (Dover, New York 1988). W. Heisenberg, Zeitschrift fur Physik 33, 879 (1925); M. Born, W. Heisenberg and P. Jordan, Zeitschrift fur Physik 35, 557 (1926). W. Heisenberg, Zeitschrift fUr Physik 43,172(1927).
W H A T W E T A L K ABOUT W H E N W E TALK A B O U T UNIVERSE C O M P U T A B I L I T Y SALVATORE GUCCIONE Istituto di Fisica Teorica dell'Universita di Napoli, Mostra d'Oltremare, pad. 20, Napoli 80125 Email: [email protected]
lost in time, lost in space, and in meaning. (The Rocky Horror Picture Show)
In the present work we will not follow the road of searching for a general definition of Computable Universe, but rather we will limit ourselves to advance a modest proposal regarding some adequate minimal conditions for the definition of Computable Universe. (In the present work we will have to do with only one Universe. In other words we will not treat Computability in parallel Universes).
1. Section 1 A. It is possible to propose a thesis according to which physical universe is viewed as a (the?) computer: ".... no time, no space, and no law. The building element is the elementary 'yes, no' quantum phenomenon. It is an abstract entity. It is not localized in space and time." ([1], p. 570; but see also [2], [3], [4]). B. It is possible to propose a thesis, less strong, according to which physical processes can be viewed as computations (see, e. g., [5]). Thesis B is obviously less strong than thesis A, as, if we consider thesis A valid, then thesis B is also valid; whereas, if we consider thesis B valid, validity of thesis A does not follow. C. It is possible to propose a third thesis, according to which all physical theories, as such, are computable. D. It is possible, finally, to propose a thesis according to which all today physical theories are computable. It should be observed, en passant, that theses B, C, D do not seem to contain the strange assertion: "no laws". (Not even, so to say, computational laws?), "no time" and "no space" (anyway, what about meaning?).
186
187 It is clear that here "computability" means "effective computability", or - to those who accept the Turing-Church thesis - "calcolability using a Turing machine". More generally, having to do with physical processes, and usually with physical theories (with the exception of the "no law" in [1]), we should ask: what do we mean when we talk about computable physical theory! In my opinion, the definition proposed in 1974 by Kreisel [6], remains the most adequate definition of computable physical theory notwithstanding the fact that it is a feeble thesis , as observed in [7], where it was proposed to reinforce it by adding a requirement - epistemological in character - christened with the expression : "uniformity condition"([7], p. 161). Computability/non computability of today physical theories (or also of physical theories as such) has been discussed by various authors: see, e. g., [8], [9], [10], [11], [12], and overall [13], [14], [15], and [16]; in fact, these last four works report results of great interest in favour of the thesis that it is possible to find in Quantum Mechanics elements of non computability. And now an observation: when referring to physical theories, we usually speak of real numbers. For example, following Einstein in the second appendix of its "The Meaning of Relativity", we see that a continuum with a finite number of dimensions is necessary both in Newton mechanics and in Relativity. It is possible to assert that the continuum is at the basis of almost all theories of today physics. This is the reason why we talk of real numbers. It should be observed that this contrasts with the immaterial vision of Wheeler [1], which, however, will not be analysed here. (It is worthwhile to note, however, that partisans of the so called strong hypothesis of artificial intelligence never looked for, to my knowledge, [17], to merge their strong hypothesis in the extra strong vision of Wheeler). At this point it is useful to remember that Turing-computability has to do with natural numbers. It appears therefore necessary to proceed from Turing machines to computational effective procedures on real numbers: see, e. g., the definition given by Gzegorczyk in terms of recursive functionals [18], [19] or the - equivalent- definition by Pour-El and Richards [20], or the notion of computability on real numbers by L. Blum et al. [21].
188 2. Section 2 In the previous section we proceeded from the question: "is the Universe computable?" to questions - perhaps more precise from an epistemological point of view - regarding computability of single physical theories. We should now first underline that, in the light of our present knowledge, nothing seems to forbid that of two theories treating the same range of phenomena one is computable (according to the definition previously given), whereas the other is not computable. Geroch and Hartle, for example, in the work previously quoted, distinguish the formulation of a theory from its possible implementation through an algorithm (computability); in addition they identify, within a specific approach to Quantum Gravity, a counterexample to the hypothesis (generally implicitly assumed by the community of physicists) of computability of every physical theory. Geroch and Hartle, however, warn us that "one mathematical formulation of the theory may provide no algorithm for implementing the theory and yet another formulation does" ([16], p. 348). Note that in the previous assertion Geroch and Hartle clearly appear to accept Quine's doctrine of under-determination of theories to data: "the doctrine that natural science is empirically under-determined; under-determined not just by past observations but by all observable events " ([22], p. 313). Note that, as underlined by Quine in the same work, the doctrine of underdetermination of theories should not be confused with , e. g., the so called Duhem-Quine thesis, regarding which see also, e. g., [23]. At this point is appears necessary to introduce the notion of completeness of a physical theory for at least two reasons. The first reason is quite obvious: it can be expressed saying that physical science, by its own nature tends to elaborate complete theories. Even if this assertion is contrasted by some epistemological currents, irrealistic in character, more interested in the so called sociology of Science than in the logical structure of Science. The second reason is that Quinian doctrine of under-determination of scientific theories to data shows all its force when referred to complete theories. In fact, it could be argued that non complete theories could be completed in such a way that they would not be empirically under-determined. The relationship between completability and under-determination thus appears a key point of great epistemological interest. In addition, in the case treated here, a further element of interest is the addition to this key point of the element of computability.
189 The number of questions is therefore increased: one could, for example, ask if theories demonstrated to be uncomplete regarding a same range of phenomena, even if all computable and completely covering the designated phenomenological field, are reciprocally computable. In other words, the problem of the computability of the, so to say, reciprocal junctions of various theories would arise (see, e. g., [24]). But what do we mean here by completeness of a scientific theory? The notion of completeness in science has generated many dicussions (see, e. g., [25]). We remind, for example, that Einstein considered uncomplete theories both the quantum mechanics and the relativistic theory of gravitation. The first results uncomplete as it does not fulfill the condition that "every element of the physical reality must have a counterpart in the physical theory" ([26], p. 777), a condition which, in a way, is close to condition of logical completeness a la Tarski; whereas the second is uncomplete as it does not fulfill the condition that: "every field theory must be constructed only by means of the primitive notion of field" ([27], p. 76). (For a detailed analysis of Einstein's position on this question, see, e. g., [28], [29], [30]). In any case it is important to underline that no matter which reasonable definition of completeness for a scientific theory we consider, its possible non computability (i. e. the non computability of its mathematical apparatus) leads in any case to a situation of undecidability of the theory, via Goedel's first theorem of uncompleteness [31]. (Note that Geroch and Hartle, for example, have demonstrated the non computable case previously mentioned via a theorem of undecidability due to Hanken [32]). This, given any reasonable (from a physical point of view) semantic could lead to the incompleteness of the theory with regards to those semantics. We say "could lead" and not "leads" as Goedel's theorem, a rigore, deals with axiomatics and coherent theories. 3. Section 3 Considering what has been said in previous sections, a definition of computability of the Universe in terms of computability of physical theories which are complete and non reciprocally contradictory should face at least the two following orders of problems: a) problems related to theories which have been demonstrated to be non computable; b) the problems here previously characterized with the expression: "junction's computability". Essentially, we are concerned with the availability of "a unified theory of everything", i.e. a theory unifying Gravitation and all the Nuclear Forces,
190 Electromagnetism and the Quantum Mechanics. In addition such theory of everything should result computable. In any case, it is well known that we do not possess such a theory (computable or non computable) even if today promising candidates to such a role (independently from the problem of computability) are the so called theories of super strings. We will not follow here the way of searching for a general definition of Computable Universe, but will limit ourselves to follow a more modest road with the aim to propose a possible definition of the minimal computability of the Universe. It will be useful, to this aim, a short digression_on the notion of physical constants and on some recent works which question the same constancy of the so called physical constants. First let us recall the words by Levy-Leblond: "In most formulae of physics or, more generally, in most theoretical analyses of any physical phenomenon, there appears one or more physical constants. Some of these play an essential and pervasive role in physics. They are variously called general or fundamental or universal physical constants" ([33], p. 87). Among these constants we recall, as an example, the velocity of light, designated as c, the gravitational constant designated as G, Planck's constant designated as h, the constant of fine structure, designated as a and the charge of electron designated as e. The cosmological constant, generally designated as A, has some peculiar aspects. It was introduced by Einstein in 1917 [34], after his attempt to apply his formulation of general relativity to the Universe as a whole. His philosophical guideline was that the Universe is static and the introduction, in his equations, of the cosmological constant (which is not contradictory with but not generated within the mathematical structure) allowed a static Universe. (As regards the cosmological constant, we will limit ourselves to recall chapter V of the already quoted book by Barrow [9] with its rich bibliography relative not only to the cosmological constant but to physical constants in general, as well as the fundamental article of Weinberg [35] and the more recent work by Krauss and Turner [36]). In the present note we will consider only the more traditional physical constants and face two questions. First question: are the values of these constants measurable numbers according to the definition of Geroch and Hartle (16)? We recall that the notion of measurable number according to Geroch and Hartle is the following: "Regard number w as measurable if exists a finite set of
191 instructions for performing an experiment such that a technician given an abundance of unprepared raw material and an allowed error e is able by following those instructions to perform the experiment yielding ultimately a rational number within e of w." ([16], p. 542). Obviously, the instructions depend from e and prediction within a progressively decreasing error would require materials always new, new instructions, new ideas (see [16], p. 549 and [12]). Second question: are the values of such constants computable real numbers? Where, to indicate a number as computable real we will follow the definition: "Roughly speaking a computable real is one which can be effectively approximated to any desired degree of precision by a computer program given in advance. Thus a number n is computable since there exist finite recipes for computing it. When more precision is desired the computation may take longer, but the recipe itself does not change" ([20], p. 13). We note first that, whereas concerning computable numbers an increasing rational approximation to a real number can be obtained using the same recipe and only at the expenses of longer computation, for measurable numbers increasing approximation may require new recipes. Geroch and Hartle present examples of physical constants which are, so to speak, traditional and are measurable: for example the constant of fine structure a is measurable (see, e.g. [37]). In any case, a rigore, it is not given for granted that Geroch and Hartle measurability (G-H measurability) is a property of all physical constants. Let us consider the example of the velocity of light, which is particularly meaningful. In general, c is used to indicate the one way velocity of light, whereas we can only measure the velocity of light along a closed path. The constancy of one way velocity is directly deduced from two axioms of Einstein Special Relativity [38]. Then we attribute to the one way velocity of light the same numerical value (this one measurable!) as to the velocity of light along a closed path (this is the reason why conventionalism is mentioned in Special Relativity: Reichenbach and Gruembaum Thesis [39]. Also, for other points of view see, e.g., [40], [41], [42]). Geroch and Hartle ([16], p. 544) assert that any computable number is measurable and this seems acceptable (even though the example of velocity of light invites to caution). But certainly the opposite is not true. As they have previously demonstrated that the set of computable numbers is not coincident with the set of measurable numbers. The information that we have on computability / non computability of physical constants is almost nihil (even considering only those constants which, following Barrow, we have called
192 traditional). From the point of view of these notes this is an unpleasant hole. In fact we suggest here to take seriously into consideration the following definition. Definition 1. We will call computable physical theories in feeble sense a physical theory for which the values of all its constants are computable real numbers. Then a theory of everything computable in feeble sense could, with some precision, tell us something about computability of the Universe. Obviously it is not contradictory to envisage a theory completely devoid of constants. Such a theory, perhaps would have pleased Einstein as it, in addition to the disturbing dualism field / particle, would radically reduce everything to the shape of the field equations. However, presently, no theorical approach seems to lead to such theory. We will therefore for the moment avoid this point to face another problem: what if physical constant were not really constant? 4. Section 4 The so called constant of fine structure a represents, roughly speaking, a measure of the electromagnetic attraction between photons and electrons: its expression, in function of other physical constants is or = e2/ h c / 2 7t, where e represents the electron charge, h the Plank's constant and c the velocity light. Well, there is experimental evidence that ^increases, although slowly, in the cosmic time t ([43], [44], [45]). We said that a would grow slowly: in fact according to experimental results in the last 6-10 billion years there would have been a variation Aa/a - - 0,72 +/ - 0.18 x 105. A behaviour, however, that would be decidely of bad taste for a constant. It has been discussed if this variation depends exclusively from one of the constants e, h, c (see, e. g., [46], [47]) and it has been suggested (on the basis of thermodynamics of black holes [48]) a test which would lead to the control of the variability of the velocity of light c (but the problem of conventionality or not of the one way value of velocity of light would remain unresolved). Now a short digression regarding the argument based on the thermodynamics of black holes [48]. At the 17nth annual International Conference on General Relativity and Gravitation (GR 17 Conference) held in Dublin, July 18-24 2004 Hawking gave a lecture on his new calculations regarding Black Holes Information Loss.
193 In his controversial lecture - controversial mainly for the use of the mathematical technique known as the Euclidean path integral method in the place of the "more straightforward Lorentian approach to gravity" [49] (and on this subject I would advise to read the stimulating paper by R. L. Oldershaw: "The new Physics - Physical or Mathematical Science?" [50]) Hawking asserts that, in contrast with his own statements during the last 30 years, black holes do not destroy information. This is the text of the press release at GR17 for physicists and reporters: "One of the most intriguing problems in theoretical physics has been solved by Professor Stephen Hawking of the University of Cambridge. He presented his findings at GR17, an International Conference in Dublin on Wednesday 21 July. Black holes are often though of as being region of space into which matter and energy can fall and disappear forever. In 1974 Stephen Hawking discovered that when one fused the ideas of quantum mechanics with those of general relativity it was no longer true that black holes were completely black. They emitted radiation now known as Hawking radiation. This radiation carried energy away from the black hole which meant that the black hole would gradually shrink and then disappear in a final explosive outburst. These ideas led to a fundamental difficulty, the information paradox, the resolution of which is to be revealed in Dublin. The basic problem is that black holes, as well as eating matter, also appear to eat quantum mechanical information. Yet the most fundamental laws of physics demand that this information be preserved as the universe evolves. The information paradox was explored and formalised by Hawking in 1975. Since then, many have tried to find a solution. Whilst most physicists think that there must be a resolution of the paradox, nobody has really produced a believable explanation. In fact, seven years ago the issue prompted Hawking, together with Kip Thome of Caltech, to make a wager against John Preskill also of Caltech, that the information swallowed by black holes could never be recovered. On Wednesday, Hawking conceded that he has lost the bet. The way his new calculations work is to show that the event horizon, which is the surface of the black hole, has quantum fluctuations in it. These are the same uncertainties in position that were made famous by Heisenberg's uncertainty principle and are central to quantum mechanics. The fluctuations gradually allow all the information inside the black hole to leak out, thus allowing us to form a consistent picture. The information paradox is now unravelled. A complete description of this work will be published in professional journals and on the web in due course."
194 At the time of the present meeting (Cesena, 4-9 October 2004) comments to the Hawking talk are already available ([49], [51]-[53]; but see also the interesting paper by A. N. St. J. Farley and P. D. D'Eath [54]). Going back to our topic, we now face functions which are not constant but acquire different values depending on variation of t. Now a first question: could the values in cosmic time t of those pseudoconstants which would be slowly variable constants (CLV) (such as fine structure constant or) be measurable numbers according to Geroch and Hartle? (G-H measurable)? In any case we give the following definitions: Definition 2. A CLV t-funcion will be said G-H measurable if any of its values is G-H measurable. Definition 3. A CLV t-function will be said G-H computable if any of its G-H measurable values results computable. Of course in Definition 3 we use the term computable in the sense of PourEl and Richards [17] . In addition, following Geroch and Hartle, ([16], p. 544) we will hold that any computable number is a G-H measurable number. We could then say that: Definition 4. A theory will be said to be G-H computable in feeble sense if all its CLV t-function are G-H computable. We thus find a conclusion that under many aspects sounds familiar with Kreisel definition [6] but is more feeble in that it refers only to fundamental constants present in a theory and not to "any real number which is well defined (observable) according to the theory ([6] , p. 11). On the other side, the minimal definition (in the sense that it is hardly possible to ask less to a definition of computability for a scientific theory) of computable physical theory results sufficiently ample to inglobe physical constants which in reality result slowly variable (in this paper we will not discuss the notion of slowliness in variability). Finally, our Definition 4, is, as Kreisel definition in [6], of non uniform nature (following the ideas of Kalmar [55]. In fact we have said that: "Given a theory T for each fundamental constant of the theory there exists an effective procedure (in the sense of [10]) such that ".
195 To reinforce the definition we could ask (according with [7] that: " given a theory T there is an effective procedure (in the sense of [20]) such that for each fundamental constant of the theory we have ".As you can see it is once again question of permutation of logical quantifiers. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.
J. A. Wheeler, International Journal of Theoretical Physics 81, 557 (1982). S. Lloyd, Nature 406, 1047 (2000). Y. J. Ng, Physical Review Letters 86, 2946 (2001). S. Lloyd, Physical Review Letters 88, 2946 (2002). S. Wolfran, Physical Review Letters 54, (1985), 735 (1985). G. Kreisel, Synthese 29, 11 (1974). S. Guccione, G. Tamburrini and S. Termini, AGORA' 17, 159 (1998). R. Penrose, The Emperor's New Mind (Oxford U. P., 1989). J. D. Barrow, Theories of Everything (Oxford U.P., 1989). J. Earman, A Primer on Determinism (D. Reidel Publ., 1986). S. Guccione, 10th International Congress of Logic Methodology and Philosophy of Science, Abstracts, 532 (1995). S. Guccione, in The Foundation of Quantum Mechanics: Historical Analysis and Open Questions, C. Garola and A. Rossi Eds. (World Scientific, 1999). P. Benioff, J. Math. Phys. 11, 2253 (1970). P. Benioff, /. Math. Phys. 12, 360 (1971). A. B. Komar, Phys. Rev. B133, 542 (1964). R. Geroch and J. B. Hartle, Foundations of Physics 20, 533 (1986). S. Guccione, EPISTEMOLOGIA, (2004, in press). A. Gzegorczyk, Fundamenta Mathematicae 42, 168 (1955). A. Gzegorczyk, Fundamenta Mathematicae 44, 61 (1957). M. Pour-El and I. Richards, Computability in Analysis and Physics (Springer Verlag, 1989). L. Blum, in Lectures in Complex Systems, E. Jen. Ed. (Addison Wesley, 1990). W. V. O. Quine, Erkenntniss 9, 313 (1975). A. Derecin, and S. Guccione, EPISTEMOLOGIA VIII, 77 (1985). S. Guccione, Boston Studies in the Philosophy of Science 47, 237 (1981). R. Schlege, Completeness in Science (Appleton Century-Croft, 1967). A. Einstein, B. Podolski and N. Rosen, Phys. Rev. 47, 111 (1935). A. Einstein and N. Rosen, Phys. Rev. 48, 73 (1935). R. de Ritis and S. Guccione, Fundamenta Scientiae 5, 103 (1984). R. de Ritis and S. Guccione, Fundamenta Scientiae 8, 383 (1987). R. de Ritis and S. Guccione, EPISTEMOLOGIA XVI, 97 (1993).
196 31. K. Goedel, Monatsh. Mat. Rev. 38, 173 (1931). 32. W. Hanken, in World Problems, W. W. Boone, F. B. Cannonito and R. Lindon Eds. (North Holland, 1973). 33. J. M. Levy-Leblond, II Nuovo Cimento 7, 187 (1977). 34. A. Einstein, Sitzungsber Preuss Acad. Win. Phys. Math. Kl, 142 (1917). 35. S. Weinberg, Rev. Mod. Phys. 61, 1 (1989). 36. L. M. Krauss and M. S. Turner, GRG 27, 1137 (1995). 37. B. N. Taylor, W. H. Parker and D. N. Langerberg, Rev. Mod. Phys. 8, 375 (1969). 38. A. Einstein, Ann. Phys. Leipzig 17, 891 (1905). 39. H. Reichenbach, The Philosophy of Space and Time (Dover, 1956). 40. D. Malament, Nous 11, 293 (1977). 41. R. de Ritis and S. Guccione, GRG 17, 596 (1985). 42. R. de Ritis and S. Guccione, Fundamenta Scientiae 8, 57 (1987). 43. J. K. Webb et al, Phys. Rev. Letters 82, 884 (1999). 44. A. Songaila and L. L. Cowic, Nature 398, 667 (1999). 45. J. K. Webb et al, Phys. Rev. Utters 87, 1 (2001). 46. J. Barrow and J. Magueijo, Phys. Letters B443, 104 (1998). 47. S. K. Lamoreaux, Nature 416, 803 (2002). 48. P. C. W. Davies et al., Nature 418, 602 (2002). 49. C. Seife, Science 305, 586 (2004). 50. R. L. Oldershaw, Am. J. Phys. 56, 1075 (1988). 51. P. Rodger, PHYSICS WEB, (22 July 2004). 52. J. Baez, UTTP://MATH:UCR.EDU./HOME/BAEZ/WEEK207.HTLM, (25 July 2004). 53. C. Seife, Science 305,934 (2004). 54. A. N. St. J. Farley and P. D. D' Eath, arXiv:gr-qc/0407086Vl, Univ. of Cambridge, U.K., (23 July 2004). 55. L. Kalmar, in Constructivity in Mathematics, Heiting Ed., 72 (1979).
BOHM AND BOHMIAN MECHANICS GIANLUCAINTROZZI Dipartimento di Fisica Nucleare e Teorica, Universita di Pavia, Via Bassi, 6 27100, Pavia, Italy MARCO ROSSETTI Dipartimento di Fisica Nucleare e Teorica, Universita di Pavia, Via Bassi, 6 27100, Pavia, Italy
The standard, or Copenhagen, formulation of quantum mechanics postulates that the complete specification of a quantum state is given by the corresponding state vector (completeness). A different approach is possible, assuming instead the incompleteness of the theory. Additional parameters, called "hidden variables" since they are not empirically known, would be needed to completely characterize the quantum state. The knowledge of these hidden variables would allow the precise determination of the values for the observables of the quantum system. In 1952 David Bohm, starting from such an assumption, has proposed a hidden variables formulation of quantum mechanics that is empirically equivalent to standard quantum mechanics, but offers a more rational and coherent picture of reality. Bohm's model integrates the ordinary quantum theory by introducing particle coordinates as hidden variables. Therefore, particles are distinguishable and describe trajectories in space or in configuration space that are causally determinate. In this context it is possible to explain double slit experiments and interference phenomena in terms of particle trajectories. Quantum probabilities become epistemic: the probabilistic nature of physical predictions is not an intrinsic characteristic of nature, but depends on our ignorance of the exact value of the hidden variables. Since Bohm's interpretation is, both epistemologically and ontologically, a natural extension of classical mechanics to the quantum domain, the visualization of physical processes is still possible, and the corresponding picture of reality is more intuitive. Bohm's interpretation clearly also presents limits and weaknesses: a Lorentz invariant formulation of the model is still lacking, and all the observables result to be dependent on the global context (contextuality), with the only exception of position observables.
1.
The de Broglie-Bohm interpretation of quantum mechanics
The objective coexistence of a quantum wave and the associated particle is the fundamental physical assumption in the causal interpretation of quantum mechanics suggested by Bohm. In the de Broglie-Bohm (d.B.B.) model (or
197
198 interpretation) each particle is always associated to a pilot wave guiding it. The opposite in not true: there could be waves without an accompanying particle, called 'empty waves'. There are physical situations in which the wave associated to a particle splits in different waves with negligible spatial superposition. One of these waves, carrying the particle, is the pilot wave; the remaining waves are, by definition, empty waves. These empty waves do carry energy and momentum indeed, and are physical waves by all means: if an empty wave meets a particle at a later time, it will influence the particle trajectory, becoming a pilot wave again. The d.B.B. model requires, for the complete characterization of a system of N quantum particles, the specification of the corresponding wave function and, in addition, of the hidden variables, represented by the positions of the particles belonging to the system. Particles are supposed to be really existing in space, distinguishable and traveling along trajectories. These trajectories can not be exactly known for a specific particle, because the spatial positions are the hidden variables of the model. The causal interpretation of quantum mechanics is the evolution of the pilot wave concept, originally proposed by de Broglie, and is due to D. Bohm,1 J. P. Vigier and J. B. Hiley.3 It is based on the same dynamical law of quantum mechanics (the Schrodinger equation) and is empirically equivalent to standard quantum mechanics since both interpretations assume the probability density to be equal to p = \y/\2. For a single particle, Bohm writes the wave function y/ in a polar form as -S(x,t)
yr=R{x,t) e" , (1) with action s(x,t) and amplitude R(x,t) both real, and with R positive. Assuming a density probability given by p = lyJ,2=R2 and inserting the polar wave function into the Schrodinger equation, it is straightforward to get the following equations: following equations:
M(w+v{r)j*LriU at dt
2m {
\2m m)
dt
,
(2)
R J \
mJ
The classical limit (h—>0) for Eq. (2) gives
^ + Yll dt
2m
+ V(r)^0
,
(4)
199
which is the classical Hamilton-Jacobi equation for a single particle with momentum p = V5(r), influenced by a potential v(r)- The corresponding Newton equation is: m ^ = -VV(5) dt Eq. (3) is the continuity relation for an ensemble of identical particles, distributed with density p = R2, where
(6)
, = YM
m is interpreted as the velocity of a massive particle (m being the mass), moving along a trajectory normal to the surface with constant action S, in a potential field V\r). Therefore, the continuity Eq. (3) can be written: ^ + V(/w) = 0(7) at The term that was neglected when considering the classical limit of Eq. (2) defines the quantum potential:
fi(r)=-*L™.
(8)
W
2m R Therefore, the quantum Hamilton-Jacobi equation for a single particle is:
?S+W)L
+ V(r)+Q(r) =
0-
(9)
at 2m and the equation describing the causal evolution of the hidden positional variables, corresponding to the Newton equation for a classical particle, is: m—
= -V(V + Q) •
(10)
dt The force due to the classical potential is not the only one present in this case: there is also a quantum force due to the quantum potential Q and determined by the amplitude R of the wave function. The wave function is therefore responsible for both the density probability and the quantum force acting on the particle. The conditions granting identical predictions and thus empirical equivalence of the d.B.B. and the standard interpretation of quantum mechanics are the following. • Wave function satisfying the Schrodinger equation. • Density probability given by p = U 2 . • •
Particle momentum equal to p = VS(r) • Particle position determined within the precision limit imposed by Heisenberg's uncertainty relations.
200
As a final remark, it should be noted that a Lorentz invariant formulation of the d.B.B. causal interpretation has not been proposed yet. The particle positions are expressed in term of an absolute time, common to the entire quantum system. 2. Bohmian mechanics In spite of the importance of the quantum potential Q for the d.B.B. interpretation, it turns out to be possible to obtain a causal formulation of quantum mechanics without using such a concept (that Bohm himself defined "rather strange and arbitrary"4). Modern approaches to the d.B.B. interpretation, known as Bohmian mechanics,5 do not use the quantum potential. A system of N particles is described by the wave function y/(X,t), where X =(Xt,...,XN)e R3N and XK denotes the position of the &-th particle. The wave function y/(X,t) satisfies the time-dependent Schrodinger equation ih^-
= H¥,
(11)
of
where H is the non relativistic Hamiltonian operator. In the case of spinless particles, H is given by 2mk where
Vt=4--
(13)
k
Furthermore, y/(X,t) equation
8Xk determines the particle motion, controlled by the motion
dxk _ n im(/v^)
—
dt
mk
;
y/ y/
(,A,,..., AN)
(14)
*• >
According to Bohmian mechanics, a system of N non relativistic particles is completely determined by Eqs. (11) and (14). Even if classical and Bohmian mechanics share common characteristics (see for instance the time evolution given by Eq. (10)), there are also substantial differences as well. Bohmian differential equations are first order equations, while Newtonian mechanics is characterized by second order equations. As a consequence, particles positions and velocities are not independent in Bohmian mechanics. The specification of initial particle positions is therefore sufficient to completely define a Bohmian system, while both initial positions and velocities are needed for the complete specification of a Newtonian system.
201 3. Double slit interference As emphasized by R. Feynman in his lectures, a double slit experiment shows all the mysteries and paradoxes of quantum mechanics. Interference patterns could be observed by using microscopic particles in experimental setups equivalent to the double slit devices used in optics to reveal the wave properties of light, as originally suggested by Young. A point like source emits waves/particles travelling to a screen with a double slit. After the two slits, the waves/particles continue their motion until they reach a second screen, where they are detected as point like spots on the second screen. Interference patterns have been obtained by using electrons (Jonnson, 1961; Merli, Missiroli e Pozzi, 1974; Lichte, 1986; Tonomura, 1989), neutrons (Zeilinger, 1988), helium atoms (Carnal e Mlynek, 1991) and fullerenes (Zeilinger, 2002). The experiments described by P. G. Merli, G. F. Missiroli and G. Pozzi,6 and subsequently by A. Tonomura et al.,1 are realized by using electrons emitted one by one, instead of a beam. By using a very low intensity current (equivalent to one electron reaching the second screen every 0.04 seconds) it is indeed possible to obtain the quantum interference pattern, even if there is no possible interference among the electrons. Clearly, the wave-like interference effect is not a collective property, but has to be attributed to each and every electron sent through the double slit, and impinging on the second screen.
Figure 1. Double slit interference using single electrons (from A. Tomonura): (a) 8 electrons; (b) 270 electrons; (c) 2000 electrons; (d) 6000 electrons.
The standard interpretation of quantum mechanics is unable to explain the interference pattern emerging as single electrons accumulate on the second screen, since there is no possible interaction among these electrons (sent one by
202
one through the double slit, so that only one electron is crossing the apparatus, at any time). There is no explanation for the fact that a single electron does behave as a part of an ensemble of many electrons, even if there is no possible interaction among the electrons emitted by the source, each at a different time. On the contrary, the single particles double slit experiment is easily explained in the context of the d.B.B. interpretation: the single electron trajectory is causally defined by the position of the electron within the slit and the quantum potential. During the experiment, the boundary conditions for the device are unchanged and therefore the quantum potential will remain the same. Each electron will feel a quantum force depending on its initial position within the slit and the shape of the quantum potential Q (both of which are time independent), and will contribute to the overall interference patter predicted by the theory. The electron motion is totally independent of the trajectories followed by former or later electrons, but it is determined by the quantum potential which is the same for all the electrons belonging to the ensemble. The same interference pattern would result from the overlapping of many single electron images, each collected in one of many different double slit devices with the same geometry (and thus with the same quantum potential), eventually located far apart from each other. Adding up the results from each device, an interference pattern equal to the one produced by the same number of electrons sent one by one in a single double slit will emerge. In order to analyze the diffraction of a single particle through a double slit, let us consider a source Sl, a screen P with two slits A and B centered at (0,±y) and a second screen S2. The particle beam is described by a plane wave emitted by the source Sl and reaching the two slits. Each slit generates a Gaussian wave that propagates after the first screen, and the two identical waves overlap on the second screen S2 • There is no classical potential in the region between Sl and S2. A time independent quantum potential is present, according to Eq. (8). At a given time, the wave function at position (x,y) is, neglecting a normalization factor, represented by: iy{x,y,t) = [\//A{x,y,t)+y/B(x,y,t)] •
(15)
The wave function can be factorized in two orthogonal components. The particle motion along the x and the y axis are independent, and the quantum potential is just a function of the v variable: Q = -, 2mR where f(y,t)
d2R
d2R
dx
V
is a plane wave
2 +
9V(y.Q 2m f(y,t)
d2y
(16)
203
The shape of the quantum potential Q for a double slit is shown in Fig. 2.
Figure 2. Double slit quantum potential (as seen from the second screen).
The single particle trajectory can be obtained by integrating mx = VS from a specified position x. While the x component of the velocity V is uniform, the y component y is determined by the force F = -^(17) dy resulting from the quantum potential. The trajectories are initially divergent from the slits, due to the repulsive effect of the central peak of the quantum potential Q. Subsequently the particles propagate in a spatial region where Q is flat and there is no force acting upon the particles. Therefore they move with almost uniform speed, and a small transversal component of velocity V • Due to the force F, particles can cross potential gaps, ending up in an adjacent potential trench (see again Fig. 2). The pattern of trajectories reaching the second screen S2 clearly shows an interferential structure, with a high central peak (high trajectory density), followed by two minima (absence of trajectories), then two lower maxima and so on. Assuming an initial density probability distribution |^ 0 | for the particles, the final density probability distribution corresponds to the quantum probability density U 2 , as shown in Fig. 3.
204
Figure 3. Possible particle trajectories through a double slit device.
The double slit experiment shows an additional property of the de Broglie-Bohm model: nonlocality. If one slit is closed at a specific time, the wave function changes instantaneously, and therefore a different pattern, corresponding to one slit diffraction instead of a double slit interference, appears on the second screen 5 2 . A particle, even if localized far away from the slit that has just been closed, is instantaneously affected by the change of the wave function guiding the particle. This is a clear indication of the nonlocal character of the d.B.B. model. The nonlocal behaviour, originally seen in the d.B.B. interpretation and related to the nonlocal structure of the quantum potential Q, is also a feature of the modern formulation of the theory, namely the Bohmian mechanics. In fact, according to Bohmian mechanics, the velocity of each particle is dependent on the positions of all the other particles of the ensemble, as shown by Eq. (14) or, equivalently,9 by dXk
„ -
h
V.y/
-
(18)
The connections between a particle and all the others belonging to the same ensemble imply the existence of nonlocal properties. This characteristic was initially considered as a defect of the d.B.B. model. On the contrary, it has been realized at a later time that nonlocality is a property of any possible formulation of quantum mechanics, and that the d.B.B. interpretation has the great advantage of showing it clearly: "That the guiding wave, in general case, propagates not in ordinary threespace but in a multidimensional-configuration space is the origin of the notorious 'nonlocality' of quantum mechanics. It is a merit of the de BroglieBohm version to bring this out so explicitly that it cannot be ignored".1
205
The double slit experiment has also suggested a generalization of the Bohr complementarity principle, proposed by D. M. Greenberger and A. Yasin in 1988. n As a particular case of a double slit device, let us consider the possibility of reducing arbitrarily the size of slit A, while the other slit B remains unchanged. As we decrease the size of A, the probability of having a particle crossing the first screen through B increases. This fact could be described by saying that, by reducing the size of one slit, we increase our knowledge of the trajectory, or corpuscular behaviour, of the quantum system. In the limit where A has a null dimension, we know for sure that the particle crossing the first screen went through the slit B. The complementary information about the wavelike behaviour of the system is therefore completely lost. Let us consider a wave function y/ = y/ + ys B , where y/ and y/ are arbitrary waves with amplitude R A and R B and the intensity is defined as / = |^y | . The wave-like behaviour of the quantum system, related to the relative intensity of the interference peak, is characterized by 7 D
max
= /
max
~ +
l
™n /
min
2R
ARB
= R
A
+
R
.
(19)
B
while the particle-like character ("which path") of the quantum system is characterized by R p =
l ~ Rl . *1 + Rl
(20)
The Greenberger-Yasin relation connects these two quantities: P2 + D2 = 1 . (21) A single experiment could display both the wave-like and the particle-like behavior of a quantum system simultaneously. This result is more general than Bohr's complementarity since the only two possible outcomes of an experiment are, according to Bohr, exclusively wave-like (z? = 1; f = 0 ) or exclusively particle-like (D = 0 ; P = l ) . The Greeenbeger-Yasin relation has been confirmed by neutron interference12 and in optical experiments.13 4. Features of the d.B.B. model 4.1. Causality Bohm's formulation clearly has been proposed in order to re-establish the causality principle in quantum mechanics. In fact, the d.B.B. interpretation describes particles moving along classical trajectories, defined by Eq. (10),
206
m— = -V(V + Q) • dt This equation is known as "quantum Newtonian equation", since the potential (V + 2) c o n t a ins a quantum potential term. The force acting on the particle (cause) is correlated to the particle motion (effect). Therefore, the causality principle is clearly effective. The probabilistic aspects of quantum mechanics do emerge, as related to a causally defined ensemble of possible trajectories (see Fig. 3). 4.2. Determinism A theory is considered to be deterministic if the specification of the initial value of all the relevant variables of the system is sufficient to calculate the past values and to predict the future values of such variables for any arbitrary value of time. This formulation of determinism also implies that it is possible, for an arbitrary time, to assign a value to all the variables characterizing the system. The time evolution of the wave function, controlled by the Schrodinger equation, is deterministic. Quantum mechanics, however, is a non deterministic theory because of the probabilistic nature of the predictions for the values of the observables of a quantum system. It is not possible to formulate exact predictions (such as "this particle will decay in 15 seconds"), but only probabilistic ones ("the half-life of this particle is 20 seconds"), in the context of quantum mechanics. The Copenhagen interpretation considers probabilities as intrinsic features of quantum mechanics: all the predictions about the outcomes of a quantum measurement can only be expressed in terms of probabilities. As a consequence, standard quantum mechanics turns out to be a non deterministic theory at an ontological level. The d.B.B. model would require, for the complete definition of a quantum system, not only the specification of the probability density, but also the definition of the initial positions of all the particles belonging to the quantum system ("hidden variables" of the model). Precise trajectories are therefore defined at any time, but they are not empirically known, for the initial positions are to be considered hidden variables. Since particle positions can not be experimentally determined with an accuracy better than that given by the probability density M 2 , the d.B.B. interpretation is unable to produce deterministic predictions, exactly as standard quantum mechanics. The d.B.B. model is indeed deterministic at an ontological level (since the equation of motion is the quantum equivalent of Newton's classical equation), but it remains a probabilistic model with respect to the possible predictions about the results of
207
a quantum measurement. In this case, probabilities are epistemic (i.e., linked to the limits of the observer's empirical knowledge of the quantum system) rather than ontological (i.e., related to the intrinsic features of the model). 4.3. Realism Realism is usually defined by an ontological and an epistemic hypothesis: there is an external reality, existing independently of any observer ("ontological"); it is possible to have direct access to this external reality ("epistemic"). The d.B.B. interpretation describes a quantum system in terms of really existing particles, with a precise (even if not known) location in space at any time, and really existing waves, either guiding the particles or "empty". Furthermore, the result of a measurement is considered to be independent of the observer, who simply registers a physical result, as in classical physics. Hence, the d.B.B. interpretation has to be considered a realistic model, as long as the only measured quantities are positional ones. All other variables are contextual (see Sec. 4.6) and therefore not realistic. These variables are sometime defined as "quasi-real".14 4.4. Nonlocality and holism The quantum potential Q connects the dynamical variables of all the particles belonging to a quantum system, independently of their spatial distance, and it is responsible for all the nonlocal and holistic features of the Bohmian model. In 1966 J. Bell15 demonstrated the impossibility of formulating a local hidden variables theory equivalent to standard quantum mechanics. Nonlocality was thus recognized as a fundamental property of any possible hidden variable theory, instead of a peculiar feature of the d.B.B. model. To be precise, nonlocality was at first considered a rather strange feature of this model, but later on it was understood that also the Copenhagen interpretation of quantum mechanics has a nonlocal character (nonlocality is not inconsistent with special relativity theory because the lack of knowledge about hidden variables values prevents using such information to send super-luminal signals). If a system has nonlocal properties, it has to be considered as a whole, which cannot be divided into smaller parts and is irreducible to the sum of his constituents. This property (holism) is evident in the d.B.B. model and is also present, but less recognizable, in standard quantum mechanics.
208 4.5. Lorentz-invariance A Lorentz-invariant formulation of the d.B.B. model has not been provided yet: in the equations of the model, the particle coordinates are expressed as functions of an absolute time, common to the entire system. The lack of a formulation consistent with special relativity is, along with contextuality (see Sec. 4.6), the major limit of Bohm's proposal. 4.6. Contextuality Hidden variables are added to standard quantum theory in order to achieve a complete knowledge of the system. Such variables are required to have a precise (even if unknown) value, regardless to the fact that it has been determined by a measurement or not. A further requirement is the possibility to perform reliable measurements: if a precise value is ascribed to a variable, a measurement performed on the system for such a variable should have the expected outcome, regardless of the the details of the measurement. This has been stated by J. S. Bell10 as "non contextuality principle": all possible measurements of an observable should give the same result, independently of all other measurements performed on different observables at the same time. J. S. Bell15 in 1966 and S. Kochen and E. P. Specker16 in 1967 demonstrated that the two requirements above are irreconcilable: hidden variable theories do violate the non contextuality principle enunciated by Bell. Contextuality implies that the value experimentally attributed to a variable in a quantum system does not depend just on that variable, but on the entire experimental context. Therefore, the attempt to complete standard quantum mechanics in a deterministic way could only be done at the cost of accepting the contextual character of such models. Not all variables are, however, contextual: for each quantum system there is at least one complete set of noncontextual compatible variables. They are the only variables that could be measured in an objective way, independently of all other measurements performed on the system at the same time. In the d.B.B. interpretation, positional variables (not to be confused with the hidden variables represented by particle positions) are the only noncontextual observables of the model. References 1. 2. 3. 4.
D. Bohm, Phys. Rev. 85, 166 (1952). D. Bohm and J. P. Vigier, Phys. Rev. 96, 208 (1954). D. Bohm and J. B. Hiley, Phys. Rep. Ill, 93 (1989). D. Bohm, Wholeness and Implicate Order (Routledge, New York, 1980).
209 5. D. Diirr, S. Goldstein and N. Zanghi, in Bohmian Mechanics and Quantum Theory: An Appraisal, J. T. Cushing, A. Fine and S. Goldstein eds. (Kluwer, Dordrecht, 1996). 6. P. G. Merli, G. F. Missiroli and G. Pozzi G., Am. J. Phys. 44 (3), 307 (1974). 7. A. Tonomura, J. Endo, T. Matsuda, T. Kavasaki and H. Ezawa, Am. J. Phys 57,117(1989). 8. P. R. Holland, The Quantum Theory of Motion (University Press, Cambridge, 1993). 9. D. Diirr, S. Goldstein and N. Zanghi, in Experimental MetaphysicsQuantum Mechanical Studies in Honor of Ahner Shimony, R. S. Cohen, M. Home and J. Stachel eds., Boston Studies in the Philosophy of Science (Kluwer, Dordrecht, 1996). 10. J. S. Bell, Speakable and unspeakable in quantum mechanics (Cambridge University Press, Cambridge, 1987). 11. D. M. Greenberger and A. Yasin, Phys. Lett. A128, 391 (1988). 12. H. Rauch, in Proceedings of the 3rd. International Symposium on the Foundations of Quantum Mechanics, M. Kobayashi Ed. (Physical Society of Japan, Tokyo, 1990). 13. P. Mittelstaedt, A. Prieur and R. Schieder, Found. Phys. 17, 891 (1987). 14. A. Fine, in Bohmian Mechanics and Quantum Theory: An Appraisal, J. T. Cushing, A. Fine and S. Goldstein eds. (Kluwer, Dordrecht, 1996). 15. J. S. Bell, Rev. Mod. Phys. 38, 447 (1966). 16. S. Kochen and E. P. Specker, Jour. Math, and Mech. 17, 59 (1967).
A N O B J E C T I V E B A C K G R O U N D FOR Q U A N T U M THEORY RELYING O N T H E R M O D Y N A M I C C O N C E P T S
L. L A N Z A N D B . V A C C H I N I Dipartimento
di Fisica dell'Universita di Milano and INFN, Via Celoria 16, 1-20133, Milano, Italy
Sezione
di
Milano,
We come back to the rooting of quantum theory in an objectively given phenomenological context, as it was first sustained by Bohr and later taken by Ludwig as basic motivation of his axiomatic approach. It is shown that the question of compatibility of an objective phenomenological context with present day quantum theory can be answered in a positive way if also thermodynamic concepts are taken into account for a quantum description of macroscopic systems. A formalism is recalled accounting for non equilibrium thermodynamics, by introducing classical fields describing local equilibrium in the quantum field context. Also a non deterministic dynamical evolution of these classical fields appears as an important possibility, linked to the breakdown of the method appropriate for the deterministic case. In this connection a refinement of this method is indicated, which with some additional condition leads to the concept of microsystem and to quantum theory for particles interacting with a macrosystem.
1. The role of a phenomenological pretheory In discussions about foundations of quantum mechanics some macroscopic level is generally invoked, often referred to as a classical level, to be described by some classical physics as opposite to a microscopic level, related to more accurate physical investigations touching the particle level which nature displays at a sufficiently small space scale and to which quantum mechanics applies: then an embarrassing question immediately arises, i.e. where the border lies between macroscopic and microscopic level. The rooting of quantum description inside a classical macroworld goes back to Bohr; a rather subtle answer and settlement of this question was given by Ludwig,1 whose approach actually leads to a richer mathematical structure than usual textbook quantum mechanics, based on states as statistical operators and transformations of states as affine maps on a linear space generated by the statistical operators. Let us mention that this same mathematical framework imposes itself when quantum mechanics is challenged with
210
211 problems in the realm of communication and evaluation of information.2 Often two utterly different ways of thinking about physics appear intertwined: one way as a theory aiming at the description of the nature of things, the other way as a theory associated with Galilean experiments. Ludwig's approach is a systematic development of the second standpoint. The most profound motivation to research leads to a mixture of these two attitudes, escaping whenever possible from the second to the first. The progress of science just comes from both ways taken together, but when difficulties appear one has to follow the second one. Any experiment about a given theory is a setting identified and controlled in a phenomenological way or known in the context of a pretheory, separated form the theory the experiment is about. The formalism of quantum mechanics in its generalized form arises in Ludwig's approach as a mathematical representation of items given inside a pretheory and used as parts in an experiment which gives evidence of a physical object, relevant for the experimental setting but not of relevance for the things composing such setting. Quantum theory fits very well and seems to be a very simple realization of Ludwig's mathematical scheme; the latter is however much more general. Also classical mechanics for classical point-like atoms is compatible with it. This must be expected since classical atomism, though disproved by experiment is a conceptually consistent scheme. To the phenomenological pretheory objectively given state parameters are generally associated, showing in many situations, but not necessarily always, a deterministic time evolution inside suitable time intervals. Generally indeed this phenomenological level does correspond with macroscopic properties of systems, while the result of an experiment performed using them is often linked with microphysical components. The key point for a proper understanding of Ludwig's approach however implies being aware of the fundamental role also of phenomenological thermodynamics with its may be inelegant phenomenological artefacts like containers, walls, external fields and its obvious local equilibrium indexes like temperature, velocity, chemical potentials, not to be linked a priori with mechanical properties of classical or quantum components. While it was obvious that thermodynamics was necessary in order to supplement the physics of a continuum in classical physics, it is believed that one can dispense with it when the physics of interacting quantum fields is considered. This is based on a persistent mechanicistic prejudice, by which quantum fields are merely a replacement of collection of fundamental particles, thus promoting any model of interacting quantum fields to a fundamental mechanical model which has to involve only mechanical quan-
212
tities. Nonrelativistic quantum mechanics is to a certain extent compatible with the existence of a system composed by a macroscopic part, to be described at a microphysical level, interacting with a microphysical system consisting of one or few particles, thus substantiating this fictitious representation of macro and micro systems; in the relativistic situation due to non conservation of mass this becomes actually inconsistent. The contribution is organized as follows: in §2 objective state parameters are introduced with their deterministic time evolution; in §3 it is suggested how a breakdown of determinism can be linked to the emerging role of a microsystem; §4 is devoted to conclusions. 2. Deterministic dynamics of objective states To represent the objective phenomenology of a system let us associate to it a set of quantum fields, here for simplicity a scalar field ip(x) confined inside a region fi C R 3 , together with a Hamilton operator H and a countable set M of linearly independent local or quasi-local observables Aj(x), built with the fields V>(x), V>t(x) an< ^ quasi-local in the sense that A,(x) only depends on ^ ( y ) , V^(y') for |x — y| C r o , |x — y'| « r o and also H is obtained by a quasi-local density. The length ro can be taken to characterize the short distances limit of the physical description, its typical space scale being A >• r 0 . A simple but already meaningful model is fluidodynamics, with a Hamiltonian density \ / n d?y ffl(x)^t(y) U(\x — y|)V'(y)^(x), with V(r) a short-range potential with range ro, the relevant observables being mass, momentum and energy densities. A set of state parameters consists of functions Q (x) such that the operator
IIOEJ/ACM-W
(!)
is essentially self-adjoint with a point spectrum and exp[—<$(C)] is trace class. Then starting with the reference generalized Gibbs state e-*(0 ™<1 = — *TT (2) C Tre-*(0 entropy is introduced by 5(C) = —kTrw^logw^ and a partition function can be defined by Z(Q = T r e - * ^ . A deterministic time evolution of these parameters can be defined if one succeeds in constructing a solution pt of the Liouville von Neumann equation carrying only one family (# of state parameters t' < t, where such a family is defined assuming that the
213
state (2) is equivalent to pt relatively to the relevant observables Tr Aj(x){pt - u/Ct) = 0
\/Aj(x) e M,Vt.
(3)
Due to the fact that the selection of relevant variables in M is generally not invariant under time evolution, pt must be represented as pt = z^e-MV+^W
Zt = Tr c -[*«.)+*«)],
(4)
S(t) being a correction not involving relevant variables. Indicating by U\a the unitary time evolution determined by the Hamiltonian one has
pt = Uyt0
= z^e-K^oWtotM],
(5)
In a formal way one can trivially rewrite Z^*o$(£t0) in terms of $(Ct) by
<*(Ct 0 ) = *(Ct) - / * * ' ^7P4*(0]
(6)
CivWAjto +
^wMic)
then one expects that also S(to) should have a similar form, so that
dt
' f dZ*ui' ^ ( ^ i ( x ) + Kf-W>lj(x)
s(t) = j2 f •
J—oo
(7)
JQ
where Qt', t' < t, describe the preparation of the system. One expects that their choice should be significant only for t' € [to — r,to], T being a typical time interval related to the duration of the preparation procedure. Further analysis of models is however necessary to establish whether the most obvious choice Qti = 0 for t' for to
Pt =
+ ie-*tt«) + e-*W Jt + Je-*(C*)Jt
:
—
:
1
—
—
(8)
Tfc[e-*«.) + 5eT*(C) + e -*(CO$t + 5 e -*(C.)5t] where: S=
rk Jo
due-uW^+sWs(t)e+u*^\
(9)
214
and thus depends on times t' < t due to expression (7) of S(t). Furthermore one can set in evidence at r.h.s. of equation (8) the reference generalized Gibbs state e -*(Ct)
=
<10)
** S7*i« writing pt = * C . + & B c . + M t + & M ; t = * 1 + Tr (Sw^ + w
+
f
(4)>
( n )
where T(t) is a traceless operator given by: f (t) = -y(t)[SwCt + tDCtJt
+
J ^ J t _ $(t Tr(Ju) C( + wit J t + Ji&Jt)] ( 12 )
with 7(f) = 1/[1 + Tr(JiDC( + wQt J* +
^ W = EOt'WA'W + ^ f Wii(x)
(is)
i where x indicates localization. Such localization is lost, and the irrelevant contribution increases with increasing time in the tail part T[t) of
S = S(t,t-T)
+ T(t)
(14)
where S(t,t-T)=
f d% f 5 d u e - » l * K ' > + ^ f dfUtt,
(15)
215
and f(t)
=[<&[' due-u&^+SW Jn Jo
f 'df^ffCtl(x)e+0*^> J-oo
(16)
due to the time evolution map, for a time interval t — t' > r . For r large enough, only the general representation in terms of the local fields V>(x), V^(x) can be given, and taking into account that relevant variables leave n particles subspaces invariant, one has the general representation:
t(t) = Y,fd\...[ a
X 4>]l(m)ftM
Jn
•••
Jn
dV I d\a...f d%
ftiVaWita)
Jn
Jn
• ••IpfaWiZlWatiVl,-
• • > Vocta,- • • , 6 )
(17) o' a t(jji,...,»;o)^a>---)€i) being for large t increasingly complicated complex valued functions describing the n body dynamics of the system. To calculate expectations of local observables O(x), such as A,(x) and A,(x), one considers expressions of the form: Tr(d(x)f(i)wCtft(i)),
Tr6(x)(f(*)^t+^"f^)),
(18)
the following typical approximation makes the job: in (17) one can replace the integration region D, by Q,\UJS(X) with w^(x) the sphere with centre in x and radius S C fi- If one calculates (18) with this modified T(t), due to locality of relevant variables Aj(x) in term of which ui£t is built and also locality of Aj(x) and Aj(x), the contribution in which only one factor T(t) or T* (t) arises are negligible due to the off-diagonality in x representation induced by the modified T(t), while for terms with both T(t), T*(t) the factorization •&(6(x)f(t)«; C t ft(*)) « Tv(d(x)wCt)Tr(f(t)wctft(t))
(19)
occurs. In this way if (8) is used to calculate the expectations of local observables, S can typically be replaced by the head part S(t, t — r ) of (14); only the factor 7(f) at r.h.s. of (12), where it affects the irreversible contribution to the dynamics, still depends on T(t). By the very definition of the state parameters these irreversible corrections strictly vanish for all relevant
216
variables; then one can reasonably expect that Tr A,(x)f (t) is a small correction to Tr Aj(pc)w£t. This is just the way in which phenomenological local equilibrium thermodynamics of a fluid, ruled by Navier Stokes equation can be derived in a quantum field approach, as was initiated by Zubarev 4 and further developed by Morozov and Roepke. 5 Even if no systematic effort has been done in this direction, one can expect that a deterministic evolution of the state parameters can be constructed by the following procedure. One starts with a zero order dynamical evolution C?t(x) solving the evolution equation of an ideal fluid, then taking the memory term with (j t (x) = Cjt(x) finds the first order dissipative and irreversible corrections to the evolution equations of the relevant expectations, this step also implies a first principle evaluation of phenomenological coefficients such as viscosity and thermal conductivity. Then one has to calculate a correction to the state parameters Cjt(x) = C°t(x) + Cjt(x)i since state parameters are linked to expectations or relevant variables. Looking in this way to Tr Aj (x)p4 one goes beyond the usual linear approximation of thermodynamics of irreversible process and a more sophisticated investigation of the subject can be envisaged, which should yield an increasingly precise deterministic evolution of the state parameters, also involving more precise evaluations of phenomenological coefficients and an increasing number of them. All this means a matching between a certain deterministic phenomenology of systems and some more or less fundamental quantum field model. Such a matching has been obtained relying on thermodynamic concepts. It provides a way to face the macro-objectification problem, whose relevance and difficulty inside the usual axiomatics of quantum mechanics has been pointed out by Ghirardi and coworkers.6 Let us observe that in this way, by some well understood phenomenology, by maybe empirical knowledge of phenomenological coefficients and by the solution of classical evolution equations (e.g. Navier Stokes equation), one can bypass, if we limit the interest to the dynamics of the expectations of the relevant variables, the generally extraordinarily complex problem of full quantum field dynamics. This just happens since we put in the foreground the basic quantum field balance equation, e.g. in interaction picture, while we leave buried in the underground of the theoretical description the dynamics of the Fock-space states of the system. No explicit use of microsystems is strictly necessary nor useful. In a sense this situation is strongly reminiscent of an ancient and bright piece of physics: the parametrization of the solution of the Boltzmann equation by local equilibrium state parameters
217 by which Chapman and Enskog succeeded in deriving phenomenological fluidodynamics from the Boltzmann equation, taken as the fundamental equation for the dynamics of the molecules composing the system, without solving explicitly such equation: though collisions between molecules was the leading underlying idea, only the link to thermodynamics allowed the extraction form the theory of the relevant physics. Now interacting fields are the dynamical setting, the context is much more general and amenable to relativity and, as we shall see, something of the underlying microstructure acquires an autonomous phenomenological evidence, much stronger than molecules did inside Boltzmann's atomistic effort.
3. A way t o m i c r o s y s t e m s By the previous consideration one can understand that a simple phenomenological description, e.g. fluidodynamics, can hold for long times, with absence of memory effects, as it is in fact observed, despite the long time complex structure which is displayed by (7). However this is not the whole story: one can expect that the typical uniformity of the factors aat(r]i,... ,T)a,(,a,. ..,£i) in (17) that allowed the simplification fi —> fi\cj,5(x), can be influenced by some, possibly not at all trivial, driving process, correlating the localization points JJ, £ in (17) with the localization points of some relevant variables: then a new feature is disclosed inside the previously described phenomenology. Let us investigate what happens if the simple behaviour we have considered before with respect to the uniform distribution of the integrand in (17) inside the space fl x fi x . . . x Cl 2a times, is analyzed in a more accurate way. We assume this simple behaviour for all variables r), £, but one, e.g. TJI for times t > i. Then (1) can be represented as:
f(t)=
[ dVft(V)Dt(r,)
(20)
the operator
Dt(n) = £ / d% ... / &na f d%...[
d%
Jn Jn Jn Jn x ft(7)2) . . . ft{r)a)ftt,a) • • • ft(,2)ft(,i)aat(r), • • • ) Wan sen * • • -si ) (21) a
being a generalized annihilation operator in the sense that Dt(ri)N = (N + l)Dt(r}) \/rj. If (21) is evaluated replacing the integration region f2 -»•
218
fi\wa(x) but not in (20) one obtains TtiAjWfWwbPit))
- Tr(Aj(X)w
[ d3r) f dS [TrAjtoftWw^ir]') Jn Jn
(t)w
=
- Tr(i ; (x)iD c t ) " L ^ f a K ^ f a ' ) ) ] x TiiDMw^btir,'))
(22)
and similarly for Aj(x). The expression gt{f],T)') = Tr(JDt(r/)i()ft£)](r/')) defines a positive operator g\ ' on L2(£l) by: ( 5 t ( 1 ) / ) W = fdPn'gt^vViv')-
(23)
Let us represent it in terms of orthonormal eigenstates gjt{vi) and positive eigenvalues Xj(t): f dV9t(v,v')fW) Jn
= T,X^9Jt(v) j
[ d3V'g*jt(v')f(v')Jn
(24)
Setting Jn equation (21) becomes T r ( i ( ( x ) f ( ^ C t f t ( f ) ) - T r ( i ; ( x ) ^ C J T r ( f (t)«) C l ft(i)) = £
A,- (*) Tr i , ( x ) ^ ^ - Tr ( i , (x)u>c,) Tr ($ t u> c , ^ t )
(25)
j
where the typical mechanism by which T(t)w^tT^(t) contributes in a negligible way to expectations of local observables can also be stated as: Tr AtWtyMit
« Tr(i,(x)ii; Ct ) T r ( ^ t t « ) C | ^ )
(26)
and similarly for A,(x). Let us now assume that an atypical behaviour can be associated with a selected subset of creation operators for which a coherent dynamics prevents the decay of correlations described by (26). As a consequence, together with the macrostate u)Qt also the new states:
will have their role, for the dynamics at times t > i; as macrostates perturbed by a particle in the state gatLet us observe that representations of T(t) different from (20) are conceivable: instead of
219
^{r}) a destruction operator ip{r]), can be put in evidence, so that a hole will have the role of microsystem; more generally products tpHviWiw) •••V't(7?r)1/'(£s) •••V'feOVKCi) could be considered, so that composed structures (more particles and holes) arise as microsystem. We have concentrated on the simplest situation. To summarize, the last term in (8), which is a kind of shadow subcollection as long as dynamics is ruled by (10) (deterministic case) is decomposed at time i as in (10) and from a time i on we shall consider pt = U\pi , where pi according to (8) is represented as k =
%
+
fft> Ct -ft(^ + f(t-)
=
^
^
+
^ = * c , + f(i)flc,ftffl
m = rv-^m
(2g)
(29)
t
1 + Tr f (t) + Tr T(i)w(lTKi)
(30)
where we skip the explicit expression of T(t), essentially depending on S(i,i — T). By (28) the statistical operator pj is represented in terms of a new reference statistical operator wj, given by (29), which shows that a part of the previous shadow collection has been associated to the previous reference state WQ corrected by t(t), which is the traceless part of T(i). By our previous analysis wj can be represented as a stochastic generalization of u>Cr:
w-t = \fwCol + Y^ Kl ^ 7 ^ * " 7t
A
f-^f>0,
Af + ^ ^ = 1
(31) Thinking about the time evolution for t > t, the first subcollection at r.h.s. of (31) is associated with pursue for t > t of deterministic evolution of state parameters; the other subcollections might evolve in different ways, due to the perturbing particle in its different states gai. Then one can expect that for t > i a more adequate parametrization can be given by tuning Q to the different components of the mixture at r.h.s. of (31) so that to describe this stochastic time evolution of the system, the following stochastic generalized
220
Gibbs state will be introduced for t > t:
a
1±
TrV'at^t>4V'at
a
(32) the state parameters C,t and C,at will have at t = i the common value £j and (32) will coincide with (31). For t > i we can write a solution of the Liouville von Neumann carrying the new parameters by an obvious generalization of the treatment given in §2: pt = U\p-t =
(ftcG-M^x) + Hfnd3x
f(t)exp[- "£jn +E x exP [ - E ^
[Jnd3vUmHv) d3x
- Jn A jf*df
0«t(x)ii(x) + E ^
ACtWtkri)
f_dt'
d 3 x
|7(Z4<>(x)i;(x))] -^(UkaArtWHr)))
£ dt'
- J d3r, J* dt> ^(Kt-Cf
^.(UiCjafWAjW)}
(r,)i>{r}))
+ Ujr® (33)
for £ = F the time integrals disappear, and one recover the coefficients Cj<*f = Cif, and the functions # ( x ) = f | f ,
221
shadow collection appears and if it were inequivalent to wt for t > i a new stochastic reference state must be introduced showing a more complicated particle structure and thus leading to a stronger stochasticity. In this way interaction of a microsystem with a system can produce more complicated microsystems: not only decoherence happens but also increasing or varying complexity can occur, whenever behind the deterministic or less stochastic wt the shadow collection SwtS^ becomes significant. Due to the larger set of parameters entering wt a larger set of fixing conditions is necessary: now we shall indicate how by these conditions quantum mechanics is recovered. We assume that in order to determine wt in the simplest stochastic case one has to choose a suitable set M ^ of one particle observables and extend to these observables the former assumption (3): IV Aj (x) (^ - % ) = 0 Tv A(pt - wCt) = 0
VAj (x) e M, W VieAf
{1)
(34)
,Vt.
The one particle observables A € M^ are more appropriately given in the momentum description: A = Jd3pdV&Hp)A^(P,P')a(pl),
(35)
correspondingly also i/S0 in wa are replaced by ag = J d3pa(p)g*(p), 1, [6 s ,aJ] T = 1. Indicating by: wg <- = (1 ±
9 i 9 lTagWQa'g)
\\g\\ =
3 6
one has: Tr(AwgX) = (
(37)
with
Agl) ( P , p') = g(P) J d3V~g*(V)AMfo,p')+j A A& (p, r,)g(r,)g* (p') (38) Let us assume for simplicity that Tr agw^ag
222
choose observables (35) which are non diagonal in p representation to a degree high enough so that \Tr(Agwi;)\
= T r ^ , A^w[x)
(39)
a
w\ is a positive trace class operator with trace less or equal than one. One has \{t) = 1 — Tr w\ ' with \{i) the weight of the deterministic situation. The operator w\ in one particle Hilbert space expresses a universal feature of all physical systems whose phenomenology is covered by our objective description: by wt -> w\ ' all objective elements associated with wt have been erased. Inversely by the spectral decomposition of w\ ' one obtains the one particle states gatip) and the weights \at\ then by (34) one gets Cat and Q and finally the whole wt can be constructed. Open questions are: i) the precise characterization of M^; ii) to study the time evolution of all the parameters CtXat, gt,gat(p),h,Kt starting with TtAl{-K)pt = jtTv{Al{^)wt)
Vi,(x)eM
(40)
Tr Apt = 4- Tc(Awt) V i € M ^ at where pt has the from (33). By our discussion the dynamics of a system is finally reduced to the construction of a solution of (34) and (40): microsystems become the natural ingredient to pursue the description of a microsystem when the parametrization in terms of the sole state parameters £t would become insufficient. To catch something quantum mechanical inside this dynamical context that could appear completely extraneous to usual quantum mechanics let us think about the most simple contribution to the l.h.s. of (40) Tr Apt = - \ Tr A[H, pt] coming from the part of pt given by (33) which is related to ft (n) or better to the corresponding agt: here 2
the leading term in the expression of gat(p) is given by e _ ^ 2 ^ ^ _ t o ^ a t 0 ( p ) ! m being the mass of the Schrodinger field VKx) introduced in §2. Beside this free dynamics of a particle with wave function gat{p), also its collisions with the remaining local equilibrium system will have a major role, relating
223
this problem to irreversible dynamics of a microsystem interacting with a local equilibrium macrosystem: then the structure (33) could be useful to treat such problem without assuming a Markovian structure of dynamics. The choice of M^ is in a sense complementary to M: one has to look for observables which are possibly insensitive to the macrostates WQ , but sensitive to the microphysical additional structure of wt given by the operators ag ag. In typical treatments of a system composed by a macrosystem <S interacting with a my, one represents the states of the compound system in a Hilbert space % = 'Hi® Tis and taking g 6 Hi and w^ statistical operator on Us, all operators ,4 given by (34) can be taken and one removes in an artificial way the problem we are now meeting. Eq.(37) indicates an eventually more refined treatment: the term Tr(AgW{) should be small enough with respect to ( g ^ 1 ) ^ ) , by some orthogonality of Ag with respect to W(: then starting with w\' as a zero order statistical operator of the microsystem, a recursive construction of the full wt can be started.
4. Conclusions and outlook By the presented construction microsystems emerge as stochasticity seeds inside an objective framework primarily based on local quantum field observables Ai(x) 6 M and on classical fields Cjt(x) having the role of local equilibrium thermodynamic indexes. They are related to the expectations of the relevant variables as long as their evolution is deterministic. The pretheory we have mentioned in the introduction is no longer a purely phenomenological framework but becomes the deterministic sector of our description: it could be treated by replacing, at least to a certain extent, the phenomenological attitude with a more first principle one. In 7 this was initiated in the case of fluidodynamics. A breakdown of the deterministic regime can occur and contextually, just in order to describe a stochastic behaviour, an additional structure is pointed out, that can be associated with a microsystem: its quantum mechanical evolution becomes a necessary ingredient in order to face the more complicated situation. In this way the quantum mechanical nonlocality looses its apparently paradoxical aspects, just as was claimed by Ludwig. The purpose of this paper was to show why this seeds can be expected and how formally one can become aware of them inside a context that is usually treated as strictly deterministic. That basic difficulties of quantum mechanics can be overcome with thermodynamic concepts was indicated among the different open alternatives in this context also in a recent general essay.8
224 Acknowledgments This work was supported by INFN a n d by MIUR under F I R B . References 1. G. Ludwig, Foundations of Quantum Mechanics (Springer, Berlin,1983). 2. A. S. Holevo, Statistical Structure of Quantum Theory, Lecture Notes in Physics, Vol. m67 (Springer, Berlin, 2001). 3. L. Lanz, B. Vacchini and O. Melsheimer, in Quantum information, statistics, probability -celebration of Holevo's 60th birthday, O. Hirotaed. (Rinton Press, New York, 2004). 4. D. N. Zubarev, Non-equilibrium statistical thermodynamics (Consultant Bureau, New York, 1974). 5. D. N. Zubarev, V. Morozov and G. Roepke, Statistical mechanics of nonequilibrium processes (Akademie-Verlag, Berlin, 1996). 6. A. Bassi and G. C. Ghirardi, Phys. Rep. 379, 257 (2003). 7. E. Vitali, Degree thesis (University of Milan, Milan, 2005). 8. V. AUori, M. Dorato, F. Laudisa and N. Zanghi, La Natura delle Cose (Carocci, Roma, 2005).
THE ENTRANCE OF QUANTUM MECHANICS IN ITALY: FROM GARBASSO TO FERMI MATTEO LEONE, NADIA ROBOTTI Department of Physics, University of Genoa Via Dodecaneso 33,1-16126 Genoa
The first steps of quantum mechanics in Italy will be here discussed, through the use of the available archives and printed sources. As it will be shown, this development was closely linked with a spectroscopy tradition of research, whose major protagonists were three physicists working in Tuscany during the first two decades of the century, namely Antonio Garbasso, who worked in Arcetri (Florence) on the theoretical basis of the recently discovered Stark Effect (1913-14); Rita Brunetti, in Arcetri as well, who made use of the quantum theory in order to explain the X-rays emission (1918-20); and, finally, the young Enrico Fermi, who paid attention to the quantum theory since his days at the Scuola Normale Superiore in Pisa (1918-22).
Introduction As is relatively well known, before 1920s Enrico Fermi's intervention, quantum ideas had been largely neglected by the Italian physics. Many indicators corroborate this contention, notably the glaring fact that very few contributions on modern physics topics were published by the Italian physics literature during the first two decades of the century. As of 1912, the Italian Physical Society journal {Nuovo Cimento) had published only two review papers on quantum theory [1] [2], one of whom being quite critical in its stance toward the subject [1]. Its author, Orso Mario Corbino (who eventually became director of Rome Institute of Physics), emphasized indeed that "the hypothesis of finite variations of the contents of molecular energy according to whole multiples of the quantum £ — hv deeply repels all our mechanical conceptions". Moreover, as of 1912, no research paper based on quantum hypotheses had been yet published on Nuovo Cimento. This backward state of art somewhat changed the following year when spectroscopy became an important topic of quantum mechanics through the discovery of an important quantum effect, the so-called Stark effect. More importantly, and unexpectedly, the Italian physics gave meaningful contributions to both the experimental set-up and the theoretical interpretation of
225
226
this effect. The role played by these - and other - researches in allowing the quantum mechanics to enter Italian physics will be here addressed. 1.
Garbasso, the Stark effect, and the problematic birth of quantum mechanics in Italy
In 1913 the Italian physicist Antonio Garbasso (1871-1933) left his position at the University of Genoa for the University of Florence to become director of its Institute of Experimental Physics. During the preceding years he had been carrying out works on electromagnetic waves, mirages, X-rays and, most importantly, on spectroscopy topics. In 1906, he had indeed published an important theoretical spectroscopy textbook: Vorlesungen iiber theoretische Spektroskopie (Leipzig: J.A. Barth). At the University of Florence, Garbasso met the young assistant Antonino Lo Surdo (1880-1949). Lo Surdo had been pursuing research on terrestrial physics at Florence, but now Garbasso set him to investigating the Doppler effect in spectra produced by the retrograde rays, i.e. by the positive ions which were always present in discharge tubes in the vicinity of the cathode. Lo Surdo thus turned his attention to the radiation being emitted in the dark space close to the cathode. At first he used standard discharge tubes, but he soon found it advantageous to use long, thin ones between 1.5 and 4 millimeters in diameter.
Figure 1. Antonio Garbasso (1871-1933). Source: [29],
227
Lo Surdo's intent was to look for a Doppler effect in the positive retrograde rays, but in the summer of 1913 he observed a far more significant phenomenon. As he recalled at the end of that year: Since last summer, while studying the Doppler effect due to the positive retrograde rays close to the cathode by means of a discharge tube placed obliquely to the slit of a spectroscope, I observed that the hydrogen [spectral] lines appear to be resolved into several components.... I [subsequently] discovered that this phenomenon also makes its appearance when the tube was placed perpendicularly to the slit. This was, therefore, a new phenomenon [3]. The nature of this unexpected phenomenon and its explanation under a quantum-mechanical framework were discovered a few months later. 1.1. Stark effect Since 1906 the German physicist Johannes Stark (1874-1957) had studied the effect of an electric field on spectral lines experimentally. His decision to do so was influenced by his interest in quantum theory, in particular, by the possibility of interpreting an electric analogue of the Zeeman effect (where a split of spectral lines was obtained out of intense magnetic fields) within its framework. Former analyses had been based exclusively on classical electrodynamics, which did not predict such an electric analogue effect. By contrast, Stark believed that the new quantum theory might permit it. In the fall of 1913, i.e. a few months after the unexpected (and up to that time unpublished) Lo Surdo's observation, Stark was able to carry out a systematic study of the influence of an electric field on spectral lines, through a specially assembled canal-ray discharge tube. By this work, he discovered that a transverse electric field caused hydrogen spectral lines to split into several components. He presented his results at a meeting of the Prussian Academy on November 20, 1913 [4]. The following month, Stark discovered the longitudinal effect as well, that is, the splitting of spectral lines when observed parallel to the direction of the electric field [5]. The discovery of this Stark effect was instrumental in gaining for Stark the Nobel Prize in Physics in 1919. On December 4, 1913, Stark's letter, where the discovery was announced, was published in Nature [6]. Lo Surdo read the letter and understood immediately that he too had observed the same effect. He realized that with his discharge tube, with its particular geometry, he had produced a strong electric field - of the same order of magnitude as Stark had - in the region where he had observed the splitting of the hydrogen spectral lines [7].
228
1.2. Theoretical explanation Niels Bohr, a few months after his landmark 1913 paper on the atomic spectra [8], proposed the first successful theoretical explanation of the Stark effect. In a paper that he published in the March 1914 issue of the Philosophical Magazine, he stated that "it seems possible on ... [my atomic] theory to account for some of the characteristic features of the recent discovery by Stark of the effect of an electric field on spectral lines...." [9]. According to Bohr's theory, an electric field causes the elliptical orbit of the electron to precess and to change its eccentricity. Bohr concluded from his analysis that only two stationary electronic orbits were allowed: "the orbits simply consist of a straight line through the nucleus parallel to the axis of the field, on each side of it." Thus, Bohr found that the change in frequency Av is proportional to the amplitude of the electric field, as Stark had found, and is given by:
Av = —-.— E(nl-nf)
An em where E is the electric field, h Planck's Constant, e and m the charge and mass of the electron, and n, and n2 the quantum numbers of the stationary states between which the electron undergoes a transition when emitting the spectral line." 1.3. Antonio Garbasso's theory Independently of Bohr, Garbasso proposed a theoretical interpretation of the Stark effect (re-named by Garbasso as "Stark - Lo Surdo phenomenon") at a session of the Accademia dei Lincei on December 21, 1913, whose content was published in Physikalische Zeitschrift, Rendiconti of the Accademia and Nuovo Cimento [10]. This was the first research paper based on quantum hypotheses to appear on the journal of the Italian Physical Society. Garbasso's interpretation too was based on Bohr's atomic theory and was similar to Bohr's. Garbasso learned about Bohr's theory of the Stark effect from a letter that Bohr published in Nature on January 15, 1914, where Bohr announced the publication of "a paper on the influence of electric and magnetic a
A completely satisfactory theoretical interpretation of the Stark effect was found only after Arnold Sommerfeld (1868-1951) generalized Bohr's theory in 1916 by introducing the "Sommerfeld conditions" to describe systems of more than one degree of freedom. His student, the Russian-born Paul Epstein (1883-1966), and independently the German astrophysicist Karl Schwarzschild (18731916), then used the Sommerfeld conditions to explain the Stark effect in hydrogen. They showed that in a transition of the electron from an initial state (quantum numbers ki,k2, k3) to a final state (jii, n2, n3), the line splitting Av in the first-order Stark effect is given by Av = —-—[(/i, +n2 +n3X«2 - » i ) - ( * i +k2+k3Xk2 v>n em
-*,)]-
229
fields on spectral lines, which will appear shortly in the Philosophical Magazine" [11]. Garbasso then wrote to Bohr on January 19, stating that: From the last issue of Nature I see that you have applied your marvelous theory of spectral analysis to the consideration of the Zeeman and Stark phenomena, respectively. I am looking forward to your paper with great interest; I have myself tried to extend your theory in this direction, however, as you will see, with but little success. I contented myself with a preliminary calculation, since, from experiments done by my assistant, Dr. Lo Surdo, the matter seems to be very complicated experimentally [12]. FUTUS
(Italian) i«n Wt«o 1. 19I4
Phya. Inat. Itr it.Hochicul* <:•»
.1
J
.«
[ ?
c =L
'
->»hr j « « h r t « r M»zx i a l l « g » I kUH d«r l » t t t a u ;iujnmer dor U»tujg ers«h« i c h , iaaa dl« Ihre «miuieraofa#ne
rttap. ~C*ri'.':K'!,.dH Vf'ftliomtjIiB
ent^d^eDi :'U
f h e o r i * d«r dpak^
.1:; dwuti.it
h.iboa.
L'[: !.iil'<- ;' H a t v.irSLlcht Lt::t) .'(:ciJ_
.110 U i t i t p
-.lotaotv-; uudzudnhlltfu,
--'-I,.
I w h wlp J l o ijoln*:i werkiuu, =uit XeiiK-i' .;ri>di>«il
loh (tab© Ullch b o ^ n V t "«tinolitoiit;,
wtt ••liior v v l i a u f i 0 '»u
* « i l »u« v«rvuoti»u vun »iu»a A»_
»iat»ut«tt von u u r , i u r r u L)x. t « Surde, <JLl« ;i«ott» »otmlnt • x y a r l i i w u l a l l
umhx ••JlfAit.li't
IU t l e l j l .
Figure 2. Antonio Garbasso to Niels Bohr, January 19, 1914 [12],
In analogy with Bohr, Garbasso had discovered that when an electric field E was present, the original hydrogen line was accompanied by a couple of spectral lines - symmetrical of the original line - whose width infrequencyAv was proportional to the electric field. Garbasso deduced an expression for this change in frequency that differed from Bohr's only by a factor of 2: l 2x2em X2 ' Bohr responded on February, 7, 1914, pointing out that Garbasso had made an error in his calculation that had led him to underestimate the difference in frequency between the components of the split lines [13] (figure 3). When
230
Bohr's aforementioned paper was in press, the Physikalische Zeitschrift published Garbasso's one where his calculations were briefly reported [10]. In a footnote added shortly before the press, Bohr wrote that "the arguments of Garbasso are stated very briefly, but seem of a type similar to those of the present paper" [9].
X> >, /f/f
^L — > • * , lit ~.-JL4~^v4.
Figure 3. Niels Bohr to Antonio Garbasso, February 7, 1914 [13].
Notwithstanding Garbasso's minor error, his intervention here was unprecedented: all Italian physicists until then had worked on subjects in classical physics; Garbasso was the first Italian physicist to use the quantum hypothesis in his theoretical research. As emphasized by Giovanni Polvani (then director of the Institute of Physics in Milan) in his historiographical paper on one hundred years of Italian scientists (1839-1939), "Garbasso, better than anybody else [in Italy], was ready to take advantage of the new Bohr's conceptions" [14]. Thus, in order to explain the Stark effect, Garbasso adopted Bohr's model of hydrogen atom before Bohr himself did. However, Garbasso's favorable attitude toward Bohr's model failed to open wide the door of Italian physics to quantum mechanics. The intellectual environment among Italian physicists was not yet ready to such a revolutionary change and, ironically, among the physicists that were not yet ready to accept quantum mechanics figures Garbasso himself! A few months after his theoretical paper, he wrote indeed a paper where he supported a revised version of the classical J.J. Thomson's atomic model, formerly suggested by German theoretical physicist Woldemar Voigt (1850-1919). As stressed by Garbasso:
231 It is of the utmost importance, from the logical viewpoint, that J.J. Thomson's model - from whom Lorentz developed his theory of Zeeman effect, forbids any influence of the electric field upon the emission process. [...] Since 1901 Voigt attempted to generalize Thomson's atom. At that time he supposed that the cubic density of charge was a function of the radius vector rather than constant. From this hypothesis follows that, when an electric field is present, each line gives rise to a pair of lines whose elements are displaced toward the same side [...] with respect to the original line. However, Stark and Lo Surdo discovered symmetrical configuration as regards Balmer series lines. Voigt has recently [1914] took up again his calculations, by supposing a potential shaped as
+b2)c + —k2c3
where a, b, and c are the components of electron displacement [15]. By this ad hoc choice of potential, Garbasso showed in his paper that a revised Thomson's model might account for the experimental observations of a simultaneous action of both an electric field and a magnetic one upon the H a hydrogen spectral line. No mention is made of Planck's theory and the "new Bohr's conceptions". When the first world war broke, the full entrance of the quantum mechanics in the Italian physics was not yet an accomplished fact. 2.
Brunetti and the study of process of X-ray emission
The "Stark - Lo Surdo phenomenon" became the focus of interest at Arcetri in the prewar years and during the conflict. Around 1915, the Arcetri Institute was "a prolific center for [physical] studies, and the spectroscopy - i.e. the most immediate tool for studying the major contemporary issues - was kept there in great esteem" [16]. Soon after Lo Surdo's discovery and Garbasso's theoretical papers, further experimental works on this subject were indeed carried out by other Garbasso's assistants. Among the topics covered were electric field effect on more lines of Balmer series, polarization conditions, possible regularities with the number of order of the series lines, and Stark effect in helium. Much more importantly, one among the young Arcetri experimental physicists namely, Rita Brunetti (1890-1942) - had a key role in making Garbasso's occasional use of quantum mechanics not an isolated effort. Rita Brunetti had obtained her degree in experimental physics from the University of Pisa in 1913 with a thesis in spectroscopy. Two years later she was appointed as assistant in Arcetri under Garbasso supervision. Under the war, Garbasso went to the front as a voluntary lieutenant of the Corps of Engineers, where he set up an efficient phono-telemetric service. During his - and most of
232
Arcetri assistants' - leave, Brunetti took over the management of the institute. In 1928 she eventually became the first Italian woman in acquiring the direction of an Institute of Physics (in University of Cagliari). Her relevance as regards the full entrance of quantum mechanics topics in Italian physics lies in the fact that after her original works on the Stark effect, she worked - between 1919 and 1920 - on the process of X-rays emission through the help of quantum hypotheses. As a consequence of these researches, she discovered a post-cathode emission caused by the electronic ejection (1920) and she verified the emission law of the radiations characteristic of various metals. She eventually arrived at a theoretical interpretation of dependence of continuous background polarization of the applied voltage (1926). Since 1918, she had - very cautiously - hinted at the 1913 Bohr's quantization of energy in her paper concerning the behavior of high frequency spectra in a magnetic field [17]. While she readily set limits to the use of quanta in physics - it was the first time for the last five years since an Italian physicist looked at the quantum mechanics. Quantum theory obtained the deserved consideration only one year later, when Brunetti wrote her first paper on X-rays emission (1919) [18]. In this paper, Brunetti adopted a definite position in favor of the quantum theory and the collection of experimental data tended toward this goal. As an example, in her introduction she wrote: The theory concerning the mechanism of emission of radiant energy out of an oscillator has several interesting applications for the field of high frequency radiations. According to this theory, the oscillator might emits etheric waves only in such a way that the energy expended for their formation is a whole multiple of an elementary quantity whose value is hv, if v is the frequency of the emitted waves and h = 6.55 x 10~27 is a universal constant. In Brunetti's approach the quantum mechanics is seen as a very efficient theoretical tool for explaining the spectroscopic data. No attention is paid to the conceptual problems posed by this tool. By this approach, Brunetti shared both Garbasso's instrumentalist methodology and the experimentalist tradition of the Italian physics. Most importantly, Brunetti shared Garbasso's interest for spectroscopy issues. However, differently of Garbasso, she did not turn back to the old models, as shown by her 1921 review paper on the atomic nucleus, where Bohr's theory is held in high esteem [19]. At the beginning of the new decade, skepticism on quantum mechanics was still dominant among the Italian institutes of physics in Italy, however Garbasso's and Brunetti's efforts were soon to be vindicated by a young student that was going to graduate in physics a few kilometers away from Arcetri, at the prestigious Scuola Normale Superiore of Pisa. His name was Enrico Fermi.
233
3.
Fermi, the "quantum mechanics propagandist"
In 1918, Enrico Fermi (1901-1954) won a fellowship of the Scuola Normale. He spent four years at the University of Pisa, gaining his doctor's degree in physics in 1922, with Luigi Pucciand, then director of the Institute of Physics and former Garbasso's assistant in Arcetri. Although Puccianti had brought important contributions to the field of experimental spectroscopy when he worked in Arcetri, as of his Pisa directorship he was no longer involved in modern physics research. As there was a substantia] lack of courses on quantum mechanics and other modern physics subjects in Italian universities, one may wonders where did Fermi find instruction and guidance. The answer is that, as it is well known, Fermi was largely self-taught since his high school years. The early - and lonely - attention paid by Fermi to modern physics topics, and in particular to spectroscopy and quantum mechanics literature, is supported by a set of recently surfaced juvenile notebooks, currently preserved at the Domus Galilaeana in Pisa (together with other Fermi's manuscripts and notebooks relative to the scientific activity carried out by Fermi during his life in Italy) [20]. One of these notebooks, entitled "Riassunto di memorie di fisica" {Summary of papers on physics), contains a collection of summarized papers written by other researchers and original papers by Fermi himself in his younger years (figure 4) [21]. Among the others, figure papers by Einstein, Richardson, Sommerfeld, Laue, Debye and Levi-Civita. Most importantly, figures a summary of the famous Bohr's 1913 Philosophical Magazine paper on atomic spectra [8].
Figure 4. Enrico Fermi's Italian language synopsis of 1913 Bohr's paper on atomic spectra [21].
234
Another juvenile notebook, preserved at the Enrico Fermi Collection, University of Chicago, throws further light on the extent of Fermi's knowledge on the old quantum mechanics subject. This notebook, titled "Alcune teorie fisiche" (Some physical theories), was written between July 12, 1919 and September 29, 1919, i.e. shortly before his second year as physics student [22]. Its 102 pages are packed with notes and bibliographies on several branches of modern physics; e.g., chapter 2 (pp. 29-57) is devoted to the electronic theory of matter, whose bibliography has an extended section on spectroscopy (papers by Voigt, Stark, Bohr, Garbasso, etc.); chapter 3 (pp. 58-66) is devoted to Planck equation and blackbody radiation. On January 30, 1920 he wrote to his close friend Enrico Persico (letter preserved by the Niels Bohr Library in Washington, D.C.) that he had "took up again the study of the progresses happened in physics during the war". Furthermore, Puccianti had charged him to deliver lectures on quantum mechanics. Among the subjects to be touched by Fermi in his lecture was the "Stark - Lo Surdo phenomenon". "Little by little", Fermi wrote, I am becoming the most influential authority at the Institute of Physics. And more than that, one of these days / am going to deliver a conference on the quantum theory - about which I am always a propagandist - before a gathering of tycoons [emphasis added] [23]. As of May 1920 he had a complete mastery of the Bohr-Sommerfeld model, as shown by another letter to Persico where he hinted at his solution of a theoretical spectroscopy problem [24]. During the same year he likely carried out an in depth study of Sommerfeld's Atombau und Spektrallinien, that eventually became for several years the reference textbook on quantum mechanics. During the fall of 1920, Fermi's fellows at Pisa (Franco Rasetti and Nello Carrara) came to recognize his immense superiority in the knowledge of mathematics and physics and "henceforth regarded him as their natural leader, looking to him rather than to the professors for instruction and guidance" [25]. During his third and fourth Scuola Normale years, Fermi published his first theoretical papers. These ones were mostly devoted to electromagnetism and relativity problems, such as the first one (January 1921) dealing with the inert mass of a rigid system of electric charges [26]. The first lasting contributions of Fermi to quantum mechanics derive from a group of papers on analytical mechanics which Fermi completed during his stay in Gottingen, one year after the graduation. In particular, P. Ehrenfest, who had delved deeply into the foundation of statistical mechanics, was impressed by 1923 Fermi's proof of the ergodic theorem [27]. At last, in 1926, while teaching theoretical physics at the
235 University of Florence, he published his celebrated paper on the statistical mechanics of particles obeying the Pauli exclusion principle (fermions) [28]. This work immediately won him international fame, since Sommerfeld recognized its revolutionary significance for understanding the properties of conduction electrons in metals and many other phenomena. Quantum mechanics was no longer a forbidden topic in Italy.
Figure 5. Some of the protagonists of the entrance of quantum mechanics in Italy from a 1925 photograph: E. Fermi, N. Carrara, F. Rasetti (left to right) and R. Brunetti (2nd row) [30].
4.
Conclusion
As it is well known, the quantum theory originated in an empirical attempt to bring the then current theories of black-body radiation into line with experiment. Of vastly greater significance were the successes which the postulate e = hv had - during the first decade of 1900s - to explain other phenomena, such as the photoelectric effect and the specific heats of solids. As we have shown, these major developments were largely extraneous to Italian physics, as the Planck's formula apparently "deeply repelled" to Italian physicists conceptions. Thus, Italian Physics delayed in tackling quantum mechanics for several years, and when things changed, the abroad debates on black-body radiation, photoelectric effect, etc. played no role. As a matter of fact, quantum mechanics entered Italian physics only in 1913, i.e. when this field was seen as a fruitful tool to deal with experimental spectroscopy subjects. This spectroscopy tradition of research had indeed an old and glorious heritage, dating back to Padre Angelo Secchi stellar spectroscopy and to what was likely the first professional society devoted to astrophysics, namely the Societa degli Spettroscopisti Italiani (1871).
236
Among the major protagonists of this tradition during the first two decades of 1900s - as regards the laboratory spectroscopy - was Garbasso and his Arcetri school. He was indeed first in applying the 1913 Bohr's theory of atomic spectra with the goal of explaining the recently discovered Stark effect. This spectroscopy school kept flourishing during the second decade of the century through Garbasso's assistants, such as Lo Surdo, who improved the experimental techniques for detecting the Stark effect, and Brunetti, who was successful in using quantum mechanics to explain X-rays spectra. Finally, signs of this tradition are also present in the early works carried out by Fermi, whose landmark papers allowed quantum mechanics to fully enter Italian Physics during the first years of 1920s. This contention is indeed supported by some of his juvenile notebooks, preserved both at the Domus Galilaeana in Pisa and at the University of Chicago, where spectroscopy issues and Bohr's theory receive a wide coverage. Acknowledgments We are grateful to Enrico Fermi Collection, University of Chicago, for access to notebook "Alcune memorie fisiche", to AIP Center for History of Physics Niels Bohr Library for access to Fermi-Persico correspondence and to the Accademia Nazionale delle Scienze "detta dei XL" for access to the Archive for History of Quantum Physics. References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14.
O.M. Corbino, N. Cimento 17, 256 (1909). O.M. Corbino, N. Cimento 3, 368 (1912). A. Lo Surdo, Rend. R. Ace. Lincei 22, 624 (1913). J. Stark, Ann. d. Phys. 43, 965 (1914). J. Stark, Ann. d. Phys. 43,983 (1914). J. Stark, Nature 92, 401 (1913). M. Leone, A. Paoletti, N. Robotti, Phys. perspect. 6, 271 (2004). N. Bohr, Phil. Mag. 26, 1 (1913). N. Bohr, Phil. Mag. 27, 506 (1914). A. Garbasso, Rend. R. Ace. Lincei 22, 635 (1913); N. Cimento 6, 338 (1914); Phys. Zeit.. 15, 122 (1914). N. Bohr, Nature 92,554 (1914). Garbasso to Bohr, January 19, 1914, Archive for History of Quantum Physics (hereafter AHQP). Bohr to Garbasso, February 17, 1914, Bohr Scientific Correspondence (2,5), AHQP. G. Polvani, Atti SIPS, 669 (1939).
237
15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30.
A. Garbasso, N. Cimento 9, 376 (1915). Z. Ollano, N. Cimento 19, 221 (1942). R. Brunetti, N. Cimento 16, 5 (1918). R. Brunetti, N. Cimento 18, 266 (1919). R. Brunetti, N. Cimento 22, 215 (1921). M. Leone, N. Robotti, C.A. Segnini, Physis 37, 501 (2000). E. Fermi, Notebook N2, Fermi Archives, Domus Galilaeana, Pisa, Italy. E. Fermi, Notebook "Alcune memorie di fisica", Enrico Fermi Collection, University of Chicago. Fermi to Persico, January 30, 1920, AIP Center for History of Physics Niels Bohr Library. Fermi to Persico, May 30, 1920, AIP Center for History of Physics - Niels Bohr Library. F. Rasetti, in E. Fermi, Note e Memorie (Collected Papers), Vol. 1, Ace. Lincei - Univ. Chicago Press, Rome - Chicago 1961, p. 55. E. Fermi, N. Cimento 22, 199 (1921). E. Fermi, N. Cimento 25, 267 (1923). E. Fermi, Rend. R. Ace. Lincei 3, 145 (1926); Z. Physik 36, 902 (1926). R. Brunetti, N. Cimento 10, 129 (1933). C. Bernardini, L. Bonolis, Conoscere Fermi, Societa Italiana di Fisica, Bologna 2001, p. 338.
THE MEASURE OF MOMENTUM IN QUANTUM MECHANICS FABRIZIO LOGIURATO Department of Physics, University ofTrento, Trento, Italy
38050 Povo
CARLO TARSITANI Department
of Physics, University of Roma 'La Roma, Italy
Sapienza'
The de Broglie relation p - h/X is often used in the heuristic deduction of the Schrodinger equation. Yet, this relation does not appear among the postulates of quantum theory. Actually, in most textbooks the physical definition of the quantum concept of momentum is often neglected. In this paper we show that the definition of momentum as derived quantity, operationally founded on the typical measurement of the so called "flight time", not only fits very well with the physical principles of the quantum theory, but can also help to avoid common ambiguities in the enunciation of Heisenberg's uncertainty principle.
1
Introduction
Complementarity and uncertainty principles are considered as the two chief "pillars" by which the stately quantum edifice is sustained. This edifice appears to be at the same time grand and solid, after so many decades of theoretical, experimental, technological successes. However, it is worth noticing that the two great principles which form the foundations of the theoretical framework, if thoroughly analyzed, appear to be still today somewhat vague and uncertain [1]. We will focus our attention in this paper on the uncertainty principle (UP) and on the "meaning variance" of its different enunciations". In particular, we will only refer to the well-known relation between position and momentum uncertainties. Usually, physics textbooks introduce UP either by describing the classical thought-experiments by Heisenberg and Bohr [3, 4], or by carrying out a formal demonstration (for the one-dimensional case) based on the Fourier theorems or, a
Indeed, under the label "uncertainty principle" one may find not only different physical statements, but also different formal expressions for the functions of the observables, for the relations with the physical states and for the various kinds of measurements [2], Let's take, for instance, the common statement: "it's impossible a simultaneous and indefinitely accurate measurement of a pair of noncommuting observables". Actually, two incompatible observables may share a subset of eigenstates. If the system is in a state belonging to that subset, the values of both the observables can be simultaneously measured as accurately as we want!
238
239 more generally, on the operators' algebra [5, 6]. Ax and Apx are obviously linked to the dispersion of the measured values of the two quantities. It is thus presumed that the measurements are made on an ensemble of identically prepared systems, so that UP is obtained by measuring each time just one of the two observables. Now, the question we want to answer is the following: which is the connection between the experimental procedure we adopt to verify UP, Fourier transforms, and the Heisenberg-Bohr thought experiments? In particular, how the measurements of momentum are performed? Let's consider for instance the well-known experiment of diffraction by a single slit (Fig. 1). A beam of particles is directed towards a wall with a hole in it whose diameter we denote by d. Each particle is prepared in the same state: its momentum has an absolute value p, and its direction is orthogonal respect to the wall. Everybody knows that on a screen, parallel to the wall and at distance L from it, we observe the typical diffraction pattern. Therefore, if we consider the particles as classical corpuscles, we are forced to admit that, while passing through the hole, the momentum of the particles acquires a transverse component.
Fig. 1. The diffraction experiment: the transverse component of the momentum of each particle is measured by means of the position detected on me screen.
The direction of the momentum of the great majority of particles is between the angles 9 and - 9, where 9 indicates the position of the first minimum of the b
In other terms, we can imagine to split the ensemble of identically prepared systems in two equal parts: for the first part we make an "ideal" (infinitely precise) measurement of the position, and, for the second, we make an ideal measurement of the momentum.
240
diffraction pattern. For small 0, we can assume px=p-WL),
(1)
where x is the position of a particle on the screen and L the distance between the wall and the screen. For each particle, the x-coordinate immediately after the hole will be known with an uncertainty Ax -d, and we can assume that the xcomponent of the momentum will acquire an uncertainty Apx » psin 6. So far, our description of the process is based on a particle-model of the quantum system. However, in order to obtain UP we need to introduce the wave model. From wave theory we know that, for the first minimum of the diffraction pattern, we have X = dsinO. So, by means of the de Broglie relation, it is easy to deduce the relation: AxApx « h.
(2)
Let's focus our attention on the quantity "momentum" in the relation (2). First, we have defined it in a purely classical way. Second, its measurement is indirect: actually, we can only measure positions and/or times. In the case of the single slit experiment, we don't need time measurements: we already know the absolute value of/?, so must measure just positions. According to Popper and Ballentine [7, 8], the demonstration of UP presupposes the assumption that, for each particle, both position and momentum are measurable so accurately that the product of the respective uncertainties can be less than h. Indeed, for each particle j which has gone through the hole and it is arrived to the screen, we have A^ = d, and Apx ~ 0. Indeed, this is a consequence of the fact that we have performed a classical measurement of the momentum, based on the measurement of two positions. In order to avoid this apparent paradox, we must make a sharp distinction between the concepts of state preparation and data prediction. In fact, UP states that it is impossible to find an ensemble of systems in a dispersion-free state both for momentum and for position (that is in a state in which all particles prepared with the same momentum can be detected at the same position). By taking into account the single particle j , if we know that AjX ~ d, we can predict that Ajpx ~ h/d. In other terms, after having measured the position of a single system with an uncertainty Ajx, our knowledge of the value of momentum is affected by an uncertainty not less than Apx ~ h/d. Often UP is stated as such: "it is impossible to know simultaneously and with exactness both the position and the momentum of a particle" [9]. However we have shown the ambiguity of this expression: if we don't know position we cannot know momentum. We could state UP in a more correct fashion by saying: "Once the position of a particle is measured, the value of its momentum
241 can be predicted with an uncertainty not less than that established by UP (and vice versa)"; or by saying: "It is impossible to prepare a system or an ensemble of systems which have simultaneously well defined values of position and momentum". Also Feynman, in his Lectures [10], uses an ambiguous statement of UP. However, it's worth noticing that he also writes: Sometimes people say quantum mechanics is all wrong. When the particle arrived from the left, its vertical momentum was zero. And now that it has gone through the slit, its position is known. Both position and momentum seem to be known with arbitrary accuracy. It is quite true that we can receive a particle, and on reception determine what its position is and what its momentum would have had to be gotten there. That is true, but that is not what the uncertainty relations refers to [...] the fact that it went through the slit no longer permits us to predict the vertical momentum. We are talking about a predictive theory, not just measurements after the fact. So we must talk about what we can predict0.
2
The definition of momentum in classical and quantum physics
Let us adopt an "operational" definition of momentum [11]. In classical physics, the momentum of a particle is deduced by the measure of its velocity. The average velocity v^ of the particle is obtained by observing its positions x and XQ in two different instants t and £0dWe cannot adopt the same method in quantum theory because of the wellknown fact that the first observation changes the system's state. However, this shortcoming can be easily avoided by the following procedure. Let Ax be the uncertainty with which we know the initial position of the particle (or the dispersion of the positions of an ensemble of particles, each prepared in the same identical state). Let's also assume that the wave function ^ W . for t = 0, is centered on the origin of the spatial coordinates. If, at the time t, we observe the position of the particle at the point x, its velocity will be vx = x/t, so that its momentum will be px - mx/t. If we suppose to be able to measure both x and t with arbitrary precision, the uncertainty on the measured value of the momentum is Spx - mAx/t. Therefore, for large /, we can reduce the uncertainty of the momentum as much as we want, and the effect of the uncertainty on the initial position is negligible. Let's now calculate the probability distribution P(px) of the momentum after a set of measurements on an ensemble of identically prepared systems. Being px = mx/t, the probability P(px)dpx that the value of the particle's momentum lays between px and px + dpx is equal, according the "flight time"
c d
Also Heisenberg specifies [4]: "the uncertainty relation does not refer to the past". If the particle is free, the average velocity is equal to the instantaneous velocity.
242
technique, to the probability p{x)dx that the particle will be revealed, at the time t, between x and x + dx. The "propagator" for the free particle is given by the following expression: K(x,t;x',0)--
2niht
exp
im{ x — x') 2ht
(3)
The probability amplitude that a particle will be observed in x at the time t, is Hx,t)=
U(x,t;x',0)po(x')dx'.
(4)
From (3) it follows that y/(x,t)
Iniht
exp
2%t
Mi^**-2"" \lf (x')dx<.
(5)
0
According our operational definition of momentum, we can write, for large t, 2
mdx p(x)dx -2Kht
U-^(x«-2xx')
¥o(x')dx'
= P(px)dPx.
(6)
Putting px = mx/t, we obtain: P(px)dpx
_ dpx inh
f
.imx'2 ip x x\
,
(7)
If the initial amplitude y/o(x) is different from zero only in a small region Ac centred on the origin, for times t -» °°, the ratio mAx2/2ht tends to zero: the contribution to the integral of the first exponential will be negligible. Thus, the distribution of momentum is equal to the square modulus of the amplitude:
V2wft J
\
h )
Voix')dx',
(8)
Now,
This proof is essentially due to Feynman [12]. It is worth noticing that, when we passed from the distribution (7) to the amplitude (8) we omitted a phase factor. For a complete proof, see the old book by Kemble [13]. Its first version is due to Kennard [14].
243
1 ^1K%
ilni VA
.
(9)
the corresponding probability amplitude 0(px) for the momentum will be the Dirac's function S(px - h/A), so that the momentum assumes the single value px = h/L All the measurements of momentum, performed on an ensemble of particles identically prepared in the state (9), would always give the same value. The values of momentum for the state (9) don't show any dispersion. Then the amplitude (9) represents a momentum eigenstate whose corresponding eigenvalue is given by the de Broglie relation. In contrast with the great majority of textbooks, there is no need to postulate the de Broglie relation: it can be deduced from the wave functionf. However, we must observe that the above measurement of the momentum eigenstate is an "ideal" measurement. Indeed, the eigenstate (9) is represented by a function which is defined in the entire space and whose square modulus is constant and different from zero. Therefore, the measurement process that we have described so far cannot be performed. Actually, we deal with wave trains. In this case, we can easily show that by increasing the length of a finite wave train, we obtain a narrower and narrower distribution, which tends to a Dirac's "delta function". Let us choose, for instance, as initial probability amplitude, the following finite and normalized wave train of length I: 1 (2m ^ —;=exp —— x \if I x l< +/ if \x\> -I J 0 We obtain, as its Fourier transform, ,jXp '
)=M±[sml[{2ff/Z)-(px/h)y[(2x/X)-(px/h)]).
(11)
\ hi 7t
For / -* co the function (10) becomes gradually narrower around the value px = h/X, where it has a maximum which goes to infinity as the square root of L. Differently from the state (9), in the new state both the function of position (10) and its Fourier transform (11) are normalized and correspond to concretely realizable physical states8. f
For instance, Sakurai introduces the relation in a rather formal way by means of the infinitesimal translation operator and the analogy with the function that generates the corresponding canonical transformation of analytical mechanics [6]. 8 Obviously also the time requested to perform the measurement tends to become infinite, as we can see from equation (5).
244
3
The diffraction by a single slit and the Fourier transform
Let's come back to the single slit diffraction experiment. For the particle at the slit the position probability amplitude is constant: \d~y^if
\x\
0
ix\>d/2
y
The Fourier transform of (12) is: >(Px) =(dn.7Ch)m{^\Px{dl2K)}l\px{dl2h)]} .
(13)
Let's analyze the diffraction pattern revealed by the screen. Let's assume that the incident beam is made of plain waves. From the classical wave model, on Fraunhofer's condition (L »d) and for small angles (sin#= 0), the intensity of the waves is: l(d) = I(O){sm{{Kda)6\l[{xdlX)0\}2,
(14)
where 7(0) is a factor corresponding to the central maximum intensity [15]. Hence, the position probability density will be (0= x/L): p(x) = |^0)| 2 {sin[(;r d/A)(x/L)]/ [(nd/A)(x/L)] } 2 .
(15)
We know that, in the classical particle model, the transverse momentum px, which is acquired by the particle while it is going through the hole, is px = p-x/L. By observing on the screen the particle's point of arrival, we will obtain and indirect measurement of its transverse momentum. Therefore the momentum distribution for an ensemble of particles is the following: \(0)|2 {sm[px(d/2h)]/\px(d/2h)]}2.
(16)
The connection between equation (16) and the Fourier transform is now evident: the momentum probability amplitude corresponds to the Fourier transform (13). So, the connection is established between the deductions of UP from Heisenberg's experiment and from the method of Fourier transforms. The diffraction pattern on the screen is deeply connected with the Fourier transform of the wave amplitude on the slit: it is nothing else than its square modulus. This is a widely known result in classical optics [16], but it is often missing in quantum mechanics textbooks. The "time of flight" and the "point of arrival" measurement techniques are equivalent. Both, in fact, adopt the classical particle model for quantum systems. Let t be the time spent by the particle to go from the slit to the screen. Given that
245
px = mx/t = m(L/t)(x/L)=pix/L),
(17)
for the time of flight we get: t = mXLIh.
(18)
Since Ax ~ d, and since the wavelength X has the same order of magnitude, for L» X the first term in the exponential in the expression (7) is negligible. Again, we obtain the momentum distribution amplitude as the Fourier transform of the position wave function. The condition L » X is certainly satisfied if we adopt the Fraunhofer's approximation11. We have just seen that the dispersions Ax and Apx are connected by the Fourier's transforms. Yet, why the "classical" formulation of UP as Ax Apx > h/2 and the relation (2) of the first section appear so different? The answer is simple: as Uffink and Hilgevoord [18] have noticed, the reason is that both Heisenberg and Bohr, in their though-experiments, adopt for Ax and Apx definitions which cannot be interpreted as variances. According to Uffink and Hilgevoord, a good quantitative measurement of the uncertainty cannot be given by variance. In fact, in the single slit experiment, by applying such a definition to the distributions \y/(x)? and \<j> (px)\2, we get Ax = d/^/Vl and Apx = °°, so that the uncertainty on momentum does not depend on the width of the slit. The reason is that the function \i//(x)\2 is characterized by sharp boundaries and goes discontinuously to zero. Therefore Uffink and Hilgevoord are forced to introduce a different definition for uncertainties. For Braginsky and Khalili [2], the difficulty does not depend on the above definition: it is sufficient to consider more realistic measurements by eliminating the discontinuities of the distributions. According to their point of view, if we adopt the definition of uncertainty as "variance" and, at the same time, if we take a slit with "smooth" boundaries (we describe the distribution I i/Kx)\2 as a Gaussian function), we get again the usual expression for UP. However, we must underline a conceptual difference. In the usual expression of UP, the uncertainties do not refer to the measured values of the observables but are regarded as intrinsic features of the prepared state. On the contrary, the relations (2) can be interpreted as uncertainties produced by the detector in the measurement process of the single particle state. By measuring position with an uncertainty AxD, we create an uncertainty on the previous value of momentum which is at least ApxD = h/2AxD. So, we can symbolically distinguish the two kinds of uncertainty relations by introducing two different
h
For further details and for some applications of the classical definition of momentum in quantum mechanics see Ref. 17.
246
notations': Sxvbspxv>hl2 4
AxDApxD>h/2.
(19)
Conclusions
A clear presentation of uncertainty relation position-momentum is deeply influenced by the use of the wave and the particle model. In quantum mechanics, if we want to define physical concepts by means of an operational approach, momentum can only be defined in terms of position and time measurements. Therefore the operational definition must be essentially "classical". For this very reason, we are forced to adopt a definite model of the system we deal with. We think that this example show how, by paying more attention to the way the concepts are used in quantum physics, we could avoid misconceptions and prevent the confusion between verbs like to predict, to prepare and to measure. Moreover, by using the classical definition of momentum we can get both the de Broglie relation and what is already known from classical optics: the diffraction pattern, interpreted as the particle distribution amplitude, is just the Fourier transform of the particle amplitude at the slit. The deep connection between the heuristic deduction of UP by means of the usual thought experiments and its formal deduction by means of Fourier transforms is often missing in textbooks. Acknowledgements It is a pleasure to thank Prof. Stefano Oss for his kind support, and Dr. B. Danese, Dr. S. Defrancesco and Dr. L. M. Gratton for valuable discussions. References 1. M. Jammer. The Philosophy of Quantum Mechanics (Wiley, New York, 1974). 2. M. G. Raymer, Am. J. Phys. 62, 986 (1994). 3. W. Heisenberg, Z. Phys. 43,172 (1927).
' A new class of measurements, which combines the uncertainties (19), can be defined [2]. We perform simultaneous measurements, which are of course inaccurate, both of the position and of the conjugate momentum for each member of the identically prepared set of systems. It is possible to show that the squares of the statistical dispersions sum up with the squares of the uncertainties due to the measurements. Then it is easy to show the validity of the relation Ax]»DApi»*) - ^> s o m at the inferior limit of the product is two times the usual one.
247
4. W. Heisenberg, The Physical Principles of the Quantum Theory (University of Chicago Press, Chicago, 1930). 5. S. Gasiorowicz, Quantum Physics (Wiley, New York, 1974). 6. J. J. Sakurai, Modern Quantum Mechanics (Addison-Wesley, Reading, MA, 1985). 7. K. R. Popper, Quantum Theory and the Schism in Physics, from the Postscript to the Logic of Scientific Discovery (Routledge, London, 1992). 8. L. E. Ballentine, Rev. Mod. Phys. 42, 358 (1970). 9. M. Alonso and E. J. Finn, Fundamental University Physics, Vol. 3 (Addison-Wesley, Reading, MA, 1972). 10. R. P. Feynman, R. B. Leighton and M. Sands, The Feynman Lectures on Physics, Vol.3 (Addison-Wesley, Reading, MA, 1989). 11. P. W. Bridgman, The Logic of Modern Physics (Macmillan C , New York, 1927). 12. R. P. Feynman and A. R. Hibbs, Quantum Mechanics and Path Integrals (Mc Graw-Hill, New York, 1964). 13. E. C. Kemble, The Fundamentals Principles of Quantum Mechanics (Dover, New York, 1958). 14. E. H. Kennard, Phys. Rev. 31, 876 (1928). 15. F. A. Jenkins and H. E. White, Fundamentals of Optics (Mc Graw-Hill, New York, 1957). 16. P. M. Duffiuex, The Fourier Transform and Its Applications to Optics (Wiley, New York, 1983). 17. F. Logiurato, The Concept of Momentum in Classical Physics and Quantum Mechanics, unpublished (2005). 18. J. B. Uffink and J. Hilgevoord, Found. Phys. 15, 925 (1985).
ON THE TWO-SLIT INTERFERENCE EXPERIMENT: A STATISTICAL DISCUSSION
M. M I N O Z Z O Department
of Economics, Finance and Statistics, University Via A. Pascoli, 06100 Perugia, Italy E-mail: [email protected]
of
Perugia,
In the last decades, many two-slit interference experiments have actually been performed as sequential experiments by sending as few as possible particles at a time through some interfering apparatus. In this work, under the usual axioms for probability theory of Kolmogorov, a novel purely particle statistical approach to the analysis of these interference experiments is proposed by explicating the sequential nature of the experimental observations.
1. Introduction The origins of the two-slit experiment go back to Young's experiment in 1803, in which a beam of visible light separated by two pinholes combines to form an interference pattern. At the beginning of the present century interference experiments were also performed for X-rays, in 1912, and for beams of electrons, in 1927. More recently, the groups of Tsuchiya et al. x and of Tonomura et al. 2 , e. g., carried out, for photons and electrons respectively, two-slit experiments in which not beams, but particles, one after the other, were sent through the interference device. Since their discovery, all these and other interference phenomena have always been regarded as the strongest empirical evidence in support of wave-like theories in quantum physics. Indeed, these phenomena, together with diffraction phenomena, are the only direct evidence of the wave-like behaviour of light and matter. Despite purely particle realizations of the two-slit experiment have been performed, up to now only a few purely particle-based explanations have been attempted, and in the main-stream theoretical developments interference experiments are always explained by appealing to some sort of wavelike entities, might these be called waves, fields or strings. In this work, which is a development of ideas already presented by the author elsewhere, 3 after formulating in Sec. 2 our ideal purely particle two-slit experiment in a
248
249 Barrier
Source
II U fj
x
Screen
y
IT
(b)
Figure 1.
(c)
Two-slit ideal experimental set-up.
sequential statistical setting under the usual axioms for probability theory of Kolmogorov, we consider a particle-based toy model in Sees. 3 and 4 which explains the interference pattern in a purely statistical fashion, without using any wave-like mathematics or concept. In Sec. 4 we also show that this model accounts for the non-additivity paradox which usually arises by comparing the interference pattern with the patterns obtained by closing one slit at a time. 2. The Two-Slit Interference Experiment Let us introduce the classical idealized version for particles of the two-slit interference experiment. For our purposes we will not make any distinction between different types of particles, such as photons, electrons or neutrons. A particle could be whatever entity with the property of being well localized in a small region of space and of moving from one place to another in a continuous way without splitting anywhere. As usual, despite real experiments are three-dimensional phenomena, our discussion will focus on the usual two dimensional geometrical section. The two-slit ideal experimental set-up (see Fig. 1) consists basically of three different elements: a source of particles, a barrier with two slits and a screen. The whole two-slit interference experiment is then made up of three distinct experiments: a first experiment with both slits open, a second experiment with only the first slit open, and a third experiment with only the second slit open. Consider first the experiment with both slits open. When the experiment starts, particles are sent out of the source sequentially, one after the other, towards the barrier. Whereas many stop against the barrier, some particles find their way through the slits and go to finish on the screen where their position is recorded. When enough particles have arrived on the screen, the histogram of their positions will start to resemble the clas-
250
sical interference pattern described by the contour (a). On the other hand, considering the experiment with only the first slit open, sending a particle after the other towards the barrier and onto the screen we obtain an empirical histogram which approximates something alike to the contour (b), whereas with only the second slit open we obtain something alike to the contour (c). We come here to the essence of the problem. Whereas the histograms relative to the experiments with only one slit open are in accordance with what we would have expected, the histogram obtained with both slits open, against our common sense experience, is not the uniform mixture of the other two, and shows instead a pattern which is similar to the distribution of intensity of a wave. According to orthodox views, it is in this sense that light, as well as matter, behaves "sometimes like a particle and sometimes like a wave", and it is in this sense that we observe the so-called phenomenon of the "self-interference" of a single particle. From a purely particle viewpoint, the "non-additivity", in the sense of non uniform mixture, of the three histograms seems to give rise to a paradoxical situation. In fact, this is in contrast with what we would have expected assuming that the barrier acts only as a static selection rule. However, all three experiments are actually performed sequentially, in a long single run, and this assumption is equivalent to assume that before each particle is sent out of the source the whole experimental set-up can be considered as if it had been completely resetted in the same identical initial conditions, that is, it is equivalent to rule out all the interactions that could have occurred (in particular those between the particles and the barrier) up to then. On the other hand, the non-additivity paradox disappears as soon as we abandon the hypothesis of a barrier acting as a static selection rule and we recognize that the three histograms are obtained in three distinct experiments in which different, dynamic or not, effects can take place. Thus, let us consider the following modeling framework. Each of the three experiments can be represented on its own probability space ($7, J-, P) where the set Q represents the set of all possible realizations w G f l o f the experiment. For each of these probability spaces, to account for the position of the particles at the barrier we consider a stochastic process (Xn). The position at the barrier of the nth particle in a given experimental realization u> is represented by xn = Xn(u>). To represent the position of the particles at the screen we consider a stochastic process (Yn) assuming values in the extended real line R U o o . For any given experimental realization OJ, we assume yn = Yn(u>) equal to oo if the nth particle does not pass the barrier, and equal to some real value, representing the position at the screen of the
251 particle, if the nth particle does indeed pass the barrier. For this framework we use the following ordinary interpretation of the concept of probability. Let us imagine to have a very large, ideally infinite, number of ideal laboratories in the same identical conditions, where, in each of them, we have an experimental set-up identical to the one in Fig. 1. Also imagine that in each laboratory it is performed a single run of length N of one of the three experiments, that is, N particles are sequentially sent out of the source one after the other. Then the probability P(Xn < x), x 6 R, for some given n = 1 , . . . , N, will represent the proportion of the laboratories in which the nth particle in the run had a position xn < x. For the experiment with both slits open, the distribution of the position of the particles at the screen in a given experimental realization u £ Q. of length N is given by the empirical distribution function, for y < oo, F(y;N) = # { n : n < N,Yn{u) < y}/#{n : n < N,Yn{w) < oo}, where # denotes cardinality. Considering a very long, ideally infinite, run, the form of the interference pattern is represented by the limiting empirical distribution function F(y) = limjv^oo F(y; N), for y < oo. For the other two experiments with only the first or the second slit open respectively, we similarly define empirical distribution functions and limiting empirical distribution functions, for which we have, for y < oo, F'(y) = lim^v^oo F'(y; N) and F"(y) = limjv-^oo F"(y; N). Of course, these, finite or limiting, empirical distribution functions, representing the proportion of particles that in a given run, among all particles that passed the barrier, have a position Vn < y, are random quantities and not probabilities.
3. A Dynamic Model with Memory Effects at the Barrier To show the possibilities of our approach we shall now discuss, with no attempt to provide any physical theory, a toy model in which the barrier is allowed to dynamically interact with the particles. For the experiment with both slits open, let us suppose that the emission of particles by the source is governed by a stochastic process such that particles arrive at the barrier independently and identically uniformly distributed over a finite interval covering the two slits. Denoting with m' and m" the position of the middle of the two slits and with 2v'0 and 2V'Q their widths, we assume that Xn, n = 1,2,..., are independent and identically uniformly distributed over a finite interval [—6, b] well covering [m' — v'0, m" + V'Q]. To specify the interaction between the particles and the barrier, we assume that at each instant of time n = 1,2,..., each slit can be in one
252 of three states: 0, 1 and 2, say. Denoting with S'n and S% the states of the two slits at time n, the barrier as a whole can be described, at any given instant of time n, by the pair (S^S^), which can be in one of the 9 combinations (0,0), (0,1), (0,2), . . . , (2,2). Then particles close to the barrier are assumed to be subjected to two effects taking place before (attraction) and after (redirection) the barrier respectively. To describe the attraction effect, we imagine that for each slit a funnel, with a width depending on the state of the slit, "captures" the particles coming from the source. Since the net result of the funnels would be that to capture some particles that were originally meant to hit the barrier and to funnel them through the slits, the total effect of the funnels can just be described by a fictitious widening of the slits. Denoting with Wn and W£, n = 1,2,..., the fictitious half widths of the two slits, we assume the simple linear relationships W'n — v'0 + cS'n and W£ = V'Q + cS%, where c is some real constant. Since the two slits can be in three different states, the half widths of the two slits can take three different values which we denote with v'0, V[, V'2, and V'Q, V", V'^. (TO avoid to take account of what happens when the two widenings overlap, we will deal with widenings small compared to the distance between the two slits.) Particles passing the barrier are also subjected to a redirection effect. Looking again at the net result, this effect is responsible for the position of the particles at the screen. For the first slit (and similarly for the second slit) we assume that if the nth particle is meant to pass through it and its position at the barrier is Xn, then its position at the screen is given by f Xn + d((Xn - m')/W^, In
S4),
\Xn-d{{m'-Xn)IW^S'n),
where the "drift" function d(Xn,S'n), (*J3,
if
(Xn - m') > 0,
if (Xn-m')<0,
{
>
for 0 < Xn < 1, is defined by _
3 *(y»,^)=u«2 sB; ,v. ( i + (, n 2^ < - i )^ ),
if S'n = 0, «« o if s'n := ,1,2.
(2)
From Formula (1), it is easy to see that the regions of the real line where the first derivative of d(Xn,S'n) with respect to Xn is close to 0 correspond to the locations on the screen where the particles accumulate more. To describe how particles interact with the barrier, indicating with S'n = [rri - W'n; m' + W'n\ and S£ = [m" - W%;m" + W^}, n = 1 , 2 , . . , the (fictitious) openings of the two slits, and assuming to start the experiment (with both slits open) at time n = 1 with the barrier in state S[ = 0 and <S" = 0, we make the following assumptions:
253
i) i£XnGS'n, then 3 , + 1 = S'n - 1, and S^+1=SH + 1; ii) if Xn e S£, then 5 ; + 1 = S'n + 1, and S^'+i = S% - 1; iii) if X n 0 5 ; U 5;', then S ; + 1 = 54, and S^'+i = <^'We also assume that if the states of the two slits cannot decrease or increase, because already equal to 0 or 2, then they remain the same. Considering, e. g., assumption i) (assumptions ii) and iii) having a similar interpretation), if the nth particle passes through the first slit, then the (n + l)th particle will find the state of the first slit decreased by a unit and the state of the second slit increased by a unit. 4. Interference Pattern and Non-Additivity Paradox Now, for the experiment with both slits open, let us define, for the position of the nth, n = 1,2,..., particle at the barrier, the conditional distributions U!(x) = P(Xn < x\Xn e S'n,S'n = i) and U?(x) = P(Xn < x\Xn e S^S^ = j), for i = 0,1,2 and j = 0,1,2. Then, considering the transformation determined by formula (1), we define the conditional distributions of the position of the nth particle at the screen as G'^y) = P(Yn are symmetric, centered around the middles m' and m" of the respective slits, and, apart from g'0(y) and g'o(y), also bimodal. Then considering the first slit we have P(Yn
=
i,S';=j)
P{Yn
for i,j = 0,1,2, and, similarly, for the second slit we have, for i, j = 0,1,2, P{Yn
=
i,S^=j)
S':, S'n = i, S% =j) P(Xn e S^\S'n = i, S£=j) = G'J(y)2v'>/2b.
With these conditional distributions we can then write, for i, j = 0,1,2,
__ P(Yn
= i,S'Jj =3)
Gi(y)^+^'(«)^]=^[GI(yM+^(y)«}q, 1
J
254
Figure 2.
States of the two slits for the experiment with both slits open,
and so, for i, j = 0,1,2,
P(Yn
= i,s': = j)
= P(Yn < y\Xn eS'nU S';, S'n = i, S': = j)
p(xn es'nusz\s'n = i,sz = j)P(s'n = i,s;' = j) = K + v'!)-1 [G'SlK + Gi'{y)v'>} (2„< + 2x,;)(26)- 1 P(<S; = », S^ = j). Moreover, for i, j = 0,1,2, we can also write
p(xn e s; us;, s'n= i, s';=j)=p{xn
€
s; us;'|<s;= i, s':=j)p{s'n= i, s;=j)
= l(2v> + 2v'>)/(2b)]P(S>n =
i,S';=j).
Thus, considering that the evolution of the state of the barrier (S'n, S'^) is as described in Fig. 2, and so that the actually admissible states for the barrier are only the states (0,0), (0,1), (0,2), (1,0), (1,1), (2,0), the distribution of a particle at the screen, for a given n = 1,2,..., is given by
p(Yn
es;us;)
(3)
= i, S<; = j)
E L o E ' I O P{Xn &S'nU S", S'n = i, S» = j)
E t o E 2 :^ [G'dvX + G'!(y)v'!] PQS; =», s ; = j) E L E?=OK + <)P(S; = i, ^ = 3) Let us now consider to evaluate the distribution of a particle at the screen for n large. The evolution of the state of the barrier forms a discrete time, discrete state, homogeneous Markov Chain. Starting with the barrier in state (S[ = 0,<S" = 0), after some particles have passed the barrier, the Markov Chain will have left forever, with probability one, every transient state for the set of non-transient states at the bottom of Fig. 2. For this set
255
of non-transient states, for every n = 1, 2 , . . . , the (homogeneous) transition probability matrix is given by
P=
P02,02 P02,ll
0
Pll,02 P l l . l l
Pll,20
.
0
P20,ll
P20.20.
where, for i',i",j',j" = 0,1,2, piV/Jlj,^P(S^+1 = j',S^+1 = j"\S'n=i',Si:= i"), are the probabilities of transition from state (i',i") to state (j',j"), which, depending on the fictitious widths of the slits, are given by P02,oa = P ( - x ' » g K -v'Q;m' +v'Q}{J[m" -v a ';m"+v' 2 \) = l-(v'0+v'2)/b, , , ,, p02Ai = P(Xn€[m'~v o]m'+v^) + P(Xn€[m '-v^m +v^)==(v'0+v^)/b, Pu,02 =
P(Xn€[Tn'-v'1;m'+v'1])=v'1/b,
Pn,n^P{Xn^[m'-v[;m'+v'1]U{w,"-v'{;m"+v'{})=l-(v[+v'{)/b, Pn.ao = P(Xn e [ m " - < ; m"+v'(}) =,v[/b, P20,u = P(Xn G [m'-v'2; m'+v'2})+P(Xn P20,20 =
6 [ m " - i # ; ro"+i#]) = (v2+v%)/b,
P(Xn$lm'-v2;m'W2}\J[m"-vZ-m"Wo})='L-{v2+vZ)/b.
Moreover, for these non-transient states, the limiting probabilities of recurrence n02 =\imn-f00P(S'n = 0 , 5 ^ = 2), 7r u =lim„_ >00 P(54 = 1,5^' = 1) and 7T2o = limn^ooP(<54 = 2, S^ = 0) can be obtained by solving the usual system of equations 7T02 = 7T02P02,02 + 7TllPll,02,
7Tn = 7T02P02,11 + ^ " l l P l l . l l + ^20^20,11,
7I"20 = 7TllPll,20 + 7T20P20,20,
7T02 + 7Tll + ^20 = 1-
Thus, finally, considering n —> oo, for the distribution of a particle at the screen for the experiment with both slits open we have lim P(Yn < y\Xn e S'n U S£)
(4)
n—»oo
^Ng^(^)+i1+^G^(j/)+
= i}/#{n:n
= i},
and, for y < oo, for j = 0,1, 2, Gj'(y;iV)=#{n:n
256
Then, the empirical distribution function of the positions of the particles at the screen for a run of the experiment of length JV is given by, for y < oo, p(v. l2/
'
m - ( W # { " ••n
S^} l
S£}
'
_ E L o E J t o ( l / ^ ) # { " = n < JV, Yn < y, Xn £ S'n U S£, S'n = i, S£ = j} E l o E - = O ( W ) # { « :n
S», 5 ; = i, S» = j }
'
where, for i, j = 0,1,2, for y < oo, and for n < N, (l/JV)#{n : y„ < y, X„ S 5 ; U 5;', 5 ; = i, S'^ = j } _ # f n : X , < y , ^ e ^ , ^ = i , ^ / = j } # { n : ^ < y , J k €%',.%=», $'=.7}
^ " \
#{n:sk =
+G'J(y;N)
itS»=j}
JV # { n : X n G 5 ; ' , 5 ; = i,SZ=j}\ / # { n : S ' n = i,S^=j} #{n:S^ = i,S»=j} J\ JV
and, for i,j = 0,1,2, for y < 00, and for n < JV, (l/iV)#{n : Xn e S'n U S^'.s; = i,S£ = j} ' # { n :XneS'nU S£, S'n = i, S£ = j } \ / # { n : S ; = i, S^ = j} #{n:S^=i,SZ=j] J\ JV Now, considering an infinite run of the experiment, the distribution of the positions of the particles at the screen is given, for y < 00, by the limiting empirical distribution function F{y) = limN-^oo F(y; N). In general, different infinite realizations would lead to different limiting empirical distribution functions, however, for the present model it is easy to see, by a repeated application of the strong law of large numbers, 4 that F(y) = linin-^oo P(Yn < y\Xn € S'n U S'^), almost surely, that is, for a set, of infinite realizations of the experiment, having probability one. So, for a given single run of the experiment of length JV, with JV large enough, we would expect the contour of the histogram of the positions of the particles at the screen to assume a pattern close to the probability density (v'Q + V 2 > 0 2 + K + <)""ll + («2 + V0 )71'20 An evaluation of the empirical distribution function F(y; JV) has been obtained by simulating a single run of the experiment. For ml = —2.1, m" = 2.1, v'0 = 0.1, V'Q = 0.1, c = 0.2, and Xn, n = 1,2,..., independently and identically uniformly distributed over [—3,3], Fig. 3 (left) shows the
257
Figure 3. (Left) Histogram of the positions y„k, k = 1 , . . . , 10000, of the particles a t the screen for the simulated experiment with both slits open. (Right) Histograms (shown together) of the positions ynk, k = 1 , . . . , 5000, of the particles at the screen for the two simulated experiments with only the first or the second slit open respectively.
histogram of the simulated positions of a sequential record of 10,000 particles arrived at the screen. Let us note that this histogram, which follows the pattern of the density (6), has been obtained by grouping together all the positions of the particles, from the 1st to the 10,000th, arrived sequentially at the screen in a single run, and not, in particular, by grouping together 10,000 values y^ (for the particles actually arrived at the screen) obtained by simulating a large number of runs of the same given length N. Let us now consider the experiment with only the first slit open. Starting the experiment with S[ = 0, and considering that, if a particle passes through the open slit, that is, if Xn e S'n, then S^+1 = S'n — 1, and that, if Xn 0 S'n, then <S^+1 = S'n, for this experiment we have that S'n = 0, n = 1,2,.... Thus, for n = 1, 2 , . . . , for the distribution of a particle at the screen we have P(Y„ < y\Xn € S'n) = P(Y„ < y\Xn £ S'n,S^ = 0) = G'0(y). On the other hand, the empirical distribution function of the positions of the particles at the screen for a run of the experiment of length N, for y < oo, is given by F'(y, N) = #{n: n < N, Yn < y, Xn e S'n}/#{n :n
(7)
Whereas, considering a run of infinite length, for the limiting empirical distribution function F'(y) = limjv—oo F'(y; N), y < oo, of the positions of the particles at the screen we have F'(y) = G'0(y), almost surely. Finally, consider the experiment with only the second slit open. Starting the experiment with S" = 0, and considering that, if Xn € S£, then S^'+i =
258
S% - 1, and that, if Xn <£ S%, then S%+1 = S%, we have that S£ = 0, n = 1,2,.... So, for n = 1,2,..., P(Yn < j/|X„ e 5;') = P(Yn < y\Xn e Sn,S„ = 0) = Go(2/)- Then the empirical distribution function of the positions of the particles at the screen for a run of the experiment of length TV, for y < oo, is given by F" (y;N)=#{n:n<
N, Yn
S;'}/#{n: n < N, Xn € S^}
_ #{n:n
(8)
= 0} _ ~„ = 0} -^(V^)-
And, considering a run of infinite length, for the limiting empirical distribution function F"(y) = limjv->oo F"(y; N), y < oo, of the positions of the particles at the screen we have F"(y) — G'^y), almost surely. Thus, performing an infinite run with each of the three experiments we would obtain, for every p e [0,1], F(y) ^ pF'(y) + (1 — p)F"(y), (—oo < y < oo), almost surely. In other words, considering a very long typical realization from each of the three experiments, we do not have to expect the empirical distribution function of the positions of the particles at the screen obtained with both slits open to be a mixture, and in particular a uniform mixture with p = 1/2, of the empirical distribution functions obtained with only the first or the second slit open respectively. Keeping the same parameter values used for the generation of the data in Fig. 3 (left), but considering that now only one slit is open, we numerically evaluated by simulation the empirical distribution functions F'(y; N) and F"(y; N). Fig. 3 (right) shows together the histograms of the positions of two sequential records of 5,000 particles arrived at the screen obtained by simulating a single run from each of the two experiments with only one slit open. 5. Conclusions In our investigation, even if we have not been concerned with any specific real set of experimental observations, emphasis has been given to the modeling of actual experimental data and not to the reproduction of the standard theoretical results of quantum mechanics. Indeed, the present investigation, although not in disagreement with classical published data, would suggest for the gathering and analysis of new experimental data from freshly made purely particle sequential experiments. For these experiments it would then be possible to investigate, among other things, also sequential properties (and so to discriminate between different models) that cannot be accounted for by the classical calculus of wave functions of quantum mechanics. For instance, under our model, even if the positions of the particles
259
Figure 4. Plot of the positions y„k versus ynk_1 of successive particles at the screen for the simulated experiment with both slits open.
at the barrier are not sequentially dependent, the dynamic interaction of the barrier with the particles, when the two slits are b o t h open, implies t h a t they are nevertheless dependent at the screen. Analysing the simulated sequential d a t a of Fig. 3 (left), in Fig. 4 it is shown the plot of ynk versus ynk_1, t h e positions of two successive particles at t h e screen, and the p a t t e r n of empty and filled crossings indicates t h a t there is a dependence at the first time lag. Let us observe t h a t even if actual experimental observations would not show any temporal dependence, this confutation could only be discussed in our framework and not with the q u a n t u m mechanical calculus. Whereas in this work we showed how to tackle interference experiments (similarly we would proceed for diffraction experiments), in Ref. 5 it is shown how to account in a purely particle sequential statistical fashion for t h e correlation experiments associated to the Bell's inequalities and t o the Einstein-Podolsky-Rosen-Bohm gedankenexperiment. References 1. Y. Tsuchiya, E. Inuzuka, T. Kurono and M. Hosoda, Advances in Electronics and Electron Physics 64A, 21 (1985). 2. A. Tonomura, J. Endo, T. Matsuda, T. Kawasaki and H. Ezawa, Am. J. Phys. 57, 117 (1989). 3. M. Minozzo, Proceedings of the 4th World Congress of the Bernoulli Society, Vienna, 1996, p. 332. 4. A. N. Shiryayev, Probability (Springer, Berlin, 1996). 5. M. Minozzo, in The Foundations of Quantum Mechanics: Historical Analysis and Open Questions, C. Garola and A. Rossi eds. (World Scientific, Singapore, 2000).
WHY THE REACTIVITY OF THE ELEMENTS IS A RELATIONAL PROPERTY, AND WHY IT MATTERS VALERIA MOSINI Dipartimento di Chimica, 'La Sapienza', and Centre for the Philosophy of Natural and Social Sciences,
LSE
In this paper I discuss the role of relational properties, first detected in quantum mechanics, in other branches of science, most specifically, in chemistry. The widening of the domain of existence of relational properties calls for a revision of the notion of realism, which can, and should, be modelled on that formulated by Henry Margenau back in the 1950s.
1. Introduction In 1950 Henry Margenau, professor of physics at Yale and one of the founders of Philosophy of Science, published a book entitled The nature of physical reality. The aim of the book was to offer a way out of the long-standing controversy on the interpretation of science that opposed realists to empiricists, the former seeing science as an approximately true account of reality, the latter as a convenient, albeit possibly fictional, means to describe and predict the phenomena. The disagreement between the two camps on the interpretation of science was grounded on an even deeper disagreement about the subject matter of science: an objective, mind-independent, physical reality for the realist, a subjective collection of sense data for the empiricist. Accordingly, the realist saw the aim of science as consisting in the one-to-one mapping between scientific concepts and observational statements, while the empiricists refused to make claims that went beyond observations. Realists and empiricists, however, agreed on one thing: the existence of a sharp dividing line between the observer and what is observed. Margenau claimed that the assumption of separability between the observer ('spectator') and what is observed ('spectacle'), that he dubbed 'spectatorial doctrine', is a misleading oversimplification because the experience of reality is an inextricable mixture of external stimuli that act on our senses while being acted upon by our intellect. Hence the elements of reality are not wholly external to, but partly constructed by, us, according to metaphysical principles that are not fixed, but revisable, as shown by the several conceptual re-shuffling recorded by the history of science, which acted as turning points in
260
261
our world-view. Margenau claimed that the spectatorial doctrine is one such principle, and that evidence gathered mainly, but not exclusively, in the microscopic domain, suggested to abandon it, showing that quantum mechanics, far from being an anomaly within physics, represented the "culmination of methods long present in natural science" (Margenau: 1950, v). Notwithstanding Margenau's prestige in the Academic world, his (1950) book failed to impress his contemporaries and, in fact, met with a rather cold reaction. In the British Journal for the Philosophy of Science the book was described as framed within "an antiquated and obscure theory of knowledge", and Margenau was accused of being over-preoccupied "with the pseudo-problem of Reality" (Hutton: 1951, p. 81); in the Philosophical Review the book was called "an ill-conceived project", expressed "in confused language" by an author who displayed "a rather immature philosophical technique", and, therefore, achieved "scant results" (Smart: 1951, p. 411-13); even Philosophy of Science (of which Margenau was one of the editors) was luke-warm (Werkmeister: 1951). Accordingly, Margenau's position went largely unnoticed, and the controversy that opposed realists to empiricists has been re-iterated in exactly the same terms as it had been stated before the publication of the book (see, for instance, Smart: 1968, and Boyd: 1973 for the realists, and Hesse: 1967, and Laudan: 1984 for the empiricists). In this paper I defend Margenau's position by showing that the argument he used against the spectatorial doctrine-the existence of relational properties, whether between things, or between things and the spectator, applies also outside the boundaries of quantum mechanics, for instance, in chemistry. I also claim that the initial reaction to Margenau's book was utterly inadequate, as shown by the fact that subsequent developments in quantum mechanics-most notably the discovery of entanglement-brought the question of relational properties to prominence (see, for instance, Teller: 1986 and Priest: 1989). Hence the negative reaction to the publication of The nature of physical reality was, in all likelihood, to be attributed to the philosophical community of the mid-nineteenth Century not being prepared to comprehend and appreciate an epistemology that was much ahead of its time. 2. Margenau on the nature of physical reality Margenau's discussion of what physical reality amounts to began with considerations of what-to his mind-it does not: reality is not identical with truth, given that reality is a material notion, truth a formal notion that can only be established within a conventional framework. Margenau also denied that reality is a set of facts given by sensory experience, and, denouncing the "lack of
262
analytical depth peculiar to perceptions" (Margenau: 1950, p. 103), warned against relying solely on the deliverances of the senses, irrespective of how solid the evidence they provide appears to be. Following Schlipp (1935), he criticised the empiricists' account of reality as the no-further-analysable set of data given by perception, claiming that the passage from sensation to thought is gradual, continuous, and, above all, complex, and would be best described as an "imaginative supplementation". Against Kant's dualism of noumena and phenomena, Margenau claimed that reality and the experience of it are one and the same thing, which is brought about in a single event, and represents a "unique adventure". To sum up, for Margenau, cognition is neither imposed by perception, as the empiricists claimed, nor pre-determined by the categories of the mind, and, therefore, inescapable, as Kant claimed. However, cognition is not free, but constrained by metaphysical principles or "correspondence rules" (see also Margenau: 1935). These rules have no content of their own, as shown by the fact that they cannot be properly stated except by reference to the terms they relate to; moreover, their range of applicability cannot be determined a priori, and each rule becomes clear, albeit in a circular way, in its habitual application. Margenau noted that the practical invariability with which the cognitive processes is performed confers on constructs objectivity and the semblance of uniqueness. However, when even the simplest act of cognition, reification, is considered in the scientific domain, it looses the deceptively immediate character it has in the domain of everyday experience: the abstraction embedded in the rules of correspondence becomes evident, and so does the fact that different degrees of abstraction are required, for instance, to interpret colour in terms of wavelength, to posit electric and magnetic fields, or to construct the wavefunction of microscopic entities. The difficulty in finding "common elements between the various rules of correspondence, for instance, between the "tree rule", the "wavelength rule", and the "electron rule", led many positivists to deny their presence altogether" (Margenau: 1950, p. 81). The lack of common elements between the various rules of correspondence, however, is counterbalanced by an important similarity: the fact that all scientific constructs satisfy the same requirements of stability, fertility, consistency, and empirical verifiability, whatever the rule implied to arrive at them. Constructs that have been corroborated by testing may be labelled "verifacts"; however, their ontological status is not due to "their referring to something real: on the contrary, they denote something real because they have been found valid" (Margenau: 1950, p. 292). Accordingly, the state function "is a property of an electron just as truly as the blue colour is a property of the sky" (Margenau: 1950, p. 68). Margenau pointed out that the history of science showed that even metaphysical
263
principles that appeared definitive have, at times, been abandoned, and that this happened because the principle in question led to shaping scientific constructs that did not satisfy one or more of the requirements cited above. "In view of such evidence" nobody can maintain "that the method now in vogue has any likelihood of being ultimate" (Margenau: 1950, p. 76), and it should be accepted that the metaphysical principles currently used might have to be amended or abandoned in the future, and the spectatorial doctrine is one such principle. This is because, however glorified by the success of Newtonian mechanics it was, the spectatorial doctrine started to face difficulties in the nineteenth century, when classical electrodynamics described the interaction between two charged particles as the effect of the field generated by one particle on the other. In so doing, electrodynamics impressed "upon interacting charges an artificial subjectobject distinction... forbidding all questions on the fate of the field-producing charge, it surrounded the subject charge with an impenetrable barrier to understanding" (Margenau: 1950, p. 37). The case of classical electrodynamics showed that the terms subject and object and, by extension, spectator and spectacle, are used in a conventional, not in an absolute, sense. Quantum mechanics proved even more detrimental for the spectatorial doctrine, because of the uncertainty in the value of the observables of quantum systems: "the space-time-matter complex degenerated into probabilities for perception. The spectacle has begun to involve the spectator" (Margenau: 1950, p. 46). To sum up, on Margenau's account, the physics of the nineteenth and twentieth centuries showed that assuming sharp separability between things, and explaining phenomena as resulting from the action of a 'subject' on an 'object', whether both things of external experience-as in electrodynamics-or mind and matter-as in quantum mechanics, is misleading. On his account, the distinction between subject and object is always arbitrary: his criticism of the spectatorial doctrine refers to the question of the nature of the properties of things, not to the question of a possible role of consciousness in determining the outcome of events. 3. Things, and their properties The spectatorial doctrine is a generalisation of the assumption of separability between things, which is, in turn, predicated on the assumption that things are fully characterised by intrinsic properties. These are properties that are always displayed independently of anything else, on the basis of which things can be defined, whether as particulars, or as universals. Within philosophical discourse, the assumption that things are fully characterised by intrinsic properties has been
264
present from the start: it underpinned the Presocratics' search for the substance(s) from which the Cosmos originated, made Plato's Forms eternal and changeless (Ross: 1953), and provided the minimal conditions for Aristotle's essentialism (Owens: 1951). Given Plato's and Aristotle's overwhelming influences throughout the Middle Ages, it is no surprise that the assumption in question became a pillar of scholastic philosophy, to be later subsumed in the great epistemologies of the Modern Times. It was not until the 1930s that Victor Lenzen, a philosophically minded physicist, challenged the view that the properties of things are intrinsic, suggesting that, if some properties appear to be always displayed, this is because the conditions for them to be manifested obtain as part of the natural order. So it is, for instance, of weight, colour, size, and shape, because of the existence of gravity and light on earth, and of human beings being sighted (Lenzen: 1931). Lenzen's own account of properties was that some are "given", some just "possible" (in the sense of not being directly accessible), and this is due to the fact that properties result from "unions of particular qualities, complexities and relationships" (Lenzen: 1931, p. 16), which may or may not obtain. Lenzen's point had important epistemological consequences. This is because, if things are defined only on the basis of their given properties, the related concepts are fixed; this option provides "an essential feature of the classical concept of substance....the assumption of the self-determination and independence of substance" (Lenzen: 1931, p. 285). If, in fact, things are defined on the basis of possible, as well as given, properties, the related concepts "admit of transformation" and reflect the "observed fact that the physical characters (of things) are inter-dependent" (Lenzen: 1931, p. 15). Margenau begun his discussion of the nature of properties by quoting, and endorsing, Lenzen's position. He ackowledged that the first phase of the cognitive process, reification, assumes that things are "carriers of observable properties" (Margenau: 1950, p. 172), but stressed that this assumption actually goes against the reality of cognition, where "properties are postulated first and somehow settle upon a construct" (Margenau: 1950, p. 173). Nonetheless, as far as everyday practice is concerned, no harm is done by assigning objects intrinsic properties whose numerical value can be uniquely assessed. The same cannot be said, however, of the properties of microscopic objects, which "take on different values on different occasions and are yet in another sense unique" (Margenau: 1950, p. 175). These properties should be seen "not so much as attributes of the system but as entities determined by the physical operation performed on it" (Margenau: 1950, p. 343). Hence Margenau suggested replacing the notion of property with that of observable, and to distinguish "possessed observables", which have unique value, from "latent observables", which scatter when
265
repeatedly observed while having a determinate probability distribution. Notably, not all observables of microscopic things are latent: charge and mass, for instance, are not; they are often regarded as parameters rather than as proper observables. Moreover, there exist states of quantum systems-called eigenstatesin which the latent observables of microscopic objects assume sharp values as possessed observables: hence, the classical and the quantum domains cannot be sharply separated from one another on the basis of things displaying, respectively, possessed or latent observales. This fact, together with the realisation that quantum properties come into being as a consequence of the operations perfomed on the system, suggests that "the modern physicist can no longer countenance simple realism" (Margenau: 1950, p. 343), and that a more elaborate epistemology should be elaborated. Lenzen's and Margenau's discussions of the nature of properties did not receive attention in philosophical quarters, and the idea that things are endowed with intrinsic properties survived unchallenged and unaltered even the introduction of the notion of dispositional properties (see Carnap: 1936, and Goodman: 1955), which have been construed as intrinsic properties with a special relationship to subjunctive conditionals (Mackie: 1962), irrespective of whether they are attributed a non-dispositional basis (Armstrong: 1968) or not (Mellor: 1974). In scientific quarters, however, the introduction of relativity theory showed that space and time, previously thought of as intrinsic properties, independent of one another, are, in fact, interrelated, and must be jointly specified (see, for instance, Sklar: 1974). Furthermore, developments in quantum mechanics, particularly the discovery of entanglement,1 showed that the observables of quantum systems do not represent intrinsic properties and should, in fact, be viewed as relations. As I spell out in the next section, also the most important chemical property, the reactivity of the elements, is a relational propert; this realisation is not without epistemological consequences. 4. Chemical reactivity: intrinsic or relational property? The most important chemical property is the reactivity of the elements, which, even since the mid-1860s, came to be denoted by the term valency (Russell: 1971). In discussing what kind of property valency is, I refer to the theoretical framework provided by a textbook of general chemistry. 1
The fact that correlations between the observables of two separate quantum systems previously joined together exist, such that latent observables of one system become possessed observables because of measurements performed on the other, spatio-temporally separated, system (Aspect etal.: 1982).
266 a) Chemical valency is neither an intrinsic nor a relative property. I start by considering the elements belonging to Group 1 of the Periodic Table, the alkali metals: lithium, sodium, potassium, rubidium, and cesium. The behaviour of the elements of the same Group is expected to vary slightly and progressively along the Group. Upon combination with oxygen, the elements of the first Group are expected to form compounds by the generic formula Me20 (as in the case of Lithium), and Me202 (in the case of sodium, potassium, and rubidium). Notably, the metals Me are monovalent both in the compounds of generic formula Me20 and Me202. However, potassium, rubidium, and cesium also form the so-called superoxides of generic formula Me02 where the metals Me are tetravalent. Hence, with the exception of lithium and sodium, the elements of Group 1 display variables valencies. The same can be said of all other elements, and is particularly evident in the case of the halogens, which, with the exception of fluorine (always monovalent), form compounds of as many generic formulae as HX, X 2 , HXO, HX0 2 , HX0 3 , HXO4; in the first three compounds the element X is monovalent, and, respectively, tetra-, penta-, and epta-valent in the following ones. The fact that most elements display variables valencies shows that valency is not an intrinsic property. Before discussing if it is a relational property, I need to consider the possibility that valency be a relative property, namely a property defined with respect to a reference point.2 To explain why a given element displays variable valencies upon reaction with the same element it is necessary to introduce the notions of oxidation number and that of redox reaction. The notion of oxidation number rests on the assumption that, when two elements of different electronegativity are bound together, the bonding electrons are fully transferred, as it was, to the more electronegative element. (Hence, the oxidation number of homopolar compounds is zero.) The acquisition/loss of electrons upon reaction on the part of an element is expressed by the oxidation number bearing the sign minus when the elements acquire, the sign plus when they loose, electrons. When the elements that take part in a reaction change their oxidation number (as in all cases listed above), they are said to undergo a redox reaction. The factor that determines whether a given redox reaction occurs or not is the difference in the redox potential of the reactants. Assuming two reactants only, as in the case of the metals of Group 1 and oxygen, the element that displays the higher redox potential is reduced while the other one is oxidised. Let us assume that the reactions take place at 273° K and 1 atm, these conditions being identified as 'standard' in chemistry textbooks As in the case of younger or brighter, for instance, which reduce to intrinsic properties, such as age and refractive index.
267
(see, for instance, Lide: 2000). The fact that some elements assume variable oxidation numbers when reacting with the very same element is explained by taking into account that, although the standard redox potential of each element Eo is fixed, the redox potential in non standard conditions E is not. It varies with the temperature T and with the relative concentrations of the reactants [r] and of the products [p] according to Nerst's equation: nF
[r]
(where n is the number of moles and F is Faraday's constant). As Nerst equation shows, in non-standard conditions, which represent almost the totality of chemical reactions, the redox potential of an element may well differ from its value in 'standard' conditions. This fact, in turn, may alter the energetic balance towards the formation of products that, on the basis of the values of the standard potentials only, should not be formed. In other words, in non-standard conditions, the redox potential depends upon parameters, temperature, and concentration of the reactants and products, which do not belong to the reacting elements themselves. This consideration explains why the reactivity of the elements is not a relative property. To address the problem of why not all the elements belonging to the same Group display the same valencies, it is necessary to abandon the approach followed so far, and bring in some simple considerations from quantum chemistry. b) Chemical valency is a relational property. The valence-bond theory represented the first application of quantum mechanics to the question of the chemical bond (Heitler and London: 1927). Like Lewis' theory, it considered the chemical bond as a localised interaction consisting in the sharing of two electrons between two atoms and, when applied to heteroatomic and unsaturated compounds, it faced problems. In the first case, it created ambiguity as to the extent in which the bond in question was to be regarded as covalent or as ionic, in the second case, it gave incorrect predictions of the number of isomeric substitution products to be expected. By contrast, another theory borne out of the application of quantum mechanics to the question of the chemical bond devised at the time, the molecular-orbital theory (Hund: 1927), gave unproblematic results when applied to heteroatomic and unsaturated compounds and to those with unpaired electrons (see, for instance, Mulliken: 1932). This state of affairs brought about the gradual replacement of the valencebond theory with the molecular-orbital theory, a transition that was almost completed by the 1950s. Hence, in this paper, I refer to the question of the chemical bond in the terms set by the molecular-orbital theory.
268 Recall that molecular orbitals are represented as linear combinations of atomic orbitals: V= CA
(1) an
where and CA and eg are the coefficients of the atomic orbitals (pA d
269
principal quantum number n are grouped together in the same shell. The electrons that take part in chemical reactions are those that occupy the outer shell, and they are called valence electrons; it is to these electrons that reactivity considerations apply. The molecular-orbital theory may be seen as an extension of the Aufbau process and differs from it in that it feeds electrons to molecular, rather than to atomic, orbitals. Restricting the discussion, for the sake of simplicity, to homonuclear diatomic molecules of generic formula A-B, with A being indistinguishable from B, the combination of two atomic orbitals results in two molecular orbitals, which may be written as follows: \|/=N±[(p A (2s)±(pB(2s)];
(2)
the energy of the orbitals is given by the secular equation: [y/Yiy/dT E(ifr)=-L
(3)
W2dT
and this has two solutions, E + and E", the former having higher, the latter lower, energy than the original pair of degenerate atomic orbitals. In the ground state of the molecule, the lower energy orbital, occupied by the bonding electrons, provides the bonding molecular-orbital; the higher energy orbital remains unoccupied and provides the antibonding molecular orbital. The energy difference between the initial pair of degenerate atomic orbitals and the bonding molecular orbital explains the stability of the molecule; its value is taken to correspond to the energy associated with the chemical bond(s) that have been formed. Consider now a heteroatomic diatomic molecule, such as, for instance, lithium hydride; this compound results from the combination of elements of respective electron configuration ls 2 2s (lithium) and Is (hydrogen), and is expected to display electron configuration la 2 2a 2 . 3 In fact, theoretical calculations show that the compound's inner shell is virtually identical with that in the free lithium atom and that the bonding molecular orbital 2a is given by the following expression: 2a = 0.323(p2sLi + 0.231(p2pLi+0-685(pH
(Ransil: 1960).
(4)
In other words, the 2a molecular orbital, which is expected to be simply a linear combination of the singly occupied valency orbital 92sLi a n d 9H> actually contains an appreciable amount of 2p character. Molecular orbitals that arise The symbols c? and n, are used in analogy with the atomic orbitals s and p, to designate, respectively, molecular orbitals that are unchanged by rotation around the molecular axis, and molecular orbitals that, having positive and negative lobes separated by a single nodal plane, change sign on rotation by half a turn.
270
from combining atomic orbitals of different symmetry are called hybrid orbitals; the evidence in their favour is overwhelming, and comes from x-ray diffraction and other spectroscopic methods, which provide information on molecular shapes, and from thermochemical data which provide information on bond energies. Notably, hybrid orbitals result from the combination of atomic orbitals of different symmetry; the functions that describe them are stationary solutions for the Schroedinger equation of the isolated atoms with non-fixed angular momentum, and, as such, they do not represent possible orbitals for the atoms in question. In other words, hybrid orbitals require that atoms interact to come about: this fact settles the question of the reactivity of the elements being a relational property. 5. Relational properties and scientific realism Let us take stock, and recall that, when the theory of valency was first formulated, the realisation that most elements display different valencies in their different compounds puzzled the majority of the chemists. A famous case was that of Kekule, who maintained that, as a "fundamental property of the atom", valency should "be as constant and invariable as the atomic weight itself (Kekule: 1864, p. 510). In the last decades of the nineteenth century, however, the number of elements that were found to display different valencies in their different compounds grew so much that the idea of variable valencies became broadly accepted. It is important to stress that such acceptance came on the basis of evidence alone, and with no theoretical justification. This had to wait until the 1930s, when the formulation of the molecular-orbital theory (Hund: 1927) turned the previously bewildering idea of variable valencies into the distinctive trait of chemistry: "it is the existence of mutual effects between pairs of atoms that gives chemistry its intrinsic interest" (Coulson: 1952, p. 11). The chemistry case discussed here shows that the range of relational properties extends outside the domain of quantum mechanics, and in a way that involves no observer dependency, as some (see, for instance, Everett: 1957) claimed to be the case in the collapse of the wavefunction upon measurement. In this respect, the chemistry case may be said to resemble the electrodynamics case, in which the subject/object distinction did not involve a role for consiousness, and was introduced to have, on the one side, the 'subject'-particle creating the field, and, on the other side, the 'object'-particles affected by the field. The case of entanglement is similar to the chemistry and electrodynamics cases in its revealing properties whose relational character is strictly between things and involves no role for consciousness. Given the relational character of the
271 properties of chemical elements, charged particles, and entangled quantum systems, the definition of those entities is context dependent. This consideration calls for a revision of the definition of scientific realism away from the naive picture of science providing the faithful mirror image of an immutable reality consisting in things endowed with intrinsic properties, towards a picture of science accounting for different interactions, which emerge in different contexts, and are bound to change as science, and technology, advance. Notably, a view of this kind has nothing to do with social constructivism, which holds that experimental practices and natural phenomena are "bound together, so that assessment of the one is ultimately an assessment of the other" (Pickering: 1984, p. 113), and that the choice of experimental, as well as theoretical, practices, far from being 'dictated' by the phenomena, is mainly driven by opportunities for further experimental, or theoretical, investigations. On the contrary, as the chemistry case discussed here, the electrodynamics case, and entanglement in quantum mechanics show, the existence of relational properties is dictated by the phenomena and, therefore, is perfectly compatible with realism, albeit not with naive realism. The existence of relational properties points to a dynamic picture of reality, one that the progress of scientific knowledge modifies. Interestingly, versions of realism that emphasise the role of relational properties in natural phenomena, and the provisional character of the scientific description of reality have been advanced, over the last twenty years or so, from within quantum mechanics (Teller: 1986, and Priest: 1989), chemistry (Ramsey: 2000), and the whole of science (Giere: 1999). The re-appearence of the salient features of Margenau's philosophy of science-the relational character of properties, the revisable status of the metaphysical principles that inform science-almost half a century after its formulation, and in positions that make no reference to The nature of physical reality, suggests that those positions were reached independently of Margeanau. This fact should be taken as providing independent confirmation for Margenau's version of realism, which was characterised by its regarding quantum mechanics as the catalyst for conceptual and epistemological revisions, and for a great deal of re-thinking, in the whole of science. Ackowledgments I am grateful to Ron Giere for helpful discussion. References 1. A. Aspect and G. Roger, Physical Review Letters 48, 91 (1982).
272
2. D. M. Armstrong, A materialist theory of the mind (Routledge, London, 1968). 3. R. Boyd, (1973), Nous 7, 1 (1973). 4. R. Carnap, Philosophy of science 3, 420 (1936). 5. C. A. Coulson, Valency (Oxford University Press, Oxford, 1952). 6. H. Everett, Reviews of Modern Physics 29, 454 (1957). 7. R. Giere, Science without laws (University of Chicago Press, Chicago, 1999). 8. N. Goodman, Fact, Fiction, and Forecast (Bobbs Merrill C , Indianapolis, 1955). 9. W. Heitler and F. London, Zeitschriftfur physik 44, 475 (1927). 10. M. Hesse, The Encyclopedia of Philosophy 4, 404, P. Edwards ed. (McMillan, New York, 1967). 11. Hund, Zeitschrift fur physik 40, 724 (1927). 12. E. H. Hutton, in British Journal for the Philosophy of Science 2, 81 (195152). 13. F. A. Kekule, Comptes Rendus 58, 510 (1864). 14. J. Kim, Philosophical Studies 41, 51 (1982). 15. L. Laudan, in Science and Reality, J. Cushing et al. eds., (Notre Dame University Press, Notre Dame, 1984). 16. V. F. Lenzen, The nature ofphysical theory (J. Wiley, New York, 1931). 17. D. R. Lide, Handbook of chemistry and physics (Boca Raton, London, 2000). 18. J. L. Mackie, Truth, probability, and paradox (Clarendon Press, Oxford, 1972). 19. H. Margenau, Philosophy of Science 2, 48 and 164 (1935). 20. H. Margenau, The nature of physical reality (Mac Graw Hill, New York, 1950). 21. H. Mellor, Philosophical Review 83, 157 (1974). 22. R. S. Mulliken, Physical Review 40, 45 (1932). 23. J. Owens, The doctrine of being in Aristotle's metaphysics (Pontificial Institute of Mediaeval Studies, Toronto, 1951). 24. A. Pickering, Studies in History and Philosophy of Science 15, 85 (1984). 25. G. Priest, British Journal for the Philosophy of Science 40, 29 (1989). 26. R. J. Puddephatt, The Periodic Table of the elements (Clarendon Press, Oxford, 1973). 27. G. L. Ramsey, in Of minds and molecules, N. Bushan and S. Rosenfeld eds. (Oxford University press, Oxford, 2000). 28. B. J. Ransil, Review of modern physics 32, 239 (1960). 29. W. D. Ross, Plato's theory of ideas (Clarendon Press, Oxford, 1951). 30. C. A. Russell, The history of valency (Leicester University Press, Leicester, 1971).
273
31. P. A. Schilpp, Philosophy of Science 2, 128, (1935). 32. L. Sklar, Space, Time, and Spacetime (University of California Press, Berkley, 1974). 33. J. J. C. Smart, The Philosophical Review 60, 411 (1951). 34. J. J. C. Smart, Between science and philosophy (Random House, New York, 1968). 35. P. Teller, British Journal for the Philosophy of Science 37, 71 (1986). 36. W. H. Werkmeister, Philosophy of Science 18, 183 (1951).
D E T E C T I N G N O N COMPATIBLE P R O P E R T I E S I N DOUBLE-SLIT E X P E R I M E N T W I T H O U T E R A S U R E
G. NISTICO Dipartimento di Matematica, Universita della Calabria Via P. Bucci 30b - 87036 Rende (CS) Italy Istituto Nazionale di Fisica Nucleare, Italy E-mail: [email protected] In this work we show that in double-slit experiment properties incompatible with the Which Slit property can be detected without erasing the knowledge of which slit each particle passes through.
PACS numbers: 03.65.Ca, 03.65.Db, 03.65.Ta 1. Introduction In the ideal double-slit experiment proposed by Englert, Scully and Walther (ESW), 1 - 3 the detection of which slit each particle passes through is performed together with the measurement of the point of impact on the final screen. As expected, no interference appears on the final screen. A different set-up of ESW experiment makes it possible the detection of another property of the system, incompatible with the Which Slit property, again with the final impact point; but in so doing interference is restored, so that the knowledge of which slit the particle passes through is definitively lost: this phenomenon has been called erasure. In this work we face the problem of finding properties incompatible with the Which Slit property, whose detection does not erase which slit knowledge and it is performed together with the measurement of the final impact point. We begin in section 2 by establishing the necessary theoretical apparatus. The problem at issue is expressed in theoretical terms as problem (V). In section 3 we present an ideal experiment which makes it possible, in a particular state vector \I>, the detection of a property incompatible with the Which Slit property without erasure and without correlation with this last property.
274
275
2. Formalization of the problem We consider a physical system which consists of a localizable particle which we describe according to Heisenberg's picture. Let the observable position of the centre-of-mass be represented, at time t, by an operator Q'*' of a suitable Hilbert space Hi. Let our particle be endowed with further degrees of freedom, related to spin or similar, described in a second Hilbert space HH , in such a way that the complete Hilbert space is H = Hi <S> Hn • Let us suppose that the Hamiltonian operator H is essentially independent of the degrees of freedom described in Hn, so that we may assume the ideal case that H = Hi® In, where Hi is a self-adjoint operator oiHi- In general, if M {An) denotes a linear operator of Hi (Hn), by the same symbol without index / (H) we denote the linear operator A — Ai ® In (A = 1/ ® An) acting on the whole space H = Hi® HnThe projection operator identifying the Which Slit (WS) property "the particle passes through slit 1" has the form E = Ei ® In, where Ei is the localization projection operator of Hi which localizes the particle in slit 1 at the time t\ of the crossing of the slits' support. We may assume, without losing generality, that the property "the particle passes through slit 2" is represented by E\ ® In, where E'j = 1/ — Ei. Given any interval A on the final screen, the event "the particle hits A" is represented, like E, by a localization projection operator F(A), but relative to a different time t2 > h, so that in general [F(A),E] ^ 0. Therefore, it is not generally possible to measure the WS property and the final impact point directly.
I I I slit)
slit 2
Figure 1.
FINAL SCREEN -
Which slit detector
276
However, if for a given state vector \& a projection operator of the kind T = 1/ T j exists such that equation T\I> = E^ holds, then it is possible to detect which slit each particle hitting the final screen passed through. Indeed, since [T, E] = 0, the joint event T A E can be measured, being represented by the projection operator TE = ET. This allows us to compute the conditional probabilities p(T \ E) and p(E | T) by means of the formulas n(T P{
I Fs; '
=
P(TAE) p(E)
=
(* 1 TE*) <* | £ * )
'
I ™ _ P(EAT) ^ I ' p(T)
pn(F
_ (* I TEV) ~ (tf | T*> '
Since T * = £ * implies T E ' * = T * = i ? * , we have to conclude that p(T | E) = p(E | T) = 1, so that the occurrence of outcome 1 (resp., 0) for T detects the passage of the particle through slit 1 (resp. 2). 4 Furthermore, such a WS detection by means of T can be performed together with the measure of the final impact point because [T, F(A)] = 0. Indeed, though [F(A),E] ^ 0, the condition H = Hi <S>ln ensures that F(A) must have the form F(A) = F/(A) ln, so that [T, F(A)] = 0. In this sense we qualify projection T as WS detector, without entering the debate about the causes of the loss of interference. 5-10 E x a m p l e 1. In the thought experiment of ESW the physical system consists of an atom in a long lived excited state, e.g. rubidium in state 63^3/2, whose centre-of-mass position is described in Hilbert space Hi. The further degrees of freedom, described in Hu, concern with a pair of cavities 1 and 2, placed as shown in fig. 1. The cavities are resonators for the electromagnetic field, tuned at a microwave frequency in such a way that whenever the excited atom enters cavity 1 or 2, it decays emitting a photon. The event "a photon is revealed in cavity 1 (resp., 2)" is represented by a projection operator Tn = |1)(1| (resp., T'n = |2)(2|) of Hn- In this experimental situation the complete state vector of the particle is \& = Tm[i>-\. ® |1) + ^2 ® |2)], where ipi,ip2 € Hi are state vectors respectively localized in slit 1 and 2 when the particle crosses the two-slit support, i.e. Eii>\ = ipi, Eifo = 0. A WS detector is represented by the projection operatorT = l j ® | l ) ( l | ; indeed (l/|l)(l|)# = (E&ln)*, i.e. T * = EX, trivially holds. The possibility of this kind of detection is not exclusive of the WS property. We can introduce the following more general definition. Definition 1.
A projection operator Y of H is called a detector of a
277
property G = Gj ljj with respect to the state vector \& if (i) [Y, F(A)] = 0, VA (ii) [Y, G) = 0 and F * = Gtf. A measurement of Y detects G in exactly the same way a measurement of the WS detector T detects the WS property E. E x a m p l e 2. In their thought experiment, ESW found (with respect to the state vector \I> of example 1) a detector YESW = 1/ | + ) ( + | of another property GESW ~ IV'+XV'+I ® Iff, where ip+ = l/v^C^i + ^2) and |+) = (l/\/2)(|l) + |2)). Conditions (i) and (ii) in definition 1 are trivially satisfied. Such a property GESW is incompatible with the WS property E because [E, GESW] 7^ 0. As shown by ESW, if GESW is detected on each particle, then the distribution of particles for which YBSW = 1 exhibits interference (fig. 2) on the final screen, and this forbids to assign each particle with the slit it passed through: we detect GESW, but the WS property is erased. 1 ' 2 Distribution of particles for which Y= I
I II I Figure 2.
Toinl disti iliDTic m ofp.lllklfs
FINAL SCREEN -
Erasure
In the present work we seek for the possibility of detecting a property G = Gj 0 l j incompatible with the WS property E, but without erasing WS knowledge provided by a WS detector T. This possibility can be realized if with respect to the same state vector \& there exist both a WS detector T of E and a detector Y of G; moreover, it must be required that [Y, T] = 0, so that Y and T can be measured together, yielding simultaneous detection of E and G. Condition [Y, -F(A)] = 0 will be automatically satisfied if Y has the form Y = 1/
278
(V) Given the WS property E = Ej ® In, we have to find: a projection operator Gi ofHj, a projection operatorTn of Tin, a projection operatorYn of Tin, and a state vector \1/ g Hi ®Hn, such that the following conditions hold: (C.l) [E,G}^0,i.e[Ei,GI}^0I, (C.2) [T,Y] = 0,i.e[Tn,Yn] = 0,r, (C.3) T * = EV, (C.4) Y$ = GV, (C.5) 0 ^ £ # ^ # , 0 ^ Gtf ^ * . Condition (C.5) excludes solutions of (C.1)-(C4) corresponding to the noninteresting case that \I> is eigenvector of E or G. 3. A 'gedanken experiment' solution In Ref. 12 it is proved that if dim(Hi) = 2, then no solution of (V) exists. In Ref. 13 a solution of (V) is given with dim(Hn) = 2 and dim(Hi) = 4. Such a solution is characterized by a direct correlation between the nondisturbing detections of E and G. This trivial character turns out to be shared by every solution of (V), if dim(Hi) < 4, independently of the dimension of Tin-12 Therefore, for non-correlated solutions we have to look at double-slit experiments in which dim(Hj) > 6 (odd dimensions are excluded to deal with symmetrical slits only). The following ideal experiment shows that it is sufficient to take just dim(Hi) = 6 to find non-correlated solutions. Our ideal apparatus exploits the same physical principles used by ESW to design their thought experiment. Therefore, the system is the excited atom of example 1, which can travel towards the slits. For each slit we choose 3 non-overlapping regions, up (u), centre (c) and down (d) which decompose that slit (see fig. 3). The further degrees of freedom, described by means of Hu, concern with four (rather than 2 as in example 1) micromaser cavities A, B, C, D, located as in fig. 3. By An we denote the projection operator of HE representing the event "a photon is revealed in cavity A". Actually, we can define four projections An, Bn, Cn, Dn so associated to cavities A, B, C, D, and we shall denote their respective eigenvectors relative to eigenvalue 1 by \a), \(3), \j) and |<5). In this experimental situation there is a correlation between the presence of the emitted photon in one of the cavities and the passage through the corresponding region (cavity A with region u U c of slit 1, and so on). To describe these
279 correlations the state vector of the entire system must be * = - L {(Vtf + W)\a) + tf\{3) + W + r2)\S) + V2d|7)} ,
(1)
where ij)f, i/jf and ipf are normalized state vectors of Hi respectively localized in region u, c and d of slit i. These six vectors form an orthonormal set. Then we take the Hilbert space Hi for describing the centre of mass of the atom as the space generated by them. The correlations can be exploited to define a WS detector as T = 1/ ® (An + Bn), which satisfies T$> = E^. Our problem is now to find a property G = Gi ® 1 n incompatible with E,
1
Figure 3.
Distribution of particles for which Y—l
I I =-
Total tistribution of pariicles
c
Ideal apparatus for detecting both E and G
which can be detected by means of a detector Y — 1/ ® (An + Ca), without renouncing to WS knowledge provided by T. We take G/ as the projection operator whose matrix representation with respect to the orthonormal basis
W.Vf.Vtf.V'M,^} of Hi is
Gi
1 4
1 4
0
1 4
1 4
0
1 4
1 4
0
1 4
1 4
0
0
0
1
0
0
0
1 4
1 4
0
1 4
1 4
1 4
1 4
0
1 4
1 4
0
0
280 We notice t h a t a Hilbert space Hi of dimension 4 would be sufficient to describe the state vector ^ in (1), but without the six dimensions obtained by splitting the regions in front cavities A and C we could not describe the operator G / . By straightforward calculations 1 2 it can be verified t h a t all conditions ( C . 1 ) - ( C 5 ) are satisfied. Therefore, from the knowledge of which cavity the photon is revealed in, we can infer b o t h which slit the a t o m comes from and whether it possesses either property G or G' = 1 — G, according t o the following scheme. Cavity A B C D
=>• slit 1 and G =>• slit 1 and G' => slit 2 and G => slit 2 and G'
T h u s , the detection of property G is attained without erasing W S knowledge. No correlation between the non-disturbing detections of E and G occur in this thought experiment. Indeed, none of the equations as Y ^ = T\&, y * = YT$, 7 ' * = T * , T * = TY$>, ..., describing correlation holds.
References 1. M. O. Scully, B.-G. Englert and H. Walther, Nature 351, 111 (1991). 2. M. O. Scully and H. Walther, Phys. Rev. A39, 5229 (1989). 3. B.-G. Englert, J. Schwinger and M. O. Scully, in New frontiers in quantum electrodynamics and quantum optics, A. O. Barut ed. (Plenum, New York, 1990). 4. G. Nistico and M. C. Romania, J. Math. Phys. 35, 4534 (1994). 5. E. P. Storey, S. M. Tan, M. J. Collett and D. F. Walls, Nature 367, 626 (1994). 6. B.-G. Englert, M.O. Scully and H. Walther, Nature 375, 367 (1995). 7. E. P. Storey, S. M. Tan, M. J. Collett and D. F. Walls, Nature 375, 368 (1995). 8. H. M. Wiseman and F. E. Harrison, Nature 377, 584 (1995). 9. U. Mohrhoff, Am.J.Phys. 64, 1468 (1996). 10. B.-G. Englert, M. O. Scully and H. Walther, Am. J. Phys. 67, 325 (1999). 11. N. Bohr, in Albert Einstein: Philosopher-Scientist, P. A. Schilpp ed. (Library of Living Philosophers, Evanston, 1949). 12. G. Nistico, ArXiv, quant-ph/0409092 (2004). 13. G. Nistico and A. Sestito, J. Mod. Opt. 51, 1063 (2004).
IF YOU CAN MANIPULATE THEM, MUST THEY BE REAL? The epistemological role of instruments in nanotechnological research ALBERTA REBAGLIA Ingegneria dell'Informazione,
Politecnico di Torino, Corso Duca degli Abruzzi 24, Torino, Italia
«So far as I'm concerned, if you can spray them then they are real» (Hacking, 1983, p.23). This statement embodies a well-known key point in Ian Hacking's contemporary reading of scientific realism: scientific instruments assume a fundamental role in characterizing the ontological scenario to believe in. This paper focuses on the challenges of nanotechnology to this standpoint. Scanning tunnelling microscopy, as opposed to traditional microscopy (from optical to electron microscope), is not an imaging but a "touching and rearranging" technique. It requires a deep appraisal of epistemological ideas such as "representing" and "intervening", "knowing" "natural" entities and "creating" "artificial" ones.
1. Introduction Richard Feynman -in his now famous lecture, There is Plenty of Room at the Bottom. An Invitation to Enter a New Field of Physics, delivered on December 29, 1959 at a meeting of the American Physical Society at the California Institute of Technology [1] - offered the perspective of exciting new discoveries if one would be able to fabricate materials and devices at the atomic scale, where all phenomena are believed to be explained in quantum mechanical terms. The invention of the scanning tunnelling microscope (STM) in 1981 by Gerd Binnig and Heinrich Rohrer at IBM's Research Laboratory in Zurich [2] a has been an epoch-making event enabling researchers to directly observe atoms and to build structures to the nanoscale (one-billionth of a metre). As a tool for atomic image observation, the STM addresses the basic philosophical problem concerning the status of scientific theories and of experimental evidence. Discussing this issue constitutes the first aim of this paper. To achieve this purpose, we refer to the philosophy concerned with the
Binnig and Rohrer were awarded the Nobel Prize for Physics in 1986 for their invention of the scanning tunnelling microscope.
281
282 practices involved in the technical applications of science. Principally, the central features Ian Hacking proposed for entity realism will be appraised. As a tool for manipulating individual atoms, the STM also deals with a further momentous problem: what characterises its ability to operate in the nanoregion is the unavoidable requirement of a bridge joining the classical (macro) and quantum (sub-micro) scenarios. Shrinkage of device working volume till to the nanometre scale is not only a process of dimensional ultra-miniaturisation, but requires as well a deep change in our understanding of its operational principles due to the rise of significant quantum effects. In this sense the STM device opens a remarkable new world where distinguishing human-made objects from natural entities could become a problematic task. This topic will be taken into account in the final part of the paper. 2. In search of a demarcation between given experience and mental creation In his influential book Representing and Intervening [3] Ian Hacking claims «Experimenting on an entity does not commit you to believing that it exists. Only manipulating an entity, in order to experiment on something else, need do that» (p.429, emphasis added). Undoubtedly, this thesis epitomises the Canadian philosopher's most peculiar and intriguing point of view. Experimental activities, according to Hacking, indeed provide evidence for scientific realism; but not in the usual expected way, which traditionally rests on their task of testing hypotheses about physical entities. The so-called "new philosophy of science" has advanced a long list of epistemological arguments in order to disprove this mostly accepted opinion: a point of view largely debated in the XXth centuryb. According to Ludwig Wittgenstein's Tractatus [5], scientific theories, such as any other set of linguistic statements, can only describe some "state of affairs" but they cannot represent their very correspondence to reality. The notorious distinction between what language could say and what it might show is the basis of Wittgenstein's claim that there is no causal link delineating an outer reality: «The law of causality is not a law but the form of a law» (6.32). In physics we have laws of the causality form and we could construct the network of scientific statements out of figures of this particular (causal) kind -as well as of any other b
The "new philosophy of science" is the conceptual horizon to which belongs also Hacking's academic life, being him the editor of Scientific Revolutions, a well known collection in which Feyerabend, Kuhn, Popper, together with Putnam, Laudan, Shapere and the same Hacking, debate such critical items as incommensurability between scientific paradigms or scientific realism [4].
283
kind. We are justified in saying that a «causal» network can describe nature, but this statement asserts nothing about the world. Hacking takes the most epistemically strong conception that adequately moves within the boundaries of Wittgenstein's view: there is no reason to believe that any theoretical term depicts "the essence of the world". After all, phlogiston, caloric and ether took part in successful research programs. But if we can reliably manipulate such theoretical entities as convenient tools, this "shows" their reality. As stated by Hacking, what convinces us of the existence of physical entities does not rely on the explanatory merits of the theory that describes their dynamics. Neither does a coherent response we could obtain from our scientific instruments, since observational sentences are seriously infected by theory: as Norwood Hanson underlines, observation is theory laden [6]. According to Wittgenstein and the "new" philosophers of science, as Hacking emphasises, experimenting on entities could not be used to learn their real, objective properties; nevertheless experimenting with entities provides grounds for belief in their existence. In his intervention-based realism the key point is that entities, which in principle cannot be observed, nonetheless can be manipulated. 2.1. Hacking's scientific entity realism. PEGGYII Hacking's chief example involves electrons. Their manipulability, he emphasises, is a criterion for believing in their existence, in spite of the fact that experimental practices don't give us a theory-independent access to "real" features of those entities. Scientists are persuaded that electrons must be real since well-defined, stable and repeatable causal properties, distinguishing those particles in a very direct and tangible way, can be successfully employed to achieve other results. «We are completely convinced of the reality of electrons when we regularly set to build -and often enough succeed in building- new kinds of devices that use various well understood causal properties of electrons to interfere in other more hypothetical parts of nature» [3], p.433. PEGGY II is the peculiar apparatus that Hacking takes as an illustration of his point of view. It is a laser gun, built at the Stanford Linear Accelerator Centre in the late 1970s, used to produce a beam of circularly polarised light, which is directed on a gallium arsenide crystal. When light impinges on it, the crystal emits a large number of linearly polarised electrons, which can be employed as tools to check parity violation in weak neutral currents. Parity expresses the relation between the directions of the spin-vector of an electron and its travelling path in the beam. Parity violation was found in weak charged current and the proposed investigation hoped to find similar violation in
284
neutral ones. Scientists may disagree with the outcomes of the experiment and the distinctive traits of some element involved, such as the boson particle that is assumed to carry the weak forces, may be considered controversial; however, the existence of electrons, which are employed as tools in the experiment, in Hacking's opinion, is hardly contestable. Using directly unobservable entities such electrons as instruments implies that we confer the highest possible degree of belief in their existence: if they were not real, it would be quite amazing to pretend to use them reliably. 2.2. The Scanning Tunnelling Microscope According to Hacking's argument, we are precluded from being realists about entities that cannot yet be manipulated and exploited to investigate other entities. With regard to this conceptual approach, it is of scarce relevance that the STM -the revolutionary device developed by IBM researchers that allows for the first time to image and rearrange structures at the nanoscale- gives scientists the way to observe (and not only to detach) small particles at the atomic, and sub-atomic, level0. The really significant fact is that the tunnelling of electrons through the vacuum (as predicted by quantum mechanics) is centrally involved in the STM, as a tool to be employed in producing an image on the computer screen giving a rendition of the topology of a given surface with sub-micrometer resolution. We may ask ourselves if, in spite of the peculiar challenges that "seeing" on a nanoscale may present, we can legitimately consider the STM a new exemplary case that Hacking could assume for substantiating his "entity realism". First proposed by George Gamow in 1928, the tunnel effect is one of the most spectacular, paradoxical results due to quantum mechanics. According to quantum laws, low energy subatomic particles can tunnel through a potential barrier even if its actual gap, in classical mechanics, prevents them from jumping over: they simply disappear and then reappear on the other side. The tunnel effect explains phenomena like alpha decay: the alpha particle of the element radium can move itself away from the atomic nucleus and through the outside of the nucleus, so appearing in a place where energy cannot move it. Philosophical thinking from Wittgenstein to Hacking states that applying a theory in order to explain the behaviour of an entity does not provide us with knowledge about its
c
In 1993, Donald M. Eigler, at IBM, using the STM tried to move several dozen cobalt atoms on a copper surface into an ellipse-shaped ring (major axis of 20 nanometers, minor axis of 10 nanometers) and found that the electrons in the nanoscale range waved with an intensity just like that on the surface of water, directly demonstrating that they produce a wave pattern as predicted by quantum mechanics.
285
reality. The eigenvalues obtained by solving the relevant Schrodinger equation give us an estimate of the probability that particles might pass through a potential wall and base their strange conduct on the double nature (particlewave) of any quantum entity. But we can infer their reality neither by this elusive property nor by experimenting on the tunnel effect using laboratory practices detached from our quantum theoretical knowledge. Nevertheless, inducing and controlling quantum tunnelling phenomena provides technical applications such as the STM. A probe tip, which ideally is atomically sharp, is moved over a surface, a small voltage differential having been placed between the two parts. The potential barrier -isolating probe from surface- should prevent any electron flow, unless a tunnel effect occurs. The tunnelling electrons give rise to a small electrical current, whose intensity can be measured with great precision. Quantum theory predicts that the tunnelling current is very sensitive to the tip/surface distance, being proportional to the inverse of the distance squared. If one scans the tip across the surface, the distance will vary and so will the current: a feedback control of the tip's location allows measuring this distance. «Typically, a topographic image is produced by running the tip back and forth over the sample surface such that, by means of an electronic feedback loop, the tip is moved up or down to keep the tunnelling current -and consequently the tip's distance above the surface- at a constant value», as David Baird and Ashley Shew explain (Probing the History of Scanning Tunnelling Microscopy, in [7], p.5). The correlation between tip position and current intensity can thus be used to produce the image of the sample surface. «One can compare STM to Braille reading or the way the tumblers in a lock 'read' the key shape. STM relies on the phenomenon of electron tunnelling to image surfaces* (ibidem). It becomes possible to manipulate electrons in a way rather uncommon with respect to basic quantum mechanical laws, and to use them as tools to achieve topographic information about the surface we are "seeing" by means of the STM. Following Hacking's claim that some entities are rightly believed to exist when they have attained the status of experimental devices, we can say that exploiting the motion of a wave packet, which can pass through a potential barrier, implies an acceptance of its reality. This application of Hacking's view to the STM allows us to assign the same ontological status to macro-objects and particle-wave entities. This is an embarrassing way to characterize "reality", at least for those scientific realists who possess a notion of reality like that criticized by Paul Feyerabend: «Those of them who pay attention to the results of anthropologists and classical scholars may admit that immaterial entities did
286 appear and that Gods did make themselves felt; they may admit that there are divine phenomena. But, they add, such phenomena are not what they seem to be. They are 'illusions' and, therefore, do not count as indicators of reality» [8], p.246. Using the change in the intensity of the tunnelling current as a tool to obtain images with a resolving power of one atom prevents realists from asserting that there is a significant difference between theoretical entities (such as the wave function of the electron) and observable ones (like subatomic particles). And this conclusion is not reached from the realistic point of view, according to which some theoretical entities become visible, but from the instrumental line of reasoning, which stresses that a reliable exploitation of something implies supposing it is real. When we use, as a tool for "quasi-visual" inspection, the finite probability that the wave function of an electron tunnels through the potential barrier it encounters, according to Hacking we must be completely convinced of the reality of the wave function of an electron truly spread out (not just hidden or unmeasured). The wave function can no longer be regarded as describing «divine» phenomena that «do not count as an indicator of reality», according to Feyerabend's quotation. The "real world" is incredibly rich; it consists of countless kinds of things, depending on the different ways in which reality constructs its qualities responding to the plurality of our inquiries. 3. In search of a demarcation between given experience and technological manipulation The STM gives a still deeper challenge to the idea of human beings answering to an independent authority called Nature (a point of view strictly related even to the instrumentalist position). The key problem is to distinguish "real entities" from artefacts of the observational procedures. Hacking's argument depends on a thesis about experimental practices that endorse the notion of laboratory investigation as attempts to create physical effects, instead of passively observing natural phenomena. According to Hacking, we see with a microscope. Our confidence in the reality of what we see does not settle on the image itself but is a consequence of the various ways that we interact and interfere (via some causal links well defined by theories) with the specimen so observed. «Don't just peer, interfere*, he urges in [9], p.308. However, thinking of scientific explanations as dependent upon "intervening" instead of relying on "representing" does not prevent us from separating attempts to learn how things really are and efforts to design and assemble artificial entities. Underlining, in Hacking's words, that we do not see
287
through a microscope but we see with it should in any way maintain the difference between two quite distinct kinds of "intervention" on nature. The first is intervening as arranging a suitable laboratory experiment to isolate some factual effect, taking advantage of our ever increasing technical ability to use some physical entity as a tool. The second is intervening as the practical and tangible result of any theoretical investigation enabling to conceive and design artificial devices whose performances seem only limited by the need of avoiding what physical laws forbid. Hacking devotes all his attention to experimentation as the use of something understood in nature in order to prove something that is not. Consequently, for him experimentation considered as the use of something understood in nature in order to design a new artificial device does not seem so philosophically crucial. Even if experimenting on micro-entities cannot directly establish their existence and nature, we are now gaining the means to construct them directly. And if we can construct them, they are real! Clearly the carpenter is sure that the table he has made is "real", and the same holds for the nanotechnologist manipulating atoms into new arrangements. To directly detect something is equivalent to getting information about its characteristics by observations and instruments, and both are theory laden. On the contrary, we know so completely the properties of what we ourselves have made that we can be sure even of the causal links that relate the final products to the procedures and tools we used to obtain it. This is the conceptual background from which Hacking's considerations start: experimental apparatus is nothing else than a peculiar kind of man-made equipmentd. 3.1. The Scanning Tunnelling Microscope as a tool for manipulating individual atoms To push resolution down to the microscale of the individual atom could seem a chimera due to the uncertainty principle, a fundamental tenet of quantum mechanics. In accord with Heisenberg's equations, no optical reflection microscope -however accurate- can be useful to observe an atom, since we have to illuminate it with a very short wavelength beam which induces unpredictable d
Hacking's position about the thesis advanced by those authors that regard natural sciences and their experimental apparatus as social constructs is chiefly depicted in [10]: «I try to make sense of the claim that something can be both real and a social construction*, p.68 For the purposes of this paper, it seems worthy of note to underline that «Kant was the great pioneer of construction*, [10], p.41, and that a deep historical background to the movement of constructivism relies on the Gianbattista Vico's thesis (we better understand what we ourselves have made).
288
shifts of the particle, so that no decision about its position can be assumed with certainty. This is true for free atoms, but does not apply for the case of atoms embedded in a solid. The photon sent to determine the position of an individual atom might nudge it, according to the uncertainty principle, but if it belongs to an organised structure the neighbouring atoms will push it back into place. Thus to observe and even manipulate individual atoms in a piece of material seems possible in principle. Feynman was the first to underline, in his famous talk, that -in spite of its counterintuitiveness- manipulating atoms (that are typically a few tenths of a nanometre in size) does not need new physics. «I am not inventing anti-gravity, which is possible someday only if the laws are not what we think. I am telling you what could be done if the laws are what we think; we are not doing it simply because we haven't yet gotten around to it» [1], p.4. Nevertheless, we cannot use traditional microscopes to "see" atoms, as a consequence of several defects in their structure and optics, which further degrade the theoretical limit of their resolving power (set by the wavelength of the illuminating light). In the visible region of the electromagnetic spectrum performances are well beyond what would be required. Scientists and technicians interested in increasing resolution in observing and studying material surfaces have introduced and developed several devices which adopt, for target illumination, beams of energised particles (like electrons or ions), aiming to shorter and shorter wavelengths. Basically, all these instruments derive from Braggs' fundamental work [11]. We can characterize these instruments as "model based", since what they detect (a beam deflection angle, averaged on a relatively large area) is used to identify some unknown average parameter (e.g. lattice constant) of a theoretical model (crystal arrangement) whose general structures are supposed in line with our previous knowledge. The desired information about the surface structure of the crystalline material observed is indirectly deduced from the identified model. The STM is a quite different kind of device in many ways. Using the vacuum tunnelling of electrons to study the surfaces of materials is quite different from using a particle beam and look for its behaviour after deflection: it involves moving a tip over a surface to obtain local, direct topographic information about it. To produce an image of the topography of a surface using a tactile practice can be considered as a perceptual ability untill we contemplate macro-dimensions. But the image produced on a computer screen by running the scanning tip back and forth over the surface in order to keep the tunnelling current is the result of active manipulation, in a manner consistent with quantum mechanics. The quantum explanation of the tunnelling process, developed in 1983, shows that it is not just a question of "feeling" the topography of the
289 underlying surface, but rather a result of the overlap -with the greatest proximity- of electron orbitals of the upper atom in the sharp tip and of some other atoms on the sample surface. In the STM case, we cannot touch without interacting. We cannot observe without interacting. 3.2. Shaping the world atom by atom What we could simply define as "seeing with a scanning tunnelling microscope" is not appropriately described as "gathering any kind of experimental evidence". The STM allows us to manipulate unobservable entities, that opens significant new perspectives, not included in the mere interference with unobservable entities to get observable results. In his talk, Feynman already envisaged a time when atoms could be rearranged to order6. The STM is an essential research tool in nanotechnology to pursue this goal. The tunnelling current used has enough energy to manipulate atomsf. D. M. Eigler and E. K. Schweizer, IBM researchers, carried out an experiment in which the STM was used to position 35 individual xenon atoms on the surface of a low-temperature single crystal of nickel to spell out the letters I B M s . Besides creating in this odd way their company's logo, they -as other research groups- have created "artificial" molecules, an atom at a time. Using nanotechnologies, scientists are abandoning the traditional assumption that they understand and explain a nature that is simply given. Instead, they embrace the project of remodelling or transforming it. In the wake of Feynman's lecture -a defining moment for nanoscience- they are taking a "bottom-up" approach to experimental research rather than the traditional "topdown" approach, which involves successive miniaturisation of macrooperations'1.
e
f
g h
According to Feynman, «it would be, in principle, possible (I think) for a physicist to synthesize any chemical substance that the chemist writes down. Give the order and die physicist syndiesizes it. How? Put the atoms down where the chemist says, and so you make die substance* [1], p.]2 By expanding the principle of the STM, Binnig and his group developed also the Atomic Force Microscope in which the atomic force between the probe and die sample surface is used in place of the tunnelling current. The now well known image was first published in 1990 in die journal Nature «Nanotechnology should be recognized as a basic technology common to all atoms, bits and genomes (materials, data and genetic engineering) as it may result in die convergence of traditional top-down technology (miniaturization) with the newly developed bottom-up technology. The development of new properties incorporated in nanoscale structures -as well as functional materials and devices through the bottom-up approach- is now swiftly beginning to take shape and is no longer just a dream of the future. In short, a paradigm shift in advanced technologies through nanotechnology is steadly developing* [12], p.2.
290 Controlling matter at atomic or molecular levels means tailoring the fundamental properties belonging to phenomena. To build electronic devices using atom-by-atom engineering signifies manipulating and keeping stable the interaction between atoms and molecules; it means to explore the same scale dimension at which all natural material and systems establish their foundation. This new methodological paradigm makes it clear that the usual divisions between "basic science" and "engineering" are no longer applicable. Also, in principle, distinguishing physical effects from artefacts becomes quite ambiguous1. For example, a great challenge for nano-manufacturing technologies that will support tailor-made products having functionally critical nanometre scale dimensions is to be shaped using self-assembly. Manipulation of nano-structures using the STM requires a very long time, so the ultimate solution seems to be self-assembly, the most fundamental process for forming a functional and living structure'. Conceptually, moving from small to larger size and creating new matter by combining atom with atom, one atom at a time, molecule to molecule is an imitation of natural processes. To mime nature is the ability that has marked several technical developments, so we might think that pursuing this goal is really nothing new. Nevertheless, science-based technology is a result of scientific abstraction and symbolism; it is far from being largely perceptionbased and it makes it possible to control nature on a hitherto unimaginable scale and even to take its place in governing phenomena. So, in exploring bottom-up, self-organising and self-assembling routes, nanoscience implies radical change in the philosophical analysis of what we consider technological and what we define ' To keep a valid criterion of demarcation between natural and artificial for what concerns nanotechnologies turns out to be a quite problematic undertaking. As Gregor Schiemann admits: «Building upon the narrow sense, I proposed an epistemic criterion according to which an object is natural if it is impossible -using all available scientific methods at a given time- to ascertain that it was produced by human action. This criterion makes it possible to distinguish analogously to the Turing-test of artificial intelligence- between natural and artificial components of most nanotechnological processes and products. Given the multifariousness of the relationship between nanotechnology and nature, there are cases where it becomes problematic to distinguish between the two. I assume, however, that these cases are exceptions. Nanotechnological objects are mostly hybrids of nature and art; only in a few cases would they be said to be wholly natural because their artificial origin could no longer be confirmed.», Nanotechnology and Nature. On Two Criteria for Understanding Their Relationship, in [ 14] pp.77-96. Many interesting suggestions about the provocative challenges nanotechnology addresses to the philosophy of science and philosophy of technology can be found in the Special Issues on Nanotech Challenges jointly published by Techne and Hyle (ref. [13] and [14]). ' Recent progresses in this field is reviewed in [15]. It has been calculated that, if a device has a feature size of 5 nanometers and a scanning tip can move 10 atoms per second, it will take about 6 months to build 1012 devices on an 8-inch wafer.
291 natural. Artefacts produced by nanotechnologies and natural objects articulate analogous dynamics; they can be regarded as parts of a structurally identical whole. 4. Conclusion The breakthrough that technology provided with the development of the STM is closely connected with the dissolution of two conceptual guidelines that seem to be essential in our everyday thinking. The use of the tunnel effect as a tool for inspection purposes introduces quantum wave function and superposition of possible position states of a particle as elements of physical reality (undoubtedly, a quite disturbing intrusion). Moreover, to improve technological applications at the nanoscale with the aid of the STM, in order to realise very special processes of "material forming and machining", means to mobilise for this aim uncommon but effective self-assembly phenomena. And since this emergence of order at the atomic level applies both to natural dynamics (e.g. in molecular physics) and to small systems created employing nanotechnologies as well, the difference between natural and man-made products fades away. Due to both these points, troublesome problems arise in the philosophical foundations of nanoscale research and applications. It becomes difficult to provide bases for maintaining a difference between experimenting on entities (properly interpreting the way in which STM interacts with the sample) and constructing some artefact (manipulating the same elementary building blocks that nature employs and leaving them to self-organise and reproduce). Upsetting questions, which venture to open further puzzling queries in the never-ending dilemma about what "reality" is. Acknowledgments I am extremely grateful to Thomas Nickles for his improving comments and suggestions, as well as to Riccardo Zich for his useful remarks. References 1. 2. 3. 4.
R. Feynman, "There is Plenty of Room at the Bottom: An Invitation to Enter a New Field of Physics", Engineering and Science, February (1960). G. Binnig, H. Rohrer, "Scanning tunnelling microscopy", IBM Journal of Research and Development, 30, 355-369 (1986). I. Hacking, Representing and Intervening (Cambridge University Press, 1983). I. Hacking ed., Scientific Revolutions (Oxford University Press, 1981).
292
5. 6. 7.
8. 9. 10. 11. 12. 13. 14. 15.
L. Wittgenstein, Tractatus Logico-Philosophicus (Routledge & Kegan Paul, London 1922,1955). N. R. Hanson, Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science (Cambridge University Press, 1958). D. Baird, A. Nordmann, J. Schummer and A. E. Schwarz, eds., Discovering the Nanoscale, International Conference (Darmstadt Technical University 2003). P.K. Feyerabend, Conquest of Abundance (University of Chicago Press, 2000). I. Hacking, "Do we see through a microscope?", Pacific Philosophical Quarterly 62, 305-322 (1981). I. Hacking, The Social Construction of What? (Harvard University Press, 2000). W. H. Bragg, Concerning the nature of things; six lectures delivered at the Royal Institution (G. Bell, London, 1925). N. Ikezawa, "Nanotechnology: Encounters of Atoms, Bits and Genomes", Nomura Research Institute Papers 37, December 1 (2001). "Nanotech Challenges", Techne Special Issue Part I, 8 (2), Winter, 2004; Part II, 8 (3), forthcoming. "Nanotech Challenges", Hyle Special Issue Part I, 10 (2) (2004); Part II, 11 (1) (2005). Z. L. Wang, "Self-assembled nanoarchitectures of polar nanobelts/ nanowires", Journal of Materials Chemistry 15,1021-1024 (2005).
MATHEMATICAL MODELS AND PHYSICAL REALITY FROM CLASSICAL TO QUANTUM PHYSICS ARCANGELO ROSSI* Dipartimento di Fisica dell'Universita di Lecce, Via per Arnesano, I - 73100, Lecce, Italy The concept of physical object in twentieth century's physical theories (relativity and quanta) turns from intuitive-substantial to formal-functional, that is to a relatively invariable properties system well beyond the intuitive properties "carrier" typical of classical physics (CP). Anyway, the autonomous heuristic fecundity of this formal mathematical model neither was fully shown before the twentieth century nor, moreover, is independent of the properties it correlates. In quantum mechanics (QM) the turn to such purely functional correlation, which is nevertheless irreducible to mere information related to measurements, is historically evidenced by the fact that the formalization of the theory as an operator calculus in Hilbert space came before its physical interpretation. Dirac, moreover, asserted the irreducibility of quantum properties to mere measurements, just because this simply pragmatic reduction is poor and ambiguous in front of a mathematical formulation of linguistically expressible (even though not always experimentally decidable) properties. Neither a falsificationist interpretation of scientific theories, which makes them empirically decidable even though irreducible to mere instrumental theories, fits the purpose. It is then safer, and in better agreement with historical testimony, to admit not only the existence of properties which are measurable and reducible to instrumental operations, or at least falsifiable, but also the existence of properties which are physically inaccessible or empirically untestable, since properties of this kind are described and fruitfully implied by the mathematical model.
1. Introduction The concept of physical object was radically modified through the shift from classical to modern physics (relativity and quantum theories).1 The transformation of the physical object from substance to function which was theorized by Cassirer2 as typical of modern science in general, even since Galileo, was instead historically determined and clearly expressed by the twentieth century's physics scientific revolutions. The fact that this transformation has been identified with the essence of modern science by Cassirer, and even before him, implied its misunderstanding
E-mail: [email protected].
293
294 as an affirmation of vague empiricism and anti-ontological phenomenalism, supposed characteristic of modern science in general in front of ancient and medieval natural philosophy.3 On the other side, as history of modern science cannot be identified with empiricism and phenomenalism in general, even less the transformation from substance to function can be identified with a positivistic stereotype: for, it must be more correctly traced back to the rise of relativity and quantum theories, which were, in any case, neither merely phenomenological nor anti-ontological theories. A more precise definition of that transformation is then required in the light of the effective historical process of scientific change then taking place. Only after the birth of relativity and quantum mechanics (QM) it had been possible to completely reduce the physical system to a non-empirical, mathematical invariant, no longer built up on the base of physical analogies with pre-established models derived from concrete physical objects (as, for example, rigid bodies or oscillating systems), but rather as a functional structure. This structure is obtained by extending the properties of those physical objects themselves, in a generalized way, to new formal connections of functional type, even unconceived before. These are irreducible to intuitive representations (in particular, of geometrical type) of those physical objects themselves whence they have nevertheless been derived in the last instance. The objective reference nuclei of new relativity and quantum laws are in fact no longer traditional intuitive "carriers" of properties, but become functional connections, more or less stable, of those properties. In short, the physical system is no more an object independent of its properties in the new physics: instead it becomes a functional correlation of properties fully expressible in mathematical terms, such as the relativistic invariants of Lorentz transformations or the invariants of Hilbert continuous transformations.1 Of course, the above remarks do not mean that there was no tendency in classical physics (CP) to reduce physical systems to mere mathematical functional connections of properties. Rather, it means that such tendency, though present, was overcome by the opposite prevailing tendency to interpret the physical system by tracing it back to its analogical representations of substantive character, through true processes of identification with concrete reference physical objects. Lagrangian and Hamiltonian formalizations of CP (as later Dirac made explicit by showing that Hamiltonian mechanics was the basis for his formalization of QM4) were no doubt strongly oriented towards a purely functional representation of physical bodies, so meant as pure mathematical correlations of properties (typically, space-time position and quantity of motion): but, in fact, they anyway identified those correlations with intuitive realities
295
irreducible to mere formal correlations, such as material dots, rigid bodies, oscillating systems. The point is that the physical object was then not yet liable to be reduced, notwithstanding existing tendencies in that sense, to a purely formal cluster of properties, no longer appealing, for the development of knowledge, to concrete physical analogies beyond the mathematical representation. The reason of this was the enormous heuristic efficacy of the physical analogy, which appeared, at that moment, irreplaceable. Only afterwards the mathematical model acquired a new efficiency in suggesting new developments of knowledge and succeeded in replacing the heuristic function of the old physical analogy, so appearing as a formalism and language which was nevertheless heuristically fertile.1 2. States and properties The fading away of the heuristic role of substantive physical analogies that previously appeared as a necessary presence in physics does not imply, in modern physics, the end of the distinction between states and properties, which is instead still necessary. In fact, it is not fit, though the tendency is well represented,1 to reduce those terms to one while calling in question substantive analogies and physical models in general, and then the correlated conception of physical object. The mathematical functional connection in which the new concept of physical object or system (the former simply meant as individual sample of the latter) consists is just a function which can change, but owes its relative invariance, stability and nomological legality and objectivity to its constituting elements. These constituting elements are the properties rather than the states of the system itself. Indeed, if properties (such as being red, having a space-time position or a mass, a quantity of motion, a charge) distinguish one from another by their distinct, though correlated definition domains (quite different anyway from their effective measurements), states, according to the standard interpretation of QM, only express the quantity of information which is effectively accessible to us relatively to the properties themselves. Thus, a "pure state" is just meant in QM as a maximum accessible information, which cannot be simply identified with the objective reality of properties as such, as already stressed by Einstein et al. to declare the "incompleteness" of QM.5 Identifying the two terms leads one to renounce that minimal realism of properties which is semantic rather than ontological and does not accept to identify reality (meant as an ensemble of properties each of which can be known even if not all properties, although they are real, can be conjointly known) with its effective knowledge. The latter is indeed limited to the information content, even maximal, but not necessarily exhaustive, that we can actually attain. A property is then what characterizes the system, just meant as a functional, more or less variable
296 nucleus of properties, a state on the contrary expresses the quantity of information we have about the system, that is the set of values that the properties which constitute the system in fact reveal step by step (even if a notion of state in terms of preparations can be introduced which allows one to define physical objects also as acts of preparation6). To renounce the distinction between states and properties means to accept the point of view of standard or "orthodox" interpretation of QM, according to which the object of knowledge is substantially identified with its effective knowledge (following the "philosophy of observables" that Bohr objected to Einstein7). A more realistic and modest view of knowledge counteracts this one in terms of a semantic realism of properties. This, however, also counteracts the ontological view that was dominating in CP, which conceives physical systems as in se or absolute objects, the so called properties "carriers" independent of the properties themselves. These instead, whenever they are not identified with the states of the system, can give much more consistency than the states to the system itself, which is then simply meant as a functional, more or less stable, correlation of properties. 3. Von Neumann's formalization of Quantum Mechanics It follows from Sec. 2 that the turn in the conception of physical system took place in terms, on one side, of irreducibility to elementary atomic objects, according to an idea of system as functional correlation, and, on the other side, of irreducibility of the system's properties to the mere information obtained by measuring them. This fact is evidenced, in particular in QM, by the most important non-relativistic mathematical formulation of the theory. This was formulated by von Neumann as an operator calculus in Hilbert space which completely preceded its physical interpretation.8 According to Jammer,9 it represented, from this point of view, a unique case in history of physics, also in view of the unquestionable success it achieved. The formal structure of the mathematical correlations of the theory came before their physical interpretation and application rather than being drawn out from particular measurements, limited to the information relative to accessible empirical values of the correlated properties. So, to be accepted, the formalization of QM realized by von Neumann had to respect some peculiar formal characters of quantum properties even before waiting for concrete empirical confirmations, as it preceded them. These characters can be identified in particular with the non-commutativity of quantum properties, or with their intrinsically probabilistic character. In any case, they are introduced independently of any intuitive representation of objects constituting their imaginary "carriers" as completely separated individual objects, avoiding,
297 at the same time, to identify them with the effective information (in many cases, as we have seen, irremediably limited) which can be obtained on them by means of measurements.8 True, von Neumann, in agreement with the widely shared "orthodox" interpretation to which he himself contributed, supposed, in the last instance, that the concepts of state and property collapsed, introducing an operational reduction of observable properties to the knowledge of them resumed in the state.8 Notwithstanding this, a consequence was inevitably intrinsic to the mathematical model adopted by him, which contradicted that reduction: the existence of "superselection rules" which put in question the so called "projection postulate", according to which the measurement of an observable property defining the state of the system, by representing the maximum accessible information on it in certain conditions (its eigenvalue), verificationistically also exhausts the meaning of the measured property itself. In any case, those rules were derivable in Hilbert theory well before and independently of von Neumann's controversial physical interpretation.9 Moreover, von Neumann's interpretation left crucial concepts as probability and measurement open to further different interpretations. In particular, the concept of measurement was largely ambiguous and complex, since the tendency to collapse properties with states, thus totally reducing concrete physical properties to mere experimental measurement procedures, appeared in it most evident, extremely controversial and to be still demonstrated in agreement with all the relations expressed by the axiomatically adopted mathematical Hilbert formalism.9 4. Dirac's formalization beyond pragmatism and falsificationism Which was then the effect of the new trend of mathematical modeling in QM in absence of substantive reference models? It was a proliferation, a true panoply of formal models in order to account for quantum properties in conceptually and formally different (though quantitatively coincident or equivalent) ways. Then, different mathematical formal correlations were built which were empirically indistinguishable in the last instance. The above remark is confirmed by the fact that, soon after and independently of von Neumann, Dirac advanced a different, very compact and smart formalization.10 This was anyway criticized by von Neumann for its lack of rigor, in particular because it introduced functions which were considered "improper" and liable to conflicting physical interpretations, as the famous
298 "delta-function" (though afterwards Schwartz reformulated its definition in a rigorous and univocal way with his theory of distributions9). Analogously, after the first more physical and epistemological interpretations of the theory by Heisenberg, Bohr and Schrbdinger, further formulations were given, some of which more operational and formal in style, as Jordan's, Weyl's, Wigner's and Stapp's so called "S-matrix". Other formulations, in terms of "second quantization", or Feynman's "path integrals",9 exalted the role of probability well beyond the previous statistic and probabilistic interpretations, considering any quantum phenomenon as a discrete probabilistic one, even if apparently purely ondulatory and continuous.11 Dirac, however, contrary to von Neumann, strongly affirmed the irreducibility of quantum properties, functionally correlated, to their measurements. More specifically, he upheld the distinction between classical and quantum properties, that he called "c-numbers" and "q-numbers", respectively. The properties of both kinds were unidentifiable with mere measurements: rather, these were characterized by algebraic formal properties, such as the non-commutativity of the latter. Moreover, in Dirac's formalization even more sharply than in von Neumann's, the power of mathematics (just as in the case of the famous "delta-function" mentioned above) made ambiguous, in its strong abstractness, the physical interpretation, as the problem of interpretation could even less be solved through concrete physical analogies and substantive models.12 Moreover, the appeal to instrumentation as a "deus ex machina" of the reduction of properties to states of the system in order to overcome the difficulty (notwithstanding the irreducibility of properties to simple measurements, as seen above), though entrusted not only to simple empirical procedures but also to true instrumental theories and theories of instrumentation, does not look sufficient. We must in fact add other theories beyond instrumental theories and theories on instrumentation to give a physical meaning to phenomena, theories which don't define the system's properties as mere measurements or instrumental interventions. Only by acknowledging this new degree of freedom one can indeed avoid two reductive converging views: pragmatism and falsificationism. Pragmatism is, even in its most rigorous operationalist expressions, contradicted by the fact that instrumental intervention and measurement alone are often poor and ambiguous, at variance with the mathematical formulation, provided that this is linked to specific properties expressed by the formalism through their correlations (anyway not always experimentally decidable, though formally well expressible).
299 Popper's falsificationism, in its turn, appears contradictory. Indeed it admits that there are facts untranslatable into immediate experimental data, consisting in properties expressed by theories which are irreducible to instrumental theories. On the other side, though acknowledging the speculative and creative character of these theories, it also admits that they are in fact in science always empirically decidable (falsifiable) through their experimental consequences.15 As a counterexample, one could quote the magnetic monopoles mathematically introduced by Dirac,16 whose instrumental tests did not succeed in yielding non ambiguous evidence or refutation,17 notwithstanding Popper's belief in the possibility of deciding any question in science on the base of falsificationism. There is in fact within falsificationism a kind of subtle dogmatism, since it attributes the capacity of selecting (deciding on) even the most abstract and speculative scientific theories to a unique decision method based on mere empirical consequences. Thus, we conclude that it would be better, in agreement with historical evidence though contrary to standard or "orthodox" QM, to admit that there are further properties besides those that are measurable, or reducible to experimental operations, or at least falsifiable. To be precise, properties which are described or fruitfully implied by a mathematical model taken as starting point, although not always physically accessible and empirically or instrumentally decidable (such as properties which are not precisely measurable simultaneously to others in QM, or Dirac's magnetic monopoles). In fact, as underlined by Galison,18 it is impossible to say in general when experiments end, because there is no absolutely certain general decision rule in testing physical theories: these are not always strictly decidable or reducible to mere instruments or operations, though they may be empirically and experimentally quite fertile. References 1. S. D'Agostino, Physis 40, 219 (2003). 2. E. Cassirer, Substance and Function and Einstein's Theory of Relativity (Dover, New York, 1953). 3. E. Mach, The Science of Mechanics (Chicago University Press, Chicago, 1893). 4. P. A. M. Dirac, Proc. Royal Soc. A109, 642 (1925). 5. A. Einstein, B. Podolsky and N. Rosen, Physical Review 47, 777 (1935). 6. C. Garola and S. Sozzo, here. 7. N. Bohr, Physical Review 48, 696 (1935). 8. J. von Neumann, Mathematical Foundations of Quantum Mechanics (Princeton University Press, Princeton, N. J., 1955).
300
9. M. Jammer, The Philosophy of Quantum Mechanics (John Wiley & Sons, New York, 1974). 10. P. A. M. Dirac, The Principles of Quantum Mechanics (Clarendon Press, Oxford, 1930). 11. R. P. Feynman, QED. The Strange Theory of Light and Matter (University of California Press, Berkeley, Los Angeles, 1985). 12. O. Darrigol. From c-Numbers to q-Numbers. The Classical Analogy in the History of Quantum Physics (University of California Press, Berkeley, Los Angeles, 1992). 13. S. D' Agostino, here. 14. P. W. Bridgman, The Logic of Modern Physics (The Macmillan Company, New York, 1927). 15. K. R. Popper, The Logic of Scientific Discovery (Basic Books, New York, 1959). 16. P. A. M. Dirac, Proc. Royal Soc. A133, 60 (1931). 17. H. Kragh, Studies Hist. Phil. Sci. 12 (1981). 18. P. Galison, How Experiments End (The University of Chicago Press, Chicago, 1987).
COMPLEX E N T A N G L E M E N T A N D Q U A T E R N I O N I C SEPARABILITY*
G. SCOLARIClt Dipartimento di Fisica dell'Universita di Lecce and INFN, Sezione di Lecce, 1-73100 Lecce, Italy L. S O L O M B R I N O * Dipartimento di Fisica dell'Universita di Lecce and INFN, Sezione di Lecce, 1-73100 Lecce, Italy
We consider the evolution of an entangled state of a simple compound system made up by two spin \ systems both in complex and in quaternionic quantum mechanics. We show, by using a recent remarkable result on quaternionic maps by Kossakowski, that the initial and final matrices associated with a component subsystem can be seen as the complex projections of quaternionic pure states connected by a unitary evolution operator. Furthermore, the state of the compound system looks like a separable state in quaternionic quantum mechanics.
1. Introduction The essential difference in the concept of state in classical and quantum mechanics is clearly pointed out by the phenomenon of entanglement, which may occur whenever the product states of a compound quantum system are superposed. Entangled states play a key role in all controversial features of QM; moreover, the recent developments in quantum information theory have shown that entanglement can be considered a concrete physical resource that it is important to identify, quantify and classify. The usual techniques devised to this end rely on the concept of density matrix. 1,2 The state of a quantum system £ whose state space has finite dimension n can be represented by a n x n quantum density matrix p, or equivalently, by an Hermitian, positive (i.e., all its diagonal matrix 'Partially supported by PRIN "Sintesi". tE-mail: [email protected]. tE-mail: [email protected]
301
302
elements, in any basis, must be nonnegative) operator of trace class (in particular, of unit trace; for the sake of brevity we do not distinguish between operators and their representing matrices in the following). Any external action which changes the state of E can be represented as a mapping As of the state space into itself,3 hence As must be positive (i.e., it must preserve positivity of operators). Yet, positivity is a necessary but not sufficient condition for a given map As to describe a physical process. In the case of open system dynamics, one usually asks for a more stringent condition than positivity, namely complete positivity. Essentially, the requirement that As is completely positive (CP) guarantees that (for any n) the map As I n , where I n is the identity map acting on the states of another n-level system E n , preserves the positivity of all states of the compound system E + E n . The physical argument supporting complete positivity is that one cannot exclude that the system E might have interacted in the past with another n-level system E„. In this case one should consider the two systems together, even though only one of them has a non-trivial evolution described by the map A E , while the other is dynamically inert (note that, if As is not CP, the only states of E + E n that may develop negative eigenvalues under AE ® I n are the entangled states). More generally, the completely positive maps are positive maps satisfying the condition that their tensor multiplication is again positive; they can be characterized as the convex set generated by the maps of the form:2 P^SpS^
(1)
where S is a linear operator and * denotes the Hermitian conjugation. However, unitary evolution of compound systems may lead in standard QM to an evolution of the density matrices associated (via partial trace) with the component subsystems that neither is unitary not described by CP maps. The role and the physical interpretation of the maps which are positive but not CP are still under investigation.4 Some hints in this direction can be provided by a recent result on quaternionic maps. 5 Following the suggestions in Ref. 5, we study in this paper an entangled state of a system made up by two 2-levels subsystems and discuss the evolution of this particular state in standard (complex) quantum mechanics (CQM) as well as in quaternionic quantum mechanics (QQM), where a Hilbert space over the skew-field Q represents the set of states of the physical system. The plan of the paper is the following. Firstly, we collect some basic notations and results on QQM (Sec. 2). Secondly, we recall and quickly
303
illustrate a general result by Kossakowski 5 on quaternionic maps and their complex projections (Sec. 3). Thirdly, we apply these concepts to study a simple situation, considering an entangled state of a physical system made up by two different spin | systems: we show by means of a simple exercise that the evolution of the subsystems in QQM can be suitably described by quaternionic unitary (hence, completely positive) maps, and that the complex (not completely positive) maps acting on the reduced density matrices in CQM are just the complex projections of the former (Sec. 4). Finally, we provide a quaternionic description of the state of the compound system as a separable state (Sec. 5) . 2. Density matrix formalism in quaternionic Hilbert spaces We recall here some basic notations. A (real) quaternion is usually expressed as q = qo + qii + qii + q$k where qt G 1 (I = 0,1,2,3), i2 = j 2 = k2 = - 1 , ij = -ji = k. The quaternion skew-field Q is an algebra of rank 4 over R, non commutative and endowed with an involutory anti-automorphism (conjugation) such that q-+q = qo-qii-
q2j - q$k
In a (right) n-dimensional vector space Q" over Q, every linear operator is associated in a standard way with a n x n matrix acting on the left. Moreover, in analogy with the case of vector spaces over C, one can introduce the concepts of unitarity, Hermiticity and so on. The density matrix p* associated with a pure state | / ) belonging to a right quaternionic Hilbert space ~hfi is defined by
Pf = l/X/l
(2)
and is the same for all (normalized) ray representatives. The definition of the density matrix associated with mixed states is given in a standard way. Denoting by Re TrA the real part of the trace of the linear operator A (notice that the real trace enjoys the cyclic property: 6 KeTrAB = ReTrBA), the expectation value of a quaternion self-adjoint operator A can be expressed in terms of pf as follows7 (A)f = (f\A\f)
= ReTr(A\f)(f\)
= ReTr(APf).
(3)
304
The time evolution equation for p. reads 7 ^pf
= -[H,pf],
(4)
where H is the quaternionic anti-Hermitian Hamiltonian operator. Hence, by using Eq. (4) and the cyclic property, one obtains the time evolution equation for {A)f.
!<">'= R e T r {(w+[iUl)*}=(£+lS-A]),
•
(5)
Moreover, for every linear operator A and density matrix p, let us put A = Ac + jA and p = pc + j'p where Ac, A, pc,7> are complex matrices (hence, Ac and pc are the complex projections of the quaternionic operators A and p, respectively); from Eq. (3) it follows that the expectation value {A)f may depend on i or p only if both A and p~ are different from zero. Indeed, one easily obtains (A)f = Re Tr(Ap) = Re Tr (Acpc ~ A*p)
(6)
where * denotes complex conjugation. As a consequence of Eq. (6) it follows that quaternionic physical effects can be revealed on quantum systems described in a quaternionic Hilbert space only if both the corresponding state and the observable are represented by genuinely quaternionic matrices. On the contrary, if an observable 0 is represented by a complex Hermitian matrix, its expectation values cannot depend on the quaternionic part fp of the state p = pc + j'p ; moreover, the expectation value predicted in the standard (complex) Quantum Mechanics for the state pc coincides in such case with the one predicted in Quaternionic Quantum Mechanics for the state p, since Tr(Opc)
= ReTr(Opc)
=
ReTr(Op).
3. Complex and quaternionic maps In the complex as well as in quaternionic quantum mechanics, the unitary evolution of a pure state p is described by the CP map ACp:p-+p,=UpUi, 2
(7) 2
where U is unitary, so that p = p implies p' = p'. However, as we already observed in the Introduction, if p is a state of a compound system, the evolution of the density matrices associated
305
(via partial trace) with its subsystems is described in CQM by positive but, generally, not CP (hence, not unitary) maps, which only preserve the Hermiticity of the p's. In particular, in the two-dimensional case, it has been proven8 that any positive map must have the form A = A1CP + A2CPoT,
(8)
where T denotes the transposition operation T : P - • PT
and A?CP (I = 1,2) is a CP map. Maps of the form (8) are called decomposable. Furthermore, a remarkable result, due to Kossakowski,5 states that any complex decomposable map (of a complex density matrix) can be seen as the projection of a corresponding quaternionic completely positive map. We can illustrate these results in a special case which generalizes (7), as follows. Let U = R + jS be a quaternionic operator, let p be any complex density matrix and let us associate the quaternionic CP map p —> UpU* to U. By a direct computation one immediately obtains UpU* = RpR) + S*pTST + j(Sptf
- R*PTST),
(9)
Since, trivially, ST = (5*)^, the mapping p -¥ RpRi + S*pTST has just the form (8). Conversely, given the map
p^Rptf
+ SpTSl,
that has the form (8), it can be seen as the complex projection of the quaternionic CP map p -> UpW associated with U = R + jS*. Our example shows that a new physical meaning can be attributed to decomposable maps if an evidence in favour of QQM is provided. In the next section we apply the above results to a simple evolution pattern, obtaining some interesting consequences, even though in a very particular case. 4. Two spin | s y s t e m s in ?£ Q Let us consider a compound system made up by two (different) spin | subsystems, denoted by 1 and 2, in standard QM, let ~HC be the complex Hilbert space of the whole system, and let us suppose that the compound system evolves unitarily as follows,
|+,-)->a|+,-) + 0|-,+>,
306
where a and /3 are complex numbers such that |a| 2 + |/3| 2 = l. The density matrices corresponding to the pure separable state |+, —) and to the pure entangled state a\+,—)+fi\—,+) are given by /0000\ 0100 Pc(0) = 0 0 0 0
Voooo/ and fO 0 Pc® = 0 \0
0 0 0\ \a\2 a{3* 0 0a* \0\2O 0 0 0)
respectively. The density matrices of the subsystems can be obtained by taking the partial traces of pc (0) and pc (t) with respect to the spin variables 2 and 1 respectively:
pS'd) =
$•*•<«>-(?
0
l/5|2
(10)
and PP(O)
=
:i)-^-r:
a •
(11)
Thus, since p[? (t) ^ pc (t) (I — 1,2), we see that the dynamics inherited by each subsystem makes it evolve from a pure to a non pure state. The dynamical evolution of the subsystems cannot be reduced to the dynamics described by a unitary evolution operator on the spaces of the two subsystems. Prom a purely algebraic point of view, unitary maps are indeed congruence transformations (see (7)), and no transformation of this kind (because of the Sylvester theorem 9 ) can connect a semidefinite operator PQ (0) with Pc (£), which is positive definite. Now, let us come to QQM. In order to discuss the same physical system in a quaternionic Hilbert space, we denote by p^> (0) and p® (t) the quaternionic density matrices of the subsystem Z, and we put pW (0) = p^' (0), p(» (f) = pg> (t)+jp{l) (t), where p£}(0) and p^ (t) are given by Eqs. (10) and (11), while pr> (t) is a still unknown complex matrix. We also observe that in l-ft as well as in 'Hc the spin observables are represented by the
307
Hermitian matrices 7
Then, as we already noted in Sec. 2, it follows from Eq. (6) that the expectation values of the spin observables do not depend on the purely quaternionic part of the state. Now, let us suppose that in QQM the evolution is described by the quaternionic unitary operator U:
whose adjoint operator is given by
It is then easy to verify by a straightforward calculation that pU(t)
= tfp(«(0)tft = Uf$Ho)U* = £){t)
+j
( ^ "M)
(14)
and pW(t) = C7p(2)(0)C/t
=
U^(0)W = £){t)+j
(_°^ M )
.
(15)
Recalling Eq. (9), we can conclude that the complex decomposable map Pc (0) —• Pc (*) acting on the density matrix p^.' (0) associated with subsystem I at t = 0 actually is the complex projection of a unitary (hence, CP) quaternionic map. Notice that the (right) eigenvalues of both p^\t) and p^2\t) are 0 and l, 10 ' 11 so that they are semidefinite operators, as one expects because of Eqs. (14) and (15), and of the quaternionic generalization of Sylvester's law. 12 We stress once again that the quaternionic pure states p^ (t) are physically indistinguishable from the complex ones p^(t), as long as we limit ourselves to consider spin variables only. 5. Quaternionic description of the compound system It is well known that in the description of compound systems in quaternionic quantum mechanics the usual definition of Kronecker product of matrices does not hold, and also the standard definition of tensor product of Hilbert
308
spaces cannot be used, owing to the non-commutativity of the skew-field Q (in order to overcome this difficulty, a concept of tensor product of quaternionic Hilbert modules has been proposed, 13 which allows one to describe compound systems on a mathematically well-founded basis; unfortunately, the results obtained in this way do not agree in the complex limit with those of standard quantum mechanics 14 ). Anyway, in the particular case described in Sec. 4 all matrix elements of the quaternionic matrices p^ (t) and U commute, since they belong to C(l, j). This suggests to resort all the same to the usual Kronecker product to describe the compound system. Then, let us calculate the state p(t) = pW (£) <8> p^> (t) and the evolution operator U = U ®U . We obtain:
p(t) =
\aP\2 0 0 M'\ 0 \a 4 -|a/3| 2 0 0 - \a0\2 \Pf 0
M2 0
j m
0
Ml /
(0 |a|2-|/3|20 \ 2 -|a| 0 0 -|a|2 |/3| 2 0 0 |/3| 2
\0
+
2
\a\2-\p\20
(16)
J
and
u=
\P\3 -\<*\jJ " K\P\j 0\3-\<*\j)
/M2 Mi
\a/3\ -N2j \<*P\jl/3|2J V-l/31 2 |a/3|
l/?|2 \ -\a/3\j 1 |2 • — \ot\ J -\<*0\j \a0\ -H2 J \a/3\ \0\2j
(17)
Of course, p(t) = pW(t) ® p(2\t) = (U ® ^)(p^ 1} (0) ® ^ 2 ) (0)) , i.e., the quaternionic density matrix in Eq. (16) has (by construction) the formal structure of a separable state. We are well aware that the above result does not hold in general, since it strongly depends on the form provided by Eq. (13) of the evolution operator U. Actually, it is easy to verify by a direct computation that
309 the most general quaternionic unitary evolution operator U' : p ^ ( 0 ) p(')(t)
and
4 ° ( ° ) ->Pc°(*)
reads
C/' =
j/31113 |a|n 4 y '
where ui, u 2 , U3, U4 are unimodular quaternions satisfying U1U3 + U2W4 = 0. Then, if a different choice is made (for instance u\ = i = —v.2 and 1*3 = j = u 4 ) one obtains that the matrix elements of such operator do not commute any more. Anyway, although we obtained it by an heuristic calculation, Eq. (16) provides a quaternionic density matrix that can be associated with the compound system (in the sense that the complex projections of the partial traces give the correct density matrices of the subsystems) and that it enjoys an interesting property. Indeed, if we consider the partial transpose of p, that is the density matrix pT* (t) obtained from (16) by interchanging the indices referring, say, to the second subsystem:
/M2 0 0
PT2 (*) =
z
\-\ap\ 0
M2\
0 |a/3| 2
a \aP\' l/?|4
0
+ \aPY
l«|2-|/3|20 \ Mo 0 -\a\ l/5|20 0 -|/?|2
/0
3 M
0
)
2
l/?l 2
0
(18)
J
and we solve the corresponding quaternionic eigenvalue problem, 10,11 we see that its eigenvalues are 0 and 1, hence pTs (t) is a positive state. In the realm of complex QM, whenever the subsystems are 2-dimensional, this property is actually a necessary and sufficient condition for a density matrix to describe a separable state. 2 Hence, we get a further argument which supports (together with the formal structure of p{t)) the conjecture that the density matrix (16) actually describes a separable state in QQM. In conclusion, our research has pointed out a puzzling situation, in which the same state of a physical system is entangled in CQM, while it seems to be separable in QQM. We hope that further investigations (on quaternionic
310 maps, and more generally, on QQM) will contribute to throw new light on this problem.
References 1. V. Gorini, A. Kossakowski and E. C. G. Sudarshan, J. Math. Phys. 17, 821 (1976). 2. M. Horodecki, P. Horodecki and R. Horodecki, "Mixed-state entanglement and quantum communication", arXiv: quant-ph/0109124 (2001). Also reprinted in Quantum Information: An Introduction to Basic Theoretical Concepts and Experiments by G. Alber, T. Beth, M. Horodecki, P. Horodecki, R. Horodecki, M. Rotteler, H. Weinfurter, R. Werner and A. Zeilinger (Springer Tracts in Modern Physics, 2001). 3. K. Kraus, Ann. Phys. 64, 311 (1971). 4. V. I. Man'ko, G. Marino, E. C. G. Sudarshan and F. Zaccaria, Phys. Lett. A 327, 353 (2004). 5. A. Kossakowski, Rep.-Math. Phys. 46, 393 (2000). 6. D. Finkelstein, J. M. Jauch, S. Sciminovich and D. Speiser, J. Math. Phys. 4, 136 (1963). 7. S. L. Adler, Quatemionic Quantum Mechanics and Quantum Fields (Oxford University, New York, 1995). 8. S. L. Woronowicz, Comm. Math. Phys. 51, 243 (1976). 9. R. A. Horn and C. R. Johnson, Matrix Analysis (Cambridge Univ. Press, 1985). 10. S. De Leo and G. Scolarici, J. Phys. A 33, 2971 (2000). 11. S. De Leo, G. Scolarici and L. Solombrino, J. Math. Phys. 4 3 , 5815 (2002). 12. F. Zhang, Lin. Alg. Appl. 251, 21 (1997). 13. A. Razon and L. P. Horwitz, Acta Appl. Math. 24, 141 (1991). 14. S. P. Brumby, G. C. Joshi and R. Anderson, Phys. Rev. A 51, 976 (1995).
MACH-ZEHNDER INTERFEROMETER AND QUANTITATIVE COMPLEMENTARITY CARLO TARSITANI Department
of Physics, University of Roma Roma, Italy
'LaSapienza'
FABRIZIO LOGIURATO Department of Physics, University of Trento, 38050 Trento, Italy
Povo
The complementarity principle is often stated as the impossibility to perform an experiment by which both the effects of the wave-like behaviour and the effects of the particle-like behaviour of quantum objects can be simultaneously tested. For instance, in the Mach-Zehnder two-way interferometer, the effect of interference is destroyed whenever we insert a device that allows us to know which of the two paths has been chosen by each photon. Interference effects and "which-way" experiments are mutually exclusive. In the present paper we focus our attention on an ideal situation in order to show the validity of a quantitative representation of complementarity principle, which has been recently introduced. On this basis, we develop a simple conceptual analysis of the intimate connection between complementarity and uncertainty principles.
1
Introduction
According to the most common interpretation of the Bohr's complementarity principle, it is impossible to check out, at the same time and with the same experimental apparatus, both the effects of the wave-like behaviour and the effects of the particle-like behaviour of quantum objects. The experimental conditions that allow us to check the wave properties of an object are incompatible with the experimental conditions that allow us to check its particle properties, and vice versa [1]. For instance, in the well-known double-slit experiment we can set up the experimental apparatus either to observe the interference pattern or to ascertain which of the two slits each object passed through, but we cannot obtain both results with the same experimental arrangement. If we insert a device that lets us know the path of the object, we destroy the interference pattern, that is the most evident effect of the object's wave-like properties. Now, the fact that the interference pattern disappears is often attributed to the unavoidable perturbation of the object's momentum that is introduced by the device by which we obtain the which-path information. From the quantitative point of view, it is commonly stated that the amount of the perturbation cannot 311
312 be smaller of what is predicted by Heisenberg's relations [2-4]. Indeed, also Heisenberg's uncertainty principle is often interpreted in terms of the "uncontrollable" perturbation involved in any measurement process. This point of view seems to be shared also by Feynman in his well-known description of the double-slit experiment with electrons [5]. If we want to ascertain the hole by which each electron passed through, we can put a light source just behind the slits and observe the photons scattered by the electrons. It's easy to verify that, as a consequence of the uncertainty relations, any localization of the electron disturbs their momentum enough to destroy the interference effects. In few words, as Feynman says, "trying to watch the electrons we have changed their motions" [5]. However, some years ago, Scully, Englert, and Walther envisaged a thought experiment whose importance from the conceptual point of view cannot be underestimated. In fact, the disappearance of the wave-like effects in this experiment cannot be explained in terms of momentum's perturbation [6,7]. The authors even say that the uncertainty relations have in general nothing to do with such experiments: interference disappears because the measurement process creates an entangled state for the compound system "measuring apparatus plus measured object". From this point of view, the complementarity principle would be more fundamental than the uncertainty principle. Obviously, even a collimated beam of atoms incident upon a two-slit arrangement will show an interference pattern. Now, for atomic beams, we can use two maser cavities in order to ascertain by which hole each atom is going to pass through. In fact, each atom, if it is prepared in an excited state, on traversing either one of the cavities spontaneously emits a microwave photon by which we can get the "which-way" information. Now, it is possible to show that no net momentum is transferred to the atom during its interaction with the cavity fields. Therefore, it is impossible to call into play the disturbing action of detectors in order to explain the loss of the interference effects. Some experiments that are conceptually equivalent to the two-slit experiment we have just described have been actually performed. For instance, it has been performed an interefometric experiment with photons, where the role of the two slits is has been played by two entrapped atomic ions. In this case, the which-path information is stored in the atoms' orbital state [8]. Rauch's interferometric experiments with neutrons have the same interest. Here, the which-path information is inscribed in the spin state of the neutrons [9]. An optical analogue of such experiments is the following. One can insert along one of the arms of a Mach-Zehnder interferometer a system that rotates the photons' polarization state. If the polarization states of the photons that take different paths are orthogonal to each other, it is possible, by a polarization measurement
313 effected at the end of the interferometer, to go back to the path of each photon. Obviously, the interference effect is lost. The paper by Scully, Englert and Walter - and their claim that complementarity is a more general concept than uncertainty - has given rise to a long-standing debate in the literature. Several authors have attempted to reestablish an equal status for the two principles, by introducing hidden effects that Scully and his collaborators would have neglected and that would cause the requested disturb for the momentum. However, the results of these attempts are uncertain and their interpretation has never been clearly established with a complete agreement [10-13]. The debate has made an actually relevant step forward thanks to a contribution by Englert in which he formulates a quantitative representation of complementarity [14]. The final clarification of the question is due to Bjork and his collaborators [15]. In their paper a proof can be found of the close relationship between the complementarity principle and the uncertainty relations. According to these authors, "which-way" experiments involve observables that are in general neither position nor momentum. By introducing new observables, it is still possible to attribute the loss of interference to a peculiar uncertainty relation. 2
Predictability and visibility in a Mach-Zehnder interferometer
We now describe in detail a typical Mach-Zehnder interferometer experiment (Fig. 1). Let us assume that a beam of "particles", that are initially in the momentum eigenstate |a) is directed towards the beam splitter BS1.
Fig. 1. The Mach-Zehnder interferometer.
The state of each particle is transformed in a superposition of the reflected state and the transmitted state. The mirrors Ml e M2 reflect the particles towards
314
the second beam splitter BS2. The arrangement of the beam splitters and of the mirrors is such that, if the reflection and transmission coefficients of the two beam splitters are equal, all the particles are detected by the upper WWD. Let us describe the process from a quantitative point of view. The unitary evolution of the states | a) and | b), due to the 50-50 beam splitters, is the following:
l«}-*^>.
\b)^^b)
+ i\a)).
(1)
The evolution due to the mirrors is the following: |fc)-Ha>-
|fl)->«|*).
(2)
More in general, let us assume that the transmission and reflection coefficients of the first beam splitter are different from each other. Then, the initial state evolves as follows: \a)^>cosa\a) + is'mc^b).
(3)
Let's put c + = c o s « and c_=sina (obviously c + + c f = l ) . The system's total evolution, using the simplified notations v+=-{c+ + c_)/y2 and v_ = (c+ - c _ ) / V 2 , will be: \a)—> c+\a) + ic_\b) -* ic+\b)- c_\a) —> -^—r=(c++c_)\a)
+ -j=(c+-cj\b)=v+\a)
+ iv_\b).
(4)
Now, following Greenberger and Yasin [16], we introduce the visibility Vas a function of the intensities of the maxima and the minima of the interference pattern: T/ _
max
min
max
min
/C\
It is evident that V can be considered as a measure of the system's wave properties and that the above intensities will be proportional to the probabilities of finding the system either along the path a or the path b. Therefore, V will be: \r __ r m a x
f^min
max
+ Pn rniin
(£L\
From (6) and (4) we can deduce that, for our experimental arrangement, the visibility is: V = v2+ - vt = sin 2a = 2c+c_. (7)
315 Let's now introduce the function predictability P [16]. Indeed, we could perform a measurement in order to discover the path taken by each system, but we have also some chance to guess the result of such a measurement. If c+>c_, the relation that quantifies our capability to predict the right result is the following: P = ^l=cos2a 1/2
= cl-c2. +
(8)
"
Then, it's easy to show that: P2+V2 = \. (9) It's worth noticing that, whenever we can predict with certainty the chosen path (P = 1), the visibility goes to zero (V= 0). Vice versa, if the predictability is zero (P = 0), we get the maximum value for the visibility of the interference pattern (V=l). We can thus interpret the relation (9) as a quantitative expression of the complementarity principle. It's important to underline the fact that for mixed initial states (in our case, for mixtures of the states |a) and |fo)), the more general inequality P2 + V2 < 1 holds [14]. Let's now investigate on the link between the expression (9) and the uncertainty relations. We introduce the Hermitian operator A defined in a two dimensional Hilbert space with orthonormal eigenstates |A+) (eigenvalue A+ = l) and | A - ) (eigenvalue A _ = - l ) . Let's assume that the observable A refers to the path that the system actually chooses immediately after the first beam splitter. The eigenvalue +1 corresponds to the superior path and the eigenvalue - 1 to the inferior one. Then we get: (A) = c2+-c2_. (10) So, the predictability P is equal to the mean value (A) . This equality can also be deduced in the following way: we put |a) = |A+), |£) = | A - ) and A=|A+)(A+|-|A-)(A-|; if we define the state \y/x) as | Y\) - cos o\ a) + i sin «j b), we can calculate ( ^ |A\ %) = P. For the variance /(AA)2V we get: ((AA)2) = ( A 2 ) - ( A ) 2 =\-(c2+-cl)2
= \-P2.
(11)
Now we introduce the operator B with orthonormal eigenstates |.B+) (eigenvalue B+=\) e|fi-) (eigenvalue B_ =-1). In order to connect B with interference and visibility, let's assume that it refers to the path chosen by the system at the end of the interferometer, just beyond the second beam splitter. We
316 assign to the observable the value 1 if the system is detected along a, and the value - 1 if it is detected along b. The mean value of B will be: (B) = vl-vl
(12)
Its variance will be: ((AB)2) = (B2)-(B)2
= 1-(V+2-V2)2 = 1-V2.
(13)
Let's discuss the form of the operator B. In terms of projection operators, it can be written as B = |B +)(fi +\-\B-)(B-1. The observable B refers to a whichpath measurement at the end of the interferometer, that is to the measurement of A when the system is in the state |v 2 ) = v + |a) + iv_|fc). Since the variances that appear in the uncertainty relations must refer to the same state, we can look for an operator B such that (^ 2 |A|^ 2 ) = (^|.e|^;}, (y/2\{AA)2\y/1) = (y/\\(AB)2\y/x). If UR and UBS indicate the evolution operators for the mirrors and for the beam splitters, respectively, we can notice that («\U+RUlsAUBSUR\¥l) = {¥l \B\Wl),
\B±) = U+RU+BS\A±).
Therefore we can represent B in the basis of the eigenvectors of A. Leaving aside constant phase factors, a possible choice is the following: |B+> = (|A+> + i|A->)/V2,
|B-} = (|A+)-i|A-»/V2.
(14)
Notice that A and B are incompatible: if the system is in an eigenstate of A, we have no information about the value of B, and vice versa. So, for the observables A and B, an uncertainty relation must hold. In fact, by multiplying the variances of A and B, we obtain: ((AA) 2 )((AB) 2 ) = (1 - P2)(l - V2) = V2P2.
(15)
We can conclude that there is a close relationship between complementarity and uncertainty principles: both can be expressed in a quantitative form by means of the functions V and P. Moreover it's always possible to define noncommuting operators that refer to the path or to interference, for which the uncertainty relations hold. In general the uncertainty relations do not involve the observable momentum. This is the main difference from what is suggested by Bohr's classical thought experiments3.
a
In (15) the variances are calculated for a state that minimizes their product. For a more detailed analysis of the connection between the Robertson-Heisenberg relations and the visibility and predictability operators, see the paper by Bjork and collaborators [15].
317
3
Distinguishability and visibility in a Mach-Zehnder interferometer
In the above discussion, the measurements of the observables A and B were effected on the same set of identically prepared systems. The variances were calculated starting from separate sharp measurements or of A or of B. Let's now analyze the effect on visibility (on the interference pattern), when the path of each system is known with some uncertainty. In other terms, we imagine to effect simultaneous unsharp measurements of the noncommuting observables for each member of the set [15, 18]. For the sake of simplicity, we assume that also the first beam splitter is 5050. Let's suppose that just behind the first beam splitter there is a device R whose initial state is | z - ) . If the system goes downwards, R keeps its initial state, if the system goes upwards R takes the state | R +), which can be not orthogonal to | z - ) . The entangled evolution of the system and of the device is described as follows: | a ) U - > - ^ | a ) + i|*>j|z->- \y1) = ^(\a)\R+) ->
+ i\b)\z-))^
-^(i\b)\R+)-\a)\z-))->
- W2) = -\\a)([z-) + \R+))-{\b)([z-)-\R+))-
(16)
Let R be represented in a two-dimensional Hilbert space whose basis is
fl*+>.|*-»Let's assume that \R+) can be represented in a simplified form with real and positive coefficients: | R +) = cos S\z+) + sin 6\z-)By measuring the device's state we will get some information about the path of the system. For instance, the system could be a spin 1/2 "particle" whose initial state is | z - ) , and we can imagine that, if it goes upwards, its spin would be rotated by the device. We can assign to the observable A the value +1, if the systems takes the state | z +) , and the value -1, if it takes the state | z - ) . However, if \R+) and | z - ) are not orthogonal, the path will be known with some uncertainty. The probabilities Pa and Pb to find the system, at the end of the interferometer, respectively along the superior path or along the inferior one, will be:
318
P.=(v2\a)(a\y,I)=±(l Pb = {v2\b){b\v2)
+
{z-\R+))=\
+ *™W-
= \(\-{z-\R+))=\(l-™e).
0?) (18)
If the chosen path is perfectly distinguishable ((z-|/?+) = 0), Pa and Pb are equal: the interference is not visible. On the contrary, if ( z - | # + ) = l, the device's state (the system's spin) does not change: we have no information about the path and all the systems are detected along a. The visibility, defined by the relation (6), now is: Vs=\Pa-Pb\ = (z-\R+) = sm6.
(19)
Let's also define a D, in order to quantify the distinguishability of the paths [15]: D = (l-|(z-|fl+>|2//2=n-sin20)1/2.
(20)
So, one can easily demonstrate the following relation: Vs2+D2=l.
(21)
In this case, the detailed analysis of the relationship between simultaneous measurements and uncertainty relations is rather complicated: the reader can find it in the literature [15]. If the first beam splitter is not 50-50, optimal simultaneous measurements can be effected; the product of the uncertainties will be: ((AA) 2 )((AB) 2 ) = ( 1 - W ) 2 . (22) One can see that the product only depends on the state | yfx), on the visibility (7), and on the predictability (8). 4
Conclusions
We have shown how, in the case of a Mach-Zehnder interferometer, it is possible to introduce observable magnitudes, that allow us to define quantitatively the notions of "predictability", that is the capability of predicting which path the object will choose, of "distinguishability", that is the capability of inferring which path the object has chosen after having gone through the interferometer, and the "visibility", that is the capability of recognizing the interference effects. By means of such magnitudes we can obtain a quantitative formulation of the wave-particle dualism and of Bohr's complementarity principle. The operators corresponding to predictability, distinguishability and visibility are incompatible and are linked by inequalities that resemble Heisenberg's uncertainty relations. It
319 is thus possible to give a mathematical expression to the close relationship between the two fundamental principles of quantum mechanics and it is also possible to find a way to verify their equivalence. As shown by Bjork and his collaborators, complementarity is not more fundamental than uncertainty: the two principles can be considered as the two sides of the same coin. A generalized complementarity relation can be formulated for any system defined in a two-dimensional Hilbert space. However, there is no need to express this relation in term of position and momentum observables. The thought experiment by Englert, Scully and Walter - together with the carrying out of its various experimental versions - shows that the effect of uncertainty relations on the interference loss does not always involve position and momentum, but, in general, it involves other noncommuting observables, depending on the actual experimental conditions. References 1. M. Jammer, The Philosophy of Quantum Mechanics (Wiley, New York, 1974). 2. W. Heisenberg, Z. Phys. 43, 172 (1927). 3. W. Heisenberg, The physical principles of the quantum theory, University of Chicago Press (1930). 4. S. Gasiorowicz, Quantum Physics (Wiley, New York, 1974). 5. R. P. Feynman, R. B. Leighton and M. Sands, The Feynman Lectures on Physics, Vol. 3 (Addison-Wesley, Reading, MA, 1989). 6. M. O. Scully, B. G. Englert and H. Walther, Nature 351, 111 (1991). 7. M. O. Scully, H. Walther, Phys. Rev. A39, 5229 (1989). 8. U. Eichmann et al, Phys. Rev. Lett. 70, 2359 (1993). 9. H. Rauch, Cont. Phys., 27, 345 (1986). 10. P. Storey et al, Nature 367, 626 (1994). 11. B. G. Englert, M. O. Scully and H. Walther, Nature 375, 367 (1995). 12. P. Storey et al, Nature 375, 368 (1995). 13. H. Wiseman and F. Harrison, Nature 377, 584 (1995). 14. B. G. Englert, Phys. Rev. Lett. 77, 2154 (1996). 15. G. Bjork et al, Phys. Rev. A60, 1874 (1999). 16. D. M. Greenberger and A. Yasin, Phys. Lett. A128, 391 (1988). 17. J. J. Sakurai, Modern Quantum Mechanics (Addison-Wesley, Reading, MA, 1985). 18. M. G. Raymer, Am. J. Phys. 62, 986 (1994).
ANTONIO GRAMSCFS REFLECTION ON QUANTUM MECHANICS
ISABELLA TASSANI Istituto di Filosofia, University of Urbino Via Saffi 9, Urbino, Italy As the first step of a wider historical reconstruction of the reception of quantum mechanics in the nineteenth-century philosophy, we are going to consider Antonio Gramsci's philosophy. He asks himself about the nature of quantum objects, if their existence depends on the act of measuring by the experimenter and if this kind of relationship can be interpreted as an argument in favour of an immaterialistic philosophy. We will remark how an idealistic interpretation of quantum mechanics found a fertile field in the Italian culture, characterized by an antiscientific attitude and at the same time needing to find in science a term of comparison.
1. Introduction The appearance of quantum mechanics aroused different reactions in the circle of the European culture at the beginning of the twentieth century and did not fail to go towards manifestations of real hostility from physicists who were linked to the picture of the world offered by classical physics and the nineteenth-century experimental methodology; they had no inclination to a kind of physics which seemed to them abstract, too mathematized, not viewable. Inside the so-called "Copenhagen-school", the founders of the theory did not hide difficulties to elaborate a new way of considering nature and sometimes they even suggested that philosophy would have to stimulate a part of this conceptual renewal. Surely, neopositivists were the most inclined philosophers to accept this challenge; many of them were able to fully understand the new theory and sometimes, as in the case of Hans Reichenbach, to make their innovative and decisive contribution to the elaboration of its interpretation [31]. In the neokantian movement, Ernst Cassirer suggested an interpretation in Kantian terms of quantum mechanics, which will be received by his followers or authors inclined to evaluate the new theory from the same perspective [11], [35]. On the other hand, except for Kantian and neopositivist traditions, the reactions were sporadic and less incisive than those which accompanied the
320
321 appearance of the theory of relativity; in fact its fundamental concepts seemed immediately to have a philosophical pertinence [13]. Quantum mechanics, instead, showed itself to be a complex theory, not viewable and often in full contrast with notions of common sense.3 In Italy such difficulties seemed particularly marked for philosophical reflection, because Italian scientists had neither shared in the elaboration of the formal apparatus of quantum mechanics nor realized any work of mediation between their scientific task and cultural popularization outside the confined sphere of scientific research [2], [17], [18], [34]. The aim of analysing Antonio Gramsci's thought about microphysics is part of a wider plan of reconstructing the cultural and philosophical European climate in the age of the appearance of quantum mechanics; we are going to pay particular attention to the philosophers' capability of appropriating conceptual novelty of this physical theory. 2. Gramsci's criticism of positivism and his conception of science In the sphere of philosophical reflection on science realized by Italian intellectuals in the first decades of the twentieth century, Gramsci was an element bearer of novelty.b In fact, as is well known, the intellectual climate of that age was dominated by the idealistic approach which shared the philosophy of Benedetto Croce and Giovanni Gentile, and by a widespread distrust towards science, which often became a real antiscientific culture0. Such hostility had its a
b
c
In [2], p. 30, Agazzi underlines "the relative poorness of epistemological discussions about quantum physics", compared to those about relativity; in fact, Italian mathematicians had prepared adequate mamematical instruments only to understand the latter. The fact that Gramsci had no part in the prevailing Italian culture in the three decades of the nineteenth-century is emphasized by Rossi, who singles out the characteristics of a similar culture in irrationalism, in the criticism of science and its union with technology, in the nostalgia towards the past ([33], p. 56). See [2] and [21], where Garin points out that the question of the role of the intellectuals in contemporary society was a subject that "was troubling" the European culture; so, Gramsci's reflection is surely not limited only to Italy's point of view (pp. 291 ff.). For the well-known critical attitudes of Croce towards science, see [12]. The stereotype that ascribes to the philosophical idealism, at the same time as the advent of fascism, the responsibility of the diffusion of an antiscientific culture in Italy, is analysed in [2]; Agazzi proves how idealism played in effect only a secondary role in causing the extinction of some phases of the previous Italian philosophy of science; after all, he points out that "such a kind of antipositivistic and, in the specified sense, also "antiscientific" reaction was diffused in those years in the whole world at philosophical level, such were the instrumentalistic, pragmatistic and conventionalistic interpretations of scientific knowledge and even the depreciations of formal logic" (p. 25); finally, Agazzi notices that inside Italian idealism we can find also more favourable attitudes towards science, for example that of Ugo Spirito, who appreciated it as a humanistic knowledge. The Italian panorama between the first and the second war was then more diversified than is usually known, and, according to the author, the debates occurred inside actualism, or in controversy with
322
origin in a misleading identification of general scientific empirical results with the representation of science offered by the nineteenth-century positivism. The idealistic criticism indeed remarked a narrowness of an activity, such as the scientific one is, which eventually limits itself merely to considering empirical data, even though translated into general laws, but however not able to embrace the large movements of thought in which subjectivity expresses itself. Positivism, after all, had offered an extremely limited image of scientific concern, either reducing it to a collection or a cataloguing of empirical datad, or linking it strictly to the idea of progress as evolution, which was badly suited to the historical and cultural context of the first decades of the new century. After all, all over Europe a criticism of positivism had been expressed not only by the strictly neoidealistic circle (Croce, Gentile) or by the "spiritualistic" one (Bergson), but also by authors that had taken an active part in elaboration of scientific theories, such as Poincare6. Gramsci's criticism of positivistic science is connected to this wider European perspective, which is developed by him on the basis of the double point of view of idealism and Marxism [21]. In fact, positivism is considered by Gramsci a kind of superstition, as it transmits a false image of science and even assumes the form of a reactionary ideology. The main mistake made by positivists lies also in a hypostatization of scientific prediction, based on mechanical causality, and in consideration of the latter as a methodological criterion, which we can apply also to different sciences from the natural ones,
d
c
it, ended in a gnosiological background which was very useful to epistemology (p. 31). For Leonetti the controversy between idealists and positivists was exhausted in the twenties, leaving its place to that part of Gramscian reflection which was more oriented in political sense [30]. That is the Gramscian criticism of positivism, which is oriented to an "abstract classifying, to methodologism and to formal logic"; see [24], p. 1467 (Q 11, § 45, 57 bis). Rossi points out that the controversy with the positivistic representation of progress had formed, in the three decades of the century, from very different ideas: "Ideas taken from Gentile and Croce, Bergson and Mach, James and Poincare, Nietzsche and Sorel acted in conjunction — even if used with different aims — as well as themes from intuitionism, conventionalism, pragmatism, historicist idealism, from the actualistic one and the magical one" ([30], p. 55). A similar point of view is supported by Garin: "We were wrong if we isolated the movement of ideas which was prevailing in Italy between the end of the nineteenth century and the first decades of the twentieth one, considering it as a "provincial" episode, and bringing it closer to some aspects of the French culture (Sorel, Bergson) or, perhaps, to the North American one (James), but separating or, at the worst, opposing it to the contemporary developments of the philosophy of life, of German historicism and even of Husserl. A criticism of science, the distinction and the antithesis between science of nature and science of life, between life and forms and so on, are themes circulating everywhere symmetrically" ([21], p. 354). As for the criticism of the formalism of logic made by Poincare\ see [2], pp. 28 f.
323
beginning, for example, from the historical ones. But, Gramsci concludes, "it is the concept itself of "science", [...] which requires to be critically destroyed. It is taken root and branch from the natural sciences, as if it were the only science or the science par excellence, as decreed by positivism" . In other words, even though Gramsci admits that positivistic exalting of science is the outcome of a bourgeois ideology, he does not extend his refusal to science as such; on the contrary, he realizes the importance of scientific method in the distinction between facts and ideological elements and desires a more authentic knowledge of scientific data and methods. Such a knowledge can be pursued by means of a wider appropriation of essential scientific notions, popularized by means of "scientists and serious students, and no longer by allknowing journalists and the self-opinionated self-taught of this world" ([24], p. 1459 (Q 11, § 39, 53 bis); [26], p. 295). Therefore, despite the fact that Gramsci is never completely free from his idealistic development, he still fully understands and exalts the deep value of science, not only for the general human knowledge, but also as a mean of emancipation of Italian culture8. Science continues to be a superstructure, for Gramsci, but it distinguishes itself from the others because it contains in itself the method for a critical distinction between ideological and factual terms. 3. Gramsci and microphysics To fully understand Gramsci's reflection on microphysics, we have first of all to consider the means by which he learned the novelty introduced by new physics — taking into account the years from 1925 to 1927 as those in which the main contributions on the formal structure of quantum mechanics were published11 — and to evaluate which books he had read during his imprisonment. Gramsci was arrested on 8 November 1926, as a result of "exceptional measures" adopted by the fascist dictatorship, and submitted to a regime of isolation; during February of 1927 he obtained a permit to receive books and
' [24], p. 1404 ( g 11, § 15, 26); [23], p. 438. The controversy with the deterministic view of economic-social structures is a topic which has been present in Gramsci's philosophy since his early works ([21], pp. 303 ft). 8 In Gramsci's opinion science not only is the bearer of theoretical values, but also of historicpolitical ones: for an analysis of the revolutionary function of culture in Gramsci, see [21], pp. 297 f.; for an examination of the crisis of the culture of the Italian middle classes, [32], pp. 235-244. We take into account only the contributions on quantum formalism and not all of Bohr's previous works on the structure of the atom, or that of Planck, going back to the beginning of the century. In spite of the publication of the results in international reviews, in Italy Enrico Fermi was the first who originally contributed to the study of the atomic structure, and only since 1923 [18].
324
reviews, and only later he was allowed to write inside his cell'. He began a series of irregular readings, from what he was receiving from friends and relatives; he also fulfilled some translations from German to Italian as a relaxing exercise. However reading, which at the beginning had seemed to him an antidote and a valid self-defence against the brutishness determined by prison and isolation, ended up very soon by appearing to him an empty intellectual exercise, because it had no definite aimJ; so Gramsci began the draft of Quaderni del carcere, to which he devoted himself from February 1929 to 1935, when he was forced to suspend the work due to the bad state of his health. The plan of work was at first divided into four parts, dedicated to the following subjects: a research on the history of Italian intellectuals, a study of comparative linguistics, an analysis of Pirandello's theatre and an essay on serial stories [22]. This project was successively modified many times and was not always kept to by Gramsci, because his health worsened, or because of the modification of his requirements and interests, and finally because of the impossibility to find the books he needed for his research. So Gramsci's reflection on science was placed inside a wider critical reconstruction of the Italian culture of his time. Gramsci's clear awareness to be at the beginning of a new age did not escape his attention, in spite of his imprisonment. Paradoxically, the restrictions seemed to have intensified his sensibility towards wider themes and his acuteness of mind in catching suggestions from the scientific European debate\ although they were filtered from second-hand sources; in fact they were not examined closely through a systematic enquiry. For a more detailed analysis, we can ask what books by means of which Gramsci heard of new scientific theories were. In numbers 8, 9 and 11 of his Quaderni del carcere, he cites many times The nature of the physical world by Eddington [15] (in the French edition of 1929 [16]), and then the "booklet" by Giuseppe Antonio Borgese [6], defined as a "small book in which G. A. Borgese ' Gerratana specifies that Gramsci obtains a permit to read newspapers, starts a double subscription to the prison library and has the right to eight volumes a week; besides he receives books and reviews from the outside and may write two letters a week ([22], p. XV e LXI). ' In a letter of December, 11, 1926, sent to Piero Sraffa, Gramsci asks his friend to send him some books, to face the boredome generated by the reclusion ([25], 1, p. 44). So he decides to extend his research to an analysis of Italian intellectuals, from an impartial point of view (fiir ewig). k In contrast with the presumed Gramscian provincialism, Garin objects that Gramsci is really "a man who shares the drama of the post-war period and of the Russian revolution, curious of any kind of books and teachings, who lived between Moscow and Vienna during the crucial years, and not in peripheral backgrounds, but in contact with the main characters of the world-history" ([21], p. 344).
325 speaks about the new trends of scientific opinion (Eddington) and announces that they dealt a blow to historical materialism" ([24], p. 985 (Q 8, § 77, 25 bis, March 1932)1. Borgese was a figure of prominence in the intellectual Italian panorama at the beginning of the nineteenth-century"1; his small volume, Escursione in terre nuove [6] (published in 1500 copies in 1931), was a report on a journey to Oxford, made on the occasion of the 7th International Congress of Philosophy, during the month of September 1931; therefore, it seems a medley of scientific and philosophical argumentations, on the one hand, and of descriptions of landscapes and personal suggestions on the other. Borgese makes many relevant comments on European and Italian philosophical culture, with a disenchanted view, but able to target some elements of novelty, above all in science. On this matter, it is worth noting that Gramsci's scientific knowledge is the outcome of the report of a literary man, with whom he shares the Crocian development, however completely independent of the provinciality of a large part of contemporary Italian culture. Borgese recognizes "the most peculiar sign of the modern era" in the "aspiration after discontinuity", and cannot fail to observe how contemporary philosophical culture seems a sort of Scholasticism, which perpetuates the past, completely unaware of the novelties occurring in other spheres of knowledge: For three or four centuries people have spoken without respite of revolutions: there is neither form of thought which does not aspire to subvert the past, or almost any man who does not hold in him the myth of a conversion; indeed, this aspiration after discontinuity is the most peculiar sign of the modern era. However, a lot of talk about revolutions does not exclude that there have been and there are revolutions.
1
The volume [16] is part of the books deposited in "Gramsci's Fund" with prison-marks of the gaol of Turi (dating back to the period November 1930-March 1933). In a letter of August, 31, 1931, in [25], 2, p. 62, Gramsci asks his sister-in-law, Tatiana Schucht, to send him "a book on physics by a well-known English writer, [...] I think it was Eddington", and another by Sir James Jeans, The universe around us [27], published in Italian translation in 1931 [28]. So Gramsci comments: "Jeans is a pure physicist; Eddington, however, accepts idealism in science". For an exact dating of the Notebook 8 — in which the physicists are quoted — we refer to [24], pp. 2365 f, as well as to [20], in which Francioni suggests an interesting reconstruction of the Gramscian working method and a more precise dating from that realized by Gerratana; in particular, the Notebook 8 was written between November 1931 and May 1932; inside, we recommend the paragraphs: 170 (Scientific ideologies) and 176 (The new science), written in November 1931, and finally the paragraph 177 (The "objective" reality), dateable between November and December 1931 (pp. 140-146). Both Eddington and Borgese are mentioned again in Notebook 11, § 36, written between August and the end of 1932.
m
G. A. Borgese (1882-1952) was literary critic, writer, contributor to newspapers, university teacher of German literature. Because he refused to take the oath required from the fascist regime, he was forced to move to the U.S.A., where he lived from 1931 to 1949. He wrote many travel books, in which naturalistic descriptions are interlaced with the historical-political dimension.
326 What happened in physics and in natural sciences in the last few years is certainly one of the most important events of the century; and it will not pass without profound consequences on any moral science, philosophy and religion. At Oxford I thought it was not sufficiently spoken about, or, as usual, without sufficient discernment; on the contrary so many reports or debates, in different philosophical fields, seemed to me faint, tired and scholastic things, without any interest for the soul. In the news I gave, I tried to point out only what has real value, what makes us imagine a tomorrow. This revolution or turning point is very recent and its fame did not become common even in the nearby fields. One of the most named philosopher still used to say at Oxford, in September 1930, that "tout dans la nature est determine: il n'y a pas d'effets sans cause". And shortly before, a philosopher of ours, one of the most recent and subtlest, had written that everything in the physical world happens according to an inexorable causal chain. Many people still talk about earth or sky as if nothing important had happened after Galileo and Newton, after the Angel had spread such a large wing ([6], pp. 10 ff.).
Borgese has the impression of failing the philosophers' pride, of the diffusion of a possibilism by which the construction of systems is improbable, of the disappearance of great philosophical personalities who embody the spirit of the ages: "In this world of problematic and descriptive philosophy, in which nature is physics and soul is history, it becomes more and more improbable to face a violent formula which opens the mind, in a heroic thinker who embodies it" ([6], p. 26). Nevertheless, Borgese remarks how at the philosophical congress of Oxford nothing seemed to him so interesting as the question posed in the first plenary session: "Is the recent progress in physics metaphysically important?" ([6],pp.35ff). Borgese expressly quotes Eddington, understanding the aspects which have revolutionized physics, that is the questioning of the spatio-temporal concepts by Einstein and Minkowski and the representation of matter offered by Rutherford: "Between 1905 and 1908 Einstein and Minkowski introduced fundamental changes in our ideas of time and space. In 1911 Rutherford brought about the greatest change in the idea of matter from Democrito's time". However, Eddington continues, while Einstein's ideas seemed immediately revolutionary, those of Rutherford did not make a stir [...]. They were instead ahead of the time of the great subversion. Rutherford, discovering the vacuum inside the atom, pulverizes — if there were a sufficient word for such radical demolition — the solidity of nature and shakes up the traditional assumption that things are more or less as they appear to the senses. The atom is as discontinuous as the solar system is. "If we eliminated all the unfilled space in a man's body and collected all his protons and electrons into one mass, the man would be reduced to speck just visible with a magnifying glass" ([15], pp. 1 f.). [...] Here, men in the street, theosophers, spiritualists, emanators of ectoplasm, yogis, fakirs and even conjurers [...], all a multicoloured group surrounds the new physics and dares to ask: but are we really sure that, all things considered, that little fragment exists?
327 Should one not imagine that a further analysis, a more penetrating inquiry, will dissolve even this last leftover of definite existence into empty space? "Matter leaves universe" ([6], pp. 40 f.)
About consequences of the new physics, Borgese concludes: Men will have to find new words, that is new sentiments, for the new things that they have now in mind. [...] As for microphysical phenomena, on the infinitely little one to which nowadays many people so passionately play attention, someone could say that "they can not be considered to exist independently of the subject that observes them". So all the reality passes to thought. Jorgen Jorgensen, a participant at the Congress of Oxford, could say that "once all the ideas existing until now on the physical world have been dashed, we have only a magnificently well working symbolism left, but the exact meaning of which, if we suppose that it has one, nobody has yet revealed" ([6], pp. 47 f.).
Borgese then recalls expressly Eddington's discussion on Heisenberg's principle of indeterminism and the refusal of causality introduced by it, emphasizing the year 1927 as the beginning of the new era of physics and of the downfall of positive and mechanical sciences ([6], pp. 55 ff.). Gramsci takes up again Borgese's references to Eddington, commenting on them in the light of his philosophy; as an example, Gramsci points out how in the new image of matter, in which the space between protons and electrons is much wider than we would expect — such a thing made a deep impression on Borgese — "there is no meaning", because "there would be no change in ratios and relationships, things would stay just as they are" ([24], p. 1451 (Q 11, § 36, 49), p. 1043 (Q 8, § 170, 52 bis); [26], p. 286)n. For Gramsci, in these elucubrations "we are dealing with mere word-play, with science fiction, not with a scientific or philosophical thought. It is a way of posing the question that is fit only for creating fantasies in empty heads" ([24], p. 1451 (Q 11, § 36, 49 bis); [26], p. 286). In other words, while Borgese had interpreted Eddington and modern physics in a mere immaterialistic and subjectivistic way0, instead Gramsci estimates the problems of modern physics as questions regarding language; so, on one side he plays down the difficulties in the representation of microscopical reality, but on the other he grasps entirely the linguistic, between others, dimension of the matter. In fact, he does not hesitate to conclude (in 1932):
" Eddington's sentence, to which Gramsci refers, is drawn from [16], p. 20: "L'atome est aussi poreux que le system solaire. Si dans le corps d'un homme nous eliminons tuot l'espace depourvu de matiere et que nous reunissions ses protons et electrons en une seule masse, 1'homme serait r6duit a un corpuscule a peine visible a la loupe". The Italian translation in Notebooks is made by Gramsci; Borgese quotes the original edition, widi a little different translation ([6], p. 41). 0 An idealistic interpretation of physic, conceived under Eddington's guidance, can be also found in [14], p. 55, in which De Giuli states: "Idealism finds in science the best confirmation of itself.
328 In Eddington's physics and in many other manifestations of modern science the surprise of the ingenuous reader depends on the fact that the words used to indicate certain facts are modified to denote arbitrarily quite different facts. A body remains "massive" in the traditional sense even if the "new" physics demonstrates that it is comprised of one million parts of matter and 999,999 parts vacuum. A body is "porous" in the traditional sense and does not become so in the sense of the "new" physics even after Eddington's claim. [...] The glosses of the various Borgeses in the long run will serve only to reduce the subjectivistic conceptions that allow trivial playing around with words in this way to a state of ridicule ([24], pp. 1451 f. ( g 11, § 36, 49 bis); [26], pp. 286 f.).
The universe around us, by James Jeans [27] (appeared in Italian translation in 1931 [28]) contributes to suggest Gramsci's correct interpretation of Eddington's words; in it the author explains the experiments and the model of atom proposed by Rutherford, emphasizing that all the universe is filled by vacuump. However, the immaterialistic conclusions are an outcome only of Borgese's personal interpretation, which is exposed by Gramsci. In other words, with an approach like a neopositivistic one, Gramsci fully realizes the need not to allow that philosophical speculation follows suggestions of play-words, maintaining precise terms, logical and conceptual rigour. But he notices above all that common language, moulded according to the macroscopic reality, very soon shows itself to be inadequate to describe the microscopic one ([24], p. 1454 (Q 11, §36, 50)). A similar linguistic mistake is made — in Gramsci's opinion — by Borgese, when he quotes the expression of the participant to the Congress of Oxford, Jorgensen; according to him the infinitely small phenomena "can not be considered independent of the subject that observes them"; on this matter also Mario Camis writes in turn that "these are words which give rise to quite a number of reflections and, from completely new standpoints, bring back into play the great problems of the subjective existence of the universe and the meaning of sensorial information in scientific thought" ([8], p. 131; quoted in [24], p. 1452, p. 1048; [26], p. 287). If we had to interpret these observations in a mere literal way and not metaphorically — Gramsci explains — physical phenomena would in fact not even have been observed, but "created"; they would be an issue of the experimenter's subjective experience, like works of art
p
As Gerratana specifies, in [24], p. 2901 (note 6, § 36 of Q 11), Gramsci wants to read the work by Jeans after the recommendation made by Mirkij; the volume [28] is part of the "Gramsci's Fund", with prison-marks (Turi III). Experiments made by Rutherford are described in [28], pp. 116-119; [29], pp. 112-119, where Jeans states: "As we pass the whole structure of the universe under review, from the giant nebulae and the vast interstellar and internebular spaces down to the tiny structure of the atom, little but vacant space passes before our mental gaze. We live in a gossamer universe; pattern, plan and design are there in abundance, but solid substance is rare" (p. 114).
329 ([24], p. 1454 (Q 11, § 36, 51)). Instead Camis himself — even though he was aligned to "a way of thinking about the 'new' physics" prevalent among British scientists in particular — implicitly ends up by explaining that the expression quoted by Borgese should be understood in a merely metaphorical sense (Ibid., p. 1452, p. 1456; [26], pp. 287 f)q. Again Gramsci fully realizes the problem of subjectivism and of solipsism implied by a possible interpretation of quantum theory, but he tends to reduce their importance assigning these ill-fated results to a rough misunderstanding by Borgese and his sources: If it were true that the infinitely small phenomena in question cannot be considered as existing independently of the subject who observes them, they would in fact not even be "observed", but "created" and would fall into the same domain as the pure imaginative intuition of the individual. The question of whether the same individual can create (observe) the same fact "twice" would also have to be posed. One would not even be dealing with "solipsism" but with witchcraft, with demiurgic powers. It would not be these (non-existent) phenomena but rather these imaginative intuitions that would, like works of art, be the subject of science. [...] But if, on the other hand, despite all the practical difficulties inherent in different individual sensitivities, the phenomenon did repeat itself and could be objectively observed by various scientists independently of one another, what would the assertion quoted by Borgese mean except that a metaphor was being used to indicate the difficulties inherent in giving a description and an objective representation of observed phenomena? It does not seem difficult to explain mis difficulty: 1) with the lack of literary ability of scientists who, up to now, have been didactically trained to describe and represent only macroscopic phenomena; 2) with the insufficiency of common language, which has also been fashioned for macroscopic phenomena; 3) with the relatively slight development of these sub-microscopic sciences, which are awaiting a further development of their methods and criteria in order to be understood by many people through the channels of literary communication [...]; 4) one must always bear in mind that many sub-microscopic experiments are indirect, chain ones whose result "is seen" in the results and not in the act itself (as in the experiments of Rutherford). One is, in any case, dealing with the initial and transitory phase of a new scientific era, which — together with a great intellectual and moral crisis — has produced a new form of "sophistry" ([24], p. 1454 (Q 11, § 36, 51 e 51 bis); [26], pp. 289 f.).
In other terms, then, Gramsci realizes that solipsim and subjectivism are implicit in quantum theory, at least as it has been conceived and interpreted, even if he thinks that these results are due to the literal use and meaning of expressions which are instead to be understood in a mere metaphorical sense; equally suitably he underlines the need to develop concepts and forms of
q
Camis — taken note of the difficulties imposed on the experimentation from the "minuteness" of the methods of inquiry required from the physical and medical sciences — comments only: "We feel like asking what are the objective conclusions which in fact can be drawn from such subjective intuitions as are those here mentioned" ([8], pp. 131 f.).
330
language able to describe microscopical reality, in the same way as Bohr used to invite people to devise a new theory of knowledge. So Gramsci, far from yielding to the temptation of accepting immaterialism, finds other reasons able to guarantee objectivity to a scientific knowledge not founded on an undoubted observation of phenomena: intersubjectivity or communication between scientists on the results of their independent experiments. This is the only criterion that can guarantee that observed phenomena do not emerge from individual speculation. Finally, he emphasizes another interesting aspect, that will be developed by later historiography. After his assertion that quantum theory is in an initial phase and is fated to be improved, the author draws attention to the subjectivistic and "irrationalist" results to which it seems to lead; he thinks that they are issues of "the intellectual and moral crisis" of the society in which this scientific theory has been conceivedr. Such a kind of sociological model in the approach to the history of science — Kuhn is a forerunner of it — has been used by Paul Forman to reconstruct the cultural climate that gave rise to quantum mechanics, with similar remarks to those already made in short by Gramscis. Another very meaningful remark developed by Gramsci concerns the ageold controversy about the existence of an external world; on this matter neopositivists, that in philosophy were following a similar approach to that of empirical sciences, maintained that it was impossible to make asserts endowed with an empirical meaning on the question of realism. On this topic all of them were true to Carnap who, since 1928, in Der logiche Aufbau der Welt, had considered either realism or idealism as pseudoquestions [9], [10]. After few years, Gramsci independently draws similar conclusions, asking himself if science can give any "certainty" on the objective existence of an external world. His clearness of thought is revealed when he states:
' The attention to the social-historical dimension in the history of science and the idea that the origin of the crisis of modern physics, such as other single fields of science, found in the historical development of the capitalistic society was a theme that Gramsci and Bukharin shared; his paper [7] is quoted and discussed in many passages of the Notebooks. s Forman considers the birth of quantum mechanics and of its subjectivistic and non causal interpretation as an outcome of the decadence of a precise historical period, corresponding to the one that was in Germany between the two wars; during the Republic of Weimar, in fact, the triumph of different forms of irrationalism happens; a kind of neoromantic philosophy (Lebensphilosophie) is diffused, such as — in the name of life and of its individuality and hostility to mechanistic theory, to exact sciences and their technical applications — denies in the first place the validity of the causal law [19]. Gramsci anticipates many topics which will be successively developed by historians of science not influenced directly by his thought [4] [5].
331 One may maintain it is an error to ask of science as such the proof of the objectivity of reality, since this objectivity is a conception of the world, a philosophy and thus cannot be a scientific datum. [...] "Objective" means this and only this: that one asserts to be objective, to be objective reality, that reality which is ascertained by all, which is independent of any merely particular or group standpoint. But, basically, this too is a particular conception of the world, an ideology. [...] But if scientific truths themselves are not conclusive and unchangeable, then science too is a historical category, a movement in continual development. Only that science does not lay down any form of metaphysical "unknowable", but reduces what humanity does not know to an empirical "not knowledge" which does not exclude the possibility of its being known, but makes it conditional on the development of physical instrumental elements and on the development of the historical understanding of single scientists. If it is so, what is of interest to science is then not so much the objectivity of the real, but humanity forging its methods of research, continually correcting those of its material instruments [...], in other words culture, the conception of the world, the relationship between humanity and reality as mediated by technology. In science, too, to seek reality outside of humanity, understood in a religious or metaphorical sense, seems nothing other than paradoxical. Without humanity what would the reality of the universe mean? ([24], pp. 1455 ff. {Q 11, § 37, 51 bis, 52, 52 bis); [26], pp. 291 f.).
Gramsci's considerations conclude, then, not only with the assertion of the impossibility for science to state the reality of an external world, but also with the reduction of scientific knowledge to an argument about man: "Science too is a superstructure, an ideology"; this is demonstrated by the fact that in some periods it had not any priority-role which we assign nowadays to it, as a result of a real "infatuation"'. On this matter Gramsci also criticizes the approach followed by Bukharin, delegate of the Sovietic Union at the second international Congress of History of Science and Technology (held in London in 1931), who represents a source for him but also a critical target". In fact, the soviet delegate maintains that the subjectivistic conception prevailing in modern philosophy and science has a religious origin, that can be ascribed to the Bishop Berkeley and to his idea of the esse est percipi. On the contrary, Gramsci criticizes Bukharin's conception, asserting that the whole question of external reality is misleadingly formulatedv. 1
For the criticism of the "abstract and impersonal fetishism of science" and the "deification of the corresponding categories", see [7], p. 28, 20; [24], p. 1458 (Q 11, § 39); [26], p. 295. " According to what is specified by Gerratana, [24], p. 2765, Gramsci had received in prison, at the end of August 1931, the proceedings of the Congress of History of Science and Technology (London, 29"' June-3"1 July 1931) [7] ([25], 2, p. 62). v In [7], pp. 11 f., Bukharin writes: "Nearly all the schools of philosophy, from theologising metaphysics to the Avenarian-Machist philosophy of "pure description" and renovated "pragmatism", with the exception of dialectical materialism (Marxism), start from the thesis, considered irrefutable, that "I" have been "given" only "my" own "sensations". This statement, the most brilliant exponent of which was Bishop Berkeley, is quite unnecessarily exalted into a new gospel of epistemology". Gramsci points out mat religion cannot wander from the idea of an independent reality; in fact, in Italy positivism was absorbed by the religious culture to refute subjective idealism [24], p. 894 (Q 7, § 47, 73 bis); [3].
332
It is true that believing in the objective existence of an external world is by now a kind of metaphysical credo that positivistic science shares with common sense; but it remains right to address the issue, not as an object of scorn, but as an opportunity for a correct historicist interpretation: Man knows objectively in so far as knowledge is real for the human race historically unified in a single unitary cultural system. [...] There exists therefore a struggle for objectivity (to free oneself from partial and fallacious ideologies) and this struggle is the same as the struggle for the cultural unification of the human race. [...] Up to now experimental science has provided a basis on which a cultural unity of this kind has reached its furthest extension. This has been the element of knowledge that has contributed most to unifying the "spirit" and making it more universal. It is the most objectified and concretely universalized subjectivity ([24], p. 1416 sg. (Q 11, § 17, 31 bis); ibid., pp. 1075 ff., pp. 1455 ff., [23], pp. 445 f.).
However, positivism gave a misleading image of science, not only exalting excessively its value, but above all not realizing that it cannot be identified with a mere collection of empirical data; more appropriately, instead, it has to be defined as a historical category, which arises from the union of empirical data and hypotheses: Science never appears as a bare objective notion — it always appears in die trappings of an ideology; in concrete terms, science is the union of the objective fact with a hypothesis or system of hypotheses which go beyond the mere objective fact. It is true however that in this field it is relatively easy to distinguish the objective notion from the system of hypotheses by means of a process of abstraction that is inherent in scientific methodology itself ([24], p. 1458 (Q 11, § 38, 53); [26], p. 293).
If then it is true that science is a historical category and that scientific objectivity is historically "dependent on the theoretical activity" of man ([3], p. 401), however this does not mean that scientific knowledge is completely determined by the subject, because scientific method and the possibility of rectifying our knowledge, submitting them to a continuous empirical check, safeguard it from an absolute levelling on subjectivity. The positivistic debate on the role of hypotheses, on the relationship between theoretical and observative terms and on the so-called theory-ladeness will be successively examined more closely, but in agreement with so many clear Gramscian remarks". In conclusion, Gramsci distinguishes himself from the intellectual climate of his time and, even in isolation and by reading few books on the new physics, he can emancipate himself from a mere idealistic and subjectivistic interpretation of quantum mechanics, that would have found a fertile field in
Boothman blames Gramsci because he was not entirely aware of how observations are so theoryladen [4]; but, in our opinion, Boothman's sentence is compromised by a kind of anachronism.
333
which to grow. He can do this, on the other hand, knowing by intuition and great clearness of thought many meaningful aspects that will characterize the epistemological debate of the following decades. References 1. 2. 3. 4.
5. 6. 7.
8.
9. 10. 11. 12. 13.
14. 15. 16. 17.
18.
E. Agazzi (ed.), Lafilosofia delta scienza in Italia nel '900, Franco Angeli, Milano (1987). E. Agazzi, "Fasi e forme della filosofia della scienza italiana nel '900", in [1], 15-41 (1987). M. Aloisi, "Gramsci, la scienza e la natura come storia", Societa, VI, 3, 385-410(1950). D. Boothman, "Gramsci, Croce e la scienza", in R. Giacomini, D. Losurdo and M. Martelli, Gramsci e VItalia. Atti del convegno internazionale di Urbino, 24-25 gennaio 1992, La citta del sole, Napoli, 165-186 (1994). D. Boothman, "General Introduction", in [26], XIII-LXXXVII (1995). G. A. Borgese, Escursione in terre nuove, Meschina, Milano (1931). N. Bukharin, 'Theory and practice from the standpoint of dialectical materialism", in Science at the Cross Roads, Kniga, London, 1-23 (1931); now in J. Needham (ed.), Science at the Cross Roads, Frank Cass, London, 9-33 (1971). M. Camis, "Scienze biologiche e mediche: Gosta Ekehorn, On the principles of renal function, Stockolm, 1931", Nuova Antologia, 1 novembre, 128-133 (1931). R. Carnap, Der logiche Aufbau der Welt, Weltkreis, Berlin (1928). R. Carnap, Scheinprobleme in der Philosophic Weltkreis, Berlin (1928). E. Cassirer, Determinismus und Indeterminismus in der modernen Physik, Goteborgs Hogskolas Arsskrift 42 (1937). B. Croce, Logica come scienza del concetto puro, Laterza, Bari (1909). S. D'Agostino, "La relativita generale nel dibattito degli anni venti fra neokantiani ed empiristi logici. Annotazioni su recenti studi einsteiniani", Physis. Rivista internazionale di storia della scienza, XXXIV, 3, 643-658 (1998). G. De Giuli, "Scienza e idealismo", Rivista di Filosofia, XXII, 1, 53-56 (1931). A. S. Eddington, The Nature of the Physical World, Cambridge University Press, Cambridge (1928). A. S. Eddington, La nature du monde physique, Payot, Paris (1929). V. Fano, "How Italian Philosophy Reacted to the Advent of Quantum Mechanics in the Thirties", in G. Tarozzi and A. van der Merwe (eds.), The Nature of Quantum Paradoxes, Kluwer, Dordrecht, 385^101 (1988). V. Fano, "La riflessione degli scienziati sulla meccanica quantistica in Italia fra le due guerre", in G. Cattaneo and A. Rossi, / fondamenti della
334
19.
20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34.
35.
meccanica quantistica. Analisi storica e problemi aperti, EditEl, Commenda di Rende, 105-118 (1991). P. Forman, "Weimar Culture, Causality and Quantum Theory, 1918-1927: Adaptation by German Physicists and Mathematicians to a Hostile Intellectual Environment", Historical Studies in the Physical Sciences, 3, 1115(1971). G. Francioni, L'ojficina gramsciana. Ipotesi sulla struttura dei Quaderni del carcere, Bibliopolis, Napoli (1984). E. Garin, Intellettuali italiani del XX secolo, Editori Riuniti, Roma (1974). V. Gerratana, "Prefazione", in [24], XI-XLII (1975). A. Gramsci, Selections from The Prison Notebooks, Lawrence and Wishart, London (1971). A. Gramsci, Quaderni del carcere, Einaudi, Torino (1975). A. Gramsci, Letters from Prison, Columbia University Press, New York, vol. 1 and vol. 2 (1994). A. Gramsci, Further Selections from The Prison Notebooks, Lawrence and Wishart, London (1995). J. Jeans, The universe around us, Cambridge University Press, LondonNew York (1929). J. Jeans, L'universo intorno a noi, Laterza, Bari (1931). J. Jeans, The universe around us, Cambridge University Press, LondonNew York (1960). A. Leonetti, Note su Gramsci, Argalia, Urbino (1970). H. Reichenbach, Philosophical Foundations of Quantum Mechanics, University of California Press, Berkeley (1944). C. Riechers, Antonio Gramsci. Marxismus in Italien, Europaische Verlagsanstalt, Frankfurt (1970). P. Rossi, "Antonio Gramsci sulla scienza moderna", Critica marxista, 2, 14, 41-60 (1976). G. Tarozzi, "Introduction: The Italian Debate on Quantum Paradoxes", in G. Tarozzi and A. van der Merwe (eds.), The Nature of Quantum Paradoxes, Kluwer, Dordrecht, 1-50 (1988). C. F. Weizsacker, Zum Weltbild der Physik, Hirzel Verlag, Stuttgart (1960).
THE ROLE OF LOGIC AND MATHEMATICS IN THE HEISENBERG FORMULATION OF QUANTUM MECHANICS ANTONIO VENEZIA Gruppo di Storia della fisica, Dipartimento di Scienze Fisiche, Universita Federico II, Napoli e-mail: [email protected]
In this paper, by means of a logical and linguistic analysis of Heisenberg's work, the properties of a logical model suitable for quantum mechanics are obtained. This model is an alternative to traditional quantum logic, because it uses an intuitionist negation. It is able to justify the passage from the problem of conjugate variables measurement to the mathematical formalization (commutation rules) of matrix mechanics.
1. Two Quantum Mechanics Formulations In 1925 Heisenberg [1] developed a first coherent "method to treat quantum theoretical data". Starting from the methodological premise that only the observable variables (for example atomic spectra) are to be considered, he resolved some special problems, as the one-dimensional harmonic oscillator (energy calculation and comparison with Kramers-Born method) and rotator (electron rotating around nucleus and comparison with Goudsmit-Kronig-Honl formulas). In the same year, Max Born and Pascual Jordan [2] proposed the first mathematical formulation of the Heisenberg method, showing how the observable variables could be represented by matrices of a non-commutative algebra. For this reason, the new mechanics was called matrix mechanics. In 1927 Heisenberg [3], in publishing his uncertainty relations, explained physically why it is impossible to make simultaneous measurements of conjugate variables, i.e. variables represented by non-commuting matrices (for example momentum and velocity of the electron). In 1926 Erwin Schrodinger [4] published four papers proposing an alternative theory to Heisenberg's mechanics, known as wave mechanics. He derived the whole theory starting from a differential equation considered as an axiomatic principle. His formulation was closer to the traditional mathematics of continuum used in classical mechanics, and was the expression of the same ideal,
335
336
according to which it is sufficient to know the initial conditions to predict the evolution of a physical system without any limitation (the ideal opposite to that expressed by the uncertainty principle). In order to reach this purpose, Schrodinger introduced a complex wave function ys, that started a long debate" about its physical interpretation. At the end of the 1920s, there were several attempts to unify the two formulations made by Schrodinger, Weyl, Dirac and von Neumann. From an historical standpoint, some authors, for example F. A, Muller [5], consider the supposedly demonstrated equivalence between matrix mechanics and wave mechanics a "myth". They pointed out that the first Schrodinger's demonstration [6] was at least incomplete. Moreover, in 1975, Heisenberg [7], even if admitting the mathematical equivalence, considered different the physical interpretations of the two formulations. Actually, both from a mathematical and logical standpoint, the next reformulations will maintain some differences already seen in the early two. From a mathematical point of view, for example, Dirac [8] unified the two theories by means of a more abstract formalism (Hilbert space), having in common with Schrodinger's theory the axiomatic organization and the choice of mathematical continuum. Heisenberg, Born and Jordan's matrix mechanics was considered a particular application (the so-called Heisenberg representation) of a more general theory. On the contrary, Weyl [9], by reformulating the theory, started from the uncertainty principle and showed how the Schrodinger equation was a consequence of the Heisenberg commutation rules when a mathematics (i.e. group theory) alternative to differential equations was used. From a logical point of view too, it is possible to recognize two distinct paths linked to the early two formulations. First, in 1936, Birkhoff and von Neumann [10] proposed a non-classical logic for quantum mechanics, characterizing it as a non-distributive logic. They described the quantum property of a physical system in terms of lattices of projectors (that admit only the eigenvalues 1 and 0) and the logical operations of conjunction (A), disjunction (v) and negation (-.) in terms of, respectively, intersection, direct sum and orthocomplement among these lattices. In this way, they could conclude their paper showing that the distributive law of conjunction vs. disjunction didn't hold true and proposed a weaker form called modular law. In the 1960s this approach was the object of several studies and until now it was the predominant one. But it was not the only one. a
From Born probabilistic interpretation of ^(1926) to von Neumann projection postulate (1932) essential to solve the problem of (f collapse.
337
In the 1970s a real alternative to this approach was built by authors such as A. Fine [11], J. Bell and M. Hallett [12], Y. Gauthier [13], J. V. Corbett and M. Adelman [14], according to which quantum logic is a non-classical logic not because the distributive law fails, but because the law of the excluded third doesn't hold true: an intuitionist characterization of quantum logic started. The main criticism against Birkhoff and von Neumann projectors algebra concerns the role of negation and the problem of defining the orthocomplement for infinite-dimensional Hilbert subspaces or for finite but open subspaces. In this last case, a Heyting algebra is required to generalize the concept of the projectors to operators (called effects) with eigenvalues included between 0 and 1. In the detailed analysis of these proposals, we showed that the formal problems of intuitionist quantum logic remain unsolved and we pointed out the lack of a clear reference to the theory's basis. In the next section, we'll show how it is possible, by means of a logicallinguistic analysis of Heisenberg's papers, to link the intuitionist approaches to matrix mechanics, as the Birkhoff and von Neumann approach is linked to Schrodinger's mechanics. 2. Premises to a New Approach to Quantum Logic Starting from the study [16] of the several approaches to quantum logic and from the A. Drago papers [17] on the bases of classical physics, it is possible to review the intuitionist program for quantum logic by means of the following assumptions. In a physical theory we can distinguish three parts: (a) experimental laws, (b) mathematical formalism (differential equations, symmetries etc.) (c) principles. Logic regards only point (c), i.e. the organization of theory's principles. Therefore, in contrast to other authors that try to characterize quantum connectives only by means of experimental examples, we'll consider the linguistic analysis of these principles sufficient to characterize primitive logical connectives and to define the syntax. Glyvenko [18] showed that this is enough to distinguish the classical logic, in which the double negation affirms, from the intuitionist logic without the law of the excluded third. In contrast to the traditional approaches to quantum logic that started from the mathematics of the Hilbert space to derive a logical calculus, here we'll show that the inverse path, in which logic precedes mathematics, is possible.
338 Finally, by agreeing with A. Drago [17], we'll consider the revision of some classical physical theories in terms of a non-classical logic, in which the double negation doesn't affirmb. 3. The Logic of the Heisenberg Formulation In order to reconstruct the birth of matrix mechanics, as it was later called, we are going to consider some Heisenberg papers or letters written between 1925 and 1928. In the 1925 paper, Heisenberg built his quantum theoretical kinematics and mechanics starting from a methodological principle about the operational meaning of physical quantities. He identified measurement as the main problem of the new theory and he looked for a method to solve this problem. His idea was to explain atomic emission spectra not in terms of orbits (as in the old quantum theory), but only by means of the experimental data, such as frequencies and intensity of light emitted or adsorbed by matter. Heisenberg writes ([1], p. 879) that the rules of the old quantum theory are not supported by physical evidence, unless we accept to found the theory on the hope that non-observable quantities could become observable in the future. I have underlined a double negation included in the author's reasoning. Heisenberg doesn't affirm, by collapsing this double negation in an affirmative statement, that "the rules of old quantum theory are supported by physical evidence on the basis of the hope ...". This affirmative statement links the evaluation of a physical theory to a future event, which Heisenberg cannot definitively prove or disprove. Neither can he say that the rules of the old quantum theory have physical evidence, because, for example, position and revolution time of the electron are not observable. Nor can he say the contrary, because the energy is observable and other quantities could be observable in the future. Thus, the double negation is essential to express this semantic ambiguity and it is not reducible to the corresponding affirmative statement as in classical logic or in traditional quantum logic. At the end of his reasoning, Heisenberg can only suggest a method of research: to reformulate a quantum mechanics with only observable quantities. After defining the field application of the theory, Heisenberg gives it a formal representation. He clearly chooses discrete variables.
b
By considering non-classical logic also in classical physics, it is possible to avoid the problem of stating if Quantum Logic is empirical or not, an intrinsic issue of Birkhoff and von Neumann approach.
339 On November 23, 1926, in a letter to Pauli ([19], p. 357) Heisenberg writes that "It is not possible that the world is not discrete ... if space-time is discrete, the velocity in a point has no significance because in order to define the velocity in that point, we need a latter point infinitively near the former: this is impossible in a discrete world". Here is evident the negation of continuous space-time. Moreover we have again a reasoning by double negation. Heisenberg starts from the statement "it is not possible that the world is not discrete". At this point of the argument he doesn't consider this statement equivalent to "the world is discrete", because he has no physical evidence of it. Then, he admits by hypothesis the possibility of space-time discreteness and obtains an operative and verifiable consequence (the impossibility to define the velocity in a point). Only with this conclusion does the initial choice of discrete variables have meaning. Then it is clear that the formal representation adopted by Heisenberg is alternative to classical analysis (used by Schrodinger), because Heisenberg declares explicitly that it is impossible to perform the limit of the incremental ratio that defines velocity. Already in the 1925 paper, by reformulating the classical theory of radiating electron, Heisenberg argued that the concept of electron orbit has not physical meaning. In order to represent the emitted radiation in terms of Fourier series expansion, he writes ([20], p. 125) that "it is always possible to find the quantum theoretical equivalent for a quantity x(tf ...while an essential difficulty arises when we consider two quantities x(t) and y(t) and we try to represent the product x(t) y(t); in the classical theory x(t)y(t) is always equal to y(t)x(t), but in quantum theory this is not true". The failure of the commutative rule was already known to mathematicians for some types of matrix algebras. So it was natural to use this mathematical formalism to develop the new quantum theory. So we are left to consider what Heisenberg called the quasi-equality (the inequalities were introduced by Weyl following Pauli's suggestion): XP-PX=ih/2n (1) where the matrices X and P represent respectively position x and momentum p of a particle. Let us note that from a formal point of view the role of the imaginary number is essential to obtain this quasi-equality. However, even if now Heisenberg has a general rule for the complex matrices, he has not still solved his starting problem of the measurement of physical observables. In order to do that, he needs to have real numbers to compare with the measurement results. This last fundamental achievement is
340
obtained by introducing the concept of measure uncertainty (in the statement of the uncertainty principle)0. Historically there are at least two distinct Heisenberg's statements for this principle. The first is in paper [3]: "The more precisely the position is determined, the less precisely the momentum is known in the same time and vice-versa" (A) The second statement is in paper [26]: "It seems a general law that it is not possible to determine position and velocity simultaneously by absolute precision" (B) These statements were considered by Heisenberg ([27], pp. 15-19) a "semiquantitative" argument with the same mathematical content of Kennard's relations'*. Thus, the statements (A) and (B) introduce the mathematical formulation of quantum mechanics and are suitable for a logical analysis. A detailed analysis of the proposition (A) in terms of intuitionist logic was discussed in a previous paper [28]. Here we want to discuss the content of the second statement (B). The proposition (B) expresses an impossibility by means of a double negation6. The former is clearly "not possible". The latter is included in the word "absolute"="not relative". In fact, according to Heisenberg's insistence of referring the physical theory only to operational quantities, we have to consider the relative precision instead of the ideal "absolute precision". We want to show that this double negation doesn't semantically coincide with the corresponding affirmative proposition and thus the law of the excluded third fails in its underlying logic. Let R be the predicate "measurable on the physical system S by relative precision". Then R(x) means that "position x is measurable on the physical system S by relative precision". The same formalism holds for momentum p. By mean of these definitions, (B) becomes: c
d
e
Heisenberg and Bohr spoke of uncertainty relations. First A. E. Ruark [22] introduced the expression "uncertainty principle". However, Heisenberg [3] used the word principle several times in his 1927 paper, with the meaning of "methodological principle" and not "axiomatic principle" as Schroedinger intended his equation. On the other hand, Heisenberg didn't have an incontrovertible experimental proof to substain the evidence of his principle. For this experimental proof we had to wait for the work of Kaiser, Werner and George 1983 [23], Uffink 1985 [24], Nairz, Andt, and Zeilinger, 2001 [25]. Kennard relations (1927) translated (A) and (B) in the mathematical formula for the product (Ax/lp) of position and momentum uncertainties. Even if Heisenberg didn't formally recognize this double negation, he expressed fully its meaning in a letter to Pauli (Feb 23, 1927; [21], pp. 376), by comparing quantum limitations to thermodynamical ones. In fact, thermodinamical limitations are formulated by the disequality of the machines thermal efficiency and are expressed by the double negation of the impossibility of perpetual motion ( see A. Drago [17]).
341 -,-,(R(x)*R(p)) (2) where the symbols "->" and " A " denote respectively the negation and the conjunction'. The corresponding affirmative proposition of (2) is: (R(x)AR(p)) (3) Statement (3) says that "it is true that we can measure x and p by relative precision". This statement, without further specifications, is neither true, nor false. It is not true because when Ax Aph/n, the measurement is possible. The same consideration holds for the negation of (3); in fact, the statement ^(R(x)AR(p)) (4) means that it is not possible to measure x and p by relative precision; as for statement (3), without further specifications, it is neither true nor false. In conclusion, statements (3) and (4) are semantically indeterminate, while (2) is true. Thus, in order to express the uncertainty principle, as Heisenberg formulized it, a logic without the law of the excluded third is needed. 4. The "Synthetic" Aspect of Heisenberg's Argument Up to now the attempts made to axiomatically organize quantum theory starting from the uncertainty principle have failed. One of the last proposals made by Bub [29] in 2000 showed the necessity to start from more general principles to derive the whole theory. In an indirect way this conclusion highlights that Heisenberg, who instead started from the uncertainty principle, used an alternative to the axiomatic organization. In order to clarify this alternative, let us sum up the logical structure of Heisenberg's work as illustrated in the previous section: 1. the measurement problem of physical observables is formulated by means of a double negation in order to criticize the old quantum theory; 2. the formal representation of quantum observables as discrete variables and the complex matrices formalism follow from the spectra observation; 3. the quasi-equality (expressing the non-commuting property of conjugate variables) are derived by means of the matrices rules. 4. by means of a double negation a semi-quantitative statement (the uncertainty principle) solves the starting problem and introduces the mathematical formulas of the uncertainty relations. f
Let us note the crucial role of negation and conjunction in the formalizing the principles of the two important scientific revolutions of 1900. The reformulation of simultaneousness (both in relativity and in quantum mechanics) has required a rigorous redefinition of the logical conjunction.
342
It's possible to show [28] that this logical organization is a general feature that other authors have adopted both in classical physics (see Drago [17]) and in quantum physics (see T. F. Jordan [30]). In particular, this kind of organization was first proposed and formalized by L. Carnot [31] as an alternative to Newton's analytical and deductive method and for this reason was called "synthetic method". In his mechanics and analysis, L. Carnot started from an operationally founded problem expressed by a double negation. He defined a formal system by introducing an auxiliary variable in order to solve the starting problem. Once he obtained a general rule, he suppressed the auxiliary variable. Thus, he reconsidered the starting system and applied the new rule to the main problem. In Heisenberg and T. F. Jordan case, the auxiliary variable is the imaginary number, necessary to formalize the complex matrices calculus. This variable disappears by means of the square modulus in the uncertainty formula (defined as standard deviation) in order to express the result of a measurement. Thus, L. Carnot mechanics is the real classical counterpart to Heisenberg's theory rather than Newton mechanics (linked to Schrodinger formulation by axiomatic organization and the mathematical continuum). L. Carnot's theory lay between geometry and mechanics, and introduced a mathematical technique less powerful but more operational than differential equations. This technique was founded upon the concept of geometrical motions8 ("virtual invertible displacements"), which can be considered the first group of symmetry in classical physics. Following a similar path, Heisenberg formulated in 1926 [33] the first non-geometrical symmetry of modern physics, by applying the methods of group theory (permutations). Wigner [34] extended this technique to geometrical symmetries (rotations). This link between the theories of Heisenberg and L. Carnot suggests the possibility of reformulating the whole matrix mechanics using the mathematical technique of symmetries; this is a path until now not well explored, except for Wigner, Weyl and T. F. Jordan's attempts of the last century. Conclusions The historical analysis of Heisenberg's formulation of the uncertainty principle can be considered the starting point of a new approach to quantum logic. The logical-linguistic analysis of his papers can justify the modem intuitionist approach to quantum logic in terms of theory's foundations. In order g
C. Gillispie [32] pointed out that the concept of geometrical motion is included in the idea of thermical cycle developed by S. Carnot.
343
to achieve these results, the present analysis has made some assumptions about the basis of quantum mechanics (relationship among principles, mathematics and experiments) and about two kinds of physical theory's organization. Suggestions were made for a complete reformulation of the theory. References 1. W. Heisenberg, Zeit. fur Phys. 33, pp. 879-893, (1925); It. trans, in [20], 2. M. Born, P. Jordan, Zeit. fur Phys., 34, p. 858, (1925). 3. W. Heisenberg, Zeit. fur Phys. 43, pp. 172-198 (1927). 4. E. Schrodinger, Ann. Phys. 79, 361-376; 79, 489-527; 80,437-490; 81, 109139, (1926). 5. F. A. Muller, Stud. Hist. Phil. Mod. Phys., 28, 35-61; 28, 219-247 (1997). 6. E. Schrodinger, Ann. Phys. 79, 734-756 (1926). 7. W. Heisenberg, Bemerkungen iiber die Entstehung der Unbestimmtheitsrelation (1975); It. trans, in [20], p. 105. 8. P. A. M. Dirac, The Principles of Quantum Mechanics, Oxford (1930). 9. H. Weyl, The theory of groups and quantum mechanics, Dover Pub. (1931). 10. G. Birkhoff, J. von Neumann, Ann. of Math. 37, 823-843 (1936). 11. A. Fine, "Some conceptual problems of Quantum Theory", in R. S. Colodny Paradigms and Paradoxes, Pittsbourgh, pp. 3-31 (1972). 12. J. Bell, M. Hallett, Philosophy of Science 49, 355-379 (1882). 13. Y. Gauthier, Int. Jour. ofTheor. Phys., 22, n. 12,1141-1152 (1983). 14. M. Adelman, J. V. Corbett, Appl. Categ. Struct., 3, n. 1, 79-104 (1995). 15. A. Venezia, La logica della Meccanica Quantistica: analisi storico-critica, Tesi di Laurea in Fisica, Universita Federico II, A. A. 1999-2000, Napoli. 16. A. Venezia, "I diversi approcci alia Logica Quantistica", in E. Schettino (ed.): Atti XX Congr. Naz. St. Fis. e Astr., CUEN, pp. 423-450 (2001). 17. A. Drago, Le due opzioni, La Meridiana, Molfetta (1991). 18. V. I. Glyvenko, Acad. Roy. Belg. Bull. Sci. (5) 15, 183-188 (1929). 19. W. Pauli, Wissenschaftlicher Briefwechsel mit Bohr, Einstein, Heisenberg u.a.I-II-III, K. von Meyenn, ed., Springer-Verlag, Berlin (1979). 20. W. Heisenberg, Lo sfondo filosofico della fisica moderna, a cura di G. Gembillo e E. Giannetto, Sellerio, Palermo (1998). 21. W. Pauli, Quantentheorie, in Handbuch der Physik, H. Geiger, K. Scheel (eds.) 23, Springer-Verlag, Berlin (1926). 22. A. E. Ruark, Bulletin of the American Physical Society 2, p. 16 (1927). 23. H. Kaiser, S. A. Werner and E. A. George, Phys. Rev. Lett. 50, 560 (1983). 24. J. Uffink, Physics Letters 108 A, 59-62 (1985). 25. O. Nairz, M. Andt, A. Zeilinger, Quantum Phys., quant-ph/0105061 (2001). 26. W. Heisenberg, Forschungen und Fortschritte 3, 83 (1927). 27. W. Heisenberg, Physical Principles of Quantum Theory, Chicago (1930).
344
28. A. Drago, A. Venezia in C. Mataix and A. Rivadulla (eds.): Quantum Physics and reality, Ed. Complutense, pp. 249-266, Madrid (2002). 29. J. Bub, Stud, in Hist, and Phil, of Mod. Phys. 31B, 75-94 (2000). 30. T. F. Jordan, Quantum Mechanics in simple matrix form, Wiley (1985). 31. L. Carnot, Reflexion sur the metaphysique du calcul infinitesimal (1813). 32. C. Gillispie, Lazare Canot Savant, Princeton U. P. (1971). 33. W. Heisenberg, Zeit. fur Phys. 38,411-426 (1926); 41, 239-267 (1927). 34. E. Wigner, Zeit. fur Phys. 40, 883-892 (1927).
SPACE-TIME AT THE PLANCK SCALE: THE QUANTUM COMPUTER VIEW PAOLA A. ZIZZI* Dipartimento
di Matematica Pura ed Applicata, Universita di Padova, via Belloni 7, 35131 Padova, Italy
We assume that space-time at the Planck scale is discrete, quantised in Planck units and "qubitised" (each pixel of Planck area encodes one qubit), that is, quantum space-time can be viewed as a quantum computer. Within this model, one finds that quantum spacetime itself is entangled, and can quantum-evaluate Boolean functions which are the laws of Physics in their discrete and fundamental form.
1. Introduction What is "space-time" at the Planck scale? Once we understand that, we will be able to formulate the theory of Quantum Gravity, the theory which should reconcile General Relativity and Quantum Mechanics. In fact, it is widely believed that at the Planck scale the quantum aspects of gravity become relevant. Moreover, it is generally assumed that at the Planck scale, space-time is not any longer a smooth manifold, but has a discrete structure. There are two main approaches to quantum gravity that assume quantum space-time to be discrete: Loop Quantum Gravity [1,2] (and spin foams [3]), and String (and M) Theory [4,5]. Other interesting approaches are non-commutative geometry [6], Causal Set Theory [7] and kinds of discrete models of space-time at the Planck scale, like lattice versions of loop quantum gravity [8,9,10], and Cellular Networks [11,12]. In our particular approach to quantum gravity, we assume discreteness of space-time at the Planck scale, and we also include the issue of information, (more precisely quantum information [13,14,15,16]). In fact, as it was suggested by Wheeler (the "It from bit" proposal) [17], information theory must play a relevant role in understanding the foundations of Quantum Mechanics. Wheeler's view is shared, in particular, by Zeilinger (who associates bits with elementary systems, i.e. two-level systems, and claims that the world appears quantised because information is quantised) [18]. E-mail: [email protected].
345
346
As it was first realized by Feynmann, a quantum computer can be exponentially more powerful than a classical one in simulating a quantum system. This line of thought is what we call here the "Quantum Computer View" (QCV). We believe that the QCV is universal, and thus can be extended to the "description" of quantum space-time itself. Approaches similar to ours, still encompassing the QCV, are those of Lloyd [19], and Jaroszkiewicz [20]. Our approach is closely related to Loop Quantum gravity and spin networks. Spin networks are relevant for quantum geometry. They were invented by Penrose [21] in order to approach a drastic change in the concept of space-time, going from that of a smooth manifold to that of a discrete, purely combinatorial structure. Then, spin networks were re-discovered by Rovelli and Smolin [22] in the context of Loop Quantum Gravity. Basically, spin networks are graphs embedded in 3-space, with edges labeled by spins and vertices labeled by intertwining operators. In loop quantum gravity, spin networks are eigenstates of the area and volume-operators [23]. We interpret spin networks as qubits when their edges are labelled by the spin-1/2 representation of SU (2). In this context, we use the quantum version [24,25] of the Holographic Principle [26,27,28]. In our model, quantum space-time is discrete, quantised in Planck units, and each pixel of Planck area encodes a qubit. This is a quantum memory register. To process the quantum information stored in the memory, it is necessary to dispose of a network of quantum logic gates (which are unitary operators). The network must be part of quantum space-time itself, as it describes its dynamical evolution. The quantum memory plus the quantum network form a quantum computer. In the QCV, some new features of quantum space-time emerge: i) The dynamical evolution of quantum space-time is a reversible process, as it is described by a network of unitary operators. ii) During a quantum computational process, quantum space-time can be in an entangled state, which leads to non-locality of space-time itself at the Planck scale (all pixels are in a non separable state, and each pixel loses its own identity). iii) As entanglement is a particular case of superposition, quantum spacetime is in a superposed state, which is reminiscent of the Many-Worlds interpretation of Quantum Mechanics [29]. iv) Due to superposition and entanglement, quantum space-time can compute a Boolean function for all inputs simultaneously (massive quantum
347
parallelism). We argue that the functions which are quantum-evaluated by quantum space-time are the laws of Physics in their most fundamental, discrete and abstract form. v) By scratch space management, we find that at the Planck scale it is possible to compute composed recursive functions of maximal depth. The paper is organized as follows. In Sec. 2, we discuss the new concepts of event in quantum space-time, and its quantum information nature. In Sec. 3, we introduce the Quantum Computer View of quantum space-time at the fundamental level. In Sec. 4, we analyze the possibility of quantum space-time being in a superposed/entangled state. In Sec. 5, we investigate about a possible quantum network chosen by Nature. In Sec. 6, we illustrate how space-time can quantum-evaluate Boolean functions at the Planck scale. In Sec. 7, we investigate about a unitary evolution of quantum space-time Sec. 8 is devoted to the conclusions. 2. Qubitisation of quantum space-time The very concept of event should be revised in the context of quantum spacetime. In fact, the definition of event as a point in a four-dimensional smooth manifold becomes meaningless once space-time is assumed to be discrete, and quantized in Planck units. If the minimal length is assumed to be the Planck length: lp = 10~33 cm and the minimal time interval is assumed to be the Planck time: tp —10 sec, it follows that an event in quantum space-time is an extended object without structure. In the QCV, the quantum event encodes quantum information. The (classical) holographic principle [26,27,28] claims that it must be possible to describe all phenomena within the bulk of a region of space of volume V by a set of degrees of freedom which reside on the boundary, and that this number should not be larger than one binary degree of freedom per Planck area. All this can be interpreted as follows: each unit of Planck area (a pixel) is associated with a classical bit of information.
348 At the Planck scale, however, where quantum gravity takes place, we argue that the encoded information should be quantum, and the holographic principle should be replaced by its quantum version [24,25]. In the quantum version of the holographic principle, a pixel encodes one quantum bit (qubit) of information. (A qubit is a linear superposition of the logical states 0 and 1, namely: \Q\ = a\o) + b\l), where a and b are complex numbers called probability amplitudes, such that U + \b\ =1). The necessity of the quantum version of the holographic principle follows directly from loop quantum gravity. In loop quantum gravity, non-perturbative techniques have led to a quantum theory of geometry in which operators corresponding to lengths, area and volume have discrete spectra. Of particular interest are the spin network states associated with graphs embedded in 3-space with edges labelled by spins
;=o,l,i,|,... and vertices labelled by intertwining operators. If a single edge punctures a 2-surface transversely, it contributes an area proportional to [23]: Let us consider the edges of spin networks in the spin -1/2 representation of SU(2): they are 2-level systems, and can be thought as qubits. In mathematical terms, the group manifold of SU(2) can be parameterized by a 3-sphere with unit radius. In fact, the most general form of 2 x 2 unitary matrices of unit determinant is: f a b^ a\2+\b\2=l U= -b where a and b are complex numbers. For example, the action of the unitary SU(2) matrix
j_
U0t=-j=a where o 2 is the Pauli matrix:
+ i
-r 0,
349
on the edge states
I \ and + I \ respectively, gives the equally 2/ 1^
superposed states
4i
2'
2/ When a surface is punctured by such a superposed state, a pixel of area is created, which encodes one qubit. The elementary pixel can then be viewed as the surface of a unit (in Planck units) sphere in three dimensions. The pixel is punctured (simultaneously) in the poles by an edge in the superposed state of spin down and spin up. Equivalently, a qubit corresponds to the surface of the 3dimensional unit sphere, where the logic states 0 and 1 correspond to the poles. This is the so-called Bloch sphere. There is clearly an analogy between the spin networks approach to quantum gravity and our Quantum Computer View of quantum space-time. 3. Quantum space-time: is it a quantum computer? Having assumed that space-time at the Planck scale encodes quantum information, the latter must be processed to give rise, as an output, to the universe as we know it. If so, quantum space-time is not just a quantum memory register of n qubits: it is the whole thing, a quantum memory register plus a network of quantum logic gates. In other words, space-time at the Planck scale must be in such a quantum state to be able to evaluate those discrete functions which are the laws of Physics in their discrete and most fundamental form. We may interpret that quantum state as the state of a quantum computer which is computing Boolean functions. But doing so, we should assume that at the Planck scale space-time is in a superposed/entangled state. In fact, any efficient quantum algorithm relies on superposition and entanglement of qubits. In quantum computation, superposition and entanglement are very important, because they allow quantum parallelism: the possibility to compute exponentially many values of a function in polynomial time. 4. Is quantum space-time in a superposed/entangled state? If the qubits encoded by pixels were superposed, the surface embedding a region of space would "exist" in many different states simultaneously. This would be a quite weird wave-like aspect of quantum space-time itself. Superposition is one characteristic feature of quantum mechanics, but we should be aware of the fact
350
that once applied to quantum space-time, it spoils the latter of its usual attributes. We think that the idea of a superposed state of qubits associated to pixels fits quite well in the Many-Worlds interpretation of Quantum Mechanics, obviously restricted to the micro-domain of space-time itself, more precisely at the fundamental level. Do the pixels of Planck area encode qubits which are entangled to each other or not? In the affirmative, space-time itself would be spoiled of locality, at the Planck scale. In other words, two quantum events might be described by a single quantum state, each event losing its own identity. This would be a quite weird feature of quantum space-time, but it cannot be discarded a priori, because entanglement is a very peculiar feature of the quantum world. Let us consider a finite number N of pixels pt (i =1, 2...N) each one encoding one qubit I Q\. (notice that the number of pixels of area of a certain surface S is equal to the number of punctures made by spin network' edges in the 1/2-representation of SU(2) onto S). The N qubits span a Hilbert space of dimension 2 . The standard basis for one qubit is: |o), ll) • The dual basis for one qubit is:
The most general one qubit state is: \Q\ = a|o) + b\l) where a and b are complex numbers such that: U + \b\ = 1. A 2-qubits state can be either non entangled (product state of two qubits) or entangled (a non-separable state). The non entangled basis for two qubits is: |00),|01),|10),|11). An example of non entangled two qubits state is the product of one dual basis vector and the qubit 10):
^(|o) + l 1 ))|o)-^d°o)+|io))The entangled basis for 2-qubits (Bell states, maximally entangled) is:
|VP±) = ^ ( | 1 0 ) ± I 0 1 > ) i*±>=^dii>±|oo>).
351 5. The quantum network of Nature Let us suppose that all the N qubits encoded by N pixels are initially in state|000 0) • They form a quantum register of size N, but that is just storage of quantum information. To be able to perform quantum computation, the qubits of the memory must be manipulated by some unitary transformations performed by quantum logic gates (the number of the gates is called the size of the network). Now, to make a superposition of two qubits, it is necessary to dispose of the Hadamard gate:
J_ ft n i -i 4i and to entangle two qubits, it is necessary the controlled-NOT (or XOR) gate: H=
(\ XOR =
0 0 0^1
0 1 0 0 0 0 0 1
0 0 1 0 V In the case of n qubits, we need the Walsh-Hadamard transformation: Hn =H®nLet us see how it works in the case of two qubits. Let us write the standard basis in vector notation: |0) =
|i) =
v°y
vly The action of the Hadamard gate on the ket I o) is:
ft r fi\ ; -I v°/
(0
1 vv°y
= ^(|o)+|i)>
and on the ket |l\ is:
i r (0
j_
i
4i W°y
-i
=^do>-|i»
vl/ Consider a quantum register of size two in state |00) • The action of the Hadamard gate H on the first qubit gives the superposed state:
If we take the superposed state as the control qubit (c), and the second qubit of the memory as the target qubit (t), the action of the XOR gate is:
352 XO/?:_^(|0) +
|1)) (C) |0) (() ^-^(|00) + |11))
which is an entangled state of two qubits. A quantum memory register of size n is a collection of n qubits. Information is stored in the quantum register in binary form. The state of n qubits is the unit vector in the 2"-dimensional complex Hilbert space: C 2 ® C 2 ®...®C 2 , n times. As a natural basis, we take the computational basis, consisting of 2" vectors, which correspond to 2 " classical strings of length n: |0)®|0)®...®|0)s|00...0) |0)®|0)®...®|l)s|00...l)
|l)®|l)®...®|l)s|ll...l). In general, we will denote one basis vector of the state of n qubits as: \X1)®\X2)®...®\XH)
where xl,x2,...,x
= \X1X2..JCH) = \X),
is the binary representation of the integer x, a number
between 0 and 2" . The general state is a complex unit vector in the Hilbert space, which is a linear superposition of the basis states:
E c il*)' where c, are the complex amplitudes of the basis states |A, with the condition: V | c I2 =\. i
To perform computation with n qubits, we have to use quantum logic gates. A quantum logic gate on n qubits is a 2" x 2" unitary matrix U. Initially, all the qubits of a quantum register are set tolo) • By the action of the Walsh-Hadamard transform, the n input qubits are set into an equal superposition: 1
2"-l
-prZI*>V2 *=o At this point the very computation can start.
353
6. Quantum function evaluation at the Planck scale The quantum computation of Boolean functions f is implemented by unitary operators U . In the case of bijective functions / : {0,l}" —> {0,l}", which are reversible, it always exists a unitary operator \J such that:
Uf:\x)->\f(xj), where I x) stands (for brevity) for the input register, namely
V2 *=o The quantum computation of non bijective functions f: {0,l}" —> {0,l}m, (which are non reversible) requires (at least) two registers, in order to guarantee the unitary of U (reversibility of the computation): a register of size n to keep a copy of the arguments of f, and a second register of size m, to store the values of f:
Uf\x)\y) =
\x)\y®f(x)),
where © stands for addition mod 2 m . Notice that in general, (for non trivial functions) the states M U © / ( x ) ) a r e entangled. Moreover, the quantum computation of f on a superposition of different inputs, produces f(x) for all x in a single run (quantum parallelism):
2»l°HI»|/«>' X
X
But we cannot get all values of f(x) from the entangled state V I x\\ f(x)) a s a n v measurement on the first register will yield one particular*value x', and the second register will then be found with the value f(x'). It is possible, however, to compute some global properties of f(x) in a single run. As we already said, both superposition and entanglement are necessary for quantum computation. But it is not obvious that quantum information stored in quantum space-time is exploited to perform quantum computation. It depends on which kind of quantum network (if any) Nature has chosen. The question is: what should be computed by quantum space-time? The answer is: the global properties of Boolean functions, as in a quantum computer. In our case, we argue that the output of quantum computation would be the global structure of the Laws of Physics. Some extra registers (called scratch space) are also needed to store intermediate results. In longer calculations (for example in computing composite functions) this leads to a large amount of "garbage" (or "junk")
354 qubits, which are not relevant to the final result. In order not to waste space, these "junk" qubits must be re-set to |o) and the scratch space can then be "recycled" for further computations. Scratch space management was proposed by Bennett [30,31]. Let us suppose we have to calculate a composite function of depth d. Without scratch space management, the computation would need d operations, and would consume d-1 junk registers. With scratch space management, the computation will need 2d-l operations, and d-1 scratch registers. For example, the computation of a composite function of depth d=2, f(x)=h(g((x))), would need 3 operations, and one scratch register, which can be reused in further computation:
\x,0,0)-^^\x,g(x),0)-^-^\x,g(x),h(g(x)))-^^\x,0,f(x)), Where U ,Uh are the unitary operators implementing the quantum computations of functions g and h respectively, and the suffix numbers refer to the registers operated on. The last step of the computation is just the inversion of the first step and un-computes the intermediate result. The second register can then be reused for further computations. As we have seen, the number of required scratch registers, increases linearly with the depth of the composite function which has to be quantum computed. This fact will be very useful to our purpose. We can imagine the boundary surface S enclosing a volume V of space, as a collection of N pixels of Planck area, each encoding a qubit. Thus S is a quantum memory register of N qubits. If all N qubits are initially set tojo), as always before any computation, the original register can be thought as the product of several registers: |o) lo) lo) ...Jo) where registers x, y, z...w have respectively size n, m, k,..., r such that n + m + k...+ r=N. The initial quantum state I*?)e C 2 of S is then:
l*Ho)J°),l<Mo) w Suppose that register Ifj) has the smallest size, for example n=2. This size is very close to the Planck scale, as for n=2, it is: ~ p. The register |o) can be set to an equal superposition of basis states by the action of the Walsh-Hadamard transform H® which acts locally on it:
Now the quantum state of S is:
l*H*M,|o) t |o>w.
355 where I x) stands for
3
}ZI*>In our case, the quantum computation of a function f: {0,l}" —» {0,1}""" can be implemented by a unitary operator such that:
tf/:EI*)l0),->£!*./<*)) only if the second register y has the right size to accommodate f, i.e., m=N-n, and there are no other registers available. However, if the computation of f produces n' junks bits which fill a scratch register of size n', a second register of size n', has to be provided. The best way to solve this problem, is to take a smaller first register x to enable scratch space management. Moreover, if f is a composite function f(h((g(l(....(x))))) of depth d, the original register of size N must be partitioned in such a way that there are d-1 scratch registers available. So, in order to compute highly composite functions, the first register (storing the argument) must have the smallest possible size, to leave room for the needed number of scratch registers. In particular, if n=l (the Planck scale), the available scratch space has size N-l, and the highest level of composition for f is d=N when d-1 scratch registers, of one qubit each, sum up to the original register of size N. Thus, the quantum computation of highly composite functions must be performed close to the Planck scale, and the output (some global property of f) is obtained at macroscopic scales. According to inflationary cosmological theories, the cosmological horizon has at present a radius R = 1060/,, , thus its surface area is A ~ 10120/p, that is an area of 10120 pixels, each one encoding one qubit. In the QCV, the cosmological horizon's surface can be interpreted as a quantum memory register of N = 10120qubits. Thus, space-time at the Planck scale can compute a composite function of maximal depth
356 It follows that, in the quantum computer view, the dynamical evolution of quantum space-time itself is a reversible process. This sounds like a paradox, as far as we think of quantum space-time as a pre-space-time with almost all the same characteristics of classical space-time, which is the seat of irreversibility. Irreversibility might be just an emergent feature at larger scales. One should be able, however, to figure out what it means reversibility of quantum space-time itself. The simplest answer leads us back to Wheeler's "space-time foam" [35], made up of virtual black holes (and wormholes). Like all virtual processes, also this one takes place by virtue of the time-energy uncertainty relation, (which at the Planck scale is saturated). A quantum black hole of Planck mass, comes into existence out of the vacuum, and then evaporates in Planck time, releasing a quantum of Planck energy back to the vacuum. As this "virtual" process is due to quantum fluctuations of the vacuum, which are non-dissipative [36], it can be considered a reversible process, unless a measurement takes place. But virtual particles cannot be probed. 8. Conclusions The QCV of space-time at the Planck scale relies on linear concepts like superposition and entanglement. Thus, this view cannot be extended to the macroscopic domain, where space-time is described by the non linear equations of General Relativity. To understand how, from the linearity of the Planck scale level we obtain the non linearity of the classical macroscopic level, it might be useful to consider self-organizing models and related technicalities. This is what we call emergence of classicality and complexity (our classical world emerges as one which is complex). As we have seen, in the QCV, quantum space-time looks like having a reversible dynamical evolution. But what does it mean that space-(time) evolves in time, and moreover in a reversible manner? As we have seen, this paradox can be solved by assuming Wheeler's picture of "space-time foam" which however excludes time flow at the Planck scale. Thus, both non linearity and irreversibility, which have no home in the QCV, should be emergent features of space-time. In the QCV, also locality is lost: "space-time" itself is non local at the Planck scale, due to the entanglement of pixels/qubits. This is very much on line with Penrose's argument, stating that the theory emergent from spin networks should have a fundamentally non-local character [37].
357
As far as causality is concerned, it is a more subtle point. However we believe that, because of non-locality due to entanglement of pixels, microcausality is missing at the Planck scale, at least in its usual form. Finally, despite all these weird features, space-time at the Planck scale seems to be able to compute its own dynamical evolution, by quantum evaluating recursive functions. Acknowledgments I wish to thank G. Peruzzi e G. Sambin for useful discussions. References 1. C. Rovelli, "Loop Quantum Gravity", gr-qc/9710008 (1997). 2. C. Rovelli and L. Smolin, "Loop representation of quantum general relativity", Nucl. Phys. B133, 80 (1990). 3. J. C. Baez, "Spin Foam Models", Class. Quant. Grav. 15, 1827 (1998). 4. J. C. Schwarz, "Introduction to Superstring Theory", hep-th/0008017 (2000). 5. M. J. Duff, "M-Theory (The Theory Formerly Known as Strings), hepth/9608117(1996). 6. A. Connes, Non Commutative Geometry (Academic Press, S. Diego, 1994). 7. L. Bombelli, J. Lee, D. Meyer and R. Sorkin, "Space-time as a causal set", Phys. Rev. Lett. 59, 521 (1987). 8. R. Gambin and J. Pullin, "A rigorous solution of the quantum Einstein equations", Phys. Rev. D54, 5935 (1996). 9. R. Loll, "Nonperturbative solutions for lattice quantum gravity", Nucl. Phys. B444, 619 (1995). 10. M. Reisenberg, "A Left-Handed Simplicial Action for Euclidean General Relativity", Class. Quant. Grav.U, 1730 (1997). 11. M. Requardt, "Cellular Networks as Models for Planck-Scale Physics", J. Phys. A31, 7997 (1998). 12. M. Requardt and S. Roy, "(Quantum) Space-Time as a Statistical Geometry of Fuzzy Lumps and the Connection with Random Metric Spaces", Class. Quant. Grav. 18, 3039 (2001). 13. J. Preskill, "Quantum Information and Computation", Lecture Notes for Physics 229 (California Institute of Technology, 1998). 14. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, UK, 2000). 15. A. Ekert, P. Hayden, and H. Inamori, "Basic Concepts in Quantum Computation", quant-ph/0011013 (2000). 16. A. Steane, "Quantum Computing", Rept. Prog. Phys. 61, 117 (1998).
358 17. J. A. Wheeler, "It from Bit", in Sakharov Memorial Lectures on Physics, Vol. 2, L. Keldysh and V. Feinberg eds. (Nova Science, New York, 1992). 18. A. Zeilinger, "A Foundational Principle for Quantum Mechanics", Found. Phys. 29, 631 (1999). 19. S. Lloyd, "Universe as quantum computer", Complexity 3(1), 32 (1997). 20. G. Jaroszkiewicz, "The running of the universe and the quantum structure of time", quant-ph/0203020 (2002). 21. R. Penrose, "Theory of quantised directions", in Quantum Theory and Beyond, T. Bastin ed. (Cambridge University Press, 1971). 22. C. Rovelli and L. Smolin, "Spin networks and quantum gravity", Phys. Rev. D52, 5743 (1995). 23. C. Rovelli and L. Smolin, "Discreteness of area and volume in quantum gravity", Nucl. Phys. B442, 593 (1995). 24. P. A. Zizzi, "Holography, Quantum Geometry, and Quantum Information Theory", Entropy 2, 39 (2000). 25. P. A. Zizzi, "Quantum Computation toward Quantum Gravity", Gen. Rel. Grav. 33, 1305 (2001). 26. G. 't Hooft, "Dimensional reduction in quantum gravity", gr-qc/9310026 (1993). 27. G. 't Hooft, "The Holographic Principle", hep-th/0003004 (2000). 28. L. Susskind, "The world as a hologram", hep-th/9409089 (1994). 29. H. Everett III, "Relative State Formulation of Quantum Mechanics", Rev. Mod. Phys. 29, 454 (1957). 30. C. H. Bennett, IBM J. Res. Develop. 17, 525 (1973). 31. C. H. Bennett, SI AM J. Comput. 18, 766 (1989). 32. P. A. Zizzi, "The Early Universe as a Quantum Growing Network", grqc/0103002 (2001). 33. P. A. Zizzi, "Ultimate Internets", gr-qc/0110122 (2001). 34. S. Lloyd, "Computational capacity of the universe", quant-ph/0110141 (2001). 35. J. A. Wheeler, Geometrodynamics (Academic Press, New York, 1962). 36. E. Nelson, "Quantum Fluctuations", Princeton Series in Physics (Princeton University Press, 1985). 37. R. Penrose, "Afterword", in The Geometric Universe, Science, Geometry, and the Work of Roger Penrose, S. A. Huggett et al. eds. (Oxford University Press, 1998).
THREE-DIMENSIONAL WAVE BEHAVIOUR OF LIGHT FABRIZIO LOGIURATO BENIAMINO DANESE LUIGI M. GRATTON STEFANO OSS Department of Physics, University ofTrento, 38050 Povo Trento, Italy We describe a simple experimental apparatus which allows one to observe the wave properties of light in a new way. This apparatus makes it possible to introduce and illustrate, in a very suggestive way, some fundamental principles of quantum theory.
1
Introduction
Quantum theory is introduced in many books by means of an example widely recognized as paradigmatic: the double-slit experiment (see, e.g., Refs. 1-4). Light, after travelling and behaving as a wave, manifests itself on the detection screen as a stream of corpuscles. According to Feynman, it is absolutely impossible to explain this phenomenon in any classical way. In his opinion this is the "heart" of quantum mechanics, "in reality, it contains the only mystery" [I]. However, in experiments emphasizing the wave nature of light, diffraction and interference patterns are shown only in the last part of the light path. What is going on in the space between the slits and the detection screen is only sketched in the figures. We developed a simple apparatus where diffraction and interference patterns are not only observed at the final screen position as in traditional experiments, but also in a three-dimensional environment [5, 6]. In this paper we give a few examples of how our apparatus may be used to illustrate the wave properties of light.
2
Experimental setup and results
Many simple techniques have been adopted in the past to visualize light rays. For instance, light diffusion from chalk powder or from smoke. The technique we adopt here is based on light diffusion from water droplets produced by an ultrasonic mist-makera immersed in water. Vibrations at ultrasonic frequencies a
Further informations about mist-makers may be found, for instance, at: http://www.phvslink.com/estore/cart/UltrasonicMistMaker.cfm 359
360
of a ceramic electrode inside the mist-maker generate ultrasounds that break the surface of the liquid and nebulize the water. This technique produces a continuous and homogeneous fog which allows the formation through its whole volume of very stable luminous patterns. To minimize turbulences and to assure high homogeneity of the fog along the light path, the mist-maker is placed in a box with transparent walls, such as an aquarium. A black piece of fabric covers the walls of the box through which no vision takes place, to avoid disturbing reflections. As coherent light source we use a 10 mW HeNe laser. The laser wave length is X = 0.6828 p n . The slits belong to Pasco optical kit OS-9165. The photographs are taken with a digital D70 reflex camera. The equivalent sensitivity is set to 200 ISO. Exposure times ranges from 1/30* to Vi sec, with various f/values. In Figure 1 we see various images. In each of them HeNe laser light impinges on a single slit, and the slits in different images have different widths. The dependence of the extent of the diffracted beam on the width of the slit is clear: the narrower the slit the broader the intense (0th order) central beam.
Figure 1. The experiments of diffraction from a single slit. The images correspond to slits of decreasing width (from left toright):80 flm, 40 \im, 20 um. We think that these images are a really beautiful illustration of the famous experiment of the single slit by which Heisenberg introduces the uncertainty relations [7, 8]. In fact, if we regard light as a stream of corpuscles (the photons), it follows that the narrower the slit, the higher the space localization of each photon in the light beam, and the larger the uncertainty of the momentum acquired by it.
361 In Figure 2 two further interesting images are compared. In the left one, the light beam impinges on a screen with two slits. The beam that passes through the screen forms the well-known interference fringes of the classic Young experiment in the space. This experiment provided the definitive demonstration of the existence of wave properties of light in 1802. In the resulting series of maxima and minima we can distinguish two patterns: the enveloping pattern due to the light diffraction through each slit and, inside the envelope, the interference pattern of the light coming from the two slits [9]. We may compare the Young experiment of the double slit with the diffraction from the single slit (right image). In the latter, the slit has the same width as each slit in the Young experiment. It can be noted immediately that the interference pattern from two slits is not the sum of two diffraction patterns from single slits. This phenomenon cannot be explained if one adopts only the classic corpuscular model of light. Hence, the photos in Figure 2 support effectively the undulatory counterpart of Feynman's two-slits experiment, by which this author introduces the wave-corpuscle dualism and Bohr's complementarity principle [1].
Figure 2. Left: Young experiment of two slits. The width of the slits is 40 urn, and their separation is 125 urn. Right: diffraction from a single slit of the same width.
362
References 1. 2. 3. 4. 5. 6. 7. 8. 9.
R. P. Feynman, R. B. Leighton and M. Sands, The Feynman Lectures on Physics, Vol. 3 (Addison-Wesley, Reading, MA, 1989). T. Hey and P. Walters, The New Quantum Universe (Cambridge University Press, Cambridge, 2003). E. H. Wichmann, Berkeley Physics Course, Vol. 4, Quantum Physics (Wiley, New York, 1971). B. D'Espagnat, Conceptual Foundations of Quantum Mechanics (W. A. Benjamin, Reading, MA, 1976). F. Logiurato, L. M. Gratton and S. Oss, The Physics Teacher, in print (2005). F. Logiurato, L. M. Gratton and S. Oss, submitted to Physics Education (2005). W. Heisenberg, The Physical Principles of the Quantum Theory (University of Chicago Press, Chicago, 1930). M. Alonso and E. J. Finn, Fundamental University Physics, Vol. 3 (Addison-Wesley, Reading, MA, 1972). F. A. Jenkins and H. E. White, Fundamentals of Optics (Mc Graw-Hill, New York, 1957).
This volume provides a unique overview of recent Italian studies on the foundations of quantum mechanics and related historical, philosophical and epistemological topics. A gathering of scholars from diverse cultural backgrounds, the conference provided a forum for a fascinating exchange of ideas and perspectives on a range of open questions in quantum mechanics. The varied nature of the papers in this volume attests to the achievement of that aim with many contributions providing original solutions to established problems by taking into account recommendations from different disciplines.
The Foundations of Quantum Mechanics
ISBN 981-256-852-2
www.worldscientific.com