Monday, 15 September 2014

Particle Physics Software and Financial Analysis Mechanics

Many people ask what are the benefits of massive particle physics experiments such as the Large Hadron Collider at CERN. Some go back to the basics and say that fundamental research inevitably generates spin offs in technology. After all, hospital equipment such as PET scans and MRI machines would have never found their way into hospitals without modern detector and superconducting magnet technology which are used in the massive particle detectors.

However it is important to note that the technology used in modern accelerators, such as superconducting magnets and silicon detectors were not invented by particle physicists, rather they were developed by materials scientists and engineers.

Particle Physics has benefited from Materials Science much the same way as Medical Physics has benefited from Particle Physics, simply in a different direction. However Particle Physics research, often labelled "Blue Sky" not only tests proof of principle of many important technologies but also does create applications in the here and now, most important of which is new software technology.

So when talking about the benefits of Particle Physics, and more importantly why people should fund it, what are the benefits?
The real benefits of Particle Physics research has been the vast amount of software that has been developed for these projects. Grid computing and Data Mining systems have been integral in the whole process from the beginning as has the ability to develop faster computer programs the analyse vast amounts of variables.

Physicists and Engineers have had to examine how they can program computers to perform these tasks as quickly as possible and it has led to them developing new programming languages to do so using existing techniques. One such example is CERN's ROOT programming language which uses the more familiar C++ language as a template.

CERN based ROOT off of C++ because even though it is a bit harder than most other languages it is very fast and powerful. Computer programming languages are chosen in the same way you would chose to buy a car, either for safety from crashing or for speed. If Fortran and Visual Basic are stable, yet slow programs, equivalent to a family car say, then a C++ program would be like a Ferrari. It is fast but crashes easily.

C++ is the computer language for mathematical models where you need speed. For models with closed form solutions you are naturally doing fine in almost any language, but when it comes to large scale Monte Carlo C++ is really a plus.

Therefore it is not surprising that CERN uses C++ in ROOT, as Monte Carlo simulations are used all the time in theoretical particle physics models. Before, CERN had used Fortran to run its simulations in its Geant-3 particle phsycis platform. CERN adopted C++ for Geant-4, which was the first such program to use object-oriented programming. CERN ROOT truly is an amazing piece of physics software with far greater potential than people give it credit for. ROOT works on virtually all operating systems however for optimum performance I have found that Red Hat Linux's Fedora is the OS of choice for CERN ROOT, at least in my experience.

Windows is useful for doing work too, for quick jobs and modifications, however I find it crashes far too easily and is not good if you are planning on working with ROOT for hours on end. Moreover, since the advent of the infamous Windows 8, most would consider Windows to be a generally bad operating system for doing any form of high level work on.

Apart from particle physics, Monte Carlo Simulations are also an important tool in modern finance, as they can be used in models to make predictions, namely in integrating certain formulas but incorporating random fluctuations as well as boundaries.

An elementary example of using the Monte Carlo method is to find the value of pi

Where humans can solve integrals symbolically, which is more mathematically complete, computer programs can only solve integrals numerically. To understand how a computer program does this, we need to think back to the basic definition of an integral from fundamental calculus:

The following example generates a solution, using standard integration, of  the area under the curve drawn from the polynomial f(x)

An integral can also be solved by approximation using a series of boxes drawn under the curve (see figure below); as the number of boxes increases, the approximation gets better and better. This is how a computer program can calculate integrals - it calculates the area of a large number of these boxes and sums them up.

In a similar fashion, a value for the mathematical constant π is represented or “modeled” by definite integrals, such as

we can simplify this by the approximation:

This forms our new representation or "model" for as

As a simplifying approximation we can use a Monte Carlo method where we choose points randomly inside of the square. The points should be uniformly distributed, that is, each location inside the square should occur with the same probability. Then the simplifying approximation which allows us to compute π is

A simple algorithm to implement this Monte Carlo calculation is:

• Choose the radius of the circle R = 1, so the enclosing square has side 2R = 2 and area (2R)^2 = 4

• Generate a random point inside the square.

–> Use a random number generator to generate a uniform deviate, that is, a real number r in the range 0 < r < 1. The x coordinate of the random point is x = 2r−1.
–> Repeat to find the y coordinate.
–> If x^2 + y^2 < 1 then the point is inside the circle of radius 1, so increment the number of inside points.
–> Increment the total number of points.

• Repeat until the desired number of points have been generated.

• Compute

Implimented in CERN ROOT, we get the following distribution:

All well and good but what does this have to do with anything beyond abstract math? It turns out that the notion of a fixed constant whose value is computed based on random external counting is very similar to the concept of a priced stock option, a fixed value but under the influence of random fluctuations caused by the externalities that every system of transaction inherently has.

SPX Corporation is a Fortune 500 multi-industry manufacturing firm. SPX's business segments serve developing and emerging end markets, such as global infrastructure, process equipment, and diagnostic tool industries.

How do we know this company is lucrative? Simply calling it a "Fortune 500" Company should not be the only index we use in measuring a company's worth. If we really wanted to invest in a company we should see how its worth changes over time.

From this we can learn a lot about what happened, particularly how the company was affected by the 2008 banking crisis; namely it suffered a minor collapse and returned to stability but this did not last and suffered an even larger crash over the course of the following months. It is noteworthy however how quickly it recovered and how it began to flourish again.

However we should remember that the banking bailouts promised a large trickling down of money into companies like these, so was this a result of actual worth or was the company benefiting some way from a bailout? President Obama's stimulus package also went into major Fortune 500 corporations, hence the overall speed of recovery should not be a de-facto litmus test of the company's competitiveness as state intervention in any corporation by any means, by bailout or stimulus, is a violation of capitalist market principles by which the system is supposed to be, at least in theory, self-correcting, i.e. the companies which are in fact doing better are allowed to survive whereas the companies doing worst must die out. Therefore measuring the price over time is not a clear indicator of a companies value in a true capitalist market system.

With smaller businesses we can however usually check the profits that a business makes in a year and use this as a accurate measure of competitiveness. The profit of such a business letting us measure its trading power and thus a competitive market system, independent from state intervention, is often more apparent in smaller business. However with bigger businesses, which often are intervened by a state we should use a marker which lets us monitor the trading power more closely, on a daily basis, hourly and even on the minute basis.

One way to do this is to see how much the company's stock was traded over time.

Volume–price trend (VPT) (sometimes price–volume trend) is a technical analysis indicator intended to relate price and volume in the stock market. VPT is based on a running cumulative volume that adds or subtracts a multiple of the percentage change in share price trend and current volume, depending upon their upward or downward movements

\text{VPT} = \text{VPT}_\text{prev} + \text{volume} \times { \text{close}_\text{today} - \text{close}_\text{prev} \over \text{close}_\text{prev} }
VPT total, i.e. the zero point, is arbitrary. Only the shape of the resulting indicator is used, not the actual level of the total.

The VPT is a similar to On Balance Volume (OBV) in that it is a cumulative momentum style indicator which which ties together volume with price action. However the key difference is that the amount of volume added to the total is dependent upon the relationship of Today's close to Yesterdays close
VPT is interpreted in similar ways to OBV. Generally, the idea is that volume is higher on days with a price move in the dominant direction, for example in a strong uptrend more volume on up days than down days. So, when prices are going up, VPT should be going up too, and when prices make a new rally high, VPT should too. If VPT fails to go past its previous rally high then this is a negative divergence, suggesting a weak move

The VPT can be considered to be more accurate than the OBV index in that the amount of volume apportioned each day is directly proportional to the underlying price action. So then, Large Moves account for large moves in the Index and small moves will account for small moves in the Index. This way the PVT can be seen to almost mirror the underlying market action, however as shown above, divergence can occur, and it is this divergence that is an indicator of possible future price action.

VPT is used to measure the "enthusiasm" of the market. In other words, it is an index that shows how much a stock was traded.

Using ROOT we can make animations showing the Price vs Volume movement of the SPX Corp over 3 years. A 50-day moving average line is also drawn over the scatter plot. (note: if animation stops and you want to see it again, open it in a new window)

We can see from this that the motion is truly random and looks very similar to the concept of Brownian Motion from physics. Using Monte Carlo Simulations physicists have been able to create Brownian Motion Simulations on computers to predict the possible random paths of particles.

Stock prices are often modeled as the sum of the deterministic drift, or growth, rate
and a random number with a mean of 0 and a variance that is proportional to dt
This is known as Geometric Brownian Motion, and is commonly model to define stock price paths.  It is defined by the following stochastic differential equation.


St is the stock price at time t, dt is the time step, μ is the drift, σ is the volatility,  Wt is a Weiner process, and ε is a normal distribution with a mean of zero and standard deviation of one .

Hence  dSt is the sum of a general trend, and a term that represents uncertainty.

We can convert this equation into finite difference form to perform a computer simulation which gives

Bear in mind that ε is a normal distribution with a mean of zero and standard deviation of one.

We can use ROOT to perform such a geometric Brownian Motion Simulation

This ROOT script reads in data file which contains 32 days of closing prices.
The script then takes all 32 days, produces a log-normal histogram which is fit with a Gaussian to get the volitility (σ) and drift (μ) assuming Geometric Brownian Motion (GBM).  Once these two parameters are obtained from the data, a simple Monte Carlo model is run to produce 5%, 50%, and 95% CL limits on future price action. On top of the CL contours, 10 world lines of possible future price histories are also drawn.

The future price histories can be "hacked" in a new code to examine the world line histories to compute the probability for a given line to cross a specified price threshold. The next plot here shows 1000 world lines.

Red lines are those which never exceed $350 closing price and green are those which do exceed the $350 price threshold at least one time. For 1000 world lines, the result is that 161 trials had prices exceeding $350 at least one time, thus implying a 16% chance for the closing price to exceed a $350 price threshold.

A computer simulation is not a proof positive way to predict how the trading power of a business will increase or decrease over the course of a year, but considering that it labels the possibility of growth and decay, it is at least a more reasonable way of prediction than the apparent instinctual way people invest in business and do trade, sometimes depending on betting schemes that they don't fully understand and with which carry externalities such as systemic risk .

The management of risk—especially systemic risk—in the financial world was evidently deeply flawed in the 2008 finiancial crisis. An important part of the problem was that core financial institutions had used a shadowy secondary banking system to hide much of their exposure. Citigroup, Merrill Lynch, hsbc, Barclays Capital and Deutsche Bank had taken on a lot of debt and lent other people’s money against desperately poor collateral. Prior to the US bank deregulation and UK privatizations of the 1990s, the exotic forms of risky investment in corporations that banks all across the Anglo-American system were doing would have been barred by the Glass–Steagall Act of 1933 from dabbling in retail finance. Banks, Credit lenders and building societies would have been less exotic and venture capitalist, more boring in the eyes of neoliberalization, but would have nevertheless remained stable and solid institutions.

Hence, paying attention to risks involved in all financial tools is of utmost importance. At the end of the day, all our beautiful graphs, fancy theorems and newest computer models are no more than decorative pieces if the economy is going completely out of whack, as it did in the early 2000's with essentially 8 Trillion dollars in the US alone existing out of thin air supporting the construction and housing boom. Considering the possible hits and misses helps reminds ourselves that no form of trade is ever too big to fail, an important lesson to avoid future crisis.

It is ironic to think that just before the great banking deregulation between the mid-1990's and early 2000's, in 1990, the grand old man of modern economics, Harry Markowitz, was finally awarded the Noble prize:

                                                         Professor Harry Markowitz

Markowitz' work provided new tools for weighing the risks and rewards of different investments and for valuing corporate stocks and bonds.

In plain English, he developed the tools to balance greed and fear, we want the maximum return
with the minimum amount of risk. Our stock portfolio should be at the "Efficient Frontier", a concept in modern portfolio theory introduced by Markowitz himself and others.

A combination of assets, i.e. a portfolio, is referred to as "efficient" if it has the best possible expected level of return for its level of risk (usually proxied by the standard deviation of the portfolio's return).

Here, every possible combination of risky assets, without including any holdings of the risk-free asset, can be plotted in risk-expected return space, and the collection of all such possible portfolios defines a region in this space. The upward-sloped (positively-sloped) part of the left boundary of this region, a hyperbola, is then called the "efficient frontier". The efficient frontier is then the portion of the opportunity set that offers the highest expected return for a given level of risk, and lies at the top of the opportunity set (the feasible set).

To quantify better the risk we are willing to take, we define a utility function U(x) . It describes as a function of our total assets x, our "satisfaction" . A common choice is 1-exp(-k*x) (the reason for the exponent will be clear later) .

The parameter k is the risk-aversion factor . For small values of k the satisfaction is small for small values of x; by increasing x the satisfaction can still be increased significantly . For large values of k, U(x) increases rapidly to 1, there is no increase in satisfaction for additional dollars earned .

In summary:
small k ==> risk-loving investor
large k ==> risk-averse investor

Suppose we have for nrStocks the historical daily returns r = closing_price(n) - closing_price(n-1) .
Define a vector x of length of nrStocks, which contains the fraction of our money invested in each stock . We can calculate the average daily return z of our portfolio and its variance using the portfolio covariance Covar :

z = r^T x   and var = x^T Covar x

Assuming that the daily returns have a Normal distribution, N(x), so will z with mean r^T x and variance x^T Covar x

The expected value of the utility function is :

E(u(x)) = Int (1-exp(-k*x) N(x) dx = 1-exp(-k (r^T x - 0.5 k x^T Covar x) )

Its value is maximized by maximizing  r^T x -0.5 k x^T Covar x under the condition sum (x_i) = 1, meaning we want all our money invested and x_i >= 0 , we can not "short" a stock

How can we do this? We need to use a technique called quadratic programming

Let's first review what we exactly mean by "quadratic programming" :

We want to minimize the following objective function :

c^T x + ( 1/2 ) x^T Q x    wrt. the vector x

where c is a vector and Q a symmetric positive definite matrix

You might wonder what is so special about this objective which is quadratic in the unknowns.

Well, we have in addition the following boundary conditions on x:

A x =  b
clo <=  C x <= cup
xlo <=    x <= xup  ,

where A and C are arbitray matrices and the rest are vectors

Not all these constraints have to be defined . Our example will only use xlo, A and b Still, this could be handled by a general non-linear minimizer like Minuit by introducing so-called "slack" variables . However, quadp is tailored to objective functions not more complex than being quadratic . This allows usage of solving techniques which are even stable for problems involving for instance 500 variables, 100 inequality conditions and 50 equality conditions .

what the quadratic programming package in our computer program will do is

minimize    c^T x + ( 1/2 ) x^T Q x    
subject to                A x  = b
                  clo <=  C x <= cup
                  xlo <=    x <= xup

what we want :

  maximize    c^T x - k ( 1/2 ) x^T Q x
  subject to        sum_x x_i = 1
                   0 <= x_i

We have nrStocks weights to determine, 1 equality- and 0 inequality- equations (the simple square boundary condition (xlo <= x <= xup) does not count)

For 10 stocks we got the historical daily data for Sep-2000 to Jun-2004:

GE   : General Electric Co
SUNW : Sun Microsystems Inc
QCOM : Qualcomm Inc
BRCM : Broadcom Corp
TYC  : Tyco International Ltd
IBM  : International Business Machines Corp
AMAT : Applied Materials Inc
C    : Citigroup Inc
PFE  : Pfizer Inc
HD   : Home Depot Inc

We calculate the optimal portfolio for 2.0 and 10.0 .

Food for thought :

- We assumed that the stock returns have a Normal distribution . Check this assumption by histogramming the stock returns !

- We used for the expected return in the objective function, the flat average over a time period . Investment firms will put significant resources in improving the return predicton .

- If you want to trade significant number of shares, several other considerations have to be taken into account :

+  If you are going to buy, you will drive the price up (so-called "slippage") . This can be taken into account by adding terms to the objective (Google for "slippage optimization")

+  FTC regulations might have to be added to the inequality constraints

- Investment firms do not want to be exposed to the "market" as defined by a broad index like the S&P and "hedge" this exposure away . A perfect hedge this can be added as an equality constrain, otherwise add an inequality constrain .

This was just a brief taste of some of the overlying fields of study that exist between fundamental experimental and theoretical physics research and the world of finance and trade, demonstrating that the two fields which are both very different and abstract can be nevertheless unified to some degree.

I understand that finance is never a universally popular subject among scientists, as most science is critically underfunded worldwide, however I believe that by cosying up to finance in the same way that science has cosied up to industry it will give us more and more chances to "sing for our supper" and get more funding for fundamental science in the future. Moreover it may also help us to do trade with business that are shown, statistically, to be the best people to trade with and therefore will bring the cost of big projects down, helping the process of science to get done with the least amount of cost.

(If you want a copy of any of the CERN ROOT codes used to develop the images and graphs above, please contact me by leaving a comment below)

Monday, 8 September 2014

Overview of Quantum Entanglement - Einstein Versus Bohr

Quantum Entanglement

It’s a popular myth that identical twins, it's said, can sometimes sense when one of the pair is in danger, even if they're oceans apart. Tales of telepathy abound. Scientists cast a skeptical eye over such claims, largely because it isn't clear how these weird connections could possibly work. Yet they've had to come to terms with something that's no less strange in the world of physics: an instantaneous link between particles that remains strong, secure, and undiluted no matter how far apart the particles may be – even if they're on opposite sides of the universe. It's a link that Einstein went to his grave denying, yet its existence is now beyond dispute. This quantum equivalent of telepathy is demonstrated daily in laboratories around the world. It holds the key to future hyperspeed computing and underpins the science of teleportation. Its name is entanglement.

The discovery of entanglement

The concept, but not the name of entanglement, was first put under the scientific spotlight on May 15, 1935, when a paper by Einstein and two younger associates, Boris Podolosky and Nathen Rosen, appeared in the journal Physical Review.[1]

Its title – "Can a Quantum-Mechanical Description of Physical Reality Be Considered Complete?" – leaves no doubt that the paper was a challenged to Niels Bohr and his vision of the subatomic world. On June 7, Erwin Schrödinger, himself no lover of quantum weirdness, wrote to Einstein, congratulating him on the paper and using in his letter the word entanglement – or, rather, its German equivalent verschränkung – for the first time. This new term soon found its way into print in an article – sent to the Cambridge Philosophical Society on August 14 that was published a couple of months later.[2]

In it he wrote:

When two systems ... enter into temporary physical interaction ... and when after a time of mutual influence the systems separate again, then they can no longer be described in the same way as before, viz. by endowing each of them with a representative of its own. I would not call that one but rather the characteristic trait of quantum mechanics, the one that enforces its entire departure from classical lines of thought. By the interaction the two representatives [the quantum states] have become entangled.
The characteristic trait of quantum mechanics ... the one that enforces its entire departure from classical lines of thought – here was an early sign of the importance attached to this remarkable effect.

Entanglement lay at the very heart of quantum reality – its most startling and defining feature and Einstein would have none of it.

For the best part of a decade, the man who revealed the particle nature of light (see Einstein and the photoelectric effect) had been trying to undermine Bohr's interpretation of quantum theory. Einstein couldn't stomach the notion that particles didn't have properties, such as momentum and position, with real, determinable (if only we knew how), preexisting values. Yet that notion was spelled out in a relationship discovered in 1927 by Werner Heisenberg. 

Known as the uncertainty principle, it stems from the rule that the result of multiplying together two matrices representing certain pairs of quantum properties, such as position and momentum, depends on the order of multiplication. The same oddball math that says X times Y doesn't have to equal Y times X implies that we can never know simultaneously the exact values of both position and momentum. Heisenberg proved that the uncertainty in momentum can never be smaller than a particular number that involves Planck's constant. 

In one sense, this relationship quantifies wave-particle duality. Momentum is a property that waves can have (related to their wavelength); position is a particlelike property because it refers to a localization in space. Heisenberg's formula reveals the extent to which one of these aspects fades out as the other becomes the focus of attention.

 In a different but related sense, the uncertainty principle tells how much the complementary descriptions of a quantum object overlap. Position and momentum are complimentary properties because to pin down one is to lose track of the other; they coexist but are mutually exclusive, like the opposite sides of the same object. Heisenberg's formula quantifies the extent to which knowledge of one limits knowledge of the other.(For More see my article on the life and work of Werner Heisenberg which has a more detailed description of the uncertainty principle and the context of wave particle duality)

Einstein didn't buy this. He believed that a particle does have a definite position and momentum all the time, whether we're watching it or not, despite what quantum theory says. From his point of view, the Heisenberg uncertainty principle isn't a basic rule of nature; it's just an artifact of our inadequate understanding of the subatomic realm. In the same way, he though, wave-particle duality isn't grounded in reality but instead arises from a statistical description of how large numbers of particles behave. Given a better theory, there'd be no wave-particle duality or uncertainty principle to worry about. The problem was, as Einstein saw it, that quantum mechanics wasn't telling the whole story: it was incomplete.

Einstein versus Bohr

Intent on exposing this fact to the world and championing a return to a more classical pragmatic view of nature, Einstein devised several thought experiments between the late 1920s and the mid-1930s. Targeted specifically at the idea of complementarity, these experiments were designed to point out ways to simultaneously measure a particle's position and momentum, or its precise energy at a precise time (another complementary pair), thus pulling the rug from under the uncertainty principle and wave-particle duality.

The first of these experiments was talked about informally in 1927, in hallway discussions at the fifth Solvay Conference in Brussels. Einstein put to Bohr a modified version of the famous double-slit experiment in which quantum objects – electrons, say – emerging from the twin slits are observed by shining light onto them. Coherent Photons bouncing off a particle would have their momenta changed by an amount that would reveal the particle's trajectory and, therefore, which slit it had passed through. The particle would then go on to strike the detector screen and contribute to the buildup of an interference pattern. Wave-particle duality would be circumvented, Einstein argued, because we would have simultaneously measured particlelike behavior (the trajectory the particle took) and wavelike behavior (the interference pattern on the screen).

But Bohr spotted something about this thought experiment that Einstein had overlooked. To be able to tell which slit a particle went through, you'd have to fix its position with an accuracy better than the distance between the slits. Bohr then applied Heisenberg's uncertainty principle, which demands that if you pin down the particle's position to such and such a precision, you have to give up a corresponding amount of knowledge of its momentum. Bohr said that this happens because the photons deliver random kicks as they bounce off the particle. The result of these kicks is to inject uncertainty into the whereabouts of the particle when it strikes the screen. And here's the crucial caveat: the uncertainty turns out to be roughly as large as the spacing between the interference bands. The pattern is smeared out and lost as the quantum mechanical wavefunction becomes decoherent. With this it disappears Einstein's hoped-for contradiction.

On several other occasions, Einstein confronted Bohr with thought experiments cunningly contrived to blow duality out of the water. Each time, Bohr used the uncertainty principle to exploit a loophole and win the say against his arch rival (and, incidentally, good friend). In the battle for the future of quantum physics, Bohr defeated Einstein and, in the process, showed just how important Heisenberg's little formula was in the quantum scheme of things.

These arguments between Bohr and Einstein were never truly resolved and got evermore technical. At the sixth Congress of Solvay, in 1930, the indeterminacy relation was Einstein's target of criticism. His idea contemplates the existence of an experimental apparatus which was subsequently designed by Bohr in such a way as to emphasize the essential elements and the key points which he would use in his response.

In this Einstein considers a box, sometime's called "Einstein's Box" or "Einstein's Box of Light". With this thought experiment, which was designed with Bohr's assistance, Einstein's was supposed to prove the violation of the indeterminacy relation between time and energy. The schematic of Einetin and Bohr's apparatus is shown below:

                 "Einstein's Box of Light" - Einstein's secret weapon to destroy quantum mechanics?

Einstein described a box full of light and said that it was possible to measure both the energy 'E' of a single photon and the time 't' when it was emitted. This was not allowed by a variant on Heisenberg’s uncertainty principle, namely .

Einstein said that the box could be weighed at first and then a single photon be allowed to escape through a shutter controlled by a clock inside the box. The box would then be weighed again and the mass difference 'm' determined. The energy of the photon 'E' is simply E = mc^2.

It appeared that both the photon's energy and its time of emission could be determined! This caused a bit of a shock when first seen by Bohr, he genuinely did not see the solution at once and Einstein seemed at first sight to have one the battle this time, meaning that the uncertainty of quantum mechanics was going to be finally wiped out!

Bohr, after sleeping on the problem, finally realized that there was a flaw in Einstein's reasoning. When the photon is released, the box will recoil (to conserve momentum) and the position of the box in the earth's gravitational field will be uncertain. Einstein's very own general theory of relativity said that this would cause a corresponding uncertainty in the time recorded.

An illustration of this, in context of Einstein's light box is shown on the left - critically it depends on the presence of a clock in the device recording the time at which particles are measured and weighted at precise time intervals by recoil from the light source; clocks are effected by gravity in special relativity and run slower in a gravitational field than in zero gravity. Hence, gravitation affects the measurement, and thus induces uncertainties in the weight of quanta - as quantum mechanics predicts. *

*(although this may not be a successful tool for disproving quantum mechanics, as Einstein intended, this may be an important way however to measure certain weakly interacting affects of gravity in context of the predictions of general relativity -such as local gravitational waves and instances of spatial warping.)

So Bohr had been saved by defeat by Einstein forgetting his own theory of general relativity!

This was to be the last serious assault – approximately 28 years after its inception at the hands of Planck, the foundations of quantum mechanics seemed to be complete and depended wholly on this uncertainty business which Einstein never accepted and always saw it as a kind of a magician's curtain, hiding the true mechanism of what appears to be a trick of nature itself.

Such is the version of this clash of 20th-century titans that's been dutifully repeated in textbooks and spoon fed to physics students for many years. But evidence has recently come to light that Bohr had unwittingly hoodwinked Einstein with arguments that were fundamentally unsound. This disclosure doesn't throw quantum mechanics back into the melting pot, but it does mean that the record needs setting straight, and that the effect that really invalidates Einstein's position should be given proper credit.

The revisionist picture of the Bohr-Einstein debates stems partly from a suggestion made in 1991 by Marlan Scully, Berthold-Georg Englert, and Herbert Walther of the Max Planck Institute for Quantum Optics in Garching, Germany.[3] These researchers proposed using atoms as quantum objects in a version of Young's two-slit experiment.

Atoms have an important advantage over simpler particles, such as photons or electrons: they have a variety of internal states, including a coherent ground state (lowest energy state) and a series of decoherent excited states. These different states, the German team reckoned, could be used to track the atom's path.

This two-state example of coherence and decoherence is what allowed this formulation of a quantum version of the famous Two-Slit Experiment.

We can label the probability-amplitude wave function passing through the left hand slit in the figure ψleft and the waves passing through the right-hand slit ψright. These are coherent and show the characteristic quantum interference fringes on the detector screen (a photographic plate or CCD array). This is the case even if the intensity of particles is so low that only one particle at a time arrives at the screen.

In a dramatic experimental proof of decoherence, physicist Gerhard Rempe sent matter waves of heavy Rubidium atoms through two slits. He then irradiated the left slit with microwaves that could excite the hyperfine structure in Rb atoms passing through that slit. As he turned up the intensity, the interference fringes diminished in proportion to the number of photons falling on the left slit. The photons decohere the otherwise coherent wave functions.[4]

The crucial factor in this version of the double-slit experiment is that the microwaves have hardly any momentum of their own, so they can cause virtually no change to the atom's momentum – nowhere near enough to smear out the interference pattern.

Heisenberg's uncertainty principle can't possibly play a significant hand in the outcome. Yet with the microwaves turned on so that we can tell which way the atoms went, the interference pattern suddenly vanishes. Bohr had argued that when such a pattern is lost, it happens because a measuring device gives random kicks to the particles. But there aren't any random kicks to speak of in the rubidium atom experiment; at most, the microwaves deliver momentum taps ten thousand times too small to destroy the interference bands. Yet, destroyed the bands are. It isn't that the uncertainty principle is proved wrong, but there's no way it can account for the results.

The only reason momentum kicks seemed to explain the classic double slit experiment discussed by Bohr and Einstein turns out to be a fortunate conspiracy of numbers. There's a mechanism at work far deeper than random jolts and uncertainty. What destroys the interference pattern is the very act of trying to get information about which paths is followed. The effect at work is entanglement.


Ordinarily, we think of separate objects as being independent of one another. They live on their own terms, and anything tying them together has to be forged by some tangible particles, A and B, which have come into contact, interacted for a brief while, and then flown apart. Each particle is described by (among other properties) its own position and momentum. The uncertainty principle insists that one of these can't be measured precisely without destroying knowledge of the other. However, because A and B have interacted and, in the eyes of quantum physics, have effectively merged to become one interconnected system, it turns out that the momentum of both particles taken together and the distance between them can be measured as precisely as we like. Suppose we measure the momentum of A, which we'll assume has remained behind in the lab where we can keep an eye on it. We can then immediately deduce the momentum of B without having to do any measurement on it at all. Alternatively, if we choose to observe the position of A, we would know, again without having to measure it, the position of B. This is true whether B is in the same room or a great distance away.

From Heisenberg's relationship, we know that measuring the position of, say, A will lead to an uncertainty in its momentum, Einstein, Podolosky, and Rosen pointed out, however, that by measuring the position of A, we gain precise knowledge of the position of B. Therefore, if we take quantum mechanics at face value, by gaining precise knowledge of its position, an uncertainty in momentum has been introduced for B. In other words, the state of B depends on what we choose to do with A in our lab. And, again, this is true whatever the separation distance may be. EPR considered such a result patently absurd. How could B possibly know whether it should have a precisely defined position or momentum? The fact that quantum mechanics led to such an unreasonable conclusion, they argued, showed that it was flawed – or, at best, that it was only a halfway house toward some more complete theory.

At the core of EPR's challenge is the notion of locality: the common sense idea that things can only be affected directly if they're nearby. To change something that's far away, there's a simple choice: you can either go there yourself or send some kind of signal. Either way, information or energy has to pass through the intervening space to the remote site in order to affect it. The fastest this can happen, according to Einstein's special theory of relativity, is the speed of light.

The trouble with entanglement is that it seems to ride roughshod over this important principle. It's fundamentally nonlocal. A measurement of particle A affects its entangled partner B instantaneously, whatever the separation distance, and without signal or influence passing between the two locations. This bizarre quantum connection isn't mediated by fields of force, like gravity or electromagnetism. It doesn't weaken as the particles move apart, because it doesn't actually stretch across space. As far as entanglement is concerned, it's as if the particles were right next to one another: the effect is as potent at a million light-years as it is at a millimeter. And because the link operates outside space, it also operates outside time. What happens at A is immediately known at B. No wonder Einstein used words such as "spook" and "telepathic" to describe – and deride – it. No wonder that as the author of relativity he argued that the tie that binds entangled particles is a physical absurdity. Any claim that an effect could work at faster-than-light speeds, that it could somehow serve to connect otherwise causally isolated objects, was to Einstein an intellectual outrage.

A close look at the EPR scenario reveals that it doesn't actually violate causality, because no information passes between the entangled particles. The information is already, as it were, built into the combined system, and no measurement can add to it. But entanglement certainly does throw locality out the window, and that development is powerfully counterintuitive. It was far too much for Einstein and his colleagues to accept, and they were firmly convinced that quantum mechanics, as it stood, couldn't be the final word. It was, they suggested, a mere approximation of some as yet undiscovered description of nature. This description would involve variables that contain missing information about a system that quantum mechanics doesn't reveal, and that tell particles how to behave before a measurement is carried out. A theory along these lines – a theory of so-called local hidden variables – would restore determinism and mark a return to the principle of locality.

The shock waves from the EPR paper quickly reached the shores of Europe. In Copenhagen, Bohr was once again cast into a fever of excitement and concern as he always was by Einstein's attacks on his beloved quantum worldview. He suspended all other work in order to prepare a counterstrike. Three months later, Bohr's rebuttal was published in the same American journal that had run the EPR paper. Basically, it argued that the nonlocality objection to the standard interpretation of quantum theory didn't represent a practical challenge. It wasn't yet possible to test it, and so physicists should just get on with using the mathematics of the subject, which worked so well, and not fret about the more obscure implications.

David Bohm's View on The EPR Paradox

                                                       David Joseph Bohm, FRS, London

Most scientists, whose interest was simply in using quantum tools to probe the structure of atoms and molecules were happy to follow Bohr's advice. But a few theorists continued to dig away at the philosophical roots. In 1952, American Physicist David Bohm, at Birkbeck College, London, who had been hounded out of his homeland during the McCarthy "Red Scare" inquisitions, came up with a variation on the EPR experiment that paved the way for further progress in the matter.[5] Instead of using two properties, position and momentum, as in the original version, Bohm focused on just one: the property known as spin.

The spin of subatomic particles, such as electrons, is analogous to spin in the everyday world but with a few important differences. Crudely speaking, an electron can be thought of as spinning around the way a basketball does on top of an athlete's finger. But whereas spinning basketballs eventually slow down, all electrons in the universe, whatever their circumstances, spin all the time and at exactly the same rate. What's more, they can only spin in one of two directions, clockwise or counterclockwise, referred to as spin-up and spin-down.

Bohm's revised EPR thought experiment starts with the creation, in a single event, of two particles with opposite spin. This means that if we measure particle A and find that its spin-up, then, from that point on, B must be spin-down. The only other possible result is that A is measured to be spin-up, which forces B to be spin-down. Taking this second case as an example, we're not to infer, says quantum mechanics, that A was spin-up before we measured it and therefore that B was spin-down, in a manner similar to a coin being heads or tails. Quantum interactions always produce superpositions. The state of each particle in Bohm's revised EPR scenario is a mixed superposition that we can write as: psi = (A spin-up and B spin-down) + (A spin-down + B spin-up). A measurement to determine A's spin causes this wave function to collapse and a random choice to be made of spin-up or spin-down. At that very same moment, B also ceases to be in a superposition of states and assumes the opposite spin.

This is the standard quantum mechanical view of the situation and it leads to the same kind of weird conclusion that troubled Einstein and friends. No matter how widely separated the spinning pair of particles may be, measuring the spin of one causes the wave function of the combined system to collapse instantaneously so that the unmeasured twin assumes a definite (opposite) spin state, too. The mixed superposition of states, which is the hallmark of entanglement, ensures nonlocality. Set against this is the Einsteinian view that "spooky action at a distance" stems not from limitations about what the universe is able to tell us but instead from limitations in our current knowledge of science. At a deeper, more basic level than that of wave functions and complementary properties, are hidden variables that will restore determinism and locality to physics.

John Bell's inequality

John Stewart Bell at CERN

Bohm's new version of the EPR paradox didn't in itself offer a way to test these radically different worldviews, but it set the scene for another conceptual breakthrough that did eventually lead to a practical experiment. This breakthrough came in 1964 from Irish physicist, John S. Bell, who worked at CERN, the European center for high-energy particle research in Switzerland. Colleagues considered Bell to be the only physicist of his generation to rank with the pioneers of quantum mechanics, such as Niels Bohr and Max Born, in the depth of his philosophical understanding of the implications of the theory. What Bell found is that it makes an experimentally observable difference whether the particles described in the EPR experiment have definite properties before measurement, or whether they're entangled in a ghostlike hybrid reality that transcends normal ideas of space and time.

Bell's test hinges on the fact that a particle's spin can be measured independently in three directions, conventionally called x, y, and z, at right angles to one another. If you measure the spin of particle A along the x direction, for example, this measurement also affects the spin of entangled particle B in the x direction, but not in the y and z directions. In the same way, you can measure the spin of B in, say, the y direction without affecting A's spin along x or z. Because of these independent readings, it's possible to build up a picture of the complementary spin states of both particles. Being a statistical effect, lots of measurements are needed in order to reach a definite conclusion. What Bell showed is that measurements of the spin states in the x, y, and z directions on large numbers of real particles could in principle distinguish between the local hidden variable hypothesis championed by the Einstein-Bohm camp and the standard nonlocal interpretation of quantum mechanics.

If Einstein was right and particles really did always have a predetermined spin, then, said Bell, a Bohm-type EPR experiment ought to produce a certain result. If the experiment were carried out on many pairs of particles, the number of pairs of particles in which both are measured to be spin-up, in both the x and y directions ("xy up"), is always less than the combined total of measurements showing xz up and yz up. This statement became known as Bell's inequality. Standard quantum theory, on the other hand, in which entanglement and nonlocality are facts of life, would be upheld if the inequality worked the other way around. The decisive factor is the degree of correlation between the particles, which is significantly higher if quantum mechanics rules.

This was big news. Bell's inequality, although the subject of a modest little paper and hardly a poplar rival to the first Beatles tour of America going on at the same time, provided a way to tell by actual experiment which of the two major, opposing visions of subatomic reality was closer to the truth.[6] Bell made no bones about what his analysis revealed: Einstein;s ideas about locality and determinism were incompatible with the predictions of orthodox quantum mechanics. Bell's paper offered a clear alternative that lay between the EPR/Bohemian local hidden variables viewpoint and Bohrian, nonlocal weirdness. The way that Bell's inequality was set up, its violation would mean that the universe was inherently nonlocal*, allowing particles to form and maintain mysterious connections with each other no matter how far apart they were. All that was needed now was for someone to come along and set up an experiment to see for whom Bell's inequality tolled.

But that was easier said than done. Creating, maintaining, and measuring individual entangled particles is a delicate craft, and any imperfection in the laboratory setup masks the subtle statistical correlations being sought. Several attempts were made in the 1970s to measure Bell's inequality but none was completely successful. Then a young French graduate student, Alain Aspect, at the Institute of Optics in Orsay, took up the challenge for his doctoral research.

*  If we want to study the applicability of Classical Probability Theory, i.e. Bayes theorem, to all of quantum mechanics, beyond the Copenhagen interpretation, we have to recognize that quantum entanglement states, such as basic singlet states, are nonfactorizable and therefore will not follow the simple factorizable relations used in proving the consistency of Bayes theorem with quantum probability functions.

This is again a result of the paradox that entanglement is a non-local effect and therefore there isn't any possibility of decomposing/factorizing the density of
states of such an entangled quantum state locally.

Even by applying a locality condition to the Bell inequalities, in the stochastic Clauser-Horne Model say, it can be shown that this (local) model, as far as applied to the singlet-state and without using quantum mechanical formalism, is not completely stochastic (i.e. there are possible configurations for which the model is deterministic). However as soon as you apply quantum mechanical formalism it becomes non-local again.

Even in experiment, where the so-called Clauser-Horne inequalities correspond to the fixed conditional probabilities of light polariser orientations for entangled photon ensembles which have not been identified show that unless they were exactly identical, the different conditional probability values of the photons themselves could not be factorized.

So the paradox is based on non-locality which is in direct contradiction to the pure local cause and effect framework of special relativity. Any theory combined with quantum mechanics becomes non-local, it has to whenever any form of quantum formalism is used. EPR is purely a paradox of relativity, not of quantum mechanics.

First Indirect Evidence of EPR Entanglement 

The first efforts to relate theory and thought experiment with actual experiment came from the pioneering work of Australian physicist John Clive Ward working with British physicist Maurice Pryce along with the work of one of the greatest experimental physicists of the 20th century, Chinese-American Physicist Chien-Shiung Wu.

Their work was on formulating and experimentally verifying the probability amplitude for quantum entanglement was the first attempt to develop a way to find such "spooky actions" in an EPR apparatus.

In a 1947 paper, published in Nature[7], Ward and Pryce were the first to calculate, and use, the probability amplitudes for the polarisation of two entangled photons moving in opposite directions.

For polarisations x and y, Ward derived this probability amplitude to be

which once normalised can be expressed as

where 1 and 2 refer to the two quanta propagating in different directions. Ward's probability amplitude is then applied to derive the correlation of the quantum polarisations of the two photons propagating in opposite directions.

This prediction was experimentally confirmed by Wu and Shaknov in 1950.[8] In current terminology this result corresponds to a pair of entangled photons and is directly relevant to a typical Einstein-Podolsky-Rosen (EPR) paradox

Chien-Shiung Wu – often referred to as Madame Wu or the First Lady of Physics – from the University of Columbia was first to give indirect evidence of entanglement in the laboratory.

She showed an Einstein-type correlation between the polarisation of two well-separated photons, which are tiny localised particles of light.

                                Chien-Shiung Wu at her Columbia University Physics Lab, 1963.

Madame Wu's work led to the confirmation of the Pryce and Ward calculations on the correlation of the quantum polarizations of two photons propagating in opposite directions. This was the first experimental confirmation of quantum results relevant to a pair of entangled photons as applicable to the Einstein-Podolsky-Rosen (EPR) paradox.

However, direct evidence of the EPR paradox, one which would include a complete isolation of local effects and test the effect of nonlocality of quantum phenomina would require a few more decades, until the laser was invented, which allowed the French physicist Alain Aspect to form an experiment that we would recognize today as a quantum entanglement circuit.

Alain Aspect's experiment

                                                        Experimental Physicist Alain Aspect

Aspect was set upon his way by his supervising professor, Bernard d'Espagnat, whose career centered around gathering experimental evidence to uncover the deep nature of reality. "I had the luck," said d'Espagnat, "to discover in my university a young French physicist, Alain Aspect, who was looking for a thesis subject and I suggested that testing the Bell inequalities might be a good idea. I also suggested that he go and talk to Bell, who convinced him it was a good idea and the outcome of this was that quantum mechanics won.”

Aspect's experiment used particles of light – photons – rather than material particles such as electrons or protons. Then, as now, photons are by far the easiest quantum objects from which to produce entangled pairs. There is, however, a minor complication concerning the property that is actually recorded in a photon-measuring experiments such as Aspect's or those of other researchers we'll be talking about later. Both Bell and Bohm presented their theoretical arguments in terms of the particle spin. Photons do have a spin (they're technically known as spin-1 particles), but because they travel at the speed of light, their spin axes always lie exactly along their direction of motion, like that of a spinning bullet shot from a rifle barrel. You can imagine photons to be right-handed or left-handed depending on which way they rotate as you look along their path of approach. What's actually measured in the lab isn't spin, however, but the very closely related property of polarization.

Effectively, polarization is the wavelike property of light that corresponds to the particlelike property of spin. Think of polarization in terms of Maxwell's equations, which tell us that the electric and magnetic fields of a light wave oscillate at right angles to each other and also to the direction in which the light is traveling. The polarization of a photon is the direction of the oscillation of its electric field: up and down, side to side, or any orientation in between. Ordinarily, light consists of photons polarized every which way. But if light is passed through a polarizing filter, like that used in Polaroid sunglasses, only photons with a particular polarization – the one that matches the slant of the filter – can get through. (The same happens if two people make waves by flicking one end of a rope held between them. If they do this through a gap between iron railings only waves that vibrate in the direction of the railings can slip through to the other side.)

Aspect designed his experiment to examine correlations in the polarization of photons produced by calcium atoms – a technique that had already been used by other researchers. He shone laser light onto the calcium atoms, which caused the electrons to jump from the ground state to a higher energy level. As the electrons tumbled back down to the ground state, they cascaded through two different energy states, like a two-step waterfall, emitting a pair of entangled photons – one photon per step – in the process.

                                                  Illustration of the Aspect Experiment

The photons passed through a slit, known as a collimator, designed to reduce and guide the light beam. Then they fell into an automatic switching device that randomly sent them in one of two directions before arriving, in each case, at a polarization analyzer – a device that recorded their polarization state.

An important consideration in Aspect's setup was the possibility, however small, that information might leak from one photon to its partner. It was important to rule out a scenario in which a photon arrived at a polarization analyzer, found that polarization was being measured along say the vertical direction, and then somehow communicated this information to the other photon. (How this might happen doesn't matter: the important thing was to exclude it as an option.) By carefully setting up the distances through which the photons traveled and randomly assigning the direction in which the polarization would be measured while the photons were in flight, Aspect ensured that under no circumstances could such a communicating signal be sent between photons. The switches operated within 10 nanoseconds, while the photons took 20 nanoseconds to travel the 6.6 meters to the analyzers. Any signal crossing from one analyzer to the other at the speed of light would have taken 40 nanoseconds to complete the journey – much too long to have any effect on the measurement.

In a series of these experiments in the 1980s, Aspect's team showed what most quantum theorists expected all along: Bell's inequality was violated.[9] The result agreed completely with the predictions of standard quantum mechanics and discredited any theories based on local hidden variables. More recent work had backed up this conclusion. What's more, these newer experiments have included additional refinements designed to plug any remaining loopholes in the test. For example, special crystals have enabled experimenters to produce entangled photons that are indistinguishable, because each member of the pair has the same wavelength. Such improvements have allowed more accurate measurements of the correlation between the photons. In all cases, however, the outcomes have upheld Aspect's original discovery. Entanglement and nonlocality are indisputable facts of the world in which we live, even if we may find it uncomfortable or "spooky" as Einstein himself did.

Einstein found it "Spooky" because the experiment is a paradox of special relativity, which is based on the idea of local causality, i.e. in a given event in space-time the distance between cause and effect must be separated by a time lapse which depends on the speed of light. In quantum entanglement a measurement, of spin say, of one particle allows you to know the spin of the other particle without disturbing it, such that the measurement of one particle is imposing an equal, but opposite, property, i.e. spin, on the other.

This is a paradox of special relativity because no events can affect another in a non-local way and must be mediated by signals. However the entangled particles, separated in space, are not separated in time. They both share a common time, however mathematically the time of one particle is real and the other is complex (i.e.using complex or imaginary numbers). So in unifying quantum mechanics with special relativity we get scenarios where, instead of the states being predetermined, one state measures time in a negative frame relative to the other and so cancel each other out.

This is why, in drawing Feynman diagrams in space-time, it appears that antimatter particles move backwards in time relative to the matter particles in pair production events for example. The particles do not travel backwards in time, this is just a consequence of unifying a local theory, special relativity, and a non-local theory, quantum mechanics. The paradox lies within this, as why should the local nature vanish? we know "how" to interpret it but we cannot really know the "why".

Niels Bohr felt as if we have no real right to know the "why", which displeased Einstein as all of his theories relied on a local space-time and to accept quantum correlations means having to violate the cosmic speed limit. Setting up the quantum systems however is determined by local events, the particles must be after all carried under local cause and effect events limited by the speed of light, so the paradox can be ironed out by what sets up the system, i.e. me bringing an entangled partner to the moon and leaving its partner on earth but as for the determination of the states of the particles themselves, in special relativity alone the paradox exists so we must interpret it using quantum mechanical non-locality.

Its kind of like saying "we see the world is flat around us, in a local frame, but by travelling around it, in a non-local frame, we know it is round" - so experimental determination of this effect, which has been done for almost 30 years now, is our equivalent of determining the roundness of the world despite it locally appearing flat in this analogy.

Applications of Quantum Entanglement

The phenomenon of entanglement has already begun to be exploited for practical purposes. In the late 1980s, theoreticians started to see entanglement not just as a puzzle and a way to penetrate more deeply into the mysteries of the quantum world, but also as a resource. Entanglement could be exploited to yield new forms of communication and computing. It was a vital missing link between quantum mechanics and another field of explosive growth: information theory. The proof of nonlocality and the quickly evolving ability to work with entangled particles in the laboratory were important factors in the birth of a new science. Out of the union of quantum mechanics and information theory sprang quantum information science – the fast-developing field whose most important fields of development are quantum cryptography, quantum teleportation, and quantum computers.

To see some of the applications of quantum entanglement in the context of quantum computer technology, see my article on quantum computer physics and architecture.

To see a nice example of merging quantum information theory with game theory see my short article here


Einstein, A., B. Podolsky, and N. Rosen. "Can a quantum-mechanical description of physical reality be considered incomplete? Physical Review 47 (1935): 777-80.

Schrödinger, E. "Discussion of probability relations between separated systems." Proceedings of the Cambridge Philosophical Society 31 (1935): 555-63.

Scully, M. O., B. G. Englert, and H. Walther. "Quantum optical tests of complimentarity." Nature 351 (1991): 111-16.

Dürr, S., T. Nonn, and G. Rempe. "Origin of QM complementarity probed by a 'which-way' experiment in an atom interferometer." Nature 395 (1998): 33.

Bohm, D. "A suggested reinterpretation of quantum theory in terms of hidden variables." Physical Review 85 (1952): 611-23.

Bell, J. S. "On the Einstein-Podolsky-Rosen paradox." Physics 1 (1964): 195-200.

M. H. L. Pryce and J. C. Ward, Angular correlation effects with annihilation radiation, Nature 160, 435 (1947).

C. S. Wu and I. Shaknov, The angular correlation of scattered annihilation radiation, Phys. Rev. 77, 136 (1950).

Aspect, A. P., P. Grangier, and G. Roger. "Experimental tests of relaistic local theories via Bell's theorem." Physical Review Letters 47 (1981): 460.

Quantum Entanglement Documentary Film, on which this article is based: