I revisit the orignal proof of Bell’s theorem and give a thorough discussion about the assumptions that have been made. This shows that Bell’s inequality is violated in general not only by quantum systems but also by simple cassical systems with correlation. To get a grasp on that topic I will start with the general notion of entanglement.

Entanglement isn’t about interaction or information transfer betweeen entangled particles.

Consider spin-entaglement of two spin-$\frac{1}{2}$ particles.

Let them be in a singulet state relative to an arbitrary axis (say z-axis):

$$ |\Psi \rangle = \frac{1}{\sqrt{2}} (\ |\uparrow_z, \downarrow_z \rangle – |\downarrow_z,\uparrow_z\rangle \ ) $$

The **joint** propability $P$ of measuring both particles in a state $|i,j \rangle$ with $i,j \in { \uparrow, \downarrow }$ where the axis of both measurements enclose the angle $\theta$ is given by

$$ P_{i,j} = | \langle i,j | \Psi \rangle |^2 = \frac{1}{4} (1 – i \cdot j \cdot \cos \theta )$$

if we take $i,j$ to be 1 and -1 for $\uparrow$ and $\downarrow$, respectively.

The **reduced** propability $p_i$ of measuring only one particle (e.g. if we don’t care about the other) is given by:

$$ p_i = \sum_{j \in {1,-1}} P_{i,j} = \frac{1}{2} $$

The **conditional** propability of measuring the other particle (**after** we already know about the measurement of the first particle) is given by:

$$ \tilde{p}_ {j|i} = \frac{P_{i,j}}{p_i} = \frac{1}{2} (1 – i \cdot j \cdot \cos \theta ) $$

This does involve the angle $\theta$ and usually one starts here to argue about non-locality and instantanious actions changing the outcome of experiment when we change the angle $\theta$ at the first measurement apparatus.

This is, however, not valid. If we are talking about **conditional** propabilities we have already performed a measurement and set the measurement axis of the first measurement. Changing this axis afterwards will not affect the propability because the angle $\theta$ is relative to the **measured** axis. Changing the axis of the second measurement only changes the propability predicting the outcome of the later measurement for the first observer because he has that extra knowledge.

The propability for the second observer stays the same, because this is the **reduced** propability (he doesn’t know about the first measurement):

$$ p_j = \sum_{i \in {1,-1}} P_{i,j} = \frac{1}{2} $$

In short: Without the extra knowledge of the first measurement, entanglement is not important for the second observer. To gain that extra knowledge there must be an additional information transfer to the second observer and this is restricted by means of relativity-causality ($v\le c$ etc.). So entanglement neither breaks causality nor can it transfer any information.

Sometimes one comes across the argument that the violation of Bell’s inequality shows, that entanglement is still something more than classical perception would allow. So let’s have a look at a certain expectation value. The axis for spin measurements shall be labeled by normalized vectors $\vec{a}$ and $\vec{b}$ such that $\vec{a}\cdot\vec{b} = \cos\theta$. Consider

\begin{equation}

\langle \Psi|(\vec{a}\cdot\vec{S}_1) \, \, (\vec{b} \cdot \vec{S}_2) | \Psi \rangle = -\frac{\hbar^2}{4}\vec{a}\cdot\vec{b} = -\frac{\hbar^2}{4} \cos\theta

\tag{1}

\end{equation}

which is the expectation value of the product of both measurement results. Here we have $\vec{S} = \frac{\hbar}{2}(\sigma_x, \sigma_y, \sigma_z)^T$ with $\sigma_x, \sigma_y, \sigma_z$ the Pauli matrices. We now follow the reasoning of John Bell in his original work since other, similar inequalities are based on the same problem.

The argument goes like this: Assume a classical, statistical system with non-hidden and hidden variables all labeled by $\vec{\lambda} = (\lambda_1, \dots, \lambda_n)$ for some $n\in\mathbb N$. Furthermore, there are two functions $A(\vec{a},\vec{\lambda})$ and $B(\vec{b},\vec{\lambda})$, which give the results of spin measurements on particle 1 and 2, respectively. They can only yield $\pm\frac{\hbar}{2}$, since that is the only outcome of the experiment. Those functions depend on one measurement axis **only**, because there shall be no action between measurement apparatus 1 and measurement apparatus 2 (this is the assumed **locality**).

Because the system is studied on a statistical basis, there exists a propability density $ \varrho(\vec{\lambda}) $ which is a function of the system parameters $\vec{\lambda}$. It allows for the calculation of the expectation value

$$ E(\vec{a},\vec{b}) = \int \varrho(\vec{\lambda}) \cdot A(\vec{a},\vec{\lambda}) B(\vec{b},\vec{\lambda}) \ d^n\lambda $$

which should equal the one from above **(1)** if it is to be interpreted on a classical and local basis (note: one can incorporate discrete statistical variables $c_j$ by terms like $\sum_j \alpha_j \cdot \delta(c_j-\lambda_m)$). The malicious assumption here is that $\varrho$ is no function of the axis-vectors $\vec{a}$ and $\vec{b}$. This is, however, quite natural for classical systems with correlation. The point is: By allowing $\varrho(\vec{\lambda}, \vec{a}, \vec{b})$ or even just $\varrho(\vec{\lambda}, \vec{a} \cdot \vec{b})$, Bell’s inequality **cannot** be derived! Such propability densities **can** cause a violation of the inequality. To understand this, I will now derive them and point out the steps which are not possible with the modified density.

Assume

$$ E(\vec{a},\vec{b}) = -\frac{\hbar^2}{4} \vec{a} \cdot \vec{b} \tag2 $$

so that the quantum mechanical description is in agreement with the classical one. For $\vec{a} = \vec{b}$:

\begin{equation}

\begin{aligned}

-\frac{\hbar^2}{4} & = \int \underbrace{\varrho(\vec{\lambda})}_ {\ge 0} \cdot \underbrace{A(\vec{a},\vec{\lambda}) B(\vec{a},\vec{\lambda})}_ {\ge -\frac{\hbar^2}{4}} \, d^n\lambda \newline & \Leftrightarrow \newline

0 & = \int \underbrace{\varrho(\vec{\lambda})}_ {\ge 0} \cdot \left( \underbrace{A(\vec{a},\vec{\lambda}) B(\vec{a},\vec{\lambda}) + \frac{\hbar^2}{4}}_{\ge 0} \right) \, d^n\lambda

\end{aligned}

\end{equation}

because $\varrho$ is a normalized propability density. It follows that

\begin{equation}

\begin{aligned}

A(\vec{a},\vec{\lambda}) B(\vec{a},\vec{\lambda}) = -\frac{\hbar^2}{4}

\end{aligned}

\end{equation}

is a valid identity when multiplied with $\varrho$. This can only hold if

\begin{equation}

\begin{aligned}

B(\vec{a},\vec{\lambda}) = \, – A(\vec{a},\vec{\lambda})

\end{aligned}

\tag3

\end{equation}

Do note that this holds for **any** vector $\vec{a}$. Now take another normalized vector $\vec{c}$ and do the following calculations:

\begin{align}

\frac{\hbar^2}{4}|(-\vec{a}\cdot\vec{b}) – (-\vec{a}\cdot\vec{c})| & = |E(\vec{a},\vec{b}) – E(\vec{a},\vec{c}) | \newline

& = \left| – \int \varrho(\vec{\lambda}) \cdot (A(\vec{a},\vec{\lambda}) A(\vec{b},\vec{\lambda}) – A(\vec{a},\vec{\lambda}) A(\vec{c},\vec{\lambda})) \, d^n\lambda \right| \newline

& = \left| \int \varrho(\vec{\lambda}) \cdot A(\vec{a},\vec{\lambda}) A(\vec{b},\vec{\lambda}) \cdot (1 – \frac{4}{\hbar^2}A(\vec{b},\vec{\lambda}) A(\vec{c},\vec{\lambda})) \, d^n\lambda \right| \newline

& \le \int | \varrho(\vec{\lambda}) | \cdot | A(\vec{a},\vec{\lambda}) A(\vec{b},\vec{\lambda}) | \cdot |1 – \frac{4}{\hbar^2}A(\vec{b},\vec{\lambda}) A(\vec{c},\vec{\lambda})| \, d^n\lambda \newline

& = \int \varrho(\vec{\lambda}) \cdot (\frac{\hbar^2}{4} – A(\vec{b},\vec{\lambda}) A(\vec{c},\vec{\lambda})) \, d^n\lambda \newline

& = \frac{\hbar^2}{4} + E(\vec{b},\vec{c}) = \frac{\hbar^2}{4} – \frac{\hbar^2}{4}\vec{b}\cdot\vec{c} \tag4

\end{align}

In the first equality we used **(2)**. In the second we used **(3)**. In the third we used $A(\vec{b},\vec{\lambda})^2 = \frac{\hbar^2}{4}$. The fourth step is the triangle inequality for integrals. In the fifth step we used $A(\vec{a},\vec{\lambda}) A(\vec{b},\vec{\lambda}) = \pm \frac{\hbar^2}{4}$ and $\varrho(\vec{\lambda}) \ge 0$. In the last step we used **(2)** and the fact that $\varrho$ is normalized.

So we finaly have Bell’s inequality

\begin{equation}

\begin{aligned}

|\vec{a}\cdot\vec{b} – \vec{a}\cdot\vec{c}| + \vec{b}\cdot \vec{c} \le 1

\end{aligned}

\tag5

\end{equation}

which can be violated for some choice of $\vec{a},\vec{b},\vec{c}$. This usually shows that our first assumption **(2)** is false. Therefore, no classical, local system should be able to describe the expectation value **(1)**.

With the modified probability density the steps in **(4)** look like this:

\begin{align}

\frac{\hbar^2}{4}|(-\vec{a}\cdot\vec{b}) – (-\vec{a}\cdot\vec{c})| & = |E(\vec{a},\vec{b}) – E(\vec{a},\vec{c}) | \notag \newline

& = \left| – \int \varrho(\vec{\lambda}, \vec{a}, \vec{b}) \cdot A(\vec{a},\vec{\lambda}) A(\vec{b},\vec{\lambda}) – \varrho(\vec{\lambda}, \vec{a}, \vec{c}) \cdot A(\vec{a},\vec{\lambda}) A(\vec{c},\vec{\lambda}) \, d^n\lambda \right| \notag \newline

& = \left| \int A(\vec{a},\vec{\lambda}) A(\vec{b},\vec{\lambda}) (\varrho(\vec{\lambda}, \vec{a}, \vec{b}) – \varrho(\vec{\lambda}, \vec{a}, \vec{c}) \frac{4}{\hbar^2}A(\vec{b},\vec{\lambda}) A(\vec{c},\vec{\lambda})) \, d^n\lambda \right| \notag \newline

& \le \int \frac{\hbar^2}{4} \cdot \left| \varrho(\vec{\lambda}, \vec{a}, \vec{b}) – \varrho(\vec{\lambda}, \vec{a}, \vec{c}) \frac{4}{\hbar^2}A(\vec{b},\vec{\lambda}) A(\vec{c},\vec{\lambda}) \right| \, d^n\lambda

\end{align}

Note that one **cannot** proceed from here since in general $\varrho(\vec{\lambda}, \vec{a},\vec{b}) \ne \varrho(\vec{\lambda}, \vec{a},\vec{c})$. Also the second equality shouldn’t work here anyway since **(3)** is only vaild when multiplied by $\varrho(\vec{\lambda},\vec{a},\vec{a})$ and **not** by $\varrho(\vec{\lambda},\vec{a},\vec{c})$. For instance, when $\varrho(\vec{\lambda},\vec{a},\vec{a}) = 0$ equation **(3)** can be violated in general. Nevertheless, one could only try to use another triangle equation on the term $|\dots|$, leaving us finally with the inequality

\begin{equation}

\begin{aligned}

|\vec{a}\cdot\vec{b} – \vec{a}\cdot\vec{c}| \le 2 \, ,

\end{aligned}

\end{equation}

which is not to be violated by any choice of $\vec{a},\vec{b},\vec{c}$.

In summary: If one allows propability densities $\varrho(\vec{\lambda}, \vec{a}, \vec{b})$, that depend on some parameters of the measurement, the derivation of an inequality which is violated by quantum mechanical expectation values is not possible in the usual way. Above I already argued that the dependence on $\vec{a}, \vec{b}$ is in general no cause for non-local behavior as long as the **reduced** propability of a subsystem depends only on its own parameters. It is natural to assume that the outcome of an experiment may change when the apparatus is altered. Hence, the probability of that outcome can in principle depend on the measurement parameters. This problem is inherent to inequalities that are based on the same arguments and assumptions like Bell’s inequality: see for example the CHSH-inequality on page 527 equation 2, which is frequently used in experiments!

So if we would find some functions $A$ and $B$ that satisfy our locality-conditions from above there is no reason to think of the expectation value **(1)** as a non-local one. Take

\begin{align}

p_{i,j}(\vec{a},\vec{b}) & = \frac{1}{4} (1 – i \cdot j \cdot \vec{a}\cdot\vec{b}) \newline

A(i,\vec{a}) & = \frac{\hbar}{2} \ i \newline

B(j,\vec{b}) & = \frac{\hbar}{2} \ j

\end{align}

Then we have

$$ E(\vec{a}, \vec{b}) = \sum_{i,j \in {1,-1 }} p_{i,j}(\vec{a},\vec{b}) \cdot A(i,\vec{a}) B(j,\vec{b}) = ~ – \frac{\hbar^2}{4} \ \vec{a}\cdot\vec{b} = ~ – \frac{\hbar^2}{4} \ \cos\theta$$

which equals **(1)** on a pure local and classical basis.