Kepler's Laws Derived from Newton's Laws

In this appendix we will derive Kepler's laws from Newton's laws of motion and his law of universal gravitation. Below are the three laws that were derived empirically by Kepler.

Mathematical Preliminaries

Consider a Cartesian coordinate system with the sun at the origin. Let $(x,y,z)$ denote the position of a planet. Clearly $x$, $y$, and $z$ are functions of the time $t$. We define the position vector $\boldsymbol{r}$, the velocity vector $\boldsymbol{v}$, and the acceleration vector $\boldsymbol{a}$ by $$\boldsymbol{r}=x\boldsymbol{i}+y\boldsymbol{j}+z\boldsymbol{k},$$ $$\boldsymbol{v}=\dot{x}\boldsymbol{i}+\dot{y}\boldsymbol{j}+\dot{z}\boldsymbol{k},$$ $$\boldsymbol{a}=\ddot{x}\boldsymbol{i}+\ddot{y}\boldsymbol{j}+\ddot{z}\boldsymbol{k}.$$

Here the dots represent differentiation with respect to time and $\boldsymbol{i}$, $\boldsymbol{j}$, $\boldsymbol{k}$ are the unit vectors in the $x$, $y$, $z$ directions respectively. Newton's law of motion can be written \begin{equation} \boldsymbol{F}=m\boldsymbol{a} \label{newtlaw} \end{equation}

where $m$ is the mass of the planet and $\boldsymbol{F}$ is the force on the planet. Let $\hat{\boldsymbol{r}}$ be a unit vector in the $\boldsymbol{r}$ direction. Then Newton's law of gravitation applied to the earth and sun is given by \begin{align} \boldsymbol{F}&=-\frac{GMm}{r^2}\,\hat{\boldsymbol{r}}\nonumber\\&=-\frac{GMm}{r^3}\,\boldsymbol{r}\label{newtgrav} \end{align}

where $G$ is a constant, $M$ is the mass of the sun, and $r$ is the magnitude of $\boldsymbol{r}$. Combining equations \eqref{newtlaw} and \eqref{newtgrav}, we get \begin{equation} \boldsymbol{a}=\ddot{\boldsymbol{r}}=-\frac{GM}{r^3}\,\boldsymbol{r}. \label{accel} \end{equation}

Planet moves in a plane

By the product rule for differentiation \begin{equation*} \frac{d}{dt}\,(\boldsymbol{r}\times\boldsymbol{v})=\boldsymbol{v}\times\boldsymbol{v}+\boldsymbol{r}\times\boldsymbol{a}=0 \end{equation*}

since $\boldsymbol{a}$ is in the same direction as $\boldsymbol{r}$ by equation \eqref{accel}. Here the symbol $\times$ represents the vector cross-product. Thus, the vector \begin{equation*} \boldsymbol{h}=\boldsymbol{r}\times\boldsymbol{v} \end{equation*}

is a constant. It follows that $\boldsymbol{r}$ and $\boldsymbol{v}$ lie in the plane orthogonal to $\boldsymbol{h}$. We will choose our coordinate system so that $\boldsymbol{k}$ is in the direction $\boldsymbol{h}$. Thus, \begin{equation} \boldsymbol{h}=\boldsymbol{r}\times\boldsymbol{v}=h\boldsymbol{k}\qquad h\gt 0. \label{hdef} \end{equation}

Kepler's Second Law

Figure 10 shows the area swept out by the position vector in a small increment of time $\Delta t$. $\Delta\theta$ is the small change of angle.

The area OAB is approximately equal to the area of the right triangle OAC for small $\Delta\theta$. Since the length of the line AC is approximately $r\Delta\theta$ and the length of the line OC is approximately $r$, we have \begin{equation*} \Delta A\doteq \tfrac{1}{2}\,r^2\Delta\theta. \end{equation*}

Dividing both sides by $\Delta t$ and letting $\Delta t$ approach zero, we see that \begin{equation} \dot{A}=\tfrac{1}{2}\,r^2\dot{\theta}. \label{Adot} \end{equation}

Since the planet moves in the $xy$ plane, we have \begin{align} \boldsymbol{r}&=x\boldsymbol{i}+y\boldsymbol{j}\nonumber\\&=(r\cos \theta)\boldsymbol{i}+(r\sin\theta)\boldsymbol{j} \label{rij} \end{align}

where the polar coordinates $r$ and $\theta$ are functions of $t$. The time derivative of $\boldsymbol{r}$ is given by \begin{equation} \boldsymbol{v}=(\dot{r}\cos\theta-r\sin\theta\;\dot{\theta})\boldsymbol{i}+(\dot{r}\sin\theta+r\cos\theta\;\dot{\theta})\boldsymbol{j}.\label{vij} \end{equation}

Substituting equations \eqref{rij} and \eqref{vij} into equation \eqref{hdef}, we obtain \begin{align} h&=r\cos\theta(\dot{r}\sin\theta+r\cos\theta\,\dot{\theta})- r\sin\theta(\dot{r}\cos\theta-r\sin\theta\,\dot{\theta})\nonumber\\ &=r^2\dot{\theta}. \label{hthetadot} \end{align}

Here we have used the fact that $\boldsymbol{i}\times\boldsymbol{j}=\boldsymbol{k}$ and $\boldsymbol{j}\times\boldsymbol{i}=-\boldsymbol{k}$. It follows from equations \eqref{Adot} and \eqref{hthetadot} that \begin{equation} \dot{A}=\tfrac{1}{2}\,r^2\dot{\theta}=h/2 =\text{constant}.\label{Aconst} \end{equation}

This is Kepler's second law.

Definition and properties of an ellipse

Before we look at the derivation of Kepler's first law, we need to define what we mean by an ellipse, and look at some of its properties. One common way of drawing an ellipse is to pin the two ends of a string, place a pencil in the loop, and trace a curve while keeping the string taught. Clearly the resulting curve has the property that the sum of the distances from any point on the curve to the two fixed points is a constant (the length of the string). The resulting curve is called an ellipse and the two fixed points are called the foci of the ellipse. Figure 11 shows an ellipse in which the foci are at $(-c,0)$ and $(c,0)$, and $2a$ corresponds to the length of the string.

The construction of the ellipse can be represented mathematically as follows \begin{equation} \sqrt{(x+c)^2+y^2}+\sqrt{(x-c)^2+y^2}=2a \label{ellipdef} \end{equation}

where $a\gt c\gt 0$.

This equation can be rearranged as follows \begin{equation*} \sqrt{(x+c)^2+y^2}=2a-\sqrt{(x-c)^2+y^2}. \end{equation*}

Squaring both sides, we get \begin{equation*} (x+c)^2+y^2=4a^2-4a\sqrt{(x-c)^2+y^2}+(x-c)^2+y^2. \end{equation*}

Solving for the square root term, we obtain \begin{align*} \sqrt{(x-c)^2+y^2}&=\frac{1}{4a}\,[4a^2+(x-c)^2-(x+c)^2]\\ &=a-\frac{c}{a}\,x. \end{align*}

Squaring again, we obtain \begin{equation*} x^2-2cx+c^2+y^2=a^2-2cx+\frac{c^2}{a^2}\,x^2 \end{equation*}

or equivalently \begin{align*} \left(1-\frac{c^2}{a^2}\,\right)x^2+y^2&=(a^2-c^2)\frac{x^2}{a^2}+y^2\\ &=a^2-c^2. \end{align*}

Dividing through by $a^2-c^2$, we obtain \begin{equation} \frac{x^2}{a^2}+\frac{y^2}{a^2-c^2}=1. \label{ellipstd} \end{equation}

We define the eccentricity $e$ of the ellipse by $e=c/a$. The eccentricity is a measure of the elongation of the ellipse. The eccentricity of the earth's orbit is small (.0167). Thus, its orbit is nearly circular. Venus has an even smaller eccentricity (.007) and Mars has a larger eccentricity (.0934). The planet with the largest eccentricity is Mercury (.2056). Let us define $b$ by \begin{equation} b=a\sqrt{\vphantom{(}1-e^2}=\sqrt{\vphantom{(}a^2-c^2}. \label{bdef} \end{equation}

Then, equation \eqref{ellipstd} can be written in the standard form \begin{equation} \frac{x^2}{a^2}+\frac{y^2}{b^2}=1. \label{std}\end{equation}

This is the form that is usually specified for an ellipse. It is easy to see that $a$ is one-half the length of the ellipse's major axis and $b$ is one-half the length of the ellipse's minor axis.

An ellipse also has a simple form in polar coordinates if we take our origin to be one of the foci. This situation is pictured in Figure 12.

Using the definition of an ellipse in terms of the sum of the distances from the two foci being constant, we can write \begin{equation} r+\sqrt{(r\cos\theta+2c)^2+r^2\sin^2\theta}=2a. \label{r2a} \end{equation}

Solving for the square root term and expanding the square terms, we get \begin{equation*} \sqrt{\vphantom{(}r^2+4rc\cos\theta+4c^2}=2a-r. \end{equation*}

Squaring this equation gives \begin{equation*} r^2+4rc\cos\theta+4c^2=4a^2-4ar+r^2 \end{equation*}

or equivalently \begin{equation*} (a+c\cos\theta)r=a^2-c^2. \end{equation*}

Solving for $r$, we obtain \begin{align} r&=\frac{a^2-c^2}{a+c\cos\theta}\nonumber\\&=\frac{a^2(1-\frac{c^2}{a^2}\,)}{a(1+\frac{c}{a}\;\cos\theta)}\nonumber\\&=\frac{a(1-e^2)}{1+e\cos\theta}\nonumber\\&=\frac{k}{1+e\cos\theta}\label{ellippolar} \end{align}

where \begin{equation} k=a(1-e^2).\label{ka} \end{equation}

Equation \eqref{ellippolar} is the desired representation of the ellipse in polar coordinates.

We can also derive our original definition of an ellipse from the polar form. Suppose $r$ and $\theta$ satisfy \begin{equation} r=\frac{k}{1+e\cos\theta}\label{ellippolar1} \end{equation}

where $k\gt 0$ and $0\lt e\lt 1$.

We define $a$ and $c$ by \begin{equation} a=\frac{k}{1-e^2}=\frac{k}{(1-e)(1+e)}\label{adef} \end{equation}

and

\begin{equation*} c=ae. \label{cdef} \end{equation*}

It follows from equation \eqref{ellippolar1} that $r$ has a maximum value of $\frac{k}{1-e}$ at $\theta=\pi$. Thus, using equation \eqref{adef} we see that \begin{equation*} r\leq \frac{k}{1-e}=a(1+e)\lt 2a. \end{equation*}

Using equation \eqref{adef}, equation \eqref{ellippolar1} can be rearranged as follows \begin{equation*} (1+e\cos\theta)r=k=a(1-e^2). \end{equation*}

Since $e=c/a$, this equation can be written \begin{align*} \Bigl(1+\frac{c}{a}\,\cos\theta\Bigr)r&=a\Bigl(1-\frac{c^2}{a^2}\,\Bigr)\\&=\frac{a^2-c^2}{a}. \end{align*}

Multiplying both sides by $a$, we obtain \begin{equation*} (a+c\cos\theta)r=a^2-c^2. \end{equation*}

Multiplying this equation by four and adding $r^2=(\sin^2\theta+\cos^2\theta)r^2$ to both sides, we obtain \begin{equation*} (\cos^2\theta+\sin^2\theta)r^2+4ar+4cr\cos\theta=4a^2-4c^2+r^2. \end{equation*}

This equation can be rearranged as \begin{equation*} r^2\cos^2\theta+4cr\cos\theta+4c^2+r^2\sin^2\theta=r^2-4ar+4a^2 \end{equation*}

or equivalently \begin{equation*} (r\cos\theta+2c)^2+r^2\sin^2\theta=(2a-r)^2. \end{equation*}

Taking the square root of both sides, we obtain \begin{equation*} r+\sqrt{(r\cos\theta+2c)^2+r^2\sin^2\theta}=2a \end{equation*}

which is the defining equation for the ellipse pictured in Figure 12 [see equation \eqref{r2a}]. Thus, equation \eqref{ellippolar1} defines an ellipse with the origin at one focus. Let $b=a\sqrt{\vphantom{(}1-e^2}$. Then it follows from equation \eqref{adef} that \begin{equation} b=\frac{k}{\sqrt{\vphantom{/}1-e^2}}. \label{bk} \end{equation}

Hamilton's Theorem

In this section we will show that the velocity vector $\boldsymbol{v}$ moves on a circle. Since $r=|\boldsymbol{r}|$, it follows from equation \eqref{rij} that equation \eqref{accel} can be written \begin{align} \dot{\boldsymbol{v}}&=\boldsymbol{a}\nonumber\\&=-\frac{GM}{r^2}\,(\cos\theta\,\boldsymbol{i}+\sin\theta\,\boldsymbol{j}). \label{vdotij} \end{align}

Combining equations \eqref{hthetadot} and \eqref{vdotij}, we obtain \begin{equation} \dot{\boldsymbol{v}}=-\frac{GM}{h}\,\dot{\theta}\,(\cos\theta\,\boldsymbol{i}+\sin\theta\,\boldsymbol{j}).\label{vdotij1} \end{equation}

By the chain rule for differentiation \begin{equation} \dot{\boldsymbol{v}}=\frac{d\boldsymbol{v}}{d\theta}\,\dot{\theta}.\label{vdotthetadot} \end{equation}

It follows from equations \eqref{vdotij1} and \eqref{vdotthetadot} that \begin{equation*} \frac{d\boldsymbol{v}}{d\theta}=-\frac{GM}{h}\,(\cos\theta\,\boldsymbol{i}+\sin\theta\,\boldsymbol{j}). \end{equation*}

Integrating this equation, we obtain \begin{equation} \boldsymbol{v}=\frac{GM}{h}\,(-\sin\theta\,\boldsymbol{i}+\cos\theta\,\boldsymbol{j})+\boldsymbol{v}_0 \label{vvzero} \end{equation}

where $\boldsymbol{v}_0$ is a constant. It follows that $|\boldsymbol{v}-\boldsymbol{v}_0|=GM/h$, i.e., $\boldsymbol{v}$ moves on the circle centered at $\boldsymbol{v}_0$ with radius $GM/h$.

Kepler's first law

We choose our coordinate system so that $\boldsymbol{j}$ is in the direction $\boldsymbol{v}_0$, i.e., \begin{equation} \boldsymbol{v}_0=v_0\boldsymbol{j}\qquad\text{where $v_0\gt 0$}.\label{vzeroj} \end{equation}

Thus, equation \eqref{vvzero} becomes \begin{equation} \boldsymbol{v}=\frac{GM}{h}\,[-\sin\theta\,\boldsymbol{i}+(\cos\theta+e)\,\boldsymbol{j}] \label{vij1} \end{equation}

where $e=v_0h/GM$. Substituting equation \eqref{vij1} into equation \eqref{hdef} and using equation \eqref{rij}, we get \begin{align*} h\boldsymbol{k}&=\boldsymbol{r}\times\boldsymbol{v}\\ &=\frac{GMr}{h}\,[\sin^2\theta+(\cos^2\theta+e\cos\theta)]\,\boldsymbol{k}\\ &=\frac{GMr}{h}\,(1+e\cos\theta)\,\boldsymbol{k} \end{align*}

and hence \begin{align} r&=\frac{h^2}{GM}\,\frac{1}{1+e\cos\theta}\nonumber\\&=\frac{k}{1+e\cos\theta} \label{ellippolar2} \end{align}

where \begin{equation} k=h^2/GM.\label{khgm} \end{equation}

In order for $r$ to remain finite for all $\theta$, we must have $0\leq e\lt 1$. Equation \eqref{ellippolar2} is the equation of an ellipse in polar coordinates with the origin at one focus. This completes the proof of Kepler's first law.

Kepler's third law

Since the rate that area is swept out by the position vector is the constant $h/2$ [see equation \eqref{Aconst}], it follows that \begin{equation} A=hT/2 \label{AT} \end{equation}

where $T$ is the period of the motion and $A$ is the area of the ellipse. Since translation doesn't change the area, we can consider the area of the ellipse \begin{equation} \frac{x^2}{a^2}+\frac{y^2}{b^2}=1.\label{ellipab} \end{equation}

We will calculate the area of the first quadrant ($x\geq 0$, $y\geq 0$) and multiply by four. Solving for $y$ as a function of $x$ from equation \eqref{ellipab}, we obtain \begin{equation*} y=b\sqrt{1-x^2/a^2},\qquad 0\leq x \leq a. \end{equation*}

Thus, the area $A$ is given by \begin{equation} A=4b\int_0^a \sqrt{1-x^2/a^2}\,dx.\label{Aint} \end{equation}

If we make the change of variables $x=a\sin\phi$ ($dx=a\cos\phi\,d\phi$) in the integral, we obtain \begin{align} A&=4ab\int_0^{\pi/2} cos^2\phi\,d\phi\nonumber\\ &=4ab\int_0^{\pi/2}\frac{1+\cos 2\phi}{2}\,d\phi\nonumber\\ &=\pi ab.\label{Aab} \end{align}

Substituting this value for $A$ into equation \eqref{AT}, we obtain \begin{equation*} T=\frac{2\pi ab}{h} \end{equation*}

and hence \begin{equation} T^2=\frac{4\pi^2a^2b^2}{h^2}.\label{Tsq} \end{equation}

Using equations \eqref{adef}, \eqref{bk}, and \eqref{khgm}, we can write the expression for $T^2$ in equation \eqref{Tsq} as follows \begin{align} T^2&=\frac{4\pi^2k^4}{(1-e^2)^3h^2}\nonumber\\ &=\frac{4\pi^2ka^3}{h^2}=\frac{4\pi^2a^3}{GM}.\label{K3} \end{align}

Equation \eqref{K3} will establish Kepler's third law if we can show that $a$ is the average distance between a point on the ellipse and the focus where the sun is located. The distance $D$ of a point $(x,y)$ on the ellipse to the focus $(c,0)$ is given by \begin{equation} D=\sqrt{(x-c)^2+y^2}. \label{distD}\end{equation}

It follows from equation \eqref{std} that \begin{equation*}y^2=b^2\left(1-\frac{x^2}{a^2}\right).\end{equation*}

Combining this equation with equation \eqref{distD}, we get \begin{align} D&=\sqrt{(x-c)^2+b^2\left(1-\frac{x^2}{a^2}\right)}\nonumber\\&=\sqrt{x^2\left(1-\frac{b^2}{a^2}\right)-2cx+c^2+b^2}.\label{D1}\end{align}

It follows from equation \eqref{bdef} that \begin{equation*} 1-\frac{b^2}{a^2}=\frac{c^2}{a^2}\qquad\text{and}\qquad c^2+b^2=a^2. \end{equation*}

Using these relations, equation \eqref{D1} becomes \begin{align} D&=\sqrt{\frac{c^2}{a^2}x^2-2cx+a^2}\nonumber\\ &=\sqrt{\frac{(cx-a^2)^2}{a^2}}. \label{D2}\end{align}

Since $x\leq a$ and $c\lt a$, it follows that $cx\lt a^2$. Thus \begin{equation} D=\frac{a^2-cx}{a}. \label{Dfinal}\end{equation}

For the upper half of the ellipse the average distance $D_\text{av}$ is given by \begin{align} D_\text{av}&= \frac{1}{2a}\int_{-a}^a \frac{a^2-cx}{a}\,dx\nonumber\\ &=\frac{1}{2}\int_{-a}^a\left(1-\frac{c}{a^2}x\right)dx\nonumber\\ &=a\label{dav} \end{align} since \begin{equation*}\int_{-a}^ax\,dx=0.\end{equation*}

The average distance over the lower half of the ellipse is the same; therefore, equation \eqref{dav} represents the average distance over the ellipse. Equations \eqref{K3} and \eqref{dav} combine to give Keplers third law.