Why the day is 24 hours long: The history of Earth’s atmospheric thermal tide, composition, and mean temperature

The Sun drives a semidiurnal (12-hour) thermal tide in Earth’s atmosphere. Zahnle and Walker suggested that an atmospheric oscillation with period Pres ≈ 10.5 hours resonated with the Solar driving ≈600 million years ago (Ma), when the length of day (lod) was ≈21 hours. They argued that the enhanced torque balanced the Lunar tidal torque, fixing the lod. We explore this hypothesis using two different global circulation models (GCMs), finding Pres = 11.4 and 11.5 hours today, in excellent agreement with a recent measurement. We quantify the relation between Pres, mean surface temperature T¯, composition, and Solar luminosity. We use geologic data, a dynamical model, and a Monte Carlo sampler to find possible histories for the Earth-Moon system. In the most likely model, the lod was fixed at ≈19.5 hours between 2200 and 600 Ma ago, with sustained high T¯ and an increase in the angular momentum LEM of the Earth-Moon system of ≈5%.


S1 Simplified Gravitational Tide Model
As stated in the main text, we use a simple tidal model which neglects the obliquity of Earth as well as the eccentricity and inclination of the Moon's orbit, in which the the Lunar tidal torque is (35,36) . (S1) The tidal dissipation on Earth occurs primarily in shallow seas, with an approximately 30% contribution from the deep ocean (91). It is known that the shape and even the number of ocean basins have changed over time. Reconstructions of the past disposition of the continents and ocean basins have been made. These have then been used to calculate the ocean tides and dissipation, e.g., (3) who did so over the last 250 Myr.
However, we require a calculation of the tides going back more than four billion years, so we settle for a much simpler tidal model (6,92). The model consists of a single hemispheric ocean (similar to the Pacific ocean) with a constant depth. Laplace's tidal equations, with a linear friction term, are solved, with the boundary condition that there is no flow across the coastline. The tides are driven at both diurnal and simi-diurnal periods. The model exhibits resonances in the tidal response. The power dissipated in the tide varies strongly with the period of the forcing, with a general decrease in dissipated power with increasing tidal frequency (or decreasing tidal period). This result, expressed in terms of the effective tidal quality factor Q(ω), is depicted in Fig. S1, which shows that Q(ω), which is inversely proportional to dissipated power, tends to increase with increasing tidal frequency ω although Q(ω) is not monotonic; the features are associated with resonances, and are very model dependent.
The Solar tidal torque, T , is usually taken to be proportional to T L , a practice we will follow: T L + T = T L (1 + β), where β = 1 4.7 where a ,0 is the present day value of the Lunar semimajor axis, while ω t ≡ 2(Ω ⊕ − n ⊕ ), with n ⊕ = G(M + M ⊕ )/a 3 ⊕ being the Earth's mean motion. The factor 1/4.7 is the present day ratio of T /T ; since a /a ,0 was less than one in the past, the Solar tide was relatively less important then.
The Earth also exhibits solid body tides. We adopt a solid body tidal Q s = 250, and use the minimum of Q s and Q(ω). This affects only the very early evolution of the system. Tidal Q Interpolation Q today Webb 82 ω today ω at start Fig. S1: The tidal dissipation factor Q(ω), from the ocean model of reference (6). The tidal Q increases rapidly with increasing frequency, i.e., with decreasing length of day. This effect tends to decrease the torque and power dissipated by tides in the past, when the length of day was shorter than 24 hours. This tendency is opposed by the higher tides produced by the Moon in the past, when it was closer to Earth. The frequency corresponding to 24 hours is indicated by the vertical dashed line. The vertical dotted line shows the tidal frequency corresponding to the situation at the start of our integration, for the best fit full thermal and gravitational tide model. Recall that for the Lunar tide, ω = ω t = 2(Ω ⊕ − n ).

S2 Simplified Thermal Tide Model
We present a derivation of a simple one-dimensional thermal tide model, which captures the relevant resonant wave behavior in a one dimensional model along the Earth's equator. The model neglects the rotation of Earth (the Coriolis and centrifugal forces), and the vertical variation of temperature and deposition of the rate of energy deposition (q). We start with first law of thermodynamics, where e is the internal energy per gram of air, s is the entropy per gram, T is the temperature, P the pressure, and ρ the mass density. Air is adequately described as an ideal gas with five degrees of freedom, so that the specific heat at constant volume where dq = T ds is the change in heat (thermal energy) per gram, µ = 28.96m p is the molecular weight of air, (with m p = 1.67 × 10 −24 g the proton mass), and k b = 1.38 × 10 −16 erg K −1 is Boltzman's constant. The equation of state for an ideal gas is For later reference ρ ≈ 1.2 × 10 −3 g cm −3 and P ≈ 1.013 × 10 6 dynes cm −2 (93) at the surface of Earth. It follows that e = c V T . The magnitude of the thermal tide is easily measured using the surface pressure, so we will use pressure as the dependent variable. Later we will introduce the momentum and mass conservation equations, which depend on P and ρ, and not on T , so we use the equation of state to change variables from T to ρ: Using this in equation (S3), we obtain where we have used the result (following from the ideal gas law) that the specific heat at constant pressure The energy equation follows: where γ ≡ c p /c v andq ≡ dq/dt. For dry air at 20 C, γ = 1.4, consistent with our choice of The continuity and momentum equations are (S12) where u is the fluid velocity. We have introduced a term −Γu, representing drag, into the momentum equation. For future use, we note the relation between the drag coefficient Γ and the quality factor of the atmospheric resonant cavity Q th , where ω th is the resonant frequency of the relevant atmospheric mode (see below). Noting that we can use the continuity equation in the energy equation At this point we will make a number of assumptions to simplify the problem. First, we note that the observed pressure fluctuations have an amplitude of about 1 millimeter of mercury, or one millibar, i.e., about one part in a thousand. Thus we write P = P 0 + P 1 (S16) with P 1 << P 0 , and similarly for ρ. We also assume that the background state consists of the atmosphere at rest, so that u = u 1 . We treatq as a first order quantity as well. We keep terms only up to first order in P 1 , ρ 1 and u 1 .
To lowest order, the momentum equation becomes We next use the fact that the vertical extent of the atmosphere is much smaller than the radius of Earth to make the assumption that the atmosphere is plane parallel, with gravity acting in the z direction, and u = (u, v, w) in Cartesian coordinates. The x-axis is taken parallel to the Earth's equator, and runs from −πR to πR , and the atmosphere is invariant in the y direction, i.e., the y-component v of u is zero. We remind the reader that we have also neglected the rotation of Earth, ignoring the associated Coriolis and centrifugal accelerations. Finally, we anticipate that we are looking for a Lamb wave, so that vertical velocity w = 0. This precludes an accurate solution for the vertical structure of the mode, with the result that we give up some accuracy in the estimate of the resonant period. This is not as bad as it might seem since our GCM results, when compared with more accurate analytic solutions, show that the latter also do poorly in this regard. The momentum and energy equations become where we have dropped the subscript 1 from all the perturbed quantities. Taking the partial time derivative of the first equation, then using the second equation to eliminate u, we find The sound speed c s ≡ γP 0 /ρ 0 ≈ 3.4 × 10 4 cm s −1 .
To estimate the heating rate per gram of air (q), we first relate it to the mean heating rate per square centimeterF , which is the quantity that is usually reported, where we used eqn. (S17). Note that we are interested in the flux absorbed by the atmosphere, not the flux at the top of the atmosphere; the latter isS = 340 W m −2 averaged over the surface of the Earth (corresponding to the usual solar constant S = 1, 360 W m −2 . The mean absorbed flux is estimated by (94) to be 75 + 24 + 88 = 187 W m −2 ; the first term corresponds to direct absorption of sunlight, the second to sensible heating from the ground, and the third to latent heating (related to water evaporation). In cgs units the totalF = 1.87 × 10 5 erg cm −2 s −1 . Similar values for each term are given in (95) (78 + 17 + 80 = 175 W m −2 ) and in (96) (79 + 20 + 84 = 183 W m −2 ).
The heating due to direct absorption of sunlight in the atmosphere is zero from sunset to sunrise. During the day the direct absorption term is proportional to cos(Ω ⊕ t), whereΩ ⊕ = Ω ⊕ − n ⊕ = 2π/P ⊕ andP ⊕ is the length of a Solar day, currently 24 hr or 86400 seconds. We take the zero of time to be at noon. For simplicity, in this section we will assume that the other two fluxes have the same temporal structure, a rather crude approximation. This assumption is relaxed in our GCM modeling.
Using the notation of reference (44), we expand the pressure and the heating rate in Fourier components, and pick out the component that produces the leading term in the torque associated with the thermal tide, namely the semidiurnal (or 12hr period term, currently) with k = 2π/(2πR ⊕ ); note that this is half the wave number associated with the semidiurnal tide, i.e., k corresponds to a wavelength equal to the circumference of Earth, while the wavelength associated with the semidiurnal tide is πR ⊕ , half the circumference; the factor of 2 in the exponents in eqns. S22 and S23 accounts for the difference. Inserting these relations into eqn. (S20), some algebra yields with the factor 2/(3π) coming from the Fourier transform of the time-dependent heating. We have defined ω res ≡ c s k = c s /(R ⊕ ); the frequency in the atmosphere is twice this, ω = c s /(R ⊕ /2) = 2c s /R ⊕ , corresponding to a resonant period about P res = 16.4 hours for our over-simplified model, compared to P res = 11.4 or P res = 11.5 hr for PlaSim and LMD respectively. The perturbed pressure is then In the main text, and below, we follow the general practice in the thermal tide literature and use the quality factor of the resonant cavity (the Earth's atmosphere) Q th = ω res /Γ, as noted above.
Using P res = 11.4 h for the present day Earth, and taking Q th = 30, eqn. (S25) predicts a pressure perturbation of within a factor of two of the observed value at Earth's equator, δP s ≈ 0.8 millibar (39,40). Fig. S2 shows the quadrupolar pressure perturbation calculated by PlaSim (the dots, see section S4 below) as well as a fit of the form given by eqn. (S25). We have adjusted the normalization of the fit to minimize the χ 2 between the fit and the numerical results. The amplitude of the quadrupolar pressure perturbation δP s due to solar heating as a function of the length of day, as calculated by PlaSim (colored points). Each point corresponds to a single simulation, with the length of day P ⊕ given by the value on the x-axis. The solid lines are fits using eqn. (S25). The maximum in δP s marks the value of P res , twice the atmospheric resonant period. The blue and red points and curves correspond to a 1 bar atmosphere, with P CO 2 = 1 mb and 100 mb respectively, while the black and green points and curves correspond to a 2 bar atmosphere and the same P CO 2 values. Consider the blue and black curves, which both have P CO 2 = 1 mbar. When P s is doubled, the peak in δP s moves to shorter period, from P res ≈ 24 hr to P res ≈ 23 hr. This is the result of two effects; first, the mean surface temperature increases with increasing P s , and second, the mean molecular weight µ(Z) decreases; both changes lead to an increase in the sound speed, and hence a lower P res . The same two effects explain the decrease in P res between the red and green curves. The height of the peak, corresponding to an increase in the normalization N of the thermal torque, increases with increasing P s at fixed P CO 2 (blue to black, and red to green).
The relation between the perturbed pressure and the torque is (44) where δ th is the phase where the pressure maximum occurs; for the present-day Earth, this corresponds to a local Solar time of ≈ 10 : 30 am and pm. Using this expression, we find where which represents the response of a damped driven harmonic oscillator. The f function has units of inverse frequency, or time.
The torque is maximized whenΩ ⊕ ≈ ω res = c s k, or when the rotation period isP ⊕ = 1/2 · 2πR ⊕ /c s . Using the present day surface sound speed c s = 3.4 × 10 4 cm s −1 , the resonant rotation period is about lod res ≈ 33 hours, compared to a period just under 23 hours found by our GCM models, so our crude approximations provide estimates good to about 40%, similar to the error in the estimate for δP s found above.
Recall that we employ Z to denote the variable composition of the atmosphere, and note that µ = µ(Z).
It is unclear what sets Q th , but candidates include radiative damping, ion damping at altitude, and surface friction (61). The GCMs that we employ both apply simple boundary conditions at the top of the model atmosphere, which may also introduce an effective dissipation, e.g., by artificially scattering the traveling sound wave into gravity waves. We will treat Q th as a free parameter, but assume that it does not vary with epoch.
The resonant period P res , or equivalently, ω res depends on the sound speed, which almost certainly varied with epoch, so we treat P res as a time-dependent free parameter when fitting the model to data.
The Solar luminosity L changes by some 30% over the age of Earth. Following (48) we use the approximation with t = 4, 560 My the age of the sun, and 0 ≤ t ≤ t ; the sun formed at t = 0, while t = 4, 560 My corresponds to the present epoch.
Similarly, the composition and the mean surface pressureP s (or mass) of the atmosphere alter the opacity, and both are believed to change with epoch. All three affectF . For example, simulations using both LMD-G and PlaSim show that increasing the amount of N 2 while keeping the other components fixed increases the surface pressure perturbation (see below). Increasing both P CO 2 and P N 2 increases the pressure perturbation even more. Increasing the Solar luminosity does as well.
We will treatF /H as a time-dependent free parameter in our fits to data. We do so by introducing a dimensionless normalization factor A(t) to describe the torque T th : where we use T th (0) = 4.14 × 10 22 dyne cm for the value of the present day thermal torque. A(t) can change on geologic time scales, and may vary by factors of several. Fig. S3 shows this torque with A(t) = A(0) = 1 and P res = 22.8 hr (the solid line) and for A = 2, P res = 19 hr (the dot-dash line); in both cases we set Q th = 100.
We employ a simple piece-wise linear model for A(t): This introduces three parameters, two that we vary as part of the Monte Carlo calculation (A 1 and A 2 ), and one that we consider fixed, the age at which the normalization begins to decrease, denoted t 0 . Note that t runs from 4, 560 Ma to the present (0). We have tried various values 1, 000 Ma < t 0 < 2, 500 Ma, so that A(t) = A 2 in the Hadean and Archean. The results do not depend strongly on the exact value, so we set t 0 = 2, 000 Ma. The dynamical model then prefers A 1 < A 2 , i.e., the thermal torque was larger in the past than at present, by a factor of several.
In Fig. S4 we compare the torque from eqn. (S28) against that calculated by PlaSim for present day values of Λ ≡ (F , Z,P s ). The fit from eqn. (S28) is fairly good away from resonance, and for day lengthsP ⊕ > P res . The peak torque forP ⊕ < P res (where the torque is negative) is overestimated by eqn. (S28) by about 40% compared to the result from PlaSim (results from LMD-G are similar to those of PlaSim).
We use eqns. S28 and S29 in our dynamical model rather than the GCM estimate of T th for two reasons. The first is practical; we need to find the torque for hundreds of thousands of dynamical models, so running a GCM is not feasible. The second reason is related to the value of Q th . The values predicted by the GCMs are around Q th ≈ 10, while the analytic predictions range from 20 to 100. The geologic data appear to favor larger values.
It is worth stressing that the GCM result shown in Fig. S4 demonstrates that when P ⊕ < P res , the thermal torque tends to increase the length of day, but when P ⊕ > P res (as it is today) the thermal torque tends to decrease the length of day. This is in agreement with the linear theory.

S2.1 Reduced torque due to isostatic adjustment in oceans
Equation (S28) gives the torque produced by the Sun's gravity on the mass quadruple of the atmospheric tide. However, if the atmospheric bulge lies over the ocean, the ocean will respond to the excess pressure (over the mean atmospheric pressure); the excess pressure in the body of the ocean will tend to produce horizontal motions away from the maximum pressure, much like a barometer reacts to increases in atmospheric pressure. In fact, the tendency of the ocean to act like an inverse barometer in response to pressure loading is well known, see, e.g., chapters 6 and 7 in (37). Reference (90) provides a nice review. As an extreme limit, eqn. (4.13) in (97) shows that an atmospheric mass (and hence pressure) perturbation over the ocean can be exactly compensated for by an opposite perturbation in the ocean. Thus the torque from the thermal tide is reduced by about 75% from that calculated here, see his eqn. (4.16).
Recently the effect of the semidiurnal thermal tide on the pressure at the bottom of the ocean has been directly detected using both satellite altimetry (98) and satellite gravimetry (99). The first paper compares satellite measurements of the height of the ocean to direct measurements of the pressure at 39 barometric pressure recorders on the open ocean floor in the tropics (since the thermal tide is small outside the tropics). The effects of both the diurnal and semidiurnal tides were detected. The results show that at 12 hours and over the scale of the tide, the ocean acts as an imperfect inverse barometer, with an ocean floor pressure reduction of about 70% of that of a perfect inverse barometer. The second paper uses gravimetry from the GRACE and GRACE-FO missions, combined with tide gauge and the ocean floor data just discussed, and shows that accounting for the diurnal and semidiurnal atmospheric tides (as well as higher frequency tides) improves the fits. Both papers are able to separate out the effects of the thermal tides from the gravitational tides.
For an example of the opposite limit, meaning a very small inverse barometer effect, consider a planet with oceans confined (by continents) to regions between meridional lines running from pole to pole. The early Atlantic ocean provides an example. Suppose the ocean's widths are all less than one quarter the planet's circumference. In that case, the response to the atmospheric pressure loading maximum will be that water will move north and south away from the equator, since the atmospheric pressure is high near the equator. Water underneath the atmospheric pressure minimum will move toward the equator. In neither case will the total mass between a pair of meridians change, although the mean moment arm will be reduced (for a pressure maximum) or enhanced (for a minimum) as water moves towards or away from the poles. The total (air plus water) mass quadruple will be reduced, but not eliminated.
To account for the possibility that the ocean might act like an inverse barometer, we allow A(t) in eqn. (S32) to be less than unity; for most Monte Carlo runs we choose a lower limit of 0.1. The best fit model from our MCMC calculations is shown in Fig. 4 in the main text.
The fixed LOD ≈ 19.5 hours implied by the data, as late as 1, 300 Ma, requires a thermal torque that is much larger than that today, and hence a large A 2 = 3.51 (note that the median value from the MCMC calculation is somewhat larger, A 2 = 3.58, see Table 1). This declines toward the present, so that at 1, 300 Ma, N (1, 300) ≈ 2.35. At the current epoch, the best fit N (0) = A 1 = 0.17. We point out in the main text that this low value might be a result of the assumption, made in current cyclostratigraphic work, that L EM is constant.

S3 Atmospheric composition over time
There is good evidence that the composition of the atmosphere changed over geologic time; see for example, reference (54), in particular, their Figs. 2, 3, and 5. The oxygen content was about a factor of 100 lower during the Proterozoic than today, and about a factor of 10,000,000 lower in the Archean. With somewhat less confidence, the CO 2 content was about 100 times higher at the beginning of the Proterozoic, while the total mass of the atmosphere was probably similar to that today. Our dynamical modeling favors a short atmospheric resonant period, which implies a combination of a high mean surface temperature and a low mean molecular mass. Fig. 7 in the main text shows an example atmospheric composition history. The values of P O 2 (orange dashed line) and P CO 2 (solid black line) in that figure are similar to those in (54). The total mean surface pressureP s in this example is 2 bar. We show our inferred P CO 2 for several epochs, t th = 700, 1, 000, 1, 500, 1, 800, and 2, 000 Ma, denoted by green squares with error bars. These CO 2 levels yield sound speeds, and hence resonant periods P res (t) that agree with the values of P res found in the Monte Carlo calculations we use to find a best fit dynamical model (see section 6, subsection "Resonant period versus age", below, for more discussion).

S4 Global Circulation Models
We are interested in a description of how the semi-diurnal atmospheric tide responds to changes in atmospheric properties and solar luminosity over geological times. The thermal tide behaves roughly as a Lamb wave, see (61) or (100), in essence a vertically-global sound-wave. We go beyond previous idealized setups, which employ a static 1D atmosphere (101) and (2) by using GCMs to describe this atmospheric mode. This allows us to capture the effects of a moist atmospheric circulation in a 3D inhomogeneous atmosphere on the thermal tide forcing and response.
We utilize two well tested and complementary GCMs, PlaSim and LMD-G, and find that they exhibit consistent behaviours for the Lamb mode of interest here, with only minor quantitative discrepancies on the mode resonant frequency.
PlaSim is an intermediate-complexity Earth GCM that includes a slab ocean, sea ice, land surface hydrology, snow, and a coupled atmosphere that includes moist processes and a simple radiation scheme (41). PlaSim allows for a simplified topography; for most of our runs this option was turned off. We performed some runs to check whether the topography affected either the mode frequency or Q th , but the differences were very small, within the variation found by running for different lengths of time. PlaSim solves the primitive equations using a spectral core, i.e., they are solved in Fourier space; to maintain numerical stability, numerical hyperviscosity, of the form ∝ ∇ 8 , is applied. PlaSim is typically run with an effective spatial resolution corresponding to 32 latitude cells by 64 longitude cells (denoted T21) or 64 latitude and 128 longitude cells (T42), with 5 or 10 vertical layers. Most of our runs used the T21 resolution with 10 vertical layers. The radiation scheme uses a two-band shortwave parameterization, with gray water and ozone absorption in the red and blue bands respectively, gray cloud scattering, and a single-band longwave radiation scheme with gray absorption from water, CO2, and clouds. PlaSim has been used to study modern Earth climate (102)(103)(104)(105), paleoclimate and snowball climate dynamics (106)(107)(108)(109), and tidally-locked and slow-rotating planets (109)(110)(111).
LMD-G (or LMD Generic) is a 3D GCM derived from the LMDz three-dimensional Earth (112) and Mars (113) GCMs. LDM-G solves the primitive equations using a finite difference method on an Arakawa C grid. LMD-G employs flexible radiative transfer based on the correlated-k method, (114), and thermodynamics/cloud microphysics prescriptions to simulate a broad range of atmospheric gas compositions. The code has been used in many climate studies of solar system planets (43,44,54) and (115,116), and exoplanets in a wide range of conditions (114,117,118) Fig . S5 shows the global mean temperatureT of Earth calculated using LMD-G, as a function of age for several different atmospheric compositions. Consistent with previous work (53,119), we find that, for a fixed composition,T increases with time; this increase is driven by the increase in solar luminosity. As discussed in the previous section, the composition of the atmosphere has changed rather dramatically over geologic time; in particular, the amount of CO 2 in the atmosphere has decreased. The short resonant period P res we infer from the geologic data suggests that the decrease of P CO 2 may have started as late as ≈ 1, 5000 Mya, much later than was previously believed. Alternately, P CO 2 may have varied non-monotonically, with a peak around 1, 500 Mya.
The results we obtain for the thermal tide using these two GCMs are qualitatively consistent with previous 1D treatments of the thermal tide response, as a damped, driven Lamb mode, e.g., Figs. S2 and S4, with the notable exception of the value of Q th . The 1D models find Q th ≈ 40 − 100, while both our GCM models return Q th ≈ 12. We have investigated the nature of the thermal tidal mode dissipation, building on the understanding developed by (61,100). Through a parameter exploration, we have established that the mode dissipation in PlaSim is only weakly affected by the magnitude of surface friction or hyperviscosity. The dissipation is somewhat more sensitive to model resolution (both horizontal and vertical), but only at the ten percent level, so that the strong dissipation in GCMs is robust. Understanding the origin of this (possibly numerical) overdissipation is left to future work. We note that the vertically-limited description in our GCM models, focused on the lower tropospheric models, and the impact of the upper boundary condition adopted in this class of models, could play a role, given that the Lamb mode perturbation velocity grows exponentially with height (120).    − t ). For a given (Z, P S ), the global mean temperatures found by PlaSim are several (∼ 9 C) degrees, or about three percent, higher than those returned by LMDZ for high values of P CO 2 , as can also be seen in Fig. 5 in the main text.

S5 Choosing the initial Lunar semimajor axis
A number of mechanisms operating early in the history of the Earth-Moon system could have altered L EM . The evection resonance (121,122) provides an example. The resonance occurs when the precession period of the Lunar apsidal axis matches the length of the year. The torque exerted by the Sun on the Lunar orbit transfers angular momentum from the Lunar orbit to the Earth's orbit around the Sun. The same torque may result in a limit cycle rather than libration around a fixed point (123); both result in a reduction of L EM . The evection resonance occurs when a ≈ 5R ⊕ . The change in L EM can be as large as a factor of two, much larger than the change effected by the thermal tide. Dissipation of Lunar obliquity tides, associated with the variation of the Laplace plane with distance from Earth, can reduce L EM by a similar amount (124); this mechanism is most effective for a 30R ⊕ . Other mechanisms that have been proposed to alter L EM include the scattering of roughly a Lunar mass worth of planetesimals (125), and interactions between the newly formed Moon and the remnants of the disk out of which it formed (126). The former occurs when a 30R ⊕ , the latter when a 5 − 7R ⊕ . We have used a = 20R ⊕ as our initial condition, since the time to reach this distance is short (≈ 30 M yr) while the time to reach 30R ⊕ is about 150 Myr.

S6 Monte Carlo parameter estimates
If we pick a set of parameters and initial conditions for eqns. (3)(4)(5) in the main text, and then integrate, we can generate predictions for the number of months per year, the number of days per month, and the length of day, at any time in Earth's history. These predictions can then be compared to the geologic data D ≡ {x i , σ i }, where x i denotes the number of months per year, days per month, or hour per day, depending on the index i; the σ i 's are the associated errors. Thus we have a generative model (see, e.g., ref (127), section 2.3).
We use a Bayesian approach, combined with Monte Carlo calculations, to use the geologic data to infer the probability distribution functions for L EM , P res (t), and other quantities of interest.
We start by describing the parameter set (denoted by θ) sampled by the Monte Carlo calculations. We then describe the likelihood function, and finally the prior probabilities we use for each parameter.
There are a number of quantities, listed below, that need to be specified in order to perform the integration. Several of these quantities will be of broad interest, including the initial value of the angular momentum of the Earth-Moon system L EM , which will inform Lunar formation theories, the quality factor of the thermal tide Q th , and the resonant period of the Earth's atmosphere P res (t).
Given P res (t), we can use the relation between P res andT from our GCMs, shown in Fig. 5 in the main text, to findT (t), which is of interest to climate modelers. The initial conditions required by eqns. (3)(4)(5) (in the main text) are L ⊕ , S ⊕ , and L . The start time of the integration (about 30 Myrs later than the formation time) τ of the Moon, also needs to be specified. We treat L ⊕ (τ ) and L (τ ) as fixed numbers. We treat L EM (τ ) = S ⊕ (τ ) + L (τ ) and τ as parameters whose probability distribution functions we wish to find, using radio-isotope dates to constrain τ .
To evaluate the gravitational tidal torque using eqn. (S1), we need the function Q(ω t ), for which we use the model of (6), shown in Fig. S1. We introduce a dimensionless number Q 1 to scale the tidal torque, a freedom needed to meet the constraints on the formation time of the Moon.
To evaluate the thermal torque T th using eqn. (S31), we need to specify the thermal torque normalization A(t), the atmospheric resonant period P res (t), and the thermal resonance quality factor Q th . As described above, we use the piece-wise model given in eqn. (S32) for A(t), thereby introducing two parameters A 1 and A 2 into our Monte Carlo modeling. Equation (S32) also introduces t 0 , but we regard this as a fixed parameter (we use t 0 = 2, 000 Ma), and do not include it in the Monte Carlo parameter set. As noted in the main text, the quantity Q th may well vary with time, but for simplicity we ignore any time dependence.
We use a piece-wise linear model for P res (t); we choose a vector P res = (P res,1 , . . . , P res,n th ) specifying the value of the resonant period of the Earth's atmosphere at various epochs. The length of this vector is denoted by n th , and chosen by the user, as is the corresponding vector of epochs t th,i at which to specify P res . For the plots presented here, we use n th = 10, with epochs given by t th = (0, 0.7, 1.0, 1.5, 1.8, 2.0, 2.2, 2.5, 3.0, 4.6) Ga. Neither n th nor t th,i are treated as parameters for the purposes of the Monte Carlo calculation.
To summarize, the parameter set θ we utilize for our Monte Carlo calculations includes the initial value of L EM , the start time of the integration (related to the Lunar age) τ , the normalization of the Ocean tide model quality factor, denoted Q 1 , the quality factor Q th of the atmospheric resonance, the two parameters A 1 , A 2 in eqn. (S32), and P res .
We regularize the parameter set by taking the natural logarithm of each parameter, so that θ = ln L EM , τ , Q 1 , Q th , A 1 , A 2 , P res .
As part of the regularization, we specify L EM in units of Earth's breakup angular momentum L s ≡ C GM ⊕ /R 3 ⊕ (where C is the largest moment of inertia of Earth, see table 1), the age τ in Myr, and P res in hours.
Given this parameter set, we integrate eqns. (3)(4)(5). We then interpolate to the epochs where we have data, and calculate the appropriate observable. Next, we calculate the χ 2 of the model given the data values x i and errors σ i . Assuming that the data points are independent of each other, we employ the likelihood where X i (θ) is the prediction of the model for the ith data point.
We use the Monte Carlo code emcee (62) to sample the posterior probability density function P (θ|D) of the parameters θ, given the data D. The posterior probability is related to the data and the predictions of eqns. (3)(4)(5), by Bayes' theorem The probability P (θ) is called the prior, and needs to be specified. We take the denominator P (D), often called the evidence, to be a normalization constant, and ignore it.
We use truncated Gaussian priors for all variables except A 1 and A 2 , for which we use truncated Rayleigh distributions. The priors for the first six parameters are forced to be zero outside the following limits: the initial angular momentum of the Earth-Moon system satisfies 0.33 ≤ L EM /L s ≤ 0.37, the integration start time time 4, 530 ≤ τ /M a ≤ 4, 300, the tidal Q normalization 0.5 ≤ Q 1 ≤ 2, the atmospheric resonance quality factor 10. ≤ Q th ≤ 120, while the torque normalization constants are in the ranges 0.1 ≤ A 1 ≤ 1.5, and 1 ≤ A 2 ≤ 10.
We use Gaussian priors for P th with mean µ = 22.8 hr, except for the present epoch, for which we use 22.9 hr. The Gaussian widths σ = 0.3 hr, except for those at the epoch of the Minoan/Sturtian glaciations (700 Ma) and the Huronian glaciations (at ≈ 2, 300 Ma), for which σ = 0.1 hr; we discuss this further below. We have also explored priors with shorter resonant periods between 1000 Ma and 2000 Ma, roughly the "boring billion", but we don't present the results here, except to say that they are similar to our default case. The Gaussians are truncated (set to zero) outside (18.5, 23.5) hours, except for the present epoch, zeroed outside (22., 23.5) hours, and the previous epoch (at 700 Ma) which is zeroed outside the interval (18.0, 23.5) hours.  Figure 6 reveals a strong (anti) correlation between L EM /L s and Q 1 ; small values of L EM /L s are associated with large values of Q 1 . Our integrations start with a fixed value of L , and effectively end with the present day value, so ∆L , which equals the integrated Lunar torque, is very nearly fixed, independent of θ. The Figure shows that the correlation between Q 1 and τ is weak, so varying Q 1 does not greatly affect the time over which we integrate. However, the spin frequency of Earth strongly affects the Lunar tidal torque, as can be inferred from Fig. S1; higher spin frequencies produce higher tidal Q, and hence lower tidal torques. Very roughly we have Q(ω t ) ∼ ω ⊕ ∼ S ⊕ ∼ L EM . The spin frequency of Earth (or the inverse LOD) is strongly affected by varying L EM , since we fix the initial a = 30R ⊕ . This yields L = 1.66 × 10 41 dyne cm and S ⊕ = 1.64 × 10 41 dyne cm. An increase in L EM (τ ) at fixed L increases the initial S ⊕ , the initial tidal frequency, and hence the initial Lunar tidal torque.     Fig. S6: Corner plot for the thermal tide model (128). There is a clear anti-correlation between L EM and the normalization of the ocean tide model: lower values of L EM are associated with higher values of Q 1 (and Q) and hence lower gravitational tidal dissipation rates. There is also a strong anti-correlation between Q th and the maximum thermal torque normalization A 2 ; this is consistent with the notion that the thermal torque has to exceed some critical value (associated with T at the time of resonance capture). Table 2 gives the median values and error estimates (corresponding to the 16th and 84th percentile values) for all parameters.

S6.1 Posterior parameter distributions, median values and ranges
The data, in the form of the present day LOD, fix the final value of the tidal frequency. It follows that an increase in L EM , by producing an increase ω t and hence Q(ω t ), forces a decrease in Q 1 , the normalization of T : where we approximate ω t ∼ 2ω ⊕ . Recall that L EM = S ⊕ + L , so ω ⊕ ∼ L EM − L . It follows that S6 also reveals a fairly strong anti-correlation between Q th and A 2 . Increasing Q th increases the peak T th , as does increasing A 2 , so the peak T th increases with the product Q th A 2 . The peak value of T th has to exceed T around 1, 500 Ma if the LOD is to be constant or decreasing. Hence, Q th A 2 > const. × T , leading to the inverse correlation we find. Table 2 gives the median parameter values and the values corresponding to the 16th and 84th percentiles of the distribution functions (the super-and subscripts on the median values).

S6.2 Resonant period and mean temperature versus age
In the main text we use the posteriors for P res to infer the global mean temperature as a function of geologic age. Figure S7 shows the posteriors for P res as a function of age (the median is given by the blue line; the shading indicates the 16th and 84th percentiles). The figure also shows the means and standard deviations of the gaussian priors for P res (the points and error bars); the values of the epochs t th,i are given in the figure caption.
The best fit posterior mean, together with the 16th and 84th percentiles, are listed in Table  2, where P 1 corresponds to the resonant period at present, and P 10 corresponds to the resonant period at the formation time of the solar system.
We note that P 10 is unconstrained by the data, so it is set by our choice of prior. In fact, the data do not constrain P res for ages older than 2, 500 Ma, as indicated by the agreement between the priors and the posterior at those early epochs. This is expected, since the length of day was 17.5 hours or less, shorter than any plausible P res . Under these conditions, the thermal torque is non-resonant and thus small, so that it leaves a signature on the spin evolution that is not detectable using our technique.
The low value of P res we infer around 1, 500 Mya is driven by the length of day between 2, 000 Ma and 1, 400 Ma. We note that the strongest constraint is that given by the short LOD (and small error bars) measured by reference (17); we have already noted that this data point relies on the assumption that L EM is constant, an assumption which our results call into question.
Since we know the dependence of P res (T ) on the global mean temperature, we can invert that relation to find the global meanT as a function of age. This is how we convert the results for P res as a function of age, shown in Fig. S7, to the run of mean global temperature with age shown in Fig. 7 in the main text. The period of the atmospheric resonance P res versus age, as inferred from our Monte Carlo calculation. The period is estimated at ten epochs, with ages of t res = (0., 0.7, 1.0, 1.5, 1.8, 2.0, 2.2, 2.5, 3.0, 4.6) Ga; the epochs at 0.7 and 2.2 Ga are chosen to allow for priors at the ages of known global scale glaciations (the Minoan/Sturtian at 0.7 and the Huronian at ≈ 2.2 − 2.5 Ga). We use Gaussian priors with a mean of µ i = 22.8 hr, except for the present, for which the mean is µ i = 22.9 hr. The standard deviations are set to σ i = 0.3 hr, except for the two glacial epochs, for which we choose a standard deviation of σ i = 0.1 hr, to ensure thatT is low enough to allow for global glaciations. The black squares and error bars show the mean and standard deviations for our priors. The blue solid line shows the mean P res , while the blue shading shows the 16th and 84th percentile (correspond to one sigma for Gaussian errors) values for the inferred P th . For ages between ≈ 2.0 Ga and ≈ 1 Ga, the data strongly favors P res ≈ 19−21 h. For more recent epochs, the data is consistent with P res ≈ 22.0 hr, about an hour shorter than at present. For ages older than ≈ 2.5 Ga, the percentile ranges correspond roughly to the standard deviations of the prior, indicating that the data does not provide strong constraints on P res at these epochs.
Using the posteriors for P res for epochs between 700 and 2, 000 Ma, we can infer the corresponding P CO 2 levels using Fig. 5 in the main text; we use the PlaSim results for each epoch, interpolating between the simulated values of P CO 2 = [1, 10, 100, 200, 300] mbar. The results are shown as the green squares with error bars in Fig. 8 in the main text. Our inferred values of P CO 2 are at or above the 95% confidence level reported in (54). However, we note that there are no data points reported or referenced in that paper between 1, 200 and 1, 800 Ma.