The Total Carbon Column Observing Network’s
GGG2014 Data Version
Debra Wunch
1
, Geoffrey C. Toon
1
,
2
, Vanessa Sherlock
3
, Nicholas M. Deutscher
4
,
Cate Liu
1
, Dietrich G. Feist
5
, Paul O. Wennberg
1
October 15, 2015
Abstract
This paper describes the updates to the Total Carbon Column Observing Network (TCCON) data analysis
to generate the GGG2014 data version, which is a significant improvement over the GGG2012 data version.
Laser sampling errors (a.k.a. “ghosts”) have been corrected, improving the network-wide consistency in
the retrieved column-averaged dry-air mole fractions. The
a priori
profiles used in the retrievals have been
improved using airborne and balloon-borne
in situ
measurements. The spectroscopic linelists are improved
for H
2
O, CO,
13
CH
4
, and the solar Fraunhofer absorptions. We have updated the airmass-dependence
corrections and scaling factors required to tie the TCCON to the currently-accepted World Meteological
Organization gas standard scales. Finally, an error budget is presented. The GGG2014 TCCON data are
available from the Carbon Dioxide Information Analysis Center (CDIAC) at
http://tccon.ornl.gov
.
1
California Institute of Technology, Pasadena, CA, USA.
2
Jet Propulsion Laboratory, California Institute of Technology, Pasadena, CA, USA.
3
Wellington, New Zealand.
4
University of Wollongong, Wollongong, Australia.
5
Max Planck Institute for Biogeochemistry, Jena, Germany.
To cite this document, please use:
Wunch, D., G. C. Toon, V. Sherlock, N. M. Deutscher, X. Liu, D. G. Feist, and P. O. Wennberg. The Total
Carbon Column Observing Network’s GGG2014 Data Version. doi:10.14291/tccon.ggg2014.documentation.R0/
1221662, 2015.
1 Introduction
The Total Carbon Column Observing Network (TCCON) provides atmospheric column-averaged dry-air
mole fractions of CO
2
, CH
4
, CO, N
2
O, H
2
O, HDO, and HF to the scientific and satellite validation com-
munities. These data have been used in scientific investigations of the carbon cycle [
Sussmann et al.
, 2012;
Keppel-Aleks et al.
, 2012, 2011;
Chevallier et al.
, 2011;
Guerlet et al.
, 2013;
Wunch et al.
, 2013;
Deutscher
et al.
, 2014], development of improved spectroscopic models and line lists [
Thompson et al.
, 2012;
Scheep-
maker et al.
, 2013;
Reuter et al.
, 2011;
Tran et al.
, 2010;
Tran and Hartmann
, 2008;
Reuter et al.
, 2012;
Miller and Wunch
, 2012;
Long and Hodges
, 2012;
Hartmann et al.
, 2009;
Gordon et al.
, 2011, 2010;
Galli
et al.
, 2012], in the validation of satellite measurements and satellite algorithm development [
Morino et al.
,
2011;
Wunch et al.
, 2011a;
Butz et al.
, 2011;
Schneising et al.
, 2012;
Schepers et al.
, 2012;
Reuter et al.
,
1
Figure 1: A map showing the locations of the TCCON stations. The background image is the Blue Marble:
Next Generation, produced by Reto St ̈ockli, NASA Earth Observatory (NASA Goddard Space Flight Center).
2013;
Parker et al.
, 2011;
Oshchepkov et al.
, 2013;
Deng et al.
, 2014;
Frankenberg et al.
, 2013;
Boesch et al.
,
2013], and in the evaluation of carbon cycle models [
Basu et al.
, 2011;
Houweling et al.
, 2010;
Keppel-Aleks
et al.
, 2013;
Mu et al.
, 2011;
Messerschmidt et al.
, 2013;
Fraser et al.
, 2013]. The TCCON instrumentation
and previous versions of the software are described in detail in
Washenfelder et al.
[2006] and
Wunch et al.
[2011b]. The network consists of ground-based Fourier transform spectrometers that measure absorption of
the direct solar beam in the near infrared region of the spectrum. The TCCON is over a decade old: its first
dedicated instrument, located at Park Falls, WI, USA, was installed in May, 2004. The TCCON has since
expanded to 23 operational sites; the site locations are plotted in Figure 1, and the site list with latitude,
longitude, and altitude information is in Table 1. Information about sites that were previously part of the
network is given in Table 2.
This paper describes the updates and improvements to the TCCON data for the GGG2014 data release,
hosted at the Carbon Dioxide Information Analysis Center (CDIAC,
http://tccon.ornl.gov
). Previous
versions of the TCCON data (GGG2009 and GGG2012) are also archived at CDIAC. Each TCCON dataset
has an unique digital object identifier (DOI) that can be used to cite the data in scientific articles. The
dataset citations are listed in Tables 1 and 2.
The TCCON data processing software, called “GGG”, is centrally maintained at the California Institute
of Technology. Each TCCON site uses the same version of the software, and the processing procedure is
2
consistent from site to site. The first step is to process the raw data (interferograms) into spectra, using a
subroutine called “I2S” (interferogram-to-spectrum).
A priori
profiles of pressure, temperature, geopotential
height, and water vapour from the National Centers for Environmental Protection and National Center for
Atmospheric Research (NCEP/NCAR) reanalysis [
Kalnay et al.
, 1996] are generated for the days on which
spectra are acquired. A subroutine called “GSETUP” generates empirical models of other trace gas profiles
using the pressure and temperature profiles from the NCEP/NCAR reanalysis. The spectra are then passed
into the main nonlinear least squares spectral fitting subroutine “GFIT” that iteratively scales the
a priori
atmospheric amounts to generate forward-modeled spectra that best fit the data. The retrieved total column
amounts of the gases are in units of molecules cm
−
2
and tend to be strongly influenced by surface pressure
(and hence topography). The total column amount, or
V C
gas
, is defined as the integral of the mole fraction
of the gas (
f
gas
(
z
)), multiplied by the total number density (
n
(
z
)), from the surface altitude (
z
s
) to the top
of the atmosphere:
V C
gas
=
∫
∞
z
s
f
gas
(
z
)
·
n
(
z
)
·
dz
(1)
Column-averaged dry-air mole fractions (DMFs; denoted X
gas
) are less sensitive to variations in surface
pressure and atmospheric water vapour than the retrieved total column amounts. This characteristic is
advantageous for carbon cycle studies because it permits direct comparisons of the trace-gas measurements
during different seasons, between sites, and with
in situ
measurements. To calculate DMFs, the total column
amount of the gas of interest is divided by the total column amount of dry air, which we measure using
co-retrieved oxygen (O
2
) multiplied by an assumed dry-air mole fraction of O
2
(0.2095).
X
gas
=
V C
gas
V C
O
2
×
0
.
2095
(2)
By ratioing the column amounts, systematic errors that are common to
gas
and O
2
cancel. The column-
averaged amount of dry air (X
air
) is a special case, and a useful quantity we use to examine station-to-station
biases, as it depends only on the surface pressure measurement (
P
s
), oxygen measurement (
V C
O
2
), and water
column (X
H
2
O
). The X
air
value should therefore be identical at all sites. It is defined as:
X
air
=
V C
air
V C
O
2
×
0
.
2095
−
X
H
2
O
×
m
H
2
O
m
dry
air
(3)
V C
air
=
P
s
{
g
}
air
·
m
dry
air
/N
a
The parameters
m
H
2
O
(18.02 g mol
−
1
) and
m
dry
air
(28.964 g mol
−
1
) are the mean molecular masses of water
and dry air, respectively,
N
a
is Avogadro’s constant (6.022
×
10
23
molecules mol
−
1
), and
{
g
}
air
is the column-
averaged gravitational acceleration. X
air
is explicitly corrected for the influence of water (second term on
the right hand side of equation 3). This correction is required because the surface pressure is enhanced by
3
the atmospheric water content. For an O
2
measurement with accurate spectroscopy, surface pressure, and
H
2
O retrievals, X
air
would have a value of 1.0. However, the typical X
air
value for TCCON measurements is
∼
0.98 and exhibits a small diurnal variation because of a
∼
2% bias in the O
2
spectroscopy that is airmass
dependent. Large (
∼
1%) deviations from 0.98 at any site indicate serious problems such as an error in
surface pressure, spectra with ghosts, spectra with a poor instrument optical alignment, or an error in the
time assigned to the spectrum causing the software to calculate the incorrect atmospheric path.
The dry-air mole fractions are passed through a set of post-processing routines, which include an air-
mass dependence correction, and a bias correction which ties the TCCON data to the currently-accepted
World Meteorological Organization (WMO) scale through comparisons with WMO-calibrated
in situ
profile
measurements obtained from aircraft or balloons.
There have been several important updates to the TCCON data processing software since the GGG2012
data version was released. These are: an interferogram resampling algorithm to correct for laser sampling
errors (
§
2); improvements in the spectroscopy for CO, CH
4
and its isotopologues, H
2
O, and N
2
O (
§
3);
improvements in the CH
4
, HF, and N
2
O
a priori
profiles (
§
4); and the ability to fit curvature in a spectrum’s
continuum (
§
5). As with all new versions of the TCCON data, an updated airmass dependence correction
is calculated (
§
6) and the data are tied to the WMO through comparisons with co-located profiles measured
by WMO-traceable instrumentation flown on aircraft and balloon platforms (
§
7). A GGG2014 error budget
is presented in
§
8.
2 Laser Sampling Error Correction
The interferogram-to-spectrum (I2S) subroutine of GGG reads the raw data (the “interferogram”), applies
a source intensity brightness correction [
Keppel-Aleks et al.
, 2007], a laser sampling error correction (new in
GGG2014), a phase correction [
Mertz
, 1967], and a fast Fourier transform [
Bergland
, 1969] to compute the
spectrum. The cause of the laser sampling error and its new correction method are described in this section.
The main instrument at each TCCON site is a Bruker 125HR Fourier transform spectrometer (FTS).
There is a small subset of TCCON sites with an earlier model (the 120HR). Most of the sites possessing a
120HR FTS have upgraded electronics to 125HR-equivalents, and all the subsequent discussion is relevant to
those sites. Sites with the 120HR without the upgrade include Lauder prior to 2011 (referred to as lauder01),
Tsukuba prior to 2011 (tsukuba01), and Ny-
̊
Alesund prior to 2012.
In all TCCON FTS instruments, a HeNe metrology laser with wavelength 632.8 nm (15798 cm
−
1
) is
passed through the interferometer simultaneously with the solar (IR) beam and produces a sine wave output,
which is used to sample the IR signal at precise and evenly spaced optical path differences (OPD). The
resulting digitized IR signal is the interferogram. At most TCCON stations, data are recorded simultaneously
4
Figure 2: The spectral range covered by most TCCON stations. The InGaAs detector (green) covers
3800 cm
−
1
–11000 cm
−
1
, and the Si detector (blue) covers 11000 cm
−
1
–15000 cm
−
1
. The folding frequency at
7899 cm
−
1
is marked by the grey vertical line. Note that the InGaAs spectral region spans above and below
the folding frequency (half the laser wavelength), requiring that the metrology laser is sampled twice per laser
fringe. The Si spectral region is entirely above the folding frequency. Absorption features of the main gases
of interest are marked by arrows. The O
2
band used in equation 2 is the band in the InGaAs region. H
2
O
and HDO are not shown because they absorb in many regions of the spectrum. The blacked-out region of the
spectrum used in the
Dohe et al.
[2013] laser sampling error correction is marked in black (7290–7360 cm
−
1
).
on two detectors: an InGaAs detector, which is optically sensitive from 3800 cm
−
1
to 11000 cm
−
1
, and a Si
detector, spanning 11000 cm
−
1
to 15000 cm
−
1
(Figure 2). All TCCON sites make measurements with an
InGaAs detector, but not all are equipped with a Si detector. Because the InGaAs detector’s spectral range
spans both halves of the Nyquist range (i.e., the detector measures both above and below half the metrology
laser wavenumber, 7899 cm
−
1
), the metrology laser must be sampled twice per laser cycle to uniquely resolve
all frequencies. This is done by sampling the IR signal on both the rising and falling zero-crossings of the
DC-coupled laser signal.
Messerschmidt et al.
[2010] found that a faulty electronics board in Bruker 125HR Fourier transform
spectrometers caused an error in the sampling of the metrology laser. In a correctly-sampled interferometer,
samples on rising and falling zero-crossings of the DC-coupled laser signal are evenly spaced in OPD. In the
faulty boards, however, the zero level was incorrect (it was not halfway between the peak and trough of the
laser signal), and so the spacing of the samples was not even in optical path difference. This causes a small
fraction of the spectral information above the Nyquist frequency (7899 cm
−
1
) to be folded back onto the
5
8438
8448
8458
8468
8478
8488
8498
8508
−0.02
0
0.02
0.04
Parent Spectrum
Lamont 20080920
−1
0
1
x 10
−4
Original Spectrum
7290
7300
7310
7320
7330
7340
7350
7360
−1
0
1
x 10
−4
Wavenumber (cm
−1
)
Ghost−Corrected Spectrum
Figure 3: The top panel shows the parent region with reversed x-axis. The middle panel shows its ghost in the
blacked-out region of the spectrum (7290–7360 cm
−
1
). The bottom panel shows the same blacked-out region
of the spectrum after the I2S correction has been applied. These plots were created using averaged Lamont
spectra on 2008/09/20, before the new laser sampling board was installed.
spectrum below 7899 cm
−
1
, and vice versa. This creates spurious signal called “ghosts” [
Brault
, 1996] that
interfere with the spectral fitting. The retrievals of O
2
are most impacted by ghosts, as its absorption band
straddles the 7899 cm
−
1
folding frequency (Figure 2). The magnitude of the ghosts relative to the desired
(“parent”) spectrum is proportional to the magnitude of the laser sampling error. Figure 3 shows a parent
region (8438–8508 cm
−
1
) that creates visible ghosts in a typically blacked-out region of the main spectrum
(7290–7360 cm
−
1
).
These ghosts can causes biases in X
CO
2
of more than 1 ppm, but most TCCON instruments suffered
from biases that were smaller. These errors, however, are site-dependent and can be time-dependent, as
metrology lasers age and are replaced. As a consequence of the work of
Messerschmidt et al.
[2010], Bruker
provided all TCCON partners with replacement laser sampling electronics boards to minimize this problem.
Most TCCON sites installed the new boards in 2011, and since that time the laser sampling errors have
been negligible. However, data recorded prior to the board replacement require correction.
There are two methods of correcting the TCCON spectra, the first of which is already described by
Dohe
et al.
[2013]. This method uses spectra throughout the TCCON site’s time series that are sufficiently wet
(or of high enough airmass) to fully absorb a particular region of the spectrum (typically between 7290–
7360 cm
−
1
, see Figure 2). The interferograms corresponding to the “wet” spectra are averaged over short
6
UTC HH:MM
22:00
23:00
00:00
01:00
02:00
03:00
Xair
0.975
0.98
0.985
0.99
0.995
Lauder 2010-10-05
Uncorrected at 40 kHz
Uncorrected at 20 kHz
Uncorrected at 10 kHz
Corrected with I2S
Figure 4: Retrieved X
air
values during a day at Lauder where measurements were recorded while deliberately
alternating the magnitude of the ghost (triangles) by modifying the scanning speed of the instrument. There
are three sets of measurements: the 40 kHz set had the largest laser sampling error, the 10 kHz set had a
smaller but non-zero laser sampling error, and the 20 kHz set had minimal laser sampling errors. The I2S
method of correcting for the ghosts (squares) show good agreement with the measurements with minimal laser
sampling errors.
time periods and passed through an algorithm that minimizes any spurious signal in the blacked out region
by iteratively adjusting the resampled spectral spacing. The amount by which the interferogram must be
resampled to minimize the spurious signal is then recorded and applied to all spectra. This method is
intuitive and successful at minimizing ghosts, but is sensitive to detector non-linearities, requires
a priori
knowledge of the atmospheric H
2
O amount, and cannot be performed on individual spectra due to insufficient
signal-to-noise.
The second method of correcting the TCCON spectra makes use of the simultaneously-measured Si
detector and will be referred to here as the “I2S” correction method. Because the Si detector spectral
range is wholely contained in the upper half of the alias, but still sampled twice per laser wavelength, even
and odd points in the interferogram each fully describe the resulting Si spectrum. Therefore, calculating
the angle between the phase curves of the even-only and odd-only points of the Si interferogram provides
a direct measure of the laser sampling error. This method does not require any
a priori
knowledge of
the spectrum itself (i.e., whether it has blacked out regions), can be performed without iteration on every
Si spectrum irrespective of the H
2
O amount, and is insensitive to detector non-linearities. Because the
7
(a) X
air
(b) X
CO
2
Figure 5: Retrieved values at Lauder from measurements recorded at 10 kHz, which had small but non-zero
laser sampling errors. The red circles show the mean biases caused by the ghosts. The black circles show the
mean differences between the I2S and
Dohe et al.
correction methods, which are an order of magnitude smaller
than the biases caused by the ghosts. Shown in (a), the ghosts cause a bias in X
air
of 0.0026; the difference
between the I2S and
Dohe et al.
methods is -0.0002. In (b), for X
CO
2
: 0.77 ppm, and -0.07 ppm.
Si and InGaAs spectra are recorded simultaneously and are digitized using the same laser signal, we can
apply the laser sampling error calculated from the Si data directly to the InGaAs data. Figure 3 shows the
results of the I2S method on the spectrum itself, and Figure 4 shows X
air
retrievals from the Lauder 125HR
(lauder02) instrument. The original Lauder X
air
measurements cycled through three different values of the
laser sampling error, induced by changing the scanning speed of the FTS. The X
air
values, which should
vary only slowly throughout the day, have marked discontinuities when the scan speed is changed. After
applying the I2S correction, the retrieved X
air
differences are negligible when the scan speed changes.
The main limitation of the I2S method is that the interferogram measured simultaneously with the
InGaAs interferogram must be sufficiently over-sampled that the spectral signal occupies less than half the
alias. At most TCCON stations, a Si diode detector is measured simultaneously with InGaAs and provides
this over-sampled measurement. Karlsruhe is the exception, which simultaneously measures InGaAs and
InSb, where the InSb range is limited by an optical filter to wavenumbers less than 7899 cm
−
1
. Over
90% of the TCCON data recorded have been corrected by the I2S method in the GGG2014 data version.
Sites that do not possess the InGaAs/Si detector pair are: Eureka (eureka01), Ny
̊
Alesund (nyalesund01),
Bremen (bremen01), early Tsukuba (tsukuba01), early Lauder (lauder01), and Iza ̃na (izana01).The Eureka
instrument was installed with the new electronics board and does not suffer from ghosts. The tsukuba01,
lauder01 and early Ny
̊
Alesund 120HR instruments have different electronics to the Bruker 125HR, and likely
do not suffer from ghosts. Bremen and Iza ̃na use the
Dohe et al.
method to correct their ghosts. Figures
5(a)–(e) compare the
Dohe et al.
and I2S methods of correcting the laser sampling errors: the differences
8
(c) X
CH
4
(d) X
N
2
O
(e) X
CO
Figure 5: continued. In (c), for X
CH
4
: 3.62 ppb, and -0.34 ppb. In (d), for X
N
2
O
: 0.74 ppb, and -0.07 ppb. In
(e), for X
CO
: 0.12 ppb, and -0.01 ppb.
between the two methods are negligibly small and an order of magnitude smaller than the errors induced by
the uncorrected ghosts.
3 Updated Spectroscopy
GGG uses several types of spectroscopic line lists: atmospheric (telluric), solar, collision-induced absorption,
self-induced absorption, and pseudo line lists. This section will discuss the updates to the atmospheric
line list and the solar line list, which are most relevant to TCCON. The atmospheric line list [atm.101,
Toon
, 2014a] is available from
http://dx.doi.org/10.14291/tccon.ggg2014.atm.R0/1221656
. The solar
line list [solar
merged.108,
Toon
, 2014b] is available from
http://dx.doi.org/10.14291/tccon.ggg2014.
solar.R0/1221658
.
9
3.1 Atmospheric Line List
The main updates to the atmospheric line list for the GGG2014 software release are in the H
2
O, CO, and
13
CH
4
spectroscopy. An additional (10th) CO
2
isotopologue was added from the latest HITRAN database
[HITRAN2012,
Rothman et al.
, 2013]. There were no other changes made to the CO
2
and O
2
spectroscopy.
The windows fitted in GGG2014 are listed in Table 3. Below is an itemized list of changes to the spectroscopic
line list.
H
2
O: There were many changes made throughout the 4000
−
6000 cm
−
1
region based on fits made to
Kitt Peak laboratory spectra. These changes not only affect the retrievals of H
2
O and HDO, but also
CO, N
2
O, CH
4
, and HF due to the water absorption in their windows.
CH
4
: A line list corresponding to
13
CH
4
from HITRAN2012 (and tweaks) was added, which impacts the
CO and CH
4
windows. There were several
13
CH
4
lines in the previous line list that were mis-identified
as
12
CH
4
in the GGG2012 line list, and the mis-identified lines were removed.
CO: We have adopted the HITRAN2012 line list, which has narrower widths than the HITRAN2008
line list used in GGG2012. After the combined changes from H
2
O,
13
CH
4
, and CO, the two CO
windows produce more consistent retrieval results. For GGG2012, there was a 3-5% difference in the
retrieved CO columns between windows. For GGG2014, the difference has reduced by about an order
of magnitude, to 0.1-0.6%.
N
2
O: In addition to the improved H
2
O line list in the region, we have added an additional retrieval
window (centred at 4719 cm
−
1
) to improve the robustness of the N
2
O retrievals.
3.2 Solar Line List
Absorption and emission lines from the sun itself are modeled in GGG using a solar line list that contains
more than 40,000 lines covering 600
−
25
,
000 cm
−
1
. Combined with a simple empirical line shape model,
the line list can be used to generate a solar pseudo-transmittance spectrum at any spectral grid, for disk-
center, disk-integrated, and intermediate cases. The line list was calculated from high-resolution spectra
from Kitt Peak [
Wallace and Livingston
, 2003], MkIV balloon [
Toon
, 1991], TCCON, and ATMOS [
Irion
et al.
, 2002], and was subsequently validated by ACE-FTS [
Bernath
, 2005], TCCON, GOSAT [
Hamazaki
,
2005;
Kuze et al.
, 2009], Kitt Peak [
Kurucz
, 2005, 2008], and SolSpec [
Thuillier et al.
, 2003] measurements.
The GGG2014 solar line list is very similar to the GGG2012 solar line list in the TCCON frequency range.
10