GISTEMP: Verify Operation and Establish Baseline

March 5, 2009
This is the second in a series of articles about GISTEMP, a software system created by the NASA Goddard Institute for Space Studies (GISS) that is used to produce the GISS Surface Temperature Analysis.  In the previous article, we gave you an overview of what to expect if you want to install the package on your system and get it operational.  This article will explain how we verified that our installation was functioning properly before moving on to our real objective, which is to evaluate what effect various changes to some of the assumptions, decisions, judgement calls, parameters, and functionality built into the software have on the temperature analysis.  The modifications and their effects will be the topics of future articles.

After spending a considerable amount of time gathering the input data files, getting the system running, and walking through each step trying to understand what it was doing and how, we finally got around to doing a full end to end run sequentially through all of the steps and produce a set of output files.  When you have successfully completed a run, you end up with a set of four files in the STEP3/results directory that represent the land-only temperature anomalies for the globe, the Northern Hemisphere, Southern Hemisphere, and specific latitudinal zones.  You also have another set of four files in the STEP4_5/results directory that represent the land plus ocean temperature anomalies for the same areas.  There are other files produced, but for the purpose of this article they are not relevant.

In order to verify that the system was indeed functioning as intended, the output we produced was compared to the output available on the GISS web site (scroll down near the bottom of the page to find the four land and four land plus ocean text files).  We ran the "diff" command on each of our land-only files against the corresponding GISS files and inspected all lines identified as different by hand to see which values did not match.  In less than 4% of the total monthly and annual temperature anomaly values generated, minor variations of .01C were detected.  In the majority of cases our value was .01C lower than the GISS value, although there were several places where our value was .01C higher.  Performing the same check on the land plus ocean files generated significantly more output, too much to inspect by hand as we had previously done. 

The original temperature anomaly files from the GISS web site that were used to do the comparison to our output were saved as part of the baseline and are available for review:

GISS Land-only anomaly files:  Global, Northern Hemisphere, Southern Hemisphere, Zonal
GISS Land and Sea anomaly files:  Global, Northern Hemisphere, Southern Hemisphere, Zonal

To eliminate the tedium and automate the process, we developed some scripts.  The first script just does a full run of GISTEMP.  The information that was displayed on the screen during the initial run was captured and stored as part of the baseline. Each of the eight temperature anomaly files produced is also stored in our baseline to be used for comparison purposes during future testing.  Those files are available for review:

YVM Land-only anomaly files:  Global, Northern Hemisphere, Southern Hemisphere, Zonal
YVM Land and Sea anomaly files:  Global, Northern Hemisphere, Southern Hemisphere, Zonal

The next script we wrote compares two temperature anomaly files and print out details about each of the differences detected.  Another script reads the discrepancies identified by compare_output and counts how often the differences are higher and lower than the GISS data.  The last script we wrote invokes compare_output and high_low_count for each of the eight files we are interested in to produce a summary or detail report about the differences between our output and the corresponding files from GISS.  The detail report was captured and stored as part of the baseline. The summary output from run_compare when using the GISS files as the baseline for comparison is as follows:

		STEP3 Global anomalies - Found 69 total differences, 0 >.01C
		Higher than baseline 56 times, lower 13 times

		STEP3 NH anomalies - Found 142 total differences, 0 >.01C
		Higher than baseline 116 times, lower 26 times

   		STEP3 SH anomalies - Found 33 total differences, 0 >.01C
		Higher than baseline 12 times, lower 21 times

   		STEP3 Zonal anomalies - Found 72 total differences, 0 >.01C
		Higher than baseline 45 times, lower 27 times

   		STEP4_5 Global anomalies - Found 510 total differences, 28 >.01C
		Higher than baseline 179 times, lower 331 times
		Big differences higher 14 times, lower 14 times

   		STEP4_5 NH anomalies - Found 755 total differences, 126 >.01C
		Higher than baseline 330 times, lower 425 times
		Big differences higher 53 times, lower 73 times

   		STEP4_5 SH anomalies - Found 342 total differences, 38 >.01C
		Higher than baseline 91 times, lower 251 times
		Big differences higher 6 times, lower 32 times

   		STEP4_5 Zonal anomalies - Found 396 total differences, 49 >.01C
		Higher than baseline 164 times, lower 232 times
		Big differences higher 26 times, lower 23 times

There are a total of 17,504 temperature anomaly values in the eight files.  The comparison identifies 2,319 temperature anomaly values as being different between our files and the GISS files, 13.24% of the total number of values.  Of those differences, 2,078 (89.6% of the anomaly values that did not match) were off by .01C and 241 (1.38% of the total temperature anomaly values) were off by .02C or more.  The land-only files were the closest match with just 316 values that were all .01C different than the temperature anomalies computed by GISS.  The land plus sea files contained 2003 values that were different than those computed by GISS and all of the 241 values that varied by .02C or more.  A further analysis of the land plus sea temperature anomaly discrepancies identified 187 that were off by .02C, 30 off by .03C, 15 off by .04C, 5 off by .05C, 1 off by .06C, 1 off by .07C, and 2 that were off by .08C. 

We determined that all of the discrepancies could be attributable to rounding errors, most likely caused by differences between the floating point hardware in the system we used versus the system used by GISS.  It is also possible that GISS used a slightly different version of the SBBX.HadR2 input data file than the one we used, which may have been the cause of some of the larger discrepancies.  In any event, we were satisfied that our installation was operating properly and that we had established a suitable baseline environment for further research into GISTEMP.  The original output files from the GISS web site are archived in our baseline directory for historical purposes along with the original input data files used.  All future research will be conducted using the same input data files and the output files produced during our initial run, as described and available above, as the baseline for comparison purposes after modifications are made to the software or other files.

Update
After publication of our article on base period selection and writing the scripts to extract data and import into spreadsheets in order to create charts, we received a request to produce charts displaying the differences between our baseline and the original GISS data.  The following global charts graphically portray the difference between our baseline and the files originally downloaded from the GISS web site.

As can be seen in the global land only chart, the differences are barely blips.

USHCN raw global changes

The global land and sea chart also demonstrate that the differences between our baseline and the GISS data are merely inconsequential blips.

USHCN raw global changes

An Excel file containing the data and all of the charts was created and incorporated into our baseline.