I have been playing around with the top 100 (cherry picked) Hockey Stick Index (HSI) that are all that are supplied by McIntyre and McKitrick in supplementary data for their 2005 paper in GRL. In doing so, I noticed certain defects in the Hockey Stick index they used. Of these, the most glaring is that for any straight line with a any slope other than zero (fat) or infinite (vertical), it indicates that the straight line is a hockey stick. Even with white noise added, so long as the Signal to Noise ratio does not exceed one, the line will probably (>50% chance) be given a HSI greater than 1, the conventional benchmark used by McIntyre to indicate something is a hockey stick.

Here is an example of a straight line "hockey stick":

In this case, the HSI is less than that for MBH 98 or 99, but the mean is of the 132 realizations is greater. That is,

**according to the M&M05 HSI, a straight line with white noise and a S/N ratio of 1.25 or more is more like a hockey stick than are MBH98 and 99**.

This fact does not depend in any way on the slope (provided it is neither flat nor vertical). Negative slopes will yield negative HSI's, but M&M05 (correctly) regard negative HSI's as equivalent to positive values in that the MBH98 reconstruction method flips the sign on proxies if that yields a better fit to the temperature data (which is not an error).

From this it follows that the HSI developed by M&M cannot consistently distinguish between a straight line and a hockey stick shape. I suspect there are other shapes that it cannot distinguish either, but for now we need only consider the straight line.

**That means that, from the M&M05 HSI, we are unable to determine whether or not half of the 10,000 pseudo proxies are distinguishable from a straight line. Nor, using that index, are we able to distinguish MBH98 from a straight line.**That means that as a statistical test of the tendency of short centered PCA to generate shapes similar to that of MBH98, the test is totally without power.

**It tells you absolutely nothing.**

The total statistical power of the first part of M&M05, it turns out, comes from the visual comparison between MBH98 (fig 2) and the MBH first Principle Component of the North American Tree Ring Network (fig 3). That's it. And as everybody should no, eyeball Mark 1 has very little statistical power as well.

Not being content with finding a flaw with M&M05 statistical test, I looked to see if they could have done better. In the end I developed five variant Hockey Stick Indexes (vHSI) that were superior as a statistical test of a hockey stick shape (although not necessarily under all circumstances). These were,

- The ratio of the standard deviation of the calibration period relative to the calibration period (1902-1980) relative to the standard deviation of the non-calibration period. This tests for flatness in the "handle" vs noisiness or a high relative slope in the "blade". Like the M&M HSI, it will only work well when the "handle" is flat, but will work better in that circumstance.
- The angle formed by the slope in the calibration period relative to the angle of the slope of the non-calibration period if the two are displaced to intersect at the first year of the calibration period. This tests merely for the angle between "blade" and "shaft" and will work well regardless of orientation . It will not tell you how flat the "handle" is, however, and so can be confused by "hockey sticks" with very crooked "handles".
- The closeness of the largest inflection point in the period 1850-1900 to the inflection start of the calibration period. The inflection point is defined as the start year of the largest 50 year trend starting in that period. The index is defined as the difference between the inflection point and 1850 divided by the square root of the difference beween the start of the calibration period and 1850. (not shown)
- The angle formed by the slope to 1850 and the fifty year trend from the inflection point. This again works best with a flat "handle".
- The inflection point angle weighted by the inflection point index.

The twelve point mean is the average of the 12 pseudo proxies used by McInyre (and Wegman) in various illustrations M&M's results. MBH98 PC1 is the first principle component of the reconstruction of 1450-1400 temperatures from MBH98.

As can be seen, MBH98 and 99 are statistically distinguishable from even the cherry picked top 1% of pseudo proxies, with differences in index values never less than 2 standard deviations above the mean, and for one index nearly 10 standard deviations above the mean. MBH98 PC1 does not perform as well, but still can be statistically distinguished from the cherry picked top 1% in 3 out of the five tests. (The Inflection point vHSI shows MBH98 PC1 to be just over two standard deviations above the mean.)

This is still a work in progress. I think I need to improve my vHSIs by making comparisons with the instrumental record rather than the calibration period, and a combination of angle based and standard deviation based vHSI would probably be superior. Further, I should make comparisons with the first principle component of the North American Tree Ring data base.

Never the less, even at this stage the results show that

**you can devise variant Hockey Stick Indexes that are better able to determine a hockey stick shape**, and that

**if you use those vHSIs MBH98 and 99 stand out as easily statistically distinguishable from PC1s generated from red noise using short centered PCA**. Further, those vHSIs are demonstrably superior to that of M&M05 in that at least none of them will mistake a straight line for a hockey stick (except the pure inversion method, which is why it was not shown ;))

**So not only did M&M05 use a test with no statistical power, without validating the test; but alternative tests exist which would have refuted their thesis.**

The take home is that the first part of M&M05 is simply scientific garbage. It has no scientific merit whatsoever.

When I get around to it, I am going to see if I can develop even better vHSIs, but probably will wait at least till I have a copy of the NOAMER PC1, and ideally until I (or a collaborator) can generate a full set of pseudo proxies without the cherry picks for statistical comparisons. (Help with either would be appreciated.)

Tom: I'm encouraging people to develop some kind of notation to disambiguate the (very useful to MM) use of phrases like "almost always generates a hockey stick", since:

ReplyDeleteHS = HSUp + HSDown

but in this domain, to most people, HS means HSUp ...

but if someone says "but half are HSDowns" the comeback is "we said there were equal numbers of Up and Down and it didn't matter" .

but of course, no Down was ever shown.

John, with a normal tree ring proxy, increased tree ring width means increased growth which for tree rings near the tree line normally means increased warmth. But suppose you had tree rings from an arid region which became more arid with increased GMST. Then increased GMST means less water, which means less growth which means smaller tree ring width. With this proxy, a HS down would in fact mean increasing GMST. MBH did not look at the correlation between individual proxies and GMST. Rather they regressed them against GMST and those with a strong negative correlation (HSdown) were inverted to generate a proxy of GMST. It is because of this feature of MBH98 that HSup and HSdown is irrelevant to the analysis by M&M05. MBH98 quite correctly will reorientate proxy records to maximize correlation with GMST, and therefore HSdown pseudo-proxies as generated by M&M05 would have been reorientated by the MBH98 procedure to become proxies of rising global temperatures in the twentieth century.

ReplyDeleteAt least, most would have been. about 30% of M&M05's cherry picked top 100 HSI pseudo-proxies have negative trends over the calibration period and would in fact have been inverted by the MBH98 procedure as a result. An analysis of the difference that limited influence would have had would probably be interesting, but also beyond me.

Tom: yes, I understand all that, I've studied Bradley(1999), especially Chapter 10 Dendrochronology and I know Ray and have met Hughes a few times, who has some experience with trees in arid regions and heard him talk at AGU on this.

ReplyDeleteLet me try again, this is as about marketing, not science:

Imagine that the panels shown by MM and WR had actually been random selections from the 10,000 runs. Would that have been effective in conveying the marketing message that was the whole point?

The message was: decentering turns (variously described) red noise into graphs that look like MBH99.

I think you've seen:backgrounder Jan 27 2005. p.2:

"The error causes their PC method to nearly always identify hockey stick shaped series as the “dominant pattern” in a data set (the so-called “first Principal Component” or PC1), even when the data are just random numbers. We carried out 10,000 simulations in which we fed “red noise”, a form of trendless random numbers, into the MBH98 algorithm. In over 99% of the cases it produced hockey stick shaped PC1 series. The figure below shows 3 simulated PC1s and the MBH98 reconstruction: can you pick out the reconstruction?"