From: Edward Cook To: Tim Osborn Subject: Re: N(eff) and practicality Date: Tue, 24 Jul 2001 08:49:14 -0400 Cc: Phil Jones , Keith Briffa Hi Tim, Thanks for the remarks. We can certainly spend some time talking through some of the points raised. I guess I am still finding it difficult to believe that an rbar of 0.05 has any operational significance in estimating Neff. It is kind of like doing correlations between tree rings and climate: a correlation of 0.10 may be statistically significant, but have no practical value at all for reconstruction. The same goes for an rbar of 0.05 in my mind. I agree that what I suggested (i.e. testing the individual correlations for significance and only using those above the some significance level for estimating rbar) is somewhat ad hoc and not theoretically pleasing. However, it is also true that correlations below the chosen significance threshold are "not significantly different from zero" and could be ignored in principle, just as we would do in testing variables for entry into a regression model. This would clearly muddy (a nice choice of words!) the rbar waters, I admit. In terms of the problem I am working on (computing bootstrap confidence limits on annual values of 1205 RCS-detrended tree-ring series from 14 sites), it is hard to know what to do. Certainly, using Neff will result in almost none of the annual means being statistically significant over the past 1200 years. I don't believe that this is "true". Other highly conservative methods of testing significance result in a very high frequency of similarly negative results, i.e. the test of significance in spectral analysis that takes into account the multiplicity effect of testing all frequencies in an a posteriori way (see Mitchell et al. 1966, Climatic Change, pg. 41). If you use this correction, virtually no "significant" band-limited signals will ever be identified in paleoclimatological spectra. So, this test has very low statistical power. I think that this is the crux issue: Type-1 vs. Type-2 error in statistical hypothesis testing. The Neff correction greatly increases the probability of Type-2 error, while virtually eliminating Type-1 error. So, truth or dare. Consider one last "thought experiment". Suppose you came to Earth from another planet to study its climate. You put out 1,000 randomly distributed recording thermometers and measure daily temperatures for 1 Earth year. You then pick up the thermometers and return to your planet where you estimate the mean annual temperature of the Earth for that one year. How many degrees of freedom do you have? Presumably, 999. Now, suppose that you leave those same recording thermometers in place for 20 years and calculate 20 annual means. From these 20-year records, you also calculate an rbar of 0.10. How many degrees of freedom per year do you have now? 999 or 9.9? What has changed? Certainly not the observation network. Does this mean that we can just as accurately measure the Earth's mean annual temperature with only 10 randomly placed thermometers if they provide temperature records with an rbar of 0.00 over a 20 year period? I wouldn't bet on it, but your theory implies it to be so. Surely, one would have more confidence (i.e. smaller confidence intervals) in mean annual tempertures estimated from a 1000-station network. Cheers, Ed >Ed, > >re. your recent questions about Neff and rbar etc... > >I've thought a bit about these kind of questions over the past few years, >but have never completely got my head around it all in a satisfactory way. >I agree with what Phil said in his reply to you. Also, your idea of >subsamping 40% of the cores at a time sounds reasonable, though I don't >think it would be possible to write a very elegant statistical >justification! Anyway, I just wanted to add a couple of points to what >Phil said: > >(1) Even for very low rbar, the formula certainly works for >idealised/synthetic cases (i.e. with similar standard deviations and >inter-series correlations etc.). For example, I just generated 1000 random >time series (each 500 elements long) with a very weak common signal, >resulting in rbar=0.047. n=1000 was the closest I could get to n=infinity >without waiting for ages for the correlation matrix to be computed! The >formula: > >neff = n / ( 1 + [n-1]rbar ) > >which reduces to neff = 1 / rbar for n=infinity gives neff = 20.83. For >such a low rbar, neff seems rather few? The mean of the variances of the >1000 series was 1.04677. If I took the "global-mean" timeseries (i.e. the >mean of the 1000 series, then it's variance was 0.05041. The ratio of >these variances is 20.77 - almost the same as neff! If our expectation >that neff should be higher than 20.83 was true, then the variance of the >mean series should have been much lower than it was. It should be easy to >try out similar synthetic tests with various options (e.g. shorter time >series, sets of series with differing variances, subsets with higher common >signal (within-site) combined with subsets with weaker common signal >(distant sites) etc.) to test the formula further. > >(2) I agree that rbar is computed from sample correlations rather than true >(population) correlations. >(a) For short overlaps, the individual correlations will rarely be >significant. But the true correlations could be higher as well as lower, >so rbar could be an underestimate and neff could be an overestimate! Maybe >you have even fewer than 20 degrees of freedom! >(b) I did wonder whether the sample rbar might be a biased estimate of the >population rbar, given that the uncertainty ranges surrounding individual >correlations are asymmetric (with a wider range on the lower side than the >higher side). But I've checked this out with synthetic data and the rbar >computed from short samples is uncertain but not biased. >(c) Just because rbar is only 0.05 does not mean that you need series 1500 >elements long to be significant - that would be the case for testing a >single correlation coefficient. But rbar is the mean of many coefficients >(not all independent though!) so it is much easier to obtain significance. >Not sure how you'd test for this theoretically, but a Monte Carlo test >would work, given some assumptions about the core data. For 100 cores, >each just 20 years long, a quick Monte Carlo test indicates that an rbar of >0.05 is indeed significant - therefore rbar=0.05 in your case with > 100 >cores, many of which will be > 20 years long, should certainly be significant. > >Looking forward to your visit! We can discuss this some more. > >Tim > > >Dr Timothy J Osborn | phone: +44 1603 592089 >Senior Research Associate | fax: +44 1603 507784 >Climatic Research Unit | e-mail: t.osborn@uea.ac.uk >School of Environmental Sciences | web-site: >University of East Anglia __________| http://www.cru.uea.ac.uk/~timo/ >Norwich NR4 7TJ | sunclock: >UK | http://www.cru.uea.ac.uk/~timo/sunclock.htm ================================== Dr. Edward R. Cook Doherty Senior Scholar Tree-Ring Laboratory Lamont-Doherty Earth Observatory Palisades, New York 10964 USA Phone: 1-845-365-8618 Fax: 1-845-365-8152 Email: drdendro@ldeo.columbia.edu ==================================