More on the statistical dispute between Scafetta and Schmidt

By Andy May

The argument about the proper way to estimate error in the European Centre for Medium-Range Weather Forecast (ECMWF) ERA5 weather reanalysis dataset between Nicola Scafetta and Gavin Schmidt has finally been published by Geophysical Research Letters. Schmidt, Jones, and Kennedy’s comment is here (Schmidt, Jones, & Kennedy, 2023), and Scafetta’s response is here (Scafetta N., 2023a).

I first wrote about this dispute earlier in the year here. Nothing much has changed in the final versions.

Schmidt, Jones, and Kennedy’s assessment of the error in the ERA5 surface temperature dataset average still (incorrectly) assumes that, during such a period, the global surface temperature was constant from 2011-2021 and that its yearly variability is due to random noise. This is clearly a nonphysical interpretation of Earth’s climate, since there are real systematic changes in the climate from year to year, whether one assumes they are due to natural or man-made forces, or both.

By conflating natural and man-made climatic forces with random noise Schmidt, Jones, and Kennedy inflate the real error of the temperature mean by 5–10 times. In fact, a proper analysis of the ensemble of observed global surface temperature members yields a decadal-scale error of about 0.01–0.02°C, as reported in published records. BEST (Berkeley Earth Land/Ocean Temperature record) derives an error of +/- 0.018- 0.020 °C for the 11-year period 2011-2021 (1951-1980 anomalies and the April 2023 version of the BEST dataset). Instead, Schmidt, Jones, and Kennedy assessed the error using the standard deviation of the mean (see Chapter 3 here) from the period 2011-2021. The equation they use is an equation that can only be used when there are multiple measurements of the same quantity, not eleven annual estimates for eleven different years. It cannot be used to properly estimate the error of a quantity, in this case the average surface temperature of the Earth, that changes naturally and possibly due to human emissions, from year to year.

Scafetta’s original paper, the reason for the dispute can be downloaded here. In the paper Scafetta shows that all IPCC/CMIP6 climate models that result in an ECS^[1] that is greater than 3°C warming per doubling of CO₂ overestimate observed global warming at a statistically significant level. How to determine what is statistically significant is at the heart of the dispute. But statistics or not, Scafetta’s point is apparent in figure 1. When in doubt look at the data.

Figure 1. IPCC/CMIP6 climate-modeled temperatures (in red) compared to observations (blue, ERA5 2 meter temperatures). Source: (Scafetta N. , 2022a).

In figure 1, the observations are from ECMWF ERA5. Clearly, if CO₂ and other greenhouse gases are causing all the recent warming, as the IPCC AR6 report claims (IPCC, 2021, pp. 425 & 961-962), the climate sensitivity we are observing is lower than 3°C. Scafetta’s analysis of ECS is very compelling, but there is still more evidence that the higher AR6 ECS estimates are incorrect. For more on this subject, see my four part series on the mysterious AR6 ECS: Part 1, Part 2, Part 3, and Part 4. There is also a very good summary of observational estimates of ECS, and a critique of the AR6 methods of determining ECS in Chapter 7 of the Clintel volume on AR6, here.

Works Cited

Crok, M., & May, A. (2023). The Frozen Climate Views of the IPCC, An Analysis of AR6.

IPCC. (2021). Climate Change 2021: The Physical Science Basis. Contribution of Working Group I to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change. In V. Masson-Delmotte, P. Zhai, A. Pirani, S. L. Connors, C. Péan, S. Berger, . . . B. Zhou (Ed.)., WG1. Retrieved from https://www.ipcc.ch/report/ar6/wg1/

Scafetta, N. (2022a). Advanced Testing of Low, Medium, and High ECS CMIP6 GCM Simulations Versus ERA5-T2m. Geophysical Research Letters, 49. doi:10.1029/2022GL097716

Scafetta, N. (2023a). Reply to “Comment on ‘Advanced testing of low, medium, and high ECS CMIP6 GCM simulations versus ERA5-T2m’. Geophysical Research Letters, 50. doi:10.1029/2023GL104960

Schmidt, G. A., Jones, G. S., & Kennedy, J. J. (2023). Comment on “Advanced testing of low, medium, and high ECS CMIP6 GCM simulations versus ERA5-T2m”. Geophysical Research Letters, 50. doi:10.1029/2022GL102530

Taylor, J. (1997). An Introduction to Error Analysis, second edition. University Science Books. Retrieved from https://www.amazon.com/Introduction-Error-Analysis-Uncertainties-Measurements/dp/093570275X/ref=monarch_sidesheet

ECS is the equilibrium climate sensitivity, or the ultimate change in global average surface temperature after an instantaneous doubling of CO₂. See here for more details. ↑

4.7 16 votes

Article Rating

392 Comments

Oldest

Newest Most Voted

Inline Feedbacks

View all comments

Nick Stokes

September 23, 2023 10:30 am

The issue here is deciding whether the difference between the earth observations and the model results is significant, for various classes of models. Well, there is the eyeball test, as Andy applies. But Scafetta claimed to use advanced methods. Here is an expanded version of that plot:

It’s unbelievably primitive. He just averages the models over the time period, and the observations, to give the result on the right, and then applies the eyeball test to that. There is a lot wrong with that; the variable that he is averaging is clearly not stationary, so it would indeed be hard to find a statistical test. But the eyeball cannot make up for that.

But the error that should be obvious here is that he allows no uncertainty in the observations. None at all. Now Andy objects that Schmidt et al have allowed too much, but zero has to be wrong. This matters, because the outcome is a claim that observations couldn’t have been in the group of model results. If you allow no uncertainty, of course that is more unlikely.

Normally the uncertainty of the mean would be derived from the stantard deviation of yearly results, which as you can see is substantial.

The uncertainty is different from the one for records, as with Best etc. That is uncertainty relative to given weather. But for this test, you also have to include weather uncertainty.

Suppose you had an ideal model, which would be a planet B (Earth is A), similar in all respects, including rising GHG. You want to know if the climate is different, so you watch both for a decade. In what ways might the results differ and still have the same climate? Well, measurement, for one, and also sampling error (different choice of locations give different results). But also they experiene different weather. The ENSOs will happen at different times, for example. All aspects of weather will be different, and you have to allow for all that before you can say nthe climate is different. That is what Schmidt et all correctly did.

-41

Nick Stokes

Reply to Nick Stokes

September 23, 2023 10:35 am

Correction, the numbers on the right are not the mean, but the “warming”, as determined by the trend. But the same objections apply, even more so. The models can be said to have uncertainty estimated by scatter, but the trend of observations (blue) most certainly has uncertainty too, as in any regression.

-32

Joseph Zorzin

Reply to Nick Stokes

September 23, 2023 10:56 am

I’m shocked- there’s any kind of uncertainty in climate science? But… but… I thought “the science is settled”.

karlomonte

Reply to Nick Stokes

September 23, 2023 3:11 pm

Stokes has zero clues about what measurement uncertainty is, just like his noisy disciples have zero clues.

Nick Stokes

Reply to karlomonte

September 23, 2023 3:48 pm

So do you think it is zero, as Scafetta used?

-9

karlomonte

Reply to Nick Stokes

September 23, 2023 5:21 pm

Uncertainty is not error, they are both wrong.

Sunsettommy

Reply to Nick Stokes

September 24, 2023 8:42 am

Both Models are lacking forecast skill and based on the bogus AGW conjecture which has long been shown to be crap.

1) No Hot Spot exists.

2) No Positive Feedback Loop exist and never existed in the last Billion years.

3) The Sun/Ocean effect is routinely ignored which is stupid.

All because of the stupid insistence of slavishly hanging onto a dead AGW conjecture.

This is why I don’t take these climate models seriously.

Gerald Browning

Reply to Nick Stokes

September 23, 2023 10:57 am

Nick never responded to the mistakes he made on the thread that shows that climate models are based on the wrong dynamical system of equations.

scvblwxq

Reply to Gerald Browning

September 23, 2023 11:46 am

Ignoring the Sun’s cycles of solar output in the models is a big mistake

gezza1298

Reply to scvblwxq

September 24, 2023 6:30 am

As is getting clouds wrong.

Bill Powers

Reply to Gerald Browning

September 24, 2023 10:34 am

Whenever you box Nick in he goes silent running. Then he pops up somewhere else and launches torpedoes, “Fire!, range, mark” in his ready shoot aim fashion. Soros funds him by the post and not the word

Andy May

Author

Reply to Nick Stokes

September 23, 2023 10:59 am

Hi Nick,
Scafetta is not assuming zero error in the observations, he is assuming the error recommended by BEST:

“Estimated Jan 1951-Dec 1980 global mean temperature (C)
Using air temperature above sea ice: 14.105 +/- 0.018
Using water temperature below sea ice: 14.701 +/- 0.018″

Check the link behind “published records.” This is not the exact dataset that Scafetta used, he downloaded in April, and it has been revised, but it is very close. Do your own analysis of the error, you will come up with Scafetta’s value, not Schmidt’s, as BEST did.

Estimating error properly means not including the quantity you want to estimate, which is what Schmidt did when he assumed all the variation over 11 years was random and not real climate variability.

Nick Stokes

Reply to Andy May

September 23, 2023 11:59 am

“he is assuming the error recommended by BEST”

Where? You may have assumed that, but I can’t see any mention in Scafetta’s paper. As you see from Fig 1 that I posted, he just says for observed warming
ΔT=0.56C
No uncertainty quoted. Just a single point on the graph.

And as I said, the BEST uncertainty is inappropriate anyway, because it is for fixed weather. To decide if the climate is different, you have to allow for weather uncertainty as well.

“Do your own analysis of the error, you will come up with Scafetta’s value”

Scafetta’s value is 0.00000

-20

Andy May

Author

Reply to Nick Stokes

September 23, 2023 1:22 pm

Nick,
He explains the computation of error in Appendix 1 of his paper. He actually evaluated the error of a number of products, because ERA5 does not supply error for their products.

Scafetta’s error is not zero, as stated it is between 0.01 and 0.02, and conforms to many other estimates. I bet you cannot find one that matches Schmidt’s.

Nick Stokes

Reply to Andy May

September 23, 2023 1:50 pm

Andy,
Scafetta’s paper is here. There is no appendix. No error is considered in the paper.

As I said, the BEST estimate for a year would not be right anyway, because it is for fixed weather. Since he is talking about warming, based on regression trend, the first error to consider would be the normal uncertainty of the trend. For BEST, for warming from 1980 to 2022, that would be ±0.16C.

-12

Nick Stokes

Reply to Nick Stokes

September 23, 2023 1:59 pm

that would be ±0.16C.
Oops, should be ±0.08C. The range is 0.16.

-11

Andy May

Author

Reply to Nick Stokes

September 23, 2023 3:51 pm

Nick you have the wrong paper. The error analysis is in this one:

“CMIP6 GCM ensemble members versus global surface temperatures”
CMIP6 GCM ensemble members versus global surface temperatures | Climate Dynamics (springer.com)

Nick Stokes

Reply to Andy May

September 23, 2023 5:29 pm

I’m talking about the paper Gavin et al commented on (your link, in your first para):

and to which Scafetta replied – again your link, first para. It is the one listed in your works cited

Scafetta, N. (2022a). Advanced Testing of Low, Medium, and High ECS CMIP6 GCM Simulations Versus ERA5-T2m. Geophysical Research Letters, 49. doi:10.1029/2022GL097716

This other paper is not cited in your article.

-5

TimTheToolMan

Reply to Nick Stokes

September 23, 2023 6:05 pm

Nick, you appear to be commenting on a research letter published in March 2022. Scarfetta’s paper appears to have actually been published in September of 2022 and looks a little different to the letter that Schmidt et al commented on.

I could be very confused here as its difficult to follow and I haven’t followed it closely but it looks like

Research letter by Scarfetta in March 2022
https://agupubs.onlinelibrary.wiley.com/doi/full/10.1029/2022GL097716

Reply to that letter also September 2023 by Gavin et al
https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2022GL102530

Actual paper published by Scarfetta in September 2022
https://link.springer.com/article/10.1007/s00382-022-06493-w

It now has a different name so that doubles how confusing it is but you can see from the abstract that they’re still grouping the GCMs for analysis.

eg from the letter

“Scafetta (2021a) tested several CMIP6 GCMs against some temperature records and found that the data-model agreement improves for the models with lower ECS. Herein, we provide a complementary and more robust statistical approach by grouping the same models into three sub-ensembles according to their ECS: low-ECS, 1.8–3.0°C; medium-ECS, 3.01–4.50°C; and high-ECS, 4.51–6.0°C. “

and from the paper

“The Coupled Model Intercomparison Project (phase 6) (CMIP6) global circulation models (GCMs) predict equilibrium climate sensitivity (ECS) values ranging between 1.8 and 5.7°C. To narrow this range, we group 38 GCMs into low, medium and high ECS subgroups and test their accuracy and precision in hindcasting the mean global surface warming observed from 1980–1990 to 2011–2021 in the ERA5-T2m, HadCRUT5, GISTEMP v4, and NOAAGlobTemp v5 global surface temperature records.”

I could be wrong but it looks like Gavin et al are commenting on an earlier version of the analysis.

bnice2000

Reply to TimTheToolMan

September 23, 2023 8:17 pm

“accuracy and precision in hindcasting ” ???

With so many parameter and fudge factors…

… surely they can get a hindcast to agenda-fabricated data correct !!

Nick Stokes

Reply to TimTheToolMan

September 23, 2023 10:10 pm

You have the links and sequence correct, and they are also the links given by Andy. It’s very clear what Gavin commented on; it is spelt out in the title. The paper Andy mentions has not previously been referenced here.

-1

TimTheToolMan

Reply to Nick Stokes

September 23, 2023 10:34 pm

Nick writes “It’s very clear what Gavin commented on; it is spelt out in the title.”

Yes, but why? Why comment on a research letter now which I’m taking to be a preliminary analysis from well over a year ago when the actual paper has been available for a year too?

Andy May

Author

Reply to TimTheToolMan

September 24, 2023 2:11 am

Good point. To me it is very clear that Schmidt, Jones and Kennedy are being disingenuous.

Nick Stokes

Reply to TimTheToolMan

September 24, 2023 1:15 pm

“Why comment on a research letter”

A long story, set out here. They actually wrote and submitted their response within a few days. The holdup was that the journal policy was in some confusion, and when they sorted that out, it required that the comment and author’s response be published together. But Scafetta’s response ran into trouble with the reviewers.

Andy May

Author

Reply to Nick Stokes

September 24, 2023 2:09 am

Yes, it has, both in my original post and in Scafetta’s reply.

Andy May

Author

Reply to TimTheToolMan

September 24, 2023 2:08 am

Tim,
This argument has been going on for a very long time. But, the most recent comment and reply articles cited in the post have been reviewed and updated. They should include all of the various papers on the subject.

Andy May

Author

Reply to Nick Stokes

September 24, 2023 2:04 am

Nick,
The paper on the statistics and the appendix you seem so concerned about are mentioned in the Abstract of Scafetta’s reply and in my original post on this dispute. This is from my original post:

“Scafetta’s original Geophysical Research Letters paper was later followed by a more extended paper in Climate Dynamics (Scafetta N., 2022b) where the issue is discussed in detail in the first and second appendix.”

This short blog post was just an update and summary of a very long discussion on this subject. Schmidt, Jones, and Kennedy’s high school mistake in statistics should be obvious to everyone but caught-they now have dug in their heels. You shouldn’t even need the Appendix. And if you are going to argue this rather obvious and trivial point to such an extreme, you should have at least read Scafetta’s reply or my original post that went into much more detail.

Nick Stokes

Reply to Andy May

September 24, 2023 2:31 am

Andy,
this is a total switcheroo. Gavin etc described the errors in the paper you cited. They are glaring, especially the total absence of uncertainty for observations. Gavin et al did not say anthing about this other paper. It is not the subject of the dispute you mention. Whether it is wrong or not, I am not going to chase up now. The fact is that the paper you cited, about which the comments were written, was wrong.

-5

TimTheToolMan

Reply to Nick Stokes

September 24, 2023 3:36 am

Nick writes

Gavin et al did not say anthing about this other paper. It is not the subject of the dispute you mention.

But why? If the “other paper” is the most recent analysis on the subject by Scarfetta, why comment on the previous analysis over a year later?

This whole thing looks suspicious to me.

If the newer paper is statistically sound then they would appear to have pulled a “Streisand effect” on it.

Andy May

Author

Reply to Nick Stokes

September 24, 2023 3:52 am

Now, you decide what papers to include in the argument and which to ignore. You are no better than Gavin. The paper I cited is correct. Schmidt had ample time to consider Scafetta’s reply and his other papers and chose to ignore everything, just like you. Childish.

Nick Stokes

Reply to Andy May

September 24, 2023 1:44 pm

“Now, you decide what papers to include in the argument and which to ignore.”

No, you decided. You listed them explicitly in your “works cited”.

Scafetta, N. (2022a). Advanced Testing of Low, Medium, and High ECS CMIP6 GCM Simulations Versus ERA5-T2m. Geophysical Research Letters, 49. doi:10.1029/2022GL097716
Scafetta, N. (2023a). Reply to “Comment on ‘Advanced testing of low, medium, and high ECS CMIP6 GCM simulations versus ERA5-T2m’. Geophysical Research Letters, 50. doi:10.1029/2023GL104960
Schmidt, G. A., Jones, G. S., & Kennedy, J. J. (2023). Comment on “Advanced testing of low, medium, and high ECS CMIP6 GCM simulations versus ERA5-T2m”. Geophysical Research Letters, 50. doi:10.10

You even said, explicitly
“Scafetta’s original paper, the reason for the dispute can be downloaded here”
linking to that paper. Now, when the faults in that paper are very evident, you want to talk about some paper that the article didn’t mention or allude to anywhere.

“ Schmidt had ample time to consider Scafetta’s reply”

No, they wrote and sent in their comment within a few days. The rest of the time was journal processes and waiting for Scafetta to produce a response that would pass refereeing. But anyway, it stands as a valid comment on a published paper. It was wrong.

-1

Nick Stokes

Reply to Andy May

September 24, 2023 3:11 am

“Schmidt, Jones, and Kennedy’s high school mistake in statistics should be obvious to everyone but caught-they now have in their heels.”

Schmidt et al are using high school (well, 101) correctly. You and Scafetta are talking nonsense. I see now that Scafetta is measuring warming by subtracting the mean of the last decade from the mean of the first. Regression would be better, but no matter. SJK say that the uncertainty of each mean is just the standard error (SEM). That is basic, as is the way of combining in a difference (by quadrature).

It’s no use putting up smokescreens about Best, sources of error etc. The SEM derives from the observed variability, as measured by the sd. You can’t get less than that. You might get more if there is autocorrelation.

-4

Andy May

Author

Reply to Nick Stokes

September 24, 2023 3:55 am

None are so blind as those who will not see. I’ve nothing more to add.

Sunsettommy

Reply to Andy May

September 24, 2023 8:47 am

He is thread fogging again because he is commonly in a state of confusion himself.

Andy May

Author

Reply to Sunsettommy

September 24, 2023 5:57 pm

True, Nick doesn’t know which end is up. You always include all the relevant papers, ignoring them only makes you look a fool. I know for a fact that Scafetta’s second paper was emailed directly to Schmidt well in advance, and he deliberately ignored it. Totally high school.

Sunsettommy

Reply to Andy May

September 24, 2023 6:31 pm

Yeah, I see that a lot over the years when they don’t provide all the relevant papers.

They did that to Bob Tisdale and John McIntire too.

bnice2000

Reply to Nick Stokes

September 24, 2023 4:09 am

Oh dear, seems Nick is only partially educated.

And quite unable to learn what he doesn’t want to understand.

Andy May

Author

Reply to bnice2000

September 24, 2023 8:53 am

It is sad, but I don’t think Nick, Schmidt, Jones, or Kennedy realize what they are doing. Climate science is now so corrupt, everyone in it thinks it is OK to ignore data, and just choose the data they want, simply because it supports their personal agenda. Everyone is just AR6 these days.

bdgwx

Reply to Andy May

September 24, 2023 6:29 pm

It is sad, but I don’t think Nick, Schmidt, Jones, or Kennedy realize what they are doing.

If Nick, Schmidt, Jones, Kennedy, etc. don’t realize what they are doing then neither does NIST, JCGM, UKAS, and every other standards body and statistical texts.

-2

bnice2000

Reply to bdgwx

September 25, 2023 4:18 am

Wrong, it is you and Nick that don’t understand what those standards bodies do. Or basic statistics texts.

Jim Gorman

Reply to bdgwx

September 25, 2023 7:51 am

You do not know what uncertainty really means.

The GUM defines experimental standard uncertainty as the dispersion of the values that could reasonably be attributed to the measurand. The center of the uncertainty interval is the expected value, that is, the mean of the data. This makes the Standard Deviation the proper measure of uncertainty.

The experimental standard uncertainly of the mean is a measure of where the mean may lay taking into account sampling error. It is not a measure of uncertainty in the observed data.

The experimental standard uncertainly of the mean is a measure of the distribution around q̅, the estimated mean. In other words, the distribution of the sample means but not the data. (See GUM B.2.17 and B.2.18)

Lastly, anomalies are the result of subtracting two random variables, a monthly value and a baseline value. Common statistical practice allows that the anomaly inherits the sum of the variances of the two parent random variables. This variance must be treated further than just finding the variance of of the much smaller numbers representing anomalies. Don’t forget these are not temperatures, they are ΔT values that have their own variance.

karlomonte

Reply to Nick Stokes

September 24, 2023 8:48 am

Trendology at its nadir, this.

bdgwx

Reply to Andy May

September 24, 2023 9:45 am

Andy May: Schmidt, Jones, and Kennedy’s high school mistake in statistics

What mistake?

-2

Andy May

Author

Reply to bdgwx

September 24, 2023 5:59 pm

The mistake is using a 10-year standard deviation to calculate measurement error. 10 year includes actual climate change, not just measurement error.

bdgwx

Reply to Andy May

September 24, 2023 6:27 pm

They use the standard deviation divided by the sqrt of N which is how Taylor, Bevington, NIST, JCGM, UKAS, etc. all say to do it. And if there is any question there is an example provided by NIST that is pretty close to the use case in question here.

-2

Jim Masterson

Reply to bdgwx

September 24, 2023 7:33 pm

If N is very large, then there’s little difference between N and N-1. However, you’re supposed to do the math correctly. One is the SD of a population and the other is the SD of a sample–not exactly the same thing.

bdgwx

Reply to Jim Masterson

September 25, 2023 6:19 am

Nobody is saying you shouldn’t use the sample SD nor is anyone saying you should assume N equals DOF. The point Nick, Bellman, and I are making, which is supported by Taylor, Bevington, NIST, JCGM, UKAS, etc. is that you apply the law of propagation of uncertainty which results in the division by a square root when the measurement model is an average. A lot of people here erroneously assume the division by a square root is optional. That’s all that is being said here. We can’t get into the minutia of details until everyone accepts the law of propagation of uncertainty first.

karlomonte

Reply to bdgwx

September 25, 2023 6:33 am

the point Nick, Bellman, and I are making, which is supported by Taylor, Bevington, NIST, JCGM, UKAS, etc. is that you apply the law of propagation of uncertainty

No, the real truth is that this lot of trendologists you list here ABUSE all the metrology texts and pound them into your square hole.

-1

Tim Gorman

Reply to bdgwx

September 26, 2023 1:48 pm

The division by the square root gives you the SEM. That is only applicable when the distribution is random and Gaussian and you can assume that all measurement uncertainty is random, Gaussian, and cancels.

If the distribution is skewed, which the temperature data set surely is then the SEN is not the proper measure of uncertainty. If measurement uncertainty is not random and Gaussian, and the temperature data set is assuredly not random and Gaussian, then the SEM is not the proper measure of the uncertainty.

Why does everyone in climate science want to assume that combining winter temps with summer temps does not produce a multi-modal distribution? They have different average temps and different variances. It’s like saying the average value of Shetland ponies and quarter horses gives you a meaningful average and the uncertainty is the SEM!

AndersV

Reply to bdgwx

September 25, 2023 1:05 am

Again, you have to know what the prerequisites for using this formula is. In the case of standard deviation divided by the square root of the number of samples is for “…N measurements of the same quantity….” to use the words of Taylor. The meaning of “same quantity” is what you need to understand.

For temperature measurements of free open space air with varying humidity and pressure the meaning of this prerequisite is that you need N measurements at the same place and time in order to use the formula. In weather measurements you clearly do not fulfil that requirement.

Taylor also states quite clearly that systematic errors are not affected by the number of measurements.

Nick Stokes

Reply to AndersV

September 25, 2023 2:15 am

Just not true. You need to quote the full context of Taylor. Here is an example from Possolo at NIST doing just what you say they can’t do. They used exactly that formula to get uncertainty of a monthly average of daily thermometer readings. Key section:

karlomonte

Reply to Nick Stokes

September 25, 2023 6:35 am

Wrong, Nitpick.

An air temperature measurement has a sample size of exactly ONE, your holy and precious SEM is a red herring that DOES NOT APPLY,

-1

Jim Gorman

Reply to Nick Stokes

September 25, 2023 9:02 am

Nick,

You have no clue.

Same location, same device, same month. You’ll notice that only Tmax is involved. Have you ever heard of repeatability conditions?

GUM
“””B.2.15
repeatability (of results of measurements) “””

“””closeness of the agreement between the results of successive measurements of the same measurand carried out under the same conditions of measurement

NOTE 1 These conditions are called repeatability conditions.

NOTE 2 Repeatability conditions include:
— the same measurement procedure
— the same observer
— the same measuring instrument, used under the same conditions
— the same location
— repetition over a short period of time.

NOTE 3 Repeatability may be expressed quantitatively in terms of the dispersion characteristics of the results. “””

Nick Stokes

Reply to Jim Gorman

September 25, 2023 1:14 pm

“Same location, same device, same month.”

So OK, now it doesn’t have to be a measurement of the same thing, just taken in the same month. Where is that in GUM?

How is measuring the max temperatures on successive days a repetition?

Jim Gorman

Reply to Nick Stokes

September 25, 2023 3:41 pm

It is the same thing. Can you not read

The measurand is declared as the average Tmax for one month, at one station. Notice it is not Tavg because that involves two things.

From NIST 1900:

“””Questions are often asked about whether it is meaningful to qualify uncertainty evaluations with uncertainties of a higher order, or whether uncertainty evaluations already incorporate all levels of uncertainty. A typical example concerns the average of n observations obtained under CONDITIONS OF REPEATABILITY and modeled as outcomes of independent random variables with the SAME MEAN µ and the SAME STANDARD DEVIATION σ, both unknown a priori. “””

“””EXAMPLES: Examples E2, E20, and E14 involve multiple observations made under conditions of repeatability.”””

“””variance (squared standard deviation) of a sum or difference of uncorrelated random variables is equal to the sum of the variances of these random variables. Assuming that the weighings are uncorrelated, we have u2(mP) = u2(cP) + u2(cE) + u2(cR 2 − cR,1) exactly.”””

B.2.15 repeatability (of results of measurements)

closeness of the agreement between the results of successive measurements of the same measurand carried out under the same conditions of measurement

NOTE 1 These conditions are called repeatability conditions.

NOTE 3 Repeatability may be expressed quantitatively in terms of the dispersion characteristics of the results. [VIM:1993, definition 3.6]

Please note that TN 1900 was created very carefully to meet these conditions.

Nick Stokes

Reply to Jim Gorman

September 25, 2023 4:37 pm

”EXAMPLES: Examples E2, E20, and E14 involve multiple observations made under conditions of repeatability.”

You give just half the para. It goes on:

“EXAMPLES: Examples E2, E20, and E14 involve multiple observations made under conditions of repeatability. In Examples E12, E10, and E21, the same measurand has been measured by different laboratories or by different methods.”

karlomonte

Reply to Nick Stokes

September 25, 2023 5:13 pm

BFD, what’s your point, Nitpick?

Jim Gorman

Reply to Nick Stokes

September 25, 2023 5:33 pm

I didn’t leave anything out! We were discussing E2. E12, E10, and E21 do not apply.

As a word of caution, are you familiar with inter-laboratory testing procedures? I’ll bet not. I have nothing but a passing knowledge of how it is done so I am not an expert on that.

karlomonte

Reply to Jim Gorman

September 25, 2023 9:08 pm

Yep! Like his disciple Bellman, he just poked around and found a random quote to generate another Stokes Patented Red Herring.

Clyde Spencer

Reply to Nick Stokes

September 25, 2023 11:53 am

It appears that the example that you are using is for one station, one thermometer, not thousands of stations and thermometers. It therefore meets the requirement of one measurand being measured by the same instrument. One month is short enough any trend is probably negligible. Therefore, we can define the time-series as also meeting the requirement of stationarity:

https://www.itl.nist.gov/div898/handbook/pmc/section4/pmc442.htm

[I consider the above URL to be a citation. If you don’t agree, please state why.]
However, longer temperature time-series will not meet the requirement of stationarity without processing that needs to be explicitly noted, which I don’t recollect ever seeing in anything that you have posted.

bdgwx

Reply to Clyde Spencer

September 25, 2023 1:45 pm

It therefore meets the requirement of one measurand being measured by the same instrument.

What requirement? I want a link to text that says the inputs into a measurement model must be of the same thing. I then want you to reconcile that with the fact that every single example in JCGM 100:2008 and NIST 1900 are of measurement models that not only accept inputs of different things, but usually of things with completely different units measured by completely different instruments.

BTW keep in mind that NIST TN 1900 E2 takes 28 different temperature measurements and uses them as inputs into the measurement model that computes the monthly average. NIST then does a type A evaluation of uncertainty of that measurement model using those 28 different temperature measurements. Schmidt did something very similar.

Therefore, we can define the time-series as also meeting the requirement of stationarity:

The time series in NIST TN 1900 E2 is not stationary.

Clyde Spencer

Reply to bdgwx

September 25, 2023 3:17 pm

If one is dealing with non-stationary data, it reflects that both the mean and standard deviation are changing with time and there is no single value for them or any parameter derived from them. One cannot make comparisons and determine statistical significance and make inferences about whether they represent the same population.

bdgwx

Reply to Clyde Spencer

September 25, 2023 5:00 pm

NIST computed the average of a non-stationary time series of temperatures and it’s corresponding uncertainty so I’m not understanding the relevance of stationarity here.

Jim Gorman

Reply to bdgwx

September 25, 2023 5:22 pm

Did you not read what Clyde said?

One month is short enough any trend is probably negligible. Therefore, we can define the time-series as also meeting the requirement of stationarity:

Also,

However, longer temperature time-series will not meet the requirement of stationarity without processing that needs to be explicitly noted, which I don’t recollect ever seeing in anything that you have posted.

From TN 1900:

The adequacy of this choice is contingent on the definition of r and on a model that explains the relationship between the thermometer readings
and r.
The daily maximum temperature r in the month of May, 2012, in this Stevenson shelter, may be defined as the mean of the thirty-one true daily maxima of that month in that shelter.
The equation, ti = r+Ei, that links the data to the measurand, together with the assumptions
made about the quantities that figure in it, is the observation equation.

Look at that measurement equation closely. It links the data to the measuranc. It is what lets NIST determine the monthly average Tmax.

To use several months one must meet the short time repeatability condition. As Clyde says, longer periods cause means and uncertainty to vary. Maybe you have a way to get around that. If so, state it.

Nick Stokes

Reply to Clyde Spencer

September 25, 2023 3:21 pm

“It therefore meets the requirement of one measurand being measured by the same instrument”

So what is the one measurand? The temperature on the 1st, or the temperature on the 21st? Do you expect them to be the same?

That is the giveaway. When you measure on the 2nd, no-one ses that as reducing the uncertainty of what you measured on the 1st. It is not a repetition.

I just cannot see the point of your link to the definition of stationarity. It isn’t even about measurements.

Jim Gorman

Reply to Nick Stokes

September 25, 2023 4:44 pm

The one measurand is what TN 1900 said, the monthly average of Tmax.

It is a measurement of different experiments, one for each day recorded.

Do you understand what experimental means?

The measurand is what you define it to be! It may be the result of 10 different chemical reactions involving 5 different reactants. You may weigh everything carefully, you may use micropipettes, but there will be differences, the GUM calls them influences, such that each experiment will result in a different value. The SEM is nice to know, but the detail but it is very unlikely that anyone trying the same experiment will ever get the “value ± SEM. The important fact for someone duplicating the experiment is to know the range of values they may get, i.e., the Standard Deviation.

Nick Stokes

Reply to Jim Gorman

September 25, 2023 6:46 pm

“The one measurand is what TN 1900 said, the monthly average of Tmax.”
So it has to be measured with the same instrument. What instrument measures the monthly average?
How do you have repetitions?

-3

Jim Gorman

Reply to Nick Stokes

September 26, 2023 5:43 am

You truly don’t know much about measurements do you. I’m sure that to you numbers are exact.

From TN 1900:

“””The daily maximum temperature r in the month of May, 2012, in this Stevenson shelter, may be defined as the mean of the thirty-one true daily maxima of that month in that shelter.”””

“””This so-called measurement error model (Freedman et al., 2007) may be specialized further by assuming that ε1, …, εm are modeled independent random m variables with the same Gaussian distribution with mean 0 and standard deviation σ. In these circumstances, the {ti} will be like a sample from a Gaussian distribution with mean r and standard deviation ( (both unknown).”””

These are declaring the measurand to be the mean of Tmax for the days in the month and the standard deviation to be the dispersion in the experimental measurements. Because the distribution is ASSUMED to be Gaussian and of the same thing, an expanded experimental standard deviation of the mean can be used as a measure of the uncertainty in the mean estimate.

I don’t necessarily agree with the use of the expanded experimental standard deviation of the mean as the appropriate uncertainty to use, it is very well defined as to what is being used and is a choice the scientist can make.

Whether an interval of 1.8 °C is used or 4.1 °C is immaterial. They are both far above what climate scientists use to justify anomalies with one one-thousandths of a degree.

You need to answer why you think running multiple experiments are not repetitive measurements of the same thing.

karlomonte

Reply to Jim Gorman

September 26, 2023 7:06 am

Bellman calls any uncertainty other than the little wiggles in an anomaly graph “hypothetical”.

This says everything one needs to know about this lotl.

Bellman

Reply to karlomonte

September 28, 2023 4:14 am

You really are obsessed with me. Even when I’m taking no part in the discussion you still insist on dragging my name into it, without a link or quote to what I actually said.

The fact is any uncertainty estimate will be hypothetical. That’s not a bad thing, it’s how science works. You have a hypothesis and then try to find evidence to support or falsify it.

In this case what I think I said, was that your and Tim Gormans, accusation that UAH had uncertainties of multiple degrees was contradicted by the evidence that across the entire 500+ monthly anomalies, the standard deviation was only about 0.25°C. I may have been commenting on Tim quoting the old “a beautiful hypothesis killed by an ugly fact” meme.

You, of course, argue that empirical evidence is irrelevant as your calculations cannot be wrong. You will claim that all the uncertainty is caused by a hypothesized systematic error, which can not be detected so is unfalsifiable. Which is convenient, but also difficult to accept when we are talking about anomalies.

Jim Gorman

Reply to Bellman

September 28, 2023 11:16 am

I never comment on UAH uncertainty. They use a totally different system to make measurements and calculations. I am not knowledgeable of all the satellite uncertainties from the measuring devices to orbital variances and any judgment would be pure conjecture. I doubt you are either.

I am familiar with terrestrial physical measurements using typical SI units.

Nick Stokes

Reply to Jim Gorman

September 26, 2023 12:44 pm

So do you think measuring successive daily maxima are repetitive measurements of the same thing? They certainly are not repetitive mesurements of what you now say is the measurand, the monthly average.

Jim Gorman

Reply to Nick Stokes

September 26, 2023 1:18 pm

Nick,

“””They certainly are not repetitive mesurements of what you now say is the measurand, the monthly average.”””

Geez, have you ever done repetitive experiments in a basic physics or chemistry lab to determine something like the force of gravity or the mass of a product isolated by filtering?

There isn’t a gravity meter you can use or a product mass meter you can use to weigh a product while in solution.

You take repetitive measurements of measurable quantities that can be averaged to obtain a mean μ and a standard deviation σ that is used to state a value and the dispersion of data around the mean.

If you want the average Tmax temperature for a period of time, i.e., a month, you take daily Tmax measurements to determine the measurand you desire.

I really don’t know why you are asking these basic questions of measurement while questioning the analysis of experimental data.

Clyde Spencer

Reply to Jim Gorman

September 26, 2023 9:21 pm

There isn’t a gravity meter you can use or a product mass meter you can use to weigh a product while in solution.

The unstated assumption is that the gravitational acceleration for a specific location doesn’t change with time. Thus, while one may not actually be measuring the ‘same’ force every time, being unchanging, it is equivalent to measuring the same force, for all practical purposes. This speaks to the issue of stationarity. It doesn’t matter when one measures a physical constant, but it does matter when one measures a temperature. That is why the Tobs (time of observation) became an issue in correcting temperatures derived from satellites with degrading orbits.

-1

Clyde Spencer

Reply to Nick Stokes

September 26, 2023 9:10 pm

The measurand is implicitly defined as the daily samples of Tmax during a single month, where the unstated assumption is that a single month has a negligible trend, and the mean for the month represents the average monthly high temperature, with the uncertainty represented by random variations caused by clouds, changing wind directions, and low/high pressure systems moving through the station site.

As I have pointed out before, one isn’t strictly measuring the same thing multiple times, but it is a good approximation for the impossible. It reflects on the carelessness of researchers that they don’t bother to explicitly state all their assumptions and compromises. But then, I suspect they haven’t given a lot of thought to what they are doing and why.

I just cannot see the point of your link to the definition of stationarity. It isn’t even about measurements.

It is always about measurements, because that is how a time-series is obtained.

Tim Gorman

Reply to Clyde Spencer

September 27, 2023 4:10 am

“It reflects on the carelessness of researchers that they don’t bother to explicitly state all their assumptions and compromises. But then, I suspect they haven’t given a lot of thought to what they are doing and why.”

You pretty much nailed it. You would think that the realization that different Tmax and Tmin values can result in the same mid-range value would be an indication that the mid-range value is not a good index for climate.

That is the ultimate in actual measurement uncertainty. If climate is what you are trying to measure and you can’t separate one climate from another based on your measurement then exactly what do you really know about the climates you are measuring? Ans: Nothing.

bdgwx

Reply to Clyde Spencer

September 27, 2023 8:13 am

The measurand is implicitly defined as the daily samples of Tmax during a single month, where the unstated assumption is that a single month has a negligible trend

The trend in NIST TN 1900 E2 is 0.22 C/day. That is hardly what I’d call negligible.

with the uncertainty represented by random variations caused by clouds, changing wind directions, and low/high pressure systems moving through the station site.

Yep. Exactly as Nick, Bellman, and I have pointed out numerous times. Not that we should have needed to since NIST explicitly states it right there in the example. BTW…there is a 3rd source of uncertainty included in the example as well…the time of day of the measurement.

Likewise, Schmidt’s uncertainty includes the variation caused by ENSO, solar cycles, and other heat flux oscillations into and out of the atmosphere plus a bunch of other components. Not only does Scafetta not account for any of that, but he doesn’t even account for measurement uncertainty.

Jim Gorman

Reply to bdgwx

September 27, 2023 12:29 pm

“””BTW…there is a 3rd source of uncertainty included in the example as well…the time of day of the measurement.”””

NIST captures this in the following:

“””The {εᵢ} capture three sources of uncertainty: natural variability of temperature from day to day, variability attributable to differences in the time of day when the thermometer was read, and the components of uncertainty associated with the calibration of the thermometer and with reading the scale inscribed on the thermometer.”””

Funny how you bring up the trend but not other statistical items. How about the distribution? See the images I have attached. Is this skewed and does a Students T match it well enough?

These are the things textbooks in statistics don’t teach you about the real world.

old cocky

Reply to bdgwx

September 27, 2023 3:43 pm

The trend in NIST TN 1900 E2 is 0.22 C/day. That is hardly what I’d call negligible.

That should be greatest in March and September, and least in June and December.

TimTheToolMan

Reply to Nick Stokes

September 26, 2023 1:38 am

Nick writes

They used exactly that formula to get uncertainty of a monthly average of daily thermometer readings.

And then they combine all those averages areal, weighted and call that a global average.

But every day the first of those readings at say 10pm at 0deg longitude happens nearly 24 hours from the last of the readings at say 10pm at 345deg longitude and there is a lot of weather between those two.

Where does that error live in the calculation?

Nick Stokes

Reply to TimTheToolMan

September 26, 2023 2:23 am

That’s part of the simplicity of max/min. There is only one a day.
But also phase issues fade when you take a monthly average.

TimTheToolMan

Reply to Nick Stokes

September 26, 2023 2:43 am

Nick writes

That’s part of the simplicity of max/min. There is only one a day.

But temperatures aren’t independent like that. For example the same cold front can influence and be measured multiple times at multiple locations if its moving in the right direction.

This isn’t about “fading” its about uncertainty and error.

karlomonte

Reply to Nick Stokes

September 26, 2023 7:08 am

So throwing away information increases your knowledge.

Huh? This is climate “science”.

Jim Gorman

Reply to Nick Stokes

September 26, 2023 7:21 am

All errors are Gaussian and cancel with averaging, right?

Nick Stokes

Reply to Jim Gorman

September 26, 2023 4:57 pm

They )part) cancel because some are positive nd some are negative. That has nothing to do with being gaussian.

-2

Jim Gorman

Reply to Nick Stokes

September 26, 2023 6:00 pm

So a skewed distribution will have errors completely cancel when calculating a mean?

Nick Stokes

Reply to Jim Gorman

September 27, 2023 2:34 am

For any distribution, a finite number of numbers drawn from it won’t have a mean equal to the population mean. But as the sample grows, the mean will converge to the population mean. Law of large numbers, and there is no requirement that the distribution be symmetric.

Tim Gorman

Reply to Nick Stokes

September 27, 2023 3:51 am

You totally missed the point! If the errors don’t totally cancel then how precisely you have located the population mean doesn’t tell you what the measurement uncertainty is.

Why do so many in climate science ALWAYS assume that all measurement uncertainty is random, Gaussian, and cancels? Because that’s the only way you can justify using the SEM instead of the propagated measurement uncertainty from the component data elements onto the averge.

karlomonte

Reply to Tim Gorman

September 27, 2023 5:55 am

There is something overwhelmingly ironic about their constant yammering about the SEM, I will it save for next week (see if you can guess what it is!).

-1

Jim Gorman

Reply to karlomonte

September 27, 2023 6:19 am

I find it ironic that a sampling distribution, i.e., temperatures, are considered a population so you can divide σ by the √N to get an SEM. Technically, the standard deviation of a sample means IS the SEM. I see each temperature as a sample whose size is 1 (one) and the mean of each sample IS the temperature. That makes the √N=√1=1.

karlomonte

Reply to Jim Gorman

September 27, 2023 7:11 am

Exactly, and they run away from this problem every time it is raised.

-1

Tim Gorman

Reply to karlomonte

September 27, 2023 7:31 am

Not a single one of them, not bdgwx, Stokes, bellman, mosher, etc can give a cogent, coherent explanation of how they compensate for the undoubted existence of systematic bias in temperature measurements taken from multiple measurement stations.

They can’t even explain how they compensate for the microclimate systematic bias changes caused by grass below the station changing from green in the summer to brown in the winter.

They just assume that somehow systematic bias always cancels but they can’t explain how. And they simply aren’t willing to even admit that systematic bias exists in the temp measurement data – for if they admit that then they would also have to admit that the SEM is not a valid way to specify the measurement uncertainty of the average.

It’s telling that none of them are even willing to use the term “measurement uncertainty”, preferring instead to use the words “uncertainty of the mean”. It’s an argumentative fallacy known as Equivocation – changing the meaning of a words that are vague hoping no one will notice.

karlomonte

Reply to Tim Gorman

September 27, 2023 7:57 am

The truth can only be that they really don’t care about measurement uncertainty, given the amount of verbal gymnastics they generate trying to support these bizarre claims. Uncertainty is a roadblock along the golden road to CAGW.

old cocky

Reply to karlomonte

September 27, 2023 8:25 pm

Does it involve anomaly baselines?

Jim Gorman

Reply to Nick Stokes

September 27, 2023 5:44 am

Nick,
That is a stock sampling assumption.

The problem is that the temperatures being read ARE the samples and form the sample means distribution.

Have you ever checked to see what that sample means distribution looks like? Even Tmax and Tmin are taken from different shaped funtions. Is the SEM of Tavg very small?

It is why TN 1900 ASSUMES a Students T distribution will work for a monthly average. It is why they say that other methods may have a wider interval.

Remember the CLT you are quoting does say the sample means distribution should be Gaussian regardless of the population distribution. I would advise checking to see if monthly, or 30 year baselines are normal distributions to verify that the SEM is an appropriate statistic.

TimTheToolMan

Reply to Nick Stokes

September 27, 2023 2:54 am

Nick writes

They )part) cancel because some are positive nd some are negative. That has nothing to do with being gaussian.

Not for weather if the weather has a built in bias. Like the polar vortex. Or roaring 40s. Or any number of features that cause a commonly expected result that can change with no reason to expect that change to be randomly distributed in timeframes we care about.

Tim Gorman

Reply to Nick Stokes

September 27, 2023 3:48 am

If they PARTLY cancel then you add them in quadrature. You don’t assume total cancellation! Adding in quadrature still means the measurement uncertainty increases with every data element added to the data set. The larger the number of stations with measurement uncertainty the larger the measurement uncertainty becomes.

You simply cannot assume total cancellation and substitute the SEM for the measurement uncertainty.

bdgwx

Reply to AndersV

September 25, 2023 6:13 am

AndersV: Again, you have to know what the prerequisites for using this formula is. In the case of standard deviation divided by the square root of the number of samples is for “…N measurements of the same quantity….” to use the words of Taylor. The meaning of “same quantity” is what you need to understand.

That is patently false. Taylor does not say that. Neither does Bevington, NIST, JCGM, UKAS, etc.

And all examples in JCGM 100:2008 or NIST TN 1900 are of different things. Every single one of them. And yet the law of propagation of uncertainty or one of its derivatives is applied equally to each one.

And literally E2 in NIST TN 1900 is of measurements of different temperatures.

karlomonte

Reply to bdgwx

September 25, 2023 6:36 am

All the usual lies from bozo-x.

Clyde Spencer

Reply to bdgwx

September 25, 2023 12:10 pm

Turning the angle on a transit sighting multiple times to increase precision has been SOP for land surveyors for a very long time. Note however, that averaging all the readings for every line gives a number, although it is a meaningless number.

One has to be very careful to define just what the measurand is, as noted in the link provided by Stokes (Possolo, 2015):

(1) Measurand & Measurement Model. Define the measurand (property intended to be measured, §2), and formulate the measurement model (§4) that relates the value of the measurand (output) to the values of inputs (quantitative or qualitative) that determine or influence its value. Measurement models may be:
∙ Measurement equations (§6) that express the measurand as a function of inputs for which estimates and uncertainty evaluations are available (Example E3);
∙ Observation equations (§7) that express the measurand as a function of the parameters of the probability distributions of the inputs (Examples E2 and E14).

bdgwx

Reply to Clyde Spencer

September 25, 2023 1:24 pm

I know. I provided the link to that exact text in the post you just responded to. Now can you help Nick and I convince AndersV and the other WUWT participants here who think the inputs to a measurement model must be of the same thing?

Clyde Spencer

Reply to bdgwx

September 25, 2023 3:11 pm

Obtaining the average of a set of numbers is a mechanistic process that can be done for any set of numbers. The issue, as I remarked above, is does the measurand make logical sense? Averaging all the telephone numbers in the world will provide a result, but to what end? You could use it as a proxy for God’s telephone number, but I doubt He/She/It will pick up.

The other issue is the precision of the resulting calculation(s). One could define a measurand as being the weight of all the dogs in the world and all the dog fleas in the world. There is such a large difference in the weight of the two that different methods would have to be used to weigh them, and if a couple fleas jumped off a dog during the weighing process, it might be observed, but could probably not be measured. So, it is important to define a measurand as something that is practical, useful, and measurable. The error in dog weights for a dog shedding hair or drooling would be larger than the weight of the fleas!

What many of us are arguing about/for, is known errors be handled properly so that one knows whether the fleas make a difference or not, rather than just ignoring them. That is, we want a good estimate of what the real global average temperature is, taking into account all known uncertainties so that statistical tests can be performed to determine if historical differences are statistically significant.

So, let me ask you a practical question regarding whether measurements have to be of the same thing. If you are trying to measure the angle turned by a transit to improve the precision, what good does it do to conflate two angles? If measuring the diameter of a ball bearing to determine if a particular machine is producing ball bearings within tolerance, why would you also measure ball bearings from a second machine producing different size ball bearings? It all comes down to the definition of one’s measurand and the purpose to which it will be applied.

Nick Stokes

Reply to Clyde Spencer

September 25, 2023 3:35 pm

“ If measuring the diameter of a ball bearing to determine if a particular machine is producing ball bearings within tolerance, why would you also measure ball bearings from a second machine producing different size ball bearings?”

It would be a very reasonable thing to measure the diameter of the product of a certain firm with same nominal size, and get the mean and sd, and SEM, even if produced by many different machines.

Jim Gorman

Reply to Nick Stokes

September 25, 2023 4:15 pm

And exactly what would you look for in these?

The SD or the SEM?

What does each one tell you about the product?

Would you buy the product if the SEM was 1″ ±0.001″ (SEM) and the SD was 1″ ±0.2″ (SD)?

You have obviously never dealt with real world measurements and informing people that pay money what they are actually paying for. The SEM may have a sacred place for statisticians dealing with numbers, but for those of us who deal with this with some accountability, the Standard Deviation is what matters.

If I am a contractor buying a truck load of 8′ 2×4’s I don’t want to know how accurately the mean was calculated, I want to know how many won’t work for 8′ walls and how many I’ll have to work over because some are too long. That means I want to know the SD and NOT the SEM.

If I’m using a voltmeter I want to know the interval I can expect around the voltage I am currently reading, i.e., the SD. The SEM means little to me since I would need to take 100 readings under repeatable conditions to calculate the SEM to see if it matched your specification.

karlomonte

Reply to Jim Gorman

September 25, 2023 9:10 pm

Neither of them will even try to answer, because they can’t.

Nick Stokes

Reply to Jim Gorman

September 25, 2023 9:18 pm

“Would you buy the product if the SEM was 1″ ±0.001″ (SEM) and the SD was 1″ ±0.2″ (SD)?”
All that tells you is that you looked at 40000 samples. It doesn’t tell you anything new about the product.

Jim Gorman

Reply to Nick Stokes

September 26, 2023 8:58 am

You really skipped over the issue. So let’s try again.

If a salesman offered you buy a truck load of 2×4’s with specs of 8′ ±0.03125″ would you buy them for making 8′ walls?

Clyde Spencer

Reply to Jim Gorman

September 27, 2023 7:45 pm

You really skipped over the issue.

He does that quite frequently.

Clyde Spencer

Reply to Nick Stokes

September 27, 2023 7:44 pm

Note that the question I asked was “why would you also measure ball bearings from a second machine producing different size ball bearings?” You changed that to, “It would be a very reasonable thing to measure the diameter of the product of a certain firm with same nominal size?” Basically, your response is a non sequitur. Not untypical of you, which is why I have accused you of being a sophist.

Tim Gorman

Reply to Clyde Spencer

September 28, 2023 3:30 am

Most of the SEM clique have reading comprehension problems.

bdgwx

Reply to Clyde Spencer

September 28, 2023 11:04 am

Note that the question I asked was “why would you also measure ball bearings from a second machine producing different size ball bearings?”

Because you may be interested doing an analysis on two different types of ball bearings.

This would not be unlike measuring two different types of temperatures: one at the surface and one at the TLT layer. Like the ball bearings one is larger (surface) and one is smaller (TLT layer). Yet we are still interested in doing an analysis on both.

It is intuitive to hypothesize that a common effect, like the planetary energy imbalance, could influence both temperatures in a similar way even though they are of different magnitude. Likewise, it is intuitive to hypothesize that a common effect, like the composition of the alloy, could influence both ball bearings in a similar way even though they are of different magnitude.

-2

Jim Gorman

Reply to bdgwx

September 28, 2023 11:24 am

Your word salad has little to do with measurement uncertainty and more to do with pushing numbers into some analysis.

Clyde Spencer

Reply to bdgwx

September 28, 2023 7:43 pm

Before you get involved with a second experiment to determine the temperature at a different elevation, you should be certain that you are doing it correctly and getting the right answer. Your suggestion is a distraction to the main point, which is whether one obtains the precision of a measurement by dividing by the sq rt of the number of measurements, or adds the uncertainty of measurements in quadrature when it isn’t the same thing measured multiple times with the same measuring device.

You and the others are basically saying that measuring all ~~air parcels~~ chickens is the same as measuring a single ~~air parcel~~ chicken the same number of times as the size of the flock. In the first case, one gets the variance of a measurand of a flock of chickens, while in the second case one gets the precision of the measurand of a particular chicken. Why is that so hard to grasp?

Tim Gorman

Reply to Clyde Spencer

September 29, 2023 3:30 am

It’s not hard to grasp. The problem is that grasping it would also mean recognizing that the uncertainty of the GAT is so large that you can’t distinguish differences in the hundredths digit and likely in the tenths and unit digits. Meaning that climate science would have to find a different way to support claims of global warming.

Jim Gorman

Reply to Clyde Spencer

September 29, 2023 5:51 am

You can measure one chicken 1000 times, find the SEM, AND ASSUME THAT ALL CKICKENS ARE THE SAME. That is, the same μ and the same σ.

Dr. Taylor covers this in his spring example of finding the “k” factor. Once you measure a spring that is outside the interval of the test function, you can no longer rely on the statistics you have assumed apply to everything. That is, all the springs are not similar.

His proof of dividing by √n requires all “samples” have the same μ and σ.

Bellman

Reply to Jim Gorman

September 29, 2023 7:52 am

“You can measure one chicken 1000 times, find the SEM, AND ASSUME THAT ALL CKICKENS ARE THE SAME. That is, the same μ and the same σ.”

Make your mind up. Normally you are insisting that measuring the same thing hundreds of times is the only time you can use statistics.

On reality of course measuring the same chicken 1000 times will tell you no more about all chickens, than measuring one chicken once. That’s because it is not a random sample. All measuring it multiple time will do is allow you to determine the measurement uncertainty.

Tim Gorman

Reply to Bellman

September 29, 2023 4:26 pm

“Make your mind up. Normally you are insisting that measuring the same thing hundreds of times is the only time you can use statistics.” (bolding mine, tpg)

Your lack of reading comprehension skills is showing again.

Assuming the measurand you measured 1000 times represents *all* measurands is simply not logical – yet this is what climate science does.

“On reality of course measuring the same chicken 1000 times will tell you no more about all chickens, than measuring one chicken once. That’s because it is not a random sample. All measuring it multiple time will do is allow you to determine the measurement uncertainty.”

Why then does climate science assume that their GAT represents all climates?

-1

Bellman

Reply to Tim Gorman

September 29, 2023 4:48 pm

“Why then does climate science assume that their GAT represents all climates?”

It doesn’t.

Tim Gorman

Reply to Bellman

September 30, 2023 4:23 am

Your lack of reading comprehension is showing again.

The term “global” has a meaning. You can’t just ignore it like you do measurement uncertainty.

-1

Bellman

Reply to Tim Gorman

September 30, 2023 7:41 am

As does the word “average”.

bdgwx

Reply to Clyde Spencer

September 29, 2023 10:31 am

Before you get involved with a second experiment to determine the temperature at a different elevation, you should be certain that you are doing it correctly and getting the right answer.

No one is saying otherwise.

Your suggestion is a distraction to the main point

You were the one that asked why someone would want to take measurements of different sized ball bearings. I gave you a plausible reason and even related it back to the topic of temperature.

which is whether one obtains the precision of a measurement by dividing by the sq rt of the number of measurements, or adds the uncertainty of measurements in quadrature when it isn’t the same thing measured multiple times with the same measuring device.

Let me repeat…again. You cannot increase the precision of an individual measurement by taking more measurements. It is only the precision of the average that improves as you increase the number of measurements that went into that average and only if the measurements have a correlation coefficient r < 1. This is true regardless of whether the measurements are of the same thing or different things like in the case of NIST TN 1900 E2. And at no time is summation in quadrature a valid procedure for assessing the uncertainty of the average.

You and the others are basically saying that measuring all ~~air parcels~~ chickens is the same as measuring a single ~~air parcel~~ chicken the same number of times as the size of the flock.

No we are not.

Why is that so hard to grasp?

If you’re trying to convince me that the uncertainty of the average is not less than the uncertainty of the individual elements that went into it then reason that is hard to grasp is because it is wrong.

It’s not unlike how some of the contrarians here try to convince me that averages and sums are interchangeable and are then incredulous when I don’t grasp that either. Not only is it wrong to suggest that averages are the same thing as sums, but it is absurdly wrong. That’s why I don’t grasp it.

Tim Gorman

Reply to bdgwx

September 29, 2023 4:30 pm

“Let me repeat…again. You cannot increase the precision of an individual measurement by taking more measurements. It is only the precision of the average that improves as you increase the number of measurements that went into that average and only if the measurements have a correlation coefficient r < 1. “

But it does nothing for estimating the accuracy of the mean you calculate unless systematic uncertainty is either insignificant or zero.

“This is true regardless of whether the measurements are of the same thing or different things like in the case of NIST TN 1900 E2. And at no time is summation in quadrature a valid procedure for assessing the uncertainty of the average.”

You are equivocating again. The issue at hand is the MEASURMENT uncertainty of the average, not how many digits your calculator uses in calculating the average.

Why do you never use the term “MEASUREMENT” uncertainty of the average?

-1

Jim Gorman

Reply to bdgwx

September 29, 2023 4:42 pm

“”””This is true regardless of whether the measurements are of the same thing or different things like in the case of NIST TN 1900 E2.””””

How many times does it need repeating that this example IS measuring the same thing. NIST declares the measurand to be the monthly average. The daily Tmax temps are experimental measurements of this measurand.

Ultimately the experimental standard deviation and the expanded standard deviation of the mean is not terribly different. 2 vs 4 for experimental uncertainty.

The real question is how do you propagate this uncertainty

bdgwx

Reply to Clyde Spencer

September 26, 2023 9:49 am

If you are trying to measure the angle turned by a transit to improve the precision, what good does it do to conflate two angles?

I don’t know. I’m having a hard time visualizing this scenario.

If measuring the diameter of a ball bearing to determine if a particular machine is producing ball bearings within tolerance, why would you also measure ball bearings from a second machine producing different size ball bearings?

Each widget (ball bearing or whatever) is a different thing. It is still perfectly reasonable to compute an average of them and assess the uncertainty of that average.

karlomonte

Reply to bdgwx

September 26, 2023 10:17 am

And this number tells you absolutely nothing.

Jim Gorman

Reply to bdgwx

September 26, 2023 10:37 am

Put some numbers to this example.

Does the experimental standard deviation of the mean provide you any information about the variance of the ball bearings?

Remember it uses √N to reduce the value, so the more you measure, the smaller the numbe because σ doesn’t change.

So, does the experimental standard deviation of the mean tell you more about the mean than it does about the dispersion of possible values surrounding μ? Or, does σ tell you more about the values that ball bearings may have?

Clyde Spencer

Reply to bdgwx

September 26, 2023 9:48 pm

But, you are not justified in conflating them and increasing the claimed precision of all of them just because you measured more of them!

Tim Gorman

Reply to Clyde Spencer

September 27, 2023 4:22 am

They simply can’t accept the fact that if there is *any* systematic bias in the measurements that the measurement uncertainty of the mean will grow with every data element you add to the data set.

Even if the systematic bias is in the thousandths digit, by the time you add 1000 data elements the measurement uncertainty will have grown into the hundredths digit.

The systematic bias in field temperature measurements is most assuredly greater than the thousandths digit once the measurement device has been in place for any period of time. Even the newest PTC sensors are only precise to the thousandths digit and precision is not accuracy.

Nick Stokes

Reply to Tim Gorman

September 27, 2023 2:26 pm

“They simply can’t accept the fact that if there is *any* systematic bias in the measurements that the measurement uncertainty of the mean will grow with every data element you add to the data set.”

No. If a thermometer is reading 1C too high, then the mean of many readings will be 1C too high.

Jim Gorman

Reply to Nick Stokes

September 27, 2023 2:59 pm

And just where do you find this applied in any of the traditional surface temperature data sets? Show us what these data sets use for combined standard uncertainty that has all the components like systematic error.

Tim Gorman

Reply to Nick Stokes

September 27, 2023 3:35 pm

“No. If a thermometer is reading 1C too high, then the mean of many readings will be 1C too high.”

Exactly! And when you average that with a different station whose systematic uncertainty is different what happens to the measurement uncertainty of that average?

And what if a year from now the systematic uncertainty of that station is +1.1C? How do you discern a temperature difference in the hundredths digit?

You *really* don’t expect all measurement station calibration to always be the same from month to month or year to year do you?

If the calibration drifts then anomalies less than the drift can’t be discerned – they would fall in the UNKNOWN.

karlomonte

Reply to Nick Stokes

September 27, 2023 5:05 pm

How do you KNOW it is 1C too high?

TimTheToolMan

Reply to Nick Stokes

September 28, 2023 3:02 am

Nick writes

No. If a thermometer is reading 1C too high, then the mean of many readings will be 1C too high.

If instead of a thermometer, the readings are from a compass with a systematic bias…after lots of readings, what is the uncertainty in your heading?

And for bonus points, why is this different?

Tim Gorman

Reply to TimTheToolMan

September 28, 2023 3:51 am

Oh WOW! This is a good one! Hope you don’t mind if I use it!

karlomonte

Reply to Tim Gorman

September 28, 2023 7:47 am

Yes it is; consider another example, sending a probe to another planet, which requires mid-course corrections: do the mission controllers have the probe make 100 different position measurements and average them to get a higher “accuracy”?

Absolutely not, they design the probe to have the necessary measurement resolution and precision built-in.

bdgwx

Reply to TimTheToolMan

September 28, 2023 7:15 am

If instead of a thermometer, the readings are from a compass with a systematic bias…after lots of readings, what is the uncertainty in your heading?

The uncertainty in each measurement will be the combination of the components of uncertainty arising from both systematic and random effects. The uncertainty of the average will depend on how much of the combined uncertainty is systematic vs random or the correlation coefficient r between any two measurements.

For example, if r = 0.5 and the combined uncertainty of systematic and random effects is u then when y = (a+b) / 2 the general solution is u(y) = sqrt[ 3/4*u(a)*u(b) ]. Note that r = 0.5 is a statement that the systematic and random effects are in equal magnitude.

Let’s put some hard values on it. Let’s say the random effect has an uncertainty u_r = 5 degrees and the systematic effect has an uncertainty u_s = 5 degrees. The combined uncertainty for a single measurement is is thus u = sqrt[ u_r^2 + u_s^2 ] = sqrt[ 5^2 + 5^2 ] = 7.1 degrees. And since u_r = u_s then r = 0.5. Now let’s say we have two measurements a = 90 degrees and b = 120 degrees. The average is y = (a + b) / 2 = (90 + 120) / 2 = 105. The uncertainty in that value is u(y) = sqrt[ 3/4*u(a)*u(b) ] = sqrt[ 3/4 * 7.1 * 7.1 ] = 6.1 degrees. That’s your answer…105 ± 6.1 degrees.

Refer to JCGM 100:2008 equation 16 for details on how I did this. The derivation of u(y) = sqrt[ 3/4*u^2 ] when y = a+b is a bit tricky, but I can walk you through it step-by-step if you like.

You can verify the solution with the NIST uncertainty machine.

And for bonus points, why is this different?

Other than being a compass and the measurements being in different units there is nothing else different. It is all handled the same way.

So for a compass that has a systematic bias of x degrees then all measurements from it are biased by x degrees which means that an average of many measurements will be biased by x degrees as well.

karlomonte

Reply to bdgwx

September 28, 2023 7:48 am

Oh Lord, please have mercy, not the NIST uncertainty machine spam AGAIN.

Jim Gorman

Reply to bdgwx

September 28, 2023 8:24 am

Do you not find it amazing that the two numbers you chose could also be two temps on the globe and end up with an uncertainty of 6°?

Tim Gorman

Reply to bdgwx

September 28, 2023 8:40 am

“ Note that r = 0.5 is a statement that the systematic and random effects are in equal magnitude.”

Do *YOU* know what the u(random) and u(systematic) are for the measurement station at Forbes Air Force Base in Topeka, KS?

If you don’t then you have no way to separate them to do the calculation you just went through.

If you *do* know then show us the math to determine the uncertainty in the mid-range value for yesterday, 9/27/23.

If you don’t know then show us the math to determine the uncertainty in the mid-range value for yesterday, 9/27/23.l

Jim Gorman

Reply to Tim Gorman

September 28, 2023 10:53 am

You are getting old, you just pulled a Biden and repeated yourself!

-1

Jim Gorman

Reply to bdgwx

September 28, 2023 10:12 am

Damn, hit the wrong button.

Let’s see what the GUM says about some of this.

Your functional description requires two measurements “a” and “b” to obtain a Y value!

If your functional relationship is
y = (a + b)/2
then you have only one measurement to use with the assumption of input quantities of 90 and 120.

Your functional description is a simple addition of similar SI units, so a simple RSS calculation will give a value of 7.1 as you indicated.Your measurement IS 105. That makes your value 105 ±7.1°.

You need two additional measurements “c” and “d” obtain a measurement quantity for another measurand. Only then can you find a mean and begin a statistical analysis of two measurements made under repeatable conditions.

Let me add that a measurement quantity of a measurand that consists of an AVERAGE of two physical measurements is probably not a good relationship. I can’t think of where this might occur in the real world. You might enlighten the folks here just what your example might be.

Lastly, when you add additional measurements from new experiments to your data, you need to use Section 4. Section 5 is for input quantities that determine a single measurement

Clyde Spencer

Reply to TimTheToolMan

September 28, 2023 7:47 pm

Yes, if one hasn’t dialed in the local magnetic declination, no amount of readings is going to give one the correct heading.

Tim Gorman

Reply to Clyde Spencer

September 29, 2023 3:35 am

If you are walking 400 miles across the Kansas prairie you would totally miss the town you are looking for. Of if you are in the CO mountains you could totally miss the valley you are looking for with only a 10 mile walk!

There *are* physical consequences that go along with measurement uncertainty. Climate science, at least as advocated for by the likes of bdgwx, refuses to accept this.

-1

bdgwx

Reply to Clyde Spencer

September 27, 2023 6:31 am

But, you are not justified in conflating them and increasing the claimed precision of all of them just because you measured more of them!

Strawman. Nick, Bellman, myself, NIST, JCGM, etc. never said you could. I’m going to tell you what I tell everyone else. That is your argument and yours alone. Don’t try to pin that on us. And don’t expect any of us to defend your arguments especially when they absurd.

What we have been saying is that the uncertainty of the average (not the uncertainty of the individual elements) decreases as the number of elements increases when those elements are correlated at r < 1.

Read that statement carefully. Read it multiple times and burn it into your brain.

-1

karlomonte

Reply to bdgwx

September 27, 2023 7:04 am

What we have been saying is that the uncertainty of the average (not the uncertainty of the individual elements) decreases as the number of elements increases when those elements are correlated at r < 1.

Still bullshit, and a distinction without a difference.

Tim Gorman

Reply to bdgwx

September 27, 2023 7:13 am

What *can* be pinned on you is your use of the unjustified assumption that the temperature measurements have no systematic uncertainty so you can do a statistical analysis of the data and use the SEM as the measurement uncertainty of the data.

No one cares how precisely you locate the average if the average has measurement uncertainty that overwhelms the differences you are trying to identify. If you precisely calculate the average to 100.000001 but the measurement uncertainty of the average is 100.0 +/- .1 you are only fooling yourself that you can tell the difference between 100.000001 and 100.000002. Both of those values in the millionth digit is part of the great UNKNOWN. It’s the location of the diamond in a fish tank full of milk.

It’s why those of us who understand metrology see climate scientists trying to identify temp differences in the hundredths digit as no different than a circus fortune teller gazing into a cloudy crystal ball.

karlomonte

Reply to Tim Gorman

September 27, 2023 7:58 am

Cue Johnny Carson and Carnak.

Tim Gorman

Reply to karlomonte

September 27, 2023 8:05 am

Your age is showing!

karlomonte

Reply to Tim Gorman

September 27, 2023 11:11 am

Heh.

Clyde Spencer

Reply to bdgwx

September 27, 2023 11:23 am

never said you could.

When you say the measurements don’t have to be of the same thing, then that is exactly what you are saying.

bdgwx

Reply to Clyde Spencer

September 27, 2023 1:57 pm

When you say the measurements don’t have to be of the same thing, then that is exactly what you are saying.

Saying that measurements do not have to be of the same thing to apply the procedure in NIST TN 1297 and JCGM 100:2008 is not a statement that the uncertainty of the individual measurements decreases as the number of measurements increases. The first is true. The second is not. Those are two completely different concepts. Neither Nick, Bellman, or myself are conflating them. And it is not unreasonable to demand that you not conflate them either.

And don’t hear what I didn’t say. I did not say that the average of the individual uncertainties is same as the uncertainty of the average. It isn’t. I did not say that error is the same thing as uncertainty. It isn’t. I did not say that the uncertainty of individual measurements decreases as the number of measurements increases. It doesn’t. There is likely to be countless strawmen that some here are going to want to pin on NIST, JCGM, Nick, Bellman, and I that none of us said or advocated for. I repeat again…we are not going to defend the arguments made by others especially when they are absurd.

The only thing that is being said is that the law of propagation holds in all cases. That necessarily means that for a measurement model that computes the average of inputs whose correlation coefficient r is less than 1 the uncertainty of the output (not the inputs) decreases as the number of inputs increases. And it works regardless of whether the inputs are themselves repetitions of a single measurand or different measurands all together measured by different instruments. That’s it. Nothing else is being said.

Tim Gorman

Reply to bdgwx

September 27, 2023 3:21 pm

The first part of the law of propagation is that you cannot assume all measurement uncertainty is random, Gaussian, and cancels. You have to *prove* that the data is random, Gaussian, and cancels.

“That necessarily means that for a measurement model that computes the average of inputs whose correlation coefficient r is less than 1 the uncertainty of the output (not the inputs) decreases as the number of inputs increases.”

You keep forgetting to state the assumptions under which this is true. It is *ONLY* true for random, Gaussian distributions. Primarily where you have multiple measurements of the same thing using the same calibrated instrument and under the same environmental conditions.

“And it works regardless of whether the inputs are themselves repetitions of a single measurand or different measurands all together measured by different instruments.”

Using *your* logic, you can combine the heights of Shetland ponies with the heights of the quarter horses and the measurement uncertainty becomes the SEM – and *NOT* the propagated measurement uncertainty from the individual elements.

In fact, such a thing would give you a multi-modal distribution which is *not* adequately described by the population mean – no matter how precisely you locate that mean by adding more and more height measurements of the different breeds. It’s exactly the same for temperature measurements.

Why do you continue to use the words “uncertainty of the output” or “uncertainty of the mean” when it is the measurement uncertainty of the mean that is at issue?

Jim Gorman

Reply to bdgwx

September 27, 2023 4:10 pm

I’ll ask you again. If your functional relationship is an average of all the inputs, how do you calculate a mean of experimental standard deviation or even experimental standard deviation of the mean. You have one experiment (one sample) with one mean as your only data point.

See the image. Where do you get multiple “k” experiments? If you average all your data into one data point, you can’t even find a mean from multiple experiments because you only have one data point calculated from X₁,₁, …, Xₙ,₁, you won’t have any X₁,₂, … , Xₙ,₂ or other experiments because you have already used the information you have available.

karlomonte

Reply to bdgwx

September 27, 2023 6:15 pm

NIST, JCGM, Nick, Bellman, and I

That you lump your gang of flat earth trendology nutters in with the GUM is beyond ironic.

Jim Gorman

Reply to bdgwx

September 25, 2023 5:00 pm

Now can you help Nick and I convince AndersV and the other WUWT participants here who think the inputs to a measurement model must be of the same thing?

You don’t even know what the “same thing” really means do you?

Define the measurand (property intended to be measured, §2), and formulate the measurement model (§4) that relates the value of the measurand (output) to the values of inputs (quantitative or qualitative) that determine or influence its value.

Define the measurand like in TN 1900. The monthly average of Tmax for a single month at a given station.

“formulate the measurement that relates the values of the output to the input values, i.e., the daily Tmax’s recorded for that month.

Tell us which one of the following conditions of repeatability conditions were violated?

GUM

B.2.15
repeatability (of results of measurements)
closeness of the agreement between the results of successive measurements of the same measurand carried out under the same conditions of measurement
NOTE 1 These conditions are called repeatability conditions.
NOTE 2 Repeatability conditions include:
— the same measurement procedure
— the same observer
— the same measuring instrument, used under the same conditions
— the same location
— repetition over a short period of time.
NOTE 3 Repeatability may be expressed quantitatively in terms of the dispersion characteristics of the results.

Experimental measurements don’t work like the old “single thing measured with the same device multiple times.” Under those conditions the “true value” plus or minus the SEM may be applicable. That is why the GUM includes an experimental standard deviation of the mean in their text. Under some conditions that may be an applicable statement of a value. It is up to the experimentor to publish an adequate statement of value, i.e. the experimental standard deviation to let readers understand that the is a dispersion of values around the mean.

Jim Gorman

Reply to bdgwx

September 26, 2023 8:44 am

Here is what Dr. Taylor says in Section 5.7.

“””Because x1, … , xn are all measurements of the same quantity x, their widths are all the same and are are equal to σₓ,
σₓ₁ = ••• = σₓₙ = σₓ”””

“””We imagined a large number of experiments, in each of which we make N measurements of x and then computed the average x̅ of those N measurements. We have shown that after repeating this experiment many times, our many answers will be normally distributed, that they will be centered on the true value of X, and that the width of their distribution is
σₓ̅ = σₓ/√n.”””

In case you don’t notice, this is exactly describing a sampling distribution where the sample means estimates the “true value” made up of the many x̅ values from multiple sample.

Since temperatures are single measurements of temperature, N = 1, even when averaging a months worth experimental data points. In essence, σₓ̅ = σₓ/√n = σₓ/√1 = σₓ

TN 1900 gets around this by making assumptions about a Students T distribution. TN 1900 also says that:

“””A coverage interval may also be built that does not depend on the assumption that the data are like a sample from a Gaussian distribution. The procedure developed by Frank Wilcoxon in 1945 produces an interval ranging from 23.6 ◦C to 27.6 ◦C (Wilcoxon, 1945; Hollander and Wolfe, 1999). The wider interval is the price one pays for no longer relying on any specific assumption about the distribution of the data. “””

Note the last sentence. NIST notices that the assumptions may not be correct. They admit that another test provides an interval of ±2.

Clyde Spencer

Reply to bdgwx

September 26, 2023 9:43 pm

It is all about definitions. If you repeatedly measure the diameter of a single ball bearing, you have the means of describing the random variation resulting from measurement error (assuming perfect sphericity or at least negligible ellipticity). Thus, you can state an estimate of the probable diameter, +/- a measurement uncertainty. You are justified in increasing the precision estimate by a factor of the sq rt of the number of measurements: The same thing, measured multiple times, with the same instrument.

If you measure the diameter of a sample of 100 ball bearings from a particular batch, you have the means of describing the average diameter of the ball bearings from that batch as a distribution, plus the inherent measurement error from the single ball bearing experiment, above. You aren’t justified in claiming an increase in precision of the individual bearings by a factor of ten: Different things, measured once each, with the same instrument.

Do you understand how that relates to measuring air masses?

Tim Gorman

Reply to Clyde Spencer

September 27, 2023 4:16 am

They don’t understand anything about real world measurements. They have their Stat 101 for non-math majors that describes the SEM as the error of the mean and by Jiminy they are going to stick to that! Anything else is just a unnecessary distraction so it’s easier to just assume it all cancels.

(Note: (population average)/sqrt(n) IS the SEM, it is *not* the measurement uncertainty of the population mean.)

bdgwx

Reply to Clyde Spencer

September 27, 2023 6:25 am

You aren’t justified in claiming an increase in precision of the individual bearings by a factor of ten

No one is saying otherwise. And if you think Nick, Bellman, and I are saying that then you haven’t been reading our posts because we have clear and unequivocal on the point that it is only the uncertainty of average of elements correlated at r < 1 that is lower with a higher element count and not the uncertainty of the individual elements themselves. We have stated that repeated and concisely numerous times.

Do you understand how that relates to measuring air masses?

Yes I do. And I see no reason to challenge NIST’s understanding on the matter either.

-2

Tim Gorman

Reply to bdgwx

September 27, 2023 7:06 am

The SEM, which is what you are calling the “uncertainty of average”, is *NOT* the measurement uncertainty of the average. The measurement uncertainty of the average is the important factor determining what you know and what you *can’t* know.

Calculating the average to the thousandths digit, which is what the SEM tells you, tells you NOTHING about how accurate that mean is. The mean could be off by 100% while the SEM is zero. You simply can’t tell anything about accuracy of the mean from the SEM.

Why climate science refuses to accept that measurement uncertainty is so large in the temperature data that you simply cannot identify differences in the hundredths digit is beyond me.

It *is* assuming that if you can measure a crankshaft journal enough times using a yardstick marked in 1/8″ increments that you can determine the diameter of the crankshaft journal to the .001″. It *is* assuming that you can identify the location of a diamond in a fish tank of milk if you just stare at the tank long enough!

Clyde Spencer

Reply to Tim Gorman

September 27, 2023 11:39 am

Why climate science refuses to accept that measurement uncertainty is so large in the temperature data that you simply cannot identify differences in the hundredths digit is beyond me.

Their religious dogma is that the Earth is warming dangerously from human activities, and the ONLY way they can even make a case for any warming is to make claims such as:

“Further analysis also indicates that if the surface temperature in the last five months of 2023 approaches the average level of the past five years, the annual average surface temperature anomaly in 2023 of approximately 1.26°C will break the previous highest surface temperature, which was recorded in 2016 of approximately 1.25°C …”
DOI: 10.1007/s00376-023-3200-9

If they showed the justifiable significant figures and the correct uncertainty, they would have to conclude that there is no statistically significant difference between the two measurements, and no basis for the recent warming claim — Game Over!

Jim Gorman

Reply to Clyde Spencer

September 27, 2023 4:19 pm

My pet peeve is that after seeing numerous individual stations with little to no warming, not one warmest has ever had the temerity to post some stations that could average out to 1.25°C with a station that has 0°C warming.

One has to wonder what it takes to have some evidence of stations that are warming at a 2.5°C or better that isn’t affected by UHI!

Bellman

Reply to Jim Gorman

September 28, 2023 5:12 am

“not one warmest has ever had the temerity to post some stations that could average out to 1.25°C with a station that has 0°C warming.”

Try looking through GHCN. Here are few from the unadjusted data, using annual averages.

SIE00115076 – POSTOJNA
1962 – 2022. Warming rate 0.48°C / decade. Total warming 2.9°C.

IR000407660 – KERMANSHAH
1951 – 2022. Warming rate 0.42°C / decade. Total warming 3.0°C.

SWE00139498 – HOLJES
1961 – 2022. Warming rate 0.49°C / decade. Total warming 3.0°C.

Jim Gorman

Reply to Bellman

September 28, 2023 7:58 am

Did you do any research on these? I realize you answered the question, but maybe these aren’t the appropriate stations.

I have included an image of a graph for Slovenia. Your graph seems to be anomalous, at least for the country.

The Tmax temps for KERMANSHAH appear to have some changes. ~1980 and ~1994 both show the possibility of a station change of some kind with relatively flat temps afterward.

I guess you should realize that the graphs on WUWT in the past were pretty well researched and not just random finds.

Bellman

Reply to Jim Gorman

September 28, 2023 9:23 am

And there go the goal posts. First you complain nobody shows you stations that show anomalous warming trends, then complain that the trends you are shown are anomalous.

As I said, I deliberately used unadjusted data, as I knew the whining if when it’s adjusted. But that inevitably means there will be all sorts of reasons why a particular stations will show more or less warming than the rest of the area. I just looked at the trend for every station, and randomly selected stations that looked plausible. Rejecting any which show obvious problems or discontinuities.

“I guess you should realize that the graphs on WUWT in the past were pretty well researched and not just random finds.”

Which ones would those be? The one Tim insisted was perfect despite a big chink showing mean temperatures rather than maximums? The one in Tokyo that keeps being used to prove there’s been no warming despite the fact it was moved to a cooler location in several years ago?

Jim Gorman

Reply to Bellman

September 28, 2023 10:19 am

I’m not going to find the actual graphs. A few are from Japan, Australia, U.S. CRN, Great Britain and other places.

I didn’t move the goalposts. I provided a Google located graph for Slovenia. It doesn’t show the warming you have. Do you think this might be indicative of the problem with the temperature database?

Bellman

Reply to Jim Gorman

September 28, 2023 11:03 am

“I provided a Google located graph for Slovenia.”

For 10 years, ending in 2021.

Here are the same years for the GHCN Postojna. Not that different.

karlomonte

Reply to bdgwx

September 27, 2023 7:06 am

You are lying every time you claim the SEM is the “uncertainty of the average”.

And I see no reason to challenge NIST’s understanding on the matter either.

The understanding problem is on YOUR end, not NIST’s.

Tim Gorman

Reply to karlomonte

September 27, 2023 7:23 am

Bevington: “The accuracy of an experiment, as we have defined it, is generally dependent on how well we can control or compensate for systematic errors, errors that will make our results different from the “true” values with reproducible discrepancies. Errors of this type are not easy to detect and not easily studied by statistical analysis.”

Taylor: “As noted before, not all types of experimental uncertainty can be assessed by statistical analysis based on repeated measurements. For this reason, uncertainties are classified into two groups: the random uncertainties, which can be treated statistically, and the systematic uncertainties, which cannot.”

The SEM is a *statistical” treatment – AND IT IS NOT JUSTIFIED WHEN SYTEMATIC BIAS EXISTS IN THE MEASUREMENTS.

These guys keep trying to justify using the SEM as a measurement uncertainty of the average when it simply doesn’t apply at all to temperature measurements.

You are correct, the understanding problem is *NOT* in the NIST, Taylor, Bevington, or Possolo. It is an understanding problem of those who think you can statistically analyze data that has systematic bias (and time varying systematic bias at that) in the face of metrology experts saying that you can *NOT* do so.

Nick Stokes

Reply to karlomonte

September 27, 2023 2:22 pm

“You are lying every time you claim the SEM is the “uncertainty of the average”.”

Here’s NIST’s E2 again

Jim Gorman

Reply to Nick Stokes

September 27, 2023 3:21 pm

One potential source of uncertainty is model selection: in fact, and as already mentioned, a
model that allows for temporal correlations between the observations may very well afford a more faithful representation of the variability in the data than the model above. However, with as few observations as are available in this case, it would be difficult to justify adopting such a model.

The {εi} capture three sources of uncertainty: natural variability of temperature from day to day, variability attributable to differences in the time of day when the thermometer was read, and the components of uncertainty associated with the calibration of the thermometer and with reading the scale inscribed on the thermometer.

Assuming that the calibration uncertainty is negligible by comparison with the other uncertainty components, and that no other significant sources of uncertainty are in play, then the common end-point of several alternative analyses is a scaled and shifted Student’s t distribution as full characterization of the uncertainty associated with r.

A lot of assumptions about uncertainty here. You can wave them away as NIST did in the EXAMPLE, but for scientific work, you can’t just do this.

Again, this example has been set up to allow one to use the SEM in terms of the same thing, multiple times with the same device, and in the same location. Do you think NIST is unaware of the requirement of repeatability conditions?

Do you understand what the difference is between the same thing, multiple times with the same device, and in the same location and single measurements of experimental standard deviation?

If you are using this example as an exemplary way of calculating uncertainty, what do you have to say about the end result of said uncertainty being ±1.8°C? Exactly what causes this large uncertainty to not propagate throughout following calculations?

Tim Gorman

Reply to Nick Stokes

September 27, 2023 3:26 pm

Why don’t you list out the assumptions in E2?

Such as:

No systematic uncertainty
The same measurand
The same instrument each time
the same environmental conditions each time
stationarity of the measurements

Then tell us how these assumptions apply to the global temperature data set.

karlomonte

Reply to Nick Stokes

September 27, 2023 5:24 pm

Here is GUM E.4:

E.4 Standard deviations as measures of uncertainty

E.4.1 Equation (E.3) requires that no matter how the uncertainty of the estimate of an input quantity is obtained, it must be evaluated as a standard uncertainty, that is, as an estimated standard deviation. If some “safe” alternative is evaluated instead, it cannot be used in Equation (E.3). In particular, if the “maximum error bound” (the largest conceivable deviation from the putative best estimate) is used in Equation (E.3), the resulting uncertainty will have an ill-defined meaning and will be unusable by anyone wishing to incorporate it into subsequent calculations of the uncertainties of other quantities (see E.3.3).

E.4.2 When the standard uncertainty of an input quantity cannot be evaluated by an analysis of the results of an adequate number of repeated observations, a probability distribution must be adopted based on knowledge that is much less extensive than might be desirable. That does not, however, make the distribution invalid or unreal; like all probability distributions, it is an expression of what knowledge exists.

E.4.3 Evaluations based on repeated observations are not necessarily superior to those obtained by other means. Consider s(q), the experimental standard deviation of the mean of n independent observations qk of a normally distributed random variable q [see Equation (5) in 4.2.3]. The quantity s(q) is a statistic (see C.2.23) that estimates σ(q), the standard deviation of the probability distribution of q, that is, the standard deviation of the distribution of the values of q that would be obtained if the measurement were repeated an infinite number of times.

It standard deviation that quantifies uncertainty, not your precious SEM.

And where are the repeated observations in air temperature measurements? THERE AREN”T ANY.

And as Jim pointed out, why did you not include the assumptions in the NIST example? More of your sophistry.

Jim Gorman

Reply to Clyde Spencer

September 27, 2023 6:59 am

Dr. Taylor covers this in his spring factor “k” example. You may use a single ball bearing to calculate an uncertainty for 99 remaining ball bearings. However, if you measure a ball bearing that exceeds the first μ+SEM, you must measure it multiple times to find the uncettainty.

Why? Because the SEM is an interval around the mean describing where the mean may lay. If your measurement exceeds this interval, you have a problem. This is one reason to use an expanded SEM.

Tim Gorman

Reply to bdgwx

September 27, 2023 4:05 am

In TN 1900 Possolo SPECIFICALLY states the assumption that the measurements are of the same thing. He also specifically states the assumption that systematic bias is insignificant. Meaning he assumed all error was random, Gaussian, and canceled thus leaving the SEM as the measurement uncertainty.

You simply cannot do that in reality. Neither assumption holds in reality, especially when you are looking at a mid-range value.

The very fact that different Tmax and Tmin values can result in the same Tmid-range should be a clue that the mid-range value is *NOT* a good index for assessing climate!

The law of propagation requires the addition of all measurement uncertainty, in quadrature if there is partial cancellation, in all cases. The addition of the measurement uncertainties cannot be substituted for by using the SEM – except in one specific case. And that one specific case is why Possolo made the assumptions he did in TN1900.

It’s the same for the JCGM and TN1900. You are wanting to shoehorn the reality of the temperature data into the same assumptions that Possolo made – all measurement uncertainty is random, Gaussian, and totally cancels leaving the SEM as the measurement uncertainty.

The SEM and the law of large numbers is ONLY good for determining how precisely you have located the population average. Except in the one specific case the SEM tells you nothing about the accuracy of the population mean. The population mean could be off by 100% while the SEM is zero all because of systematic bias in the individual measurements. That is why you HAVE to account for systematic bias in the real world. It never goes away. It can partially cancel which is why you add in quadrature – but it *NEVER* goes away in real world measurements and that is why the SEM can’t be used as the measurement uncertainty in the real world of temperature measurement.

Tim Gorman

Reply to bdgwx

September 27, 2023 6:58 am

Bevington specifically says that measurements containing systematic bias are not amenable to statistical analysis. He then goes on to analyze distributions, like Gaussian, assuming no systematic bias. His entire book is on distributions whose measurements are assumed to have no systematic bias, only random error.

Almost the same thing applies to Taylor. His Chapter 4 is even titled “Statistical Analysis of Random Uncertainties”. Note carefully the words “Random Uncertainties”.

In the text he says: “As noted before, not all types of experimental uncertainty can be assessed by statistical analysis based on repeated measurements. For this reason, uncertainties are classified into two groups: the random uncertainties, which can be treated statistically, and the systematic uncertainties, which cannot.” (italics are in the text, tpg)

In TN1900, Possolo specifically states that systematic uncertainty has to be assumed to be insignificant in order to use the methodology he proceeds with.

Global temperature data HAS widespread systematic uncertainty. A priori that means the data is *NOT* amenable to statistical analysis. The way it should be treated is how Taylor treats measurement uncertainty in his Chapter 1-3 – i.e. measurement uncertainty ADDS, either directly or in quadrature.

The use of the SEM *is* based on assuming that all measurement uncertainty is random, Gaussian, and cancels. NO SYSTEMATIC UNCERTAINTY.

It’s an assumption that is simply not justified for the real world of temperature measurements using field measuring devices whose calibration is not guaranteed.

It’s why Hubbard and Lin found in 2002 that regional adjustments to measurement devices is simply wrong. Local microclimates are so different that the random error/systematic bias has to be done on a station-by-station basis. That alone should have been a warning to climate science that assuming all measurement uncertainty is random and Gaussian is wrong. But it seems to have made no impression at all – especially on you, Stokes, and bellman.

karlomonte

Reply to Tim Gorman

September 27, 2023 7:09 am

Remember that bgw is a big proponent of fraudulent data mannipulations.

karlomonte

Reply to AndersV

September 25, 2023 6:34 am

They been told this very point many, many times yet refuse to acknowledge the truth.

-1

Clyde Spencer

Reply to karlomonte

September 27, 2023 11:56 am

Did you seen the remark that Stokes made over on Climate Etc. on the 24th about “That’s a common mantra among the uncertainty cranks at WUWT, who never quote any authority for it.?”

bigoilbob

Reply to Clyde Spencer

September 27, 2023 2:15 pm

You can follow that down to his facts on why, and then Andy May’s subsequent silence.

-1

karlomonte

Reply to bigoilbob

September 27, 2023 5:28 pm

Still waiting for Bellcurveman to “school” me, blob.

Bellman

Reply to karlomonte

September 28, 2023 5:17 am

I think you are incapable of being schooled.

Jim Gorman

Reply to Clyde Spencer

September 27, 2023 4:12 pm

Which thread Clyde?

Clyde Spencer

Reply to Jim Gorman

September 27, 2023 8:01 pm

Judith Curry’s blog, Comment and Reply to GRL on evaluation of CMIP6 simulations by Scafetta, https://judithcurry.com/2023/09/24/comment-and-reply-to-grl-on-evaluation-of-cmip6-simulations/#comment-993573

Jim Gorman

Reply to Clyde Spencer

September 28, 2023 6:03 am

Thank you.

Nick doesn’t understand measurement uncertainty and shouldn’t be commenting on it.

As I have pointed out to Nick, as have you, the assumptions in TN 1900 are entirely set up to follow the traditional same thing, multiple measurements, same device, repeatable conditions whereby the SEM is an appropriate statistic.

The GUM is definite about this. Experimental measurements of different stations do not meet repeatable conditions requirement.

karlomonte

Reply to Jim Gorman

September 28, 2023 7:55 am

“Together with Gareth Jones and John Kennedy, he [Gavin] wrote a letter to the Editorial Board of GRL asking them to retract my paper.”

The #1 tactic of leftists — censorship.

karlomonte

Reply to Clyde Spencer

September 28, 2023 7:52 am

“The topic of discussion in this sub-thread is not the correct way to handle data, but instead, your willful lying.” — CS

Amen!

karlomonte

Reply to Clyde Spencer

September 27, 2023 5:27 pm

No I missed it, don’t read CE. Typical Stokes, who thinks he is the world’s foremost expert on absolutely everything.

Jim Gorman

Reply to AndersV

September 25, 2023 12:35 pm

Exactly!

Tim Gorman

Reply to AndersV

September 26, 2023 1:48 pm

Bingo! Someone that understand metrology!

Jim Gorman

Reply to bdgwx

September 25, 2023 8:48 am

The JCGM does not say that. It says;

“””B.2.18 uncertainty (of measurement)”””

“””parameter, associated with the result of a measurement, that characterizes the dispersion of the values that could reasonably be attributed to the measurand. “””

“””C.3.2 Variance”””

“””The variance of the arithmetic mean or average of the observations, rather than the variance of the individual observations, is the proper measure of the uncertainty of a measurement result.”””

The variance of the average of the observations is the proper measure of the uncertainty in a measurement. That is the statistical parameter that characterizes the dispersion of the values that could reasonably be attributed to the measurand.

I have yet to see anyone define what the measurand is, the procedure for determining the observation of the measurand, or the measurment model. I refer you to NIST TN 1900 for a tutorial on how to define an observation equation and a measurement error model.

I’ll say it again, anomalies are not measurements of temperatures they are a delta T.

One more criticism is that all this work should be done Kelvin.

Martin Brumby

Reply to Nick Stokes

September 23, 2023 11:12 am

No Stokes.

The real issue is that Schmidt proved many years ago (15?), that he is either a completely incompetent or, happy to write anything for a price.

Witness his laughable paper on ‘warming’ Antarctica with stations in wildly incorrect locations, others long buried under meters thickness of snow, in addition to smearing warmth from the volcanic peninsula as far around as he dare.

Exposed in detail by Steve McIntyre and others.

Nick Stokes

Reply to Martin Brumby

September 23, 2023 12:00 pm

I think you are thinking of Steig (and have it all wrong).

-6

sherro01

Reply to Nick Stokes

September 23, 2023 4:30 pm

Nick,
You are correct, it was Steig’s work.
Geoff S

TimTheToolMan

Reply to Nick Stokes

September 23, 2023 7:10 pm

Agreed. It was Steig’s work and IIRC Nic Lewis criticised the method and effectively proved Steig’s analysis wrong.

sherro01

Reply to TimTheToolMan

September 24, 2023 12:18 am

Tim,
You are correct about the response(s) to Eris Steig about 2009-8, by Ryan O’Donnell, Nick Lewis, Stephen McIntyre and Jeff Condon. Geoff S

Gary Pearse

Reply to Nick Stokes

September 23, 2023 1:53 pm

This is the Nick of old that I remember! At least, in any serious reply to his comment, some thoughtfulness is warranted. I myself don’t dig down to the minutae of temperature readings because as I speak an algorithm is busy adjusting T down before 1940 and up after 1980. And that was after the Father of Global Warming Hysteria, pushed the 1930s-40s 20th century highhstand down over half a degree C. In doing so he got rid of century’s high and the deep 40yr cooling period late 1940s -1979 that followed, both of which falsified the CO2 control knob hypothesis. Jim Jumanji Climate then retired, as is the wont of the climate changers (remember T. Karl and his Karlization of ocean surface T on the eve of his taking his pension).

So how many angels are dancing on the head of the T algorithm stylis doesn’t interest me. Were I a scientist from the Dark Side, I would be delighted that sceptic scientists had legitimized the Big T jiggering by arguing about tenths of a degree on a bogus record.

Clyde Spencer

Reply to Gary Pearse

September 25, 2023 12:22 pm

Gary, I think it important that when an alarmist says the following:

…, if the surface temperature in the last five months of 2023 approaches the average level of the past five years, the annual average surface temperature anomaly in 2023 of APPROXIMATELY 1.26°C will break the previous highest surface temperature, which was recorded in 2016 of APPROXIMATELY 1.25°C …
[ DOI: 10.1007/s00376-023-3200-9 ]

that we hold their feet to the fire and make them defend their claims because people who vote read the above and are impressed by their credentials.

bnice2000

Reply to Nick Stokes

September 23, 2023 2:03 pm

“Suppose you had an ideal model, which would be a planet B (Earth is A), similar in all respects, including rising GHG.”

So use two FAKE and ERRONEOUS models and compare the output…

…. and assume the difference is real.

Seriously , Nick. !!!

If that is the sort of anti-science you need to rely on….

YOU GOT NOTHING !!!

Javier Vinós

Reply to Nick Stokes

September 23, 2023 2:31 pm

It is actually worse than Niccola Scafetta paints it.

This is a comparison between HadCRUT5 and CMIP6 from:
climexp.knmi.nl/CMIP6/Tglobal/global_tas_mon_ens_ssp245_192_ave.dat
Baseline: 1961-1990

Models are getting wronger by the day. There’s no fix for their problem. And the AMO hasn’t even started going down. This is going to be fun.

Nick Stokes

Reply to Javier Vinós

September 23, 2023 3:28 pm

That data says it is from 2019. It’s from an early time when almost all the data came from CanESM5, which did indeed run rather hot. A more complete sample will show a different picture.

-7

bnice2000

Reply to Nick Stokes

September 23, 2023 4:08 pm

So you are saying that all earlier models RAN HOT

And now you say they don’t (which is of course BS)

Trouble is, the whole AGW scam is built around those earlier models.

Now take you left foot out, and put your right foot in !

Javier Vinós

Reply to Nick Stokes

September 23, 2023 4:15 pm

That is certainly not correct. CanESM5 has 28 members out of 175. Since when is 16% of the data “almost all the data”?

That CMIP6 runs hotter than CMIP5 has made it to Science:
Voosen, P., 2021. Science, 373 (6554) pp.474–475. doi.org/10.1126/science.373.6554.474

Since when are you a Science denier? Since about the same time when 16% of the data became almost all the data?

Javier Vinós

Reply to Javier Vinós

September 23, 2023 4:21 pm

In any case, that the IPCC has gone from highlighting the scariest scenarios and the models that produced the most warming to doing the opposite speaks volumes about this supposed climate emergency and the skill of models in predicting it.

Nick Stokes

Reply to Javier Vinós

September 23, 2023 5:33 pm

“That is certainly not correct. CanESM5 has 28 members out of 175. Since when is 16% of the data “almost all the data”?”

It is certainly enough to create a large bias. But why go back to the very early days in 2019, when only a few results were in?

-12

bnice2000

Reply to Nick Stokes

September 23, 2023 6:15 pm

So.. 28 WRONG MODELS

Hilarious that you even pretend any are accurate. 🙂

Nick Stokes

Reply to bnice2000

September 23, 2023 11:33 pm

28 runs from 1 model

-6

bnice2000

Reply to Nick Stokes

September 24, 2023 2:03 am

“28 runs from 1 model”

So the model was WRONG “at least 27 times out of 28”

WOW how reassuring is that !!

Change feet again, Nick !!

bnice2000

Reply to Nick Stokes

September 23, 2023 7:16 pm

So.. wait for the URBAN data to be fabricated,

Then say, “these models are close to the urban adjusted temperature fabrication”

When all the rest of the models are wayyyyy off !!

And pretend it is nothing more than an accident.

That is “climate science™” for you

Javier Vinós

Reply to Nick Stokes

September 24, 2023 12:06 am

enough to create a large bias

No. If I remove CanESM5 the two curves would approach less than 16%, you wouldn’t be able to tell the difference.

why go back to the very early days in 2019

That’s what’s available in knmi, and it was before the IPCC started cherry-picking the coolest models and changing the baseline to the 21st century to hide the issue.

doonman

Reply to Javier Vinós

September 24, 2023 11:38 am

I Missed the important part. Since when is climate model output data as Nick asserts?

sherro01

Reply to Javier Vinós

September 23, 2023 4:34 pm

Javier,
Some annotation on the graphs would help. What are the red lines, the black li

Andy May

Author

Reply to sherro01

September 23, 2023 4:55 pm

In both graphs the black are observations and the red are models. The top graph is the anomalies and the bottom is rate of change.

sherro01

Reply to Andy May

September 24, 2023 12:39 am

Thanks, Andy & Javier.
Geoff S

Javier Vinós

Reply to sherro01

September 24, 2023 12:10 am

Yes, thank you, Andy. The black thin line is HadCRUT5 13-month running average. The black thick line is a Gaussian smoothing. The dashed red line is the average of the 175 ensemble members (40 models) for CMIP6 available at KNMI Explorer. All curves in anomaly with respect to the 1961-1990 baseline.

Javier Vinós

Reply to Javier Vinós

September 24, 2023 12:14 am

The second graph is the 15-year rate of change of the curves in the first one, expressed as ºC per decade.

About half of the time the models appear clueless about what is going on in the real climate. That indicates more chance than skill.

Sunsettommy

Reply to Javier Vinós

September 24, 2023 8:52 am

Thank you for the clarification.

The inevitable coming cooling trend will utterly destroy their forecast skill of their models which will never predict the cooling at all.

Jim Gorman

Reply to Sunsettommy

September 25, 2023 9:14 am

The Internet now has too many pictures of what is currently being assumed. It will eliminate too much changing. Which means any cooling is going to look outlandish and at the same time destroy CO2 beeping the boogyman.

karlomonte

Reply to Nick Stokes

September 23, 2023 3:10 pm

Nice spaghetti, Stokes.

Nick Stokes

Reply to karlomonte

September 23, 2023 3:12 pm

Not mine, Scafetta’s.

-5

karlomonte

Reply to Nick Stokes

September 23, 2023 5:22 pm

You posted it.

ferdberple

Reply to Nick Stokes

September 23, 2023 3:41 pm

Nick, you are wrong about the eye. Millions of years of evolution have worked wonders. See my remarks from years ago about the uni prof and his mark one eyeball.

steve_showmethedata

Reply to Nick Stokes

September 24, 2023 3:04 am

I agree when you say “But the error that should be obvious here is that he allows no uncertainty in the observations. None at all. Now Andy objects that Schmidt et al have allowed too much, but zero has to be wrong.”

Scafetta (2023, “Reply to “Comment on…”) seems to “want his cake and eat it too” by assuming the time series 2011-2021 is completely deterministic (“the 2011–2021 ERA5-T2m interannual variability—which represents the actual climatic chronology that occurred—cannot be replaced by random data”) so he can handwave-away any stochasticity in each series by grid cell BUT in contrast he assumes (by the statistical method he applies of t-tests by grid cell) that the set of CGMs applied are a simple random sample of some superset of CGMs. You cannot have it both ways. You either have both deterministic (i.e. no stochasticity and thus no variance estimation and hypothesis testing) or consider both (part) stochastic.

My suggestion would be to fit a thin-plate regression spline to each grid cell’s time series of observed ERA5-T2m records by fitting the appropriate linear mixed model (incorporating the thin-plate spline as linear plus random effect terms) to the ERA5-T2m records minus the corresponding prediction of each CGM as an 11 x N catenated vector representing the response variable and including a random effect for each CGM, a random effect for each year in the series, and finally the residual error. The test of no difference would then be based on the support interval for the intercept parameter which under the null hypothesis of zero difference between population-level mean of observations and population-level mean of model predictions (assuming both observations and predictions are sets of random samples within each grid cell). This support interval would be based on the random CGM variance component and the residual variance about the fitted splines for the ERA5-T2m series and the residual representing the interaction of CGM_factor and the times series factor (adjusting for other terms). These last two variances would be greater than zero but less than that obtained by not fitting the spline (i.e. the equivalent of the SJK2023 approach in this last case).

eg in R using MCMCglmm for each grid cell

prior1 <- list(G = list(G1=list(V =1, nu = 0.002), G2 = list(V =1, nu = 0.002), G3 = list(V =1, nu = 0.002), R = list(V=1, nu = 0.002))
>
> m5d.1 <- MCMCglmm(T2m_minus_CGM_pred ~ 1+Years_centred, random=~ spl(Years_centred) + Year_factor + CGM_factor, data = data,
nitt=130000, thin=100, burnin=30000, prior = prior1, family = “gaussian”, pr=TRUE, verbose = FALSE)
>
> summary(m5d.1)

Joseph Zorzin

September 23, 2023 10:52 am

Is there such a thing as “random noise” in climate science? I have no clue.

Andy May

Author

Reply to Joseph Zorzin

September 23, 2023 11:05 am

There is random noise in all measurements, each measurement of any particular thing will be slightly different, which is why we repeat measurements and take an average.

Estimating an average annual temperature for one year twice and getting a different value is truly an estimate of error. Measuring the average temperature for two (or 11) consecutive years, then averaging those is not error, since the underlying climate trends are also in the differences. This is the problem with what Schmidt, Jones, and Kennedy did. They mixed real climate change variations with random error. That is a no-no.

sherro01

Reply to Andy May

September 23, 2023 4:45 pm

Andy,
I disagree that there is random noise in all measurement.
We could argue what random means.
But every measurement variation has a cause and effect. We get into trouble when we cannot or will not discern the direction, magnitude and abundance of all of the causes of variation. It is simply not good enough, but seductively appealing, to hit the “too hard basket” and say it is “noise”. It is worse to next assume that positive and negative noises cancel out. It is easy to find comfort by fiddling with strings of synthetic numbers that appear to support your assumptions, especially when it makes it so much easier. Geoff S

Pat Frank

Reply to sherro01

September 23, 2023 11:20 pm

On point, Geoff.

Every single field calibration of a surface air temperature sensor has revealed considerable systematic measurement error. The error wasn’t merely random in any one of them.

The pervasive assumption of strictly random temperature measurement error is the crack cocaine of AGW climatology.

And the fake equation of model precision as an accuracy metric is their operating fantasy..

karlomonte

Reply to Pat Frank

September 24, 2023 8:46 am

Yet the climate science warriors rant on and on about how non-random error transmogrify into random error which they then ignore.

Sunsettommy

Reply to Pat Frank

September 24, 2023 8:55 am

Yeah, they used temperature data that themselves have been adjusted to climate models that fails to show valid forecast skill thus the whole thing is a waste of time.

Thus, the errors are magnified while they pretend to themselves, they are too small to matter.

Jim Gorman

Reply to Andy May

September 25, 2023 10:04 am

Andy,

Noise is a term that is too casually bandied about. Noise is a signal that is detected along with and emulates the intended signal.

Do temperatures have noise, of course. Wind, humidity, surface types, clouds, or any of a number of environmental conditions can externally change measurements. However, this is what uncertainty intervals are designed to account for.

However, too many people call variations in temperature from day to day. Month to month, year to year at any given station noise also. The variations ARE the signal! The fact that they don’t lie on a simple trend doesn’t mean noise.

Consider each measurement as one from repeating experiments. Those experimental measurements will vary and their variance will be a measure of the spread in uncertainty. Repeatability conditions are important and are discussed in the GUM. One is measurements occurring in a short time. NIST in TN 1900 uses a month of data. To me longer periods begin to begin to add in seasonal changes that increase variance.

The uncertainty I see mentioned here are far to small to represent a dispersion of measurementsp values around a mean.

Lastly, the experimental standard uncertainty of the mean is a statistic describing an interval surrounding a mean where tells one where the mean may lay. It is not a measurement uncertainty informing one of the variance (spread) in the measurements themselves.

I would be very interested to know what the experimental standard uncertainty in this data.

bdgwx

Reply to Joseph Zorzin

September 23, 2023 2:07 pm

It is thing in all disciplines of science. They all utilize measurements that contain a component of uncertainty arising from a random effect.

Jim Gorman

Reply to bdgwx

September 25, 2023 10:17 am

“”D.5.2 Uncertainty of measurement is thus an expression of the fact that, for a given measurand and a given result of measurement of it, there is not one value but an infinite number of values dispersed about the result that are consistent with all of the observations and data and one’s knowledge of the physical world, and that with varying degrees of credibility can be attributed to the measurand. “””

INFINITE NUMBER OF VALUES DISPURSED ABOUT THE RESULT THAT ARE CONSISTENT WITH ALL THE OBSERVATIONS AND DATA.

Tell us how an experimental standard uncertainty of the mean tell anyone about the dispersion of observations and data.

Clyde Spencer

Reply to bdgwx

September 25, 2023 12:26 pm

There can also be systematic variations that have to be identified and removed. One cannot assume that all variation is random.

bdgwx

Reply to Clyde Spencer

September 25, 2023 1:10 pm

I’m addressing Joseph Zorin’s question “Is there such a thing as “random noise” in climate science?” The say the answer is yes. I extend my answer to all disciplines of science. Are you are challenging the answer I have given?

Jim Gorman

Reply to bdgwx

September 25, 2023 6:02 pm

From the GUM.

F.1.1.3 Second, it must be asked whether all of the influences that are assumed to be random really are random. Are the means and variances of their distributions constant, or is there perhaps a drift in the value of an unmeasured influence quantity during the period of repeated observations? If there is a sufficient number of observations, the arithmetic means of the results of the first and second halves of the period and their experimental standard deviations may be calculated and the two means compared with each other in order to judge whether the difference between them is statistically significant and thus if there is an effect varying with time.
B.2.10
influence quantity
quantity that is not the measurand but that affects the result of the measurement
EXAMPLE 1 Temperature of a micrometer used to measure length.
EXAMPLE 2 Frequency in the measurement of the amplitude of an alternating electric potential difference.
EXAMPLE 3 Bilirubin concentration in the measurement of haemoglobin concentration in a sample of human blood plasma.
[VIM:1993, definition 2.7]
Guide Comment: The definition of influence quantity is understood to include values associated with measurement standards, reference materials, and reference data upon which the result of a measurement may depend, as well as phenomena such as short-term measuring instrument fluctuations and quantities such as ambient temperature, barometric pressure and humidity.

Of course there are influences that can cause measurement uncertainty. Is that considered noise? Not really, wind, clouds, etc. are not temperature. Their influence may change a temperature reading, but they are really part of the weather that temperature is measuring. You might as well ask if nighttime temperatures are noise.

Gerald Browning

September 23, 2023 10:54 am

It looks like Schmidt cherry picked which models to include. Also reanalysis data includes model error as it uses models to create it.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 11:16 am

So given reanalysis data is generated by models doesn’t it stand to reason that it would be closer to climate model forecasts?

Andy May

Author

Reply to Gerald Browning

September 23, 2023 11:31 am

The fact that they don’t match suggests one is wrong. I strongly suspect it is the climate models that are wrong, both BEST and ERA5 incorporate as much data as possible. One can argue about the models used and the processing, as I have in the past, but they are based on data and much closer to it than climate models.

bnice2000

Reply to Andy May

September 23, 2023 1:48 pm

So they are showing the effect of massive surface urbanisation.

There is no possible way that using urban tainted surface data can give you a true representation of the Earth’s warming.

bdgwx

Reply to Andy May

September 23, 2023 2:37 pm

I can say confidently that prognostic models (CMIP6) are less correct regarding the global average temperature than diagnostic models (BEST, GISTEMP, ERA5, etc.). The obvious reason is because the prognostic models have to make a prediction of a future state from measurements of a past state whereas the diagnostic models only have to assess the present state using present measurements.

bnice2000

Reply to bdgwx

September 23, 2023 4:10 pm

BEST, GISS et al are built using mostly URBAN and AIRPORT temperatures.

They cannot possibly give a correct global view of the temperature.

Pat Frank

Reply to bnice2000

September 23, 2023 11:27 pm

They’re built on air temperature measurements riddled with massive amounts of systematic error.

bnice2000

Reply to Pat Frank

September 23, 2023 11:40 pm

Yep, well aware of the measurement issues.. and how they have changed with different equipment.

… but apart from that, there is also a lot of spurious local urban and other warming over time.

There is absolutely zero possibility that the surface station data fabrications can give even a remotely true representation of any planetary warming.

sherro01

Reply to Pat Frank

September 24, 2023 12:37 am

Pat,
I have nearly finished an article on UHI using 45 “pristine” Aussie stations and a comparison set of 44 “urban” stations. Like a preview copy?
It is difficult even to select a matching set of stations because anything above about 10 stations of each departs Goldilocks territory and I end up rejecting station after station because of errors and noise (undefined).
My initial expectation was dashed, that Aust would have numerous pristine stations that group into regions like Koppen and provide an estimate of real climate trend without UHI. Multi-year trends for pristine stations over comparable periods like 1970 to 2020 range in Tmax and Tmin from 1 degC negative to above 4 degC per century equivalent with no apparent clustering around a plausible pristine value.
Yet the “experts” allege they can calculate global averages for a year to numbers like +/- 0.1 degC since 1910 or whatever.
By any definition, junk science. Geoff S

bnice2000

Reply to sherro01

September 24, 2023 2:06 am

Unfortunately, even rural stations are very often corrupted by local factors that may not be apparent until visited physically.

And of course, Australia is a VAST country, with many different EVER-changing weather patterns.

Sunsettommy

Reply to Pat Frank

September 24, 2023 8:57 am

Then apply them to climate models and think they are doing good science.

HAW HAW HAW what a pathetic pile of crap they promote are they really that stupid?

Jim Gorman

Reply to Pat Frank

September 25, 2023 10:32 am

The uncertainty of the mean is also the wrong statistics to be using to express uncertainty of measurement. It is in essence a measure of sampling error. That is, how different the samples are in μ and σ. If all the samples equaled μ and σ/√N, the standard error of the mean would be zero. The data itself could have a standard deviation of 200.

It is wrong to throw away the variance of the data.

Pat Frank

Reply to Andy May

September 23, 2023 11:25 pm

“The fact that they don’t match suggests one is wrong.”

Or that they’re both wrong.

bnice2000

Reply to Pat Frank

September 23, 2023 11:43 pm

The best wording would be “at least one is wrong”. 😉

But which one?

And if you have say 10 models that give different results..

.. then “at least 9 are wrong”

and you don’t know which 9.

bdgwx

Reply to Gerald Browning

September 23, 2023 2:18 pm

No. Reanalysis is a type of measurement or diagnostic model. GCMs are a type of predictive or prognostic model. It is true that reanalysis is a more complex model than say the model UAH or Berkeley Earth uses though.

-3

Pat Frank

Reply to Gerald Browning

September 23, 2023 11:24 pm

I’m with you, Jerry.

Reanalysis just constrains models with error-ridden measurement data. The models then interpolate between the points. There’s no reason to think the interpolations are correct. Reanalysis is merely a way to generate fake data.

Gerald Browning

Reply to Pat Frank

September 24, 2023 6:50 am

Hey hi Pat. Can you believe this nonsense? Look at my other post on this site that proves that all the global models are based on the wrong set of dynamical equations.
Minor problem.

Pat Frank

Reply to Gerald Browning

September 24, 2023 7:13 am

I read through your post on models, Jerry, and all the conversation below it. I was very glad to see your corrective publicly applied to the models. The more the better.

The whole present thing is a horrid abuse by science so-called of climate modeling.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 11:24 am

This is climate science at its worst. These are not pure observations but model generated observations compared to those models.

Nick Stokes

Reply to Gerald Browning

September 23, 2023 12:15 pm

“This is climate science at its worst.”

It was Scafetta who made that choice.

-12

sherro01

Reply to Nick Stokes

September 23, 2023 3:37 pm

Nick,
Might I please ask you again, what does the text book say about the optimum statistical method to determine the uncertainty of synthetic measurements?
By synthetic, I mean more than imputed infilling of missing measurements . It also means measurements where one or more factors are knowingly ignored or downplayed in uncertainty estimates. Geoff S

Nick Stokes

Reply to sherro01

September 23, 2023 3:47 pm

You’ll need to look in the text book.

-10

bnice2000

Reply to Nick Stokes

September 23, 2023 4:12 pm

Which one did you use NIck.

Goldilocks and the Three Bears? or Hansel and Gretel?

sherro01

Reply to Nick Stokes

September 24, 2023 12:48 am

Nick,
I have looked for textbook examples without success for years. I have concluded that meaningful statistics should never be used on made-up numbers if a serious use is to be made of the results. The reason is that I can make up another lot of numbers if my first made-ups do not fit my ideals.
By serious, I mean for example political decisions to dictate what type of car I can drive. Geoff S

bdgwx

Reply to sherro01

September 24, 2023 9:37 am

Geoff, have you researched bootstrapping or jackknife resampling? They are top-down techniques that can be used to assess uncertainty even when the sources of error are hard to identify.

-1

Geoff Sherrington

Reply to bdgwx

September 24, 2023 8:33 pm

bdgwx,
Please explain how they incorporare uncertainty into their estimates.
Geoff

bdgwx

Reply to Geoff Sherrington

September 25, 2023 6:05 am

Geoff: Please explain how they incorporare uncertainty into their estimates.

It is probably best to let them explain.

[Rohde et al. 2013]

Jim Gorman

Reply to Geoff Sherrington

September 25, 2023 10:38 am

They can’t because their assumption is that each data point is true and exact. Uncertainty be damned, it isn’t useful.

karlomonte

Reply to Jim Gorman

September 25, 2023 12:03 pm

This is exactly what they believe—a single data point has no “sampling” and therefore can have no uncertainty.

bdgwx

Reply to sherro01

September 23, 2023 5:39 pm

There are multiple methods for doing this. Berkeley Earth, for example, uses jackknife resampling. ERA uses the spread of an ensemble of stochastic perturbations that match the observations.

bnice2000

Reply to bdgwx

September 23, 2023 6:16 pm

And they all use huge amounts of URBAN and AIRPORT data.

Totally inadmissible for “climate change” calculations.

Sunsettommy

Reply to bdgwx

September 24, 2023 9:27 am

Did they get them out of a textbook?

bdgwx

Reply to Sunsettommy

September 24, 2023 10:07 am

Did they get them out of a textbook?

I don’t know. Do academic publications qualify as “textbook”? I ask because Berkeley Earth cites Miller 1974, Tukey 1958, and Quenouille 1949 in their methods paper.

Sunsettommy

Reply to bdgwx

September 24, 2023 2:09 pm

Those are old citations are you sure it hasn’t been changed or improved on since 1949?

bdgwx

Reply to Sunsettommy

September 24, 2023 5:22 pm

I’ve not see anything that would indicate that BEST has changed their method.

Sunsettommy

Reply to bdgwx

September 24, 2023 6:38 pm

I was talking about since 1949……. and before BEST came around.

bdgwx

Reply to Sunsettommy

September 25, 2023 6:04 am

I guess I’m not understanding the question then. Jackknife resampling was documented long before Berkeley Earth was a thing.

karlomonte

Reply to bdgwx

September 25, 2023 6:37 am

Just like you don’t understand propagation of uncertainty.

-1

Jim Gorman

Reply to bdgwx

September 25, 2023 10:43 am

Show how they did any sensitivity analysis using multiple variations that fit inside the experimental standard uncertainty of each data point.

I’ve looked and can find nothing. They all assume that each data point has zero uncertainty, just like every statistical textbook I have examined.

Remember, from Tavg of a day, you are no longer dealing with measurements, but with distributions with uncertainty.

karlomonte

Reply to Jim Gorman

September 25, 2023 12:05 pm

Remember, from Tavg of a day, you are no longer dealing with measurements, but with distributions with uncertainty.

They don’t care—they truly do not care.

bnice2000

Reply to Nick Stokes

September 23, 2023 4:30 pm

Showing how badly the models fail, even using agenda-drive not-data?

MarkW

Reply to Gerald Browning

September 23, 2023 12:27 pm

Is there a climate science at it’s best?

Mr.

Reply to MarkW

September 23, 2023 3:16 pm

Well, there’s not a lot to work with in this field, imo.

But for mine, I reckon among the most honest science produced would be that presented by practising meteorologist Prof. Cliff Mass of University of Washington (state).

He (like quite a few others) fully accepts that human activities over the past century or so have affected the naturally warming climate(s) to a barely discernable degree, but in the broader view of historical climate behaviors, we haven’t got anything to worry about for the foreseeable.

He writes of ~ 1 to 2F detectable warming in the Pacific Northwest over the last 100 years or so. That’s records & observations, not models.

In line with Willis’ observations about the remarkable overall stability of most climates over millenia.

Sunsettommy

Reply to MarkW

September 24, 2023 9:01 am

Yeah before 1980 then the science suddenly improves to a higher level.

Mark BLR

Reply to MarkW

September 25, 2023 7:00 am

Is there a climate science at it’s best?

Yes.

It can be found in the main body of the AR6 WG-I assessment report, where discussion of “uncertainty” and the limitations of models are located.

Even the heavy “filtering / gatekeeping” applied to the SPM cannot keep out all mentions of these issues …

In the “Box SPM.1.2” paragraph inside “Box SPM.1 : Scenarios, Climate Models and Projections”, on page 12 :

Some differences from observations remain, for example in regional precipitation patterns.
…
However, some CMIP6 models simulate a warming that is either above or below the assessed very likely range of observed warming.

Section 1.5.4, “Modelling techniques, comparisons and performance assessments”, page 221 :

Numerical models, however complex, cannot be a perfect representation of the real world.

In the context of the ATL article, “perfect = zero error” (?) …

Section 4.2.5, “Quantifying Various Sources of Uncertainty”, page 566 :

… fitness-for-purpose of the climate models used for long-term projections is fundamentally difficult to ascertain and remains an epistemological challenge (Parker, 2009; Frisch, 2015; Baumberger et al., 2017).
…
However, the long-term perspective to the end of the 21st century or even out to 2300 takes us beyond what can be observed in time for a standard evaluation of model projections, and in this sense the assessment of long-term projections will remain fundamentally limited.

NB : Section 4.3 = “Projected Changes in Global Climate Indices in the 21st Century” and section 4.3.1 = “Atmosphere”.

In sub-section 4.3.1.1, “Surface Air Temperature”, on page 572, the IPCC is quite open about the changes to “projections” when going from the CMIP5 (RCP) model outputs to the CMIP6 (SSP) ones :

In summary, the CMIP6 models show a general tendency toward larger long-term globally averaged surface warming than did the CMIP5 models, for nominally comparable scenarios (very high confidence). In SSP1‐2.6 and SSP2‐4.5, the 5–95% ranges have remained similar to the ranges in RCP2.6 and RCP4.5, respectively, but the distributions have shifted upward by about 0.3°C (high confidence). For SSP5‐8.5 compared to RCP8.5, the 5% bound of the distribution has hardly changed, but the 95% bound and the range have increased by about 20% and 40%, respectively (high confidence). About half of the warming increase has occurred because of more models with higher climate sensitivity in CMIP6, compared to CMIP5; the other half of the warming increase arises from higher effective radiative forcing in nominally comparable scenarios (medium confidence, see Section 4.6.2).

This discrepancy is also analysed in section 4.6.2.2, “Consistency Between Shared Socio-economic Pathways and Representative Concentration Pathways”, on page 618 :

MAGICC7.5 in its WGIII-calibrated setup (see Cross-Chapter Box 7.1) projects differences in 2081–2100 mean warming between the RCP2.6 and SSP1‐2.6 scenarios of around 0.2°C, between RCP4.5 and SSP2‐4.5 of around 0.3°C and between RCP8.5 and SSP5‐8.5 of around 0.3°C (Figure 4.35b). The SSP scenarios also have a wider 5–95% range simulated by MAGICC7.5 explaining about half of the increased range seen when comparing CMIP5 and CMIP6 models. Higher climate sensitivity is, though, the primary reason behind the upper end of the warming for SSP5‐8.5 reaching 1.5°C higher than the CMIP5 results. Compared with the differences between the CMIP5 and CMIP6 multi-model ensembles for the same scenario pairs (Table A6 in Tebaldi et al., 2021), the higher ERFs of the SSP scenarios contribute approximately half of the warmer CMIP6 SSP outcomes (medium confidence).

NB : The title of Chapter 7 of the AR6 WG-I report is The Earth’s Energy Budget, Climate Feedbacks and Climate Sensitivity.

In section 7.1, “Introduction, conceptual framework, and advances since AR5”, on page 929, the IPCC admits that :

The top-of-atmosphere (TOA) energy budget determines the net amount of energy entering or leaving the climate system.
…
It also provides a fundamental test of climate models and their projections.

One of the specific limitations of “reanalyses” products is mentioned in section 7.2.2.3, “Changes in Earth’s surface energy budget”, on page 939 …

ESMs and reanalyses often do not reproduce the full extent of observed dimming and brightening (Wild and Schmucki, 2011; Allen et al., 2013; Zhou et al., 2017a; Storelvmo et al., 2018; Moseid et al., 2020; Wohland et al., 2020), potentially pointing to inadequacies in the representation of aerosol mediated effects or related emission data.

… but they would like the funding to keep flowing regardless.

In section 7.3.1, “Methodologies and representation in models; overview of adjustments”, on page 942 :

While there is currently insufficient corroborating evidence to recommend including tropospheric temperature and water vapour corrections in this assessment, it is noted that the science is progressing rapidly on this topic.

In the Executive Summary to Chapter 8, “Water Cycle Changes”, sub-section “Confidence in Projections, Non-linear Responses and the Potential for Abrupt Changes” on page 1058 addresses another specific issue with “the climate models” …

Representation of key physical processes has improved in global climate models, but they are still limited in their ability to simulate all aspects of the present-day water cycle and to agree on future changes (high confidence).

… and section 11.7.1.3, “Model evaluation”, on page 1588, provides another related to tropical cyclones (“TCs”, AKA “hurricanes”) :

In summary, various types of models are useful to study climate changes of TCs, and there is no unique solution for choosing a model type. However, higher-resolution models generally capture TC properties more realistically (high confidence). In particular, models with horizontal resolutions of 10-60 km are capable of reproducing strong TCs with Category 4-5 and those of 1-10km are capable of the eyewall structure of TCs. Uncertainties in TC simulations come from details of the model configuration of both dynamical and physical processes. Models with realistic atmosphere-ocean interactions are generally better than atmosphere-only models at reproducing realistic TC evolutions (high confidence).

Whoda thunk it ?!?

“Higher resolution” climate models that are “more realistic” perform better than the alternatives …
_ _ _ _ _

When it comes to how the “but scientists are only following the data” discourse has changed since AR6 (2013), as usual a picture paints 1000 words.

The following (if I’ve copied the URL correctly …) is a copy of “FAQ 7.3, Figure 1” from page 1025 (out of 2409) of the AR6 WG-I report.

Note especially how for “future projections” the IPCC scientists compiling AR6 “do not solely rely on models”.

Note also how the AR6 “as judged by the experts” range compares to the “old / obsolete” CMIP5 model ensemble, as opposed to the “new and improved” CMIP6 numbers …

Mark BLR

Reply to Mark BLR

September 25, 2023 7:06 am

As usual, the delay between hitting the “Post Comment” button and spotting an error can be measured in femto-seconds.

It was AR-5 that came out in 2013.

bdgwx

Reply to Gerald Browning

September 23, 2023 2:22 pm

All observations require a measurement model of some kind. Even something as simple as a single spot temperature measurement requires complex thermodynamic, electromagnetic, and material science understanding. A single ASOS temperature report is obtained using a collection of models built on top of more models. If you don’t like model generated observations then you aren’t going to be satisfied with any temperature measurement or science in general for that matter.

-11

Mr.

Reply to bdgwx

September 23, 2023 3:25 pm

Well you’ve convinced me.

Next time I’m in hospital and the care team on my case goes into a huddle to figure out what to do about my raging temperature and erratic pulse, I’ll demand that they revisit the models that they are relying on for their conclusions about my condition.

You may have just saved many lives BDGWX.

Medical science will no doubt ensure that you receive the coveted Fauci Award for “safe and effective” medical treatment practices.

bnice2000

Reply to bdgwx

September 23, 2023 4:32 pm

“model generated observations”

ROFLMAO

There is NO SUCH THING !!

Sunsettommy

Reply to bdgwx

September 24, 2023 9:05 am

Your dependence on irrelevant climate models is why you are not credible here,

Climate models are junk while engineering based models (Designing airplanes, Cars, ships etc.) are good for a reason YOU probably don’t understand.

bnice2000

Reply to bdgwx

September 25, 2023 4:47 am

“requires complex thermodynamic, electromagnetic, and material science understanding”

Which basically ZERO climate scientist have. !!

Jim Gorman

Reply to bdgwx

September 25, 2023 10:52 am

Look at the image and tell us how averaging temperatures reduce uncertainty.

A ±1.8° F max error? Or an RMSE of 0.9° F?

Tell us how averaging reduces that by 3 orders of magnitude!

Andy May

Author

Reply to Gerald Browning

September 23, 2023 11:27 am

Any average global temperature involves a model, there is no way to take a snapshot of global temperature since it changes everywhere instantaneously. ERA5 and BEST both involve models, but at least the models are forced to match what observations we do have, unlike climate models.

Gerald Browning

Reply to Andy May

September 23, 2023 11:38 am

Andy,

Comparing a model with a model especially when both models use the same wrong dynamical system and similar parameterizations (possible tuned the same or different) is nonsensical science.

Jerry

Andy May

Author

Reply to Gerald Browning

September 23, 2023 12:17 pm

Jerry,
That is a bit harsh. Global Average Surface (T2M) Temperature is not a real thing, it must be modeled. At every point on Earth, temperatures constantly change. I’d have some sympathy with what you are saying if we were talking about heat content, but the numbers are so large we would never see a change.

The difference between climate models and reanalysis (ERA5) or composites, like BEST, is that the latter can be tied to data and measurement error can be approximated.

It is this difference that Scafetta is using to test the validity of the climate models. Regardless of the fact both models may be flawed, his comparison has validity.

Pat Frank

Reply to Andy May

September 23, 2023 11:32 pm

“latter can be tied to data and measurement error can be approximated.”

Given our conversations, Andy, it is incredible to me that you would hold that view.

Or tell me you disbelieve all my work.

Andy May

Author

Reply to Pat Frank

September 24, 2023 2:41 am

Oh come on Pat, measurement error can be approximated. Nothing wrong with that statement.

Pat Frank

Reply to Andy May

September 24, 2023 7:15 am

No it can’t. Not air temperature measurement error. It varies unpredictably in time and space. And it’s not random.

karlomonte

Reply to Pat Frank

September 24, 2023 8:42 am

And why does climate science throw away all the standard deviations from all the averaging?

Andy May

Author

Reply to karlomonte

September 24, 2023 8:58 am

Hilarious! That got me laughing. Judging from the argument between Schmidt and Scafetta and many of the comments in this thread, I suspect it is because no one knows how to calculate it, or if they do, they don’t know what it means!

karlomonte

Reply to Andy May

September 24, 2023 9:46 am

You are right on the money, Andy.

Sunsettommy

Reply to Pat Frank

September 24, 2023 9:09 am

Yeah, since air temperature varies all over the world between the measuring stations which only measure in a few feet of space of the air around the measuring device itself most of the planet’s surface isn’t measured at all thus a poor sampling to draw from.

Clyde Spencer

Reply to Pat Frank

September 25, 2023 3:42 pm

My observations suggest that a time-series of still air is highly correlated. The autocorrelation will decrease as windiness increases, particularly if the winds are gusting from different directions. A large step decrease can be expected if a cold front moves through.

Pat Frank

Reply to Clyde Spencer

September 25, 2023 4:10 pm

Lebedeff and Hansen published on correlated air temperatures back in 1987. They found correlation r = 0.5 at about 1000 km separation.

But of course the scatter about the mean was very large.

The large scatter didn’t stop Hansen from extrapolating air temperatures 1000 km into the arctic, though, where no measurements are ever taken.

Jim Gorman

Reply to Andy May

September 25, 2023 10:58 am

Andy,

It isn’t just measurement error. It is the uncertainty introduced via averaging vastly different temperatures from vastly different places.

The uncertainty in the data distribution must be acknowledged too.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 12:49 pm

The ERA can be tied to the obs using a model that can only be kept on track by inserting obs data every 6-12 hours. This does not speak highly of the model accuracy or it’s possible negative impact on the obs data.

bdgwx

Reply to Gerald Browning

September 23, 2023 2:31 pm

ERA assimilates data continuously as it arrives. It is 4D-VAR’d into hourly grids. They provide grid products at the traditional “synoptic” times every 6 hours as well. ERA and other reanalysis datasets have advantages, but they aren’t perfect. In fact, based on a type A evaluation of uncertainty of ERA it could be argued that the traditional datasets (like BEST, HadCRUT, GISTEMP, etc.) are better at measuring the global average temperature.

-3

bnice2000

Reply to bdgwx

September 23, 2023 4:34 pm

WRONG..

GISS, BEST etc do not measure global anything.

They measure mostly urban and airport temperatures

These are not remotely representative of global anything.

Gerald Browning

Reply to Andy May

September 23, 2023 11:51 am

At a minimum of transparency the graphs should have been labeled model generated observations. This is exactly the type of deception used by climate “scientists”.

Nick Stokes

Reply to Gerald Browning

September 23, 2023 12:43 pm

You can’t seem to get it into your thick head that these are Scafetta’s graphs. Scafetta is the local favourite here.

-10

TimTheToolMan

Reply to Nick Stokes

September 23, 2023 2:06 pm

I’d suggest Scarfetta is less favourite here than Gavin is favourite over at RealClimate. Many of us here are sceptical of the science and how its derived and interpreted rather than the people behind it.

Nick Stokes

Reply to TimTheToolMan

September 23, 2023 3:29 pm

Then don’t get your science from Scafetta.

-6

bnice2000

Reply to Nick Stokes

September 23, 2023 4:22 pm

Certainly, NO-ONE is ever going to get any science from Schmidt..

… or from Nick, for that matter. !

TimTheToolMan

Reply to Nick Stokes

September 23, 2023 10:07 pm

Nick says “Then don’t get your science from Scafetta.”

Would you say the same about Steig? He got his trash analysis graphic on the cover of Time Magazine IIRC.

Andy May

Author

Reply to Nick Stokes

September 24, 2023 2:44 am

I’ll take Scafetta over Schmidt, Jones and Kennedy any day of the week after covering this dispute for almost two years.

bdgwx

Reply to Andy May

September 24, 2023 10:02 am

I’ll take Scafetta over Schmidt, Jones and Kennedy any day of the week after covering this dispute for almost two years.

Do you believe that ERA5 has measured the global average temperature with no uncertainty?

Sunsettommy

Reply to Nick Stokes

September 24, 2023 9:24 am

Then don’t get your science from Schmidt.

Meanwhile I lost interest in them years ago as they never stop making changes to them in every new IPCC report which by itself shows they do not have an established forecast skill to run on thus worthless.

bnice2000

Reply to Nick Stokes

September 23, 2023 2:36 pm

Scafetta is mostly correct and is an actual scientist.

Schmidt is mostly wrong and is a paid stooge of the AGW agenda….

Which would you choose to look at first !

bdgwx

Reply to Gerald Browning

September 23, 2023 2:24 pm

All global average temperature datasets are model generated. Even a single spot temperature measurement is model generated. And like Nick mentioned Scafetta is one that choose that particular model.

-2

karlomonte

Reply to bdgwx

September 23, 2023 3:14 pm

You just admitted they are all bogus and fraudulent.

Good job.

Gerald Browning

Reply to karlomonte

September 23, 2023 5:19 pm

Certainly if the model has large errors ( which is the case) mixing the obs and the model data produces a questionable result.

Gerald Browning

Reply to karlomonte

September 23, 2023 5:23 pm

Just for fun it might be interesting to compute the difference between the NCAR reanalysis data and the ECMWF reanalysis data to get a feel for the “observational” error?

Gerald Browning

Reply to Gerald Browning

September 23, 2023 5:27 pm

Anyone up for this?

bdgwx

Reply to Gerald Browning

September 23, 2023 5:33 pm

I’ve done this with MERRA, ERA, and JRA. The type A evaluation of uncertainty is about ±0.15 C.

-2

Gerald Browning

Reply to bdgwx

September 23, 2023 5:53 pm

I am sorry but I am having a hard time believing this about the accuracy of all variables just from the two articles I have just found.

bdgwx

Reply to Gerald Browning

September 24, 2023 5:14 am

It’s not the accuracy of all variables. It is the type A evaluation of uncertainty of just the monthly global average temperature.

Jim Gorman

Reply to bdgwx

September 25, 2023 11:23 am

It is an experimental standard uncertainty of the mean. It basically tells you the interval where the the mean may lay. It is not the uncertainty of the data surrounding the mean.

To be honest, the standard uncertainty of the mean is a sampling statistic. In many papers I’ve read this is determined from the number of stations, i.e., √N.

Gerald Browning

Reply to bdgwx

September 25, 2023 8:46 pm

So the other variables can be in error by 100% between the two reanalysis sets, but their global average temperatures are the same?
Well that just shows how ridiculous that result is. The surface temperature is determined from the state of the remaining variables,
e.g., cloudiness and precipitation. So if those vary in large areas
between the two sets please explain how the daily global (and thus the average) surface temperatures can be close. It must hiding the local inaccuracy of the data sets. Because you have the data, please compute the relative l_2 difference between the other variables and provide them to us.

karlomonte

Reply to Gerald Browning

September 25, 2023 9:14 pm

Climatology is an gigantic exercise in hiding dirty laundry.

Jim Gorman

Reply to Gerald Browning

September 26, 2023 9:28 am

It is very much like some studies in medicine. See here.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1255808/

or here

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2959222/#

From the last article above.

“””Unlike SD, SEM is not a descriptive statistics and should not be used as such. However, many authors incorrectly use the SEM as a descriptive statistics to summarize the variability in their data because it is less than the SD, implying incorrectly that their measurements are more precise.”””

Does this sound like climate science?

Gerald Browning

Reply to bdgwx

September 23, 2023 9:48 pm

Reference,please.

bdgwx

Reply to Gerald Browning

September 24, 2023 5:13 am

MERRA, JRA, and ERA can be downloaded here. Then use the procedure JCGM 100:2008 section 4.2.

I just completed procedure and here are the exact values at 2σ for the monthly values.

MERRA, ERA, JRA: ±0.17 C

UAH, RSS, STAR: ±0.15 C.

GISTEMP, HadCRUT, BEST: ±0.06 C

And sorry I misspoke. It was UAH, RSS, and STAR that was ±0.15 C. MERRA, ERA, and JRA was actually slightly higher at ±0.17 C.

As always I encourage you or anyone to double check my work.

old cocky

Reply to bdgwx

September 24, 2023 3:26 pm

So Scafetta would be correct if he used GISSTEMP instead of ERA?

old cocky

Reply to old cocky

September 24, 2023 4:14 pm

D’oh! ERA was in the L.A. Times thread. That serves me right for flicking between threads 🙁

Carry on.

Nick Stokes

Reply to old cocky

September 24, 2023 4:32 pm

No, because of this basic difference between measurement error and weather variability. These numbers are measurement error (including spatial sampling error, the main one).

But if you are trying to characterise warming over 40 years, why not just subtract 1980 from 2021. Suppose you had zero measurement error, would that not give a perfect answer?

No, because you’d get a different answer if you subtracted 1980 from 2020. That is just due to weather. That is why he chooses a sample of 10 years at each end. It reduces the weather variability effect. And it is that reduced weather uncertainty in a 10 year mean that Gavin et al are calculating. It’s what you want to know.

old cocky

Reply to Nick Stokes

September 24, 2023 5:43 pm

Yeah, bdgwx’s 0.17 and 0.06 figures triggered that particular faux pas.

Actually, it appears he did use ERA, but with BEST uncertainties, which seems incongruous.

What I think Scafetta is trying to do, whether he states it or not, is effectively a treatment trial using the observations as the control. That is indeed quite sensitive to sample selection. The non-stationarity doesn’t help, either. Realistically, all the samples should be compared to determine if the belong to different populations.

Schmidt is taking a different approach, comparing slopes.

It’s an apples and tennis balls comparison.
I don’t think either approach can eliminate the 3 – 4.5 models belonging to the same population as the observations, more because of the model spread than observational uncertainty. The model spread of the high ECS models is high enough to not eliminate them, either. The same would apply to models with an ECS of 0 – 1.5 if they existed and didn’t show cooling. It’s just too noisy.

Nick Stokes

Reply to old cocky

September 24, 2023 8:41 pm

“Schmidt is taking a different approach, comparing slopes.”

No, I think they are the same. The difference between 1980-1990 and 2011-2021 is used as a measure of warming. It isn’t a very good measure, but it’s mainly being used as a statistic to test for difference. The key thing, of course, is that you need the uncertainty of both obs and models.

old cocky

Reply to Nick Stokes

September 24, 2023 9:20 pm

The difference between 1980-1990 and 2011-2021 is used as a measure of warming. It isn’t a very good measure, but it’s mainly being used as a statistic to test for difference.

That’s why it looks like a treatment trial. He’s comparing the end-points – well, fairly broad end-points.

Schmidt is looking at how the observations and models got from point A to point B. That isn’t necessarily a linear progression, although it’s a reasonable approximation in this case.

The key thing, of course, is that you need the uncertainty of both obs and models.

Individual model runs don’t have uncertainties, but there will be a spread of outputs between models and model runs. The inter-model differences should be systematic, and intra-model differences should be pseudo-random due to intentional introduction of noise.

The 2011-21 variance is reasonably high by eye, so that would have been a more relevant critique than using trends. There’s too wide an uncertainty band around the average to reject the hypothesis that both “samples” came from the same population. The use of regression should have been a second line of argument.

Steve’s approach appears superior to either Scafetta’s or Schmidt’s.

bdgwx

Reply to old cocky

September 25, 2023 8:18 am

Just to be clear I was asked to compute the “observation error”. I took that to mean the monthly uncertainty and Gerald Browning didn’t balk at it so I think I did what was requested.

That doesn’t mean it directly applies to Scaffetta and Schmidt debate here. And like Nick is saying one of the components of uncertainty we are focused on here is that caused by short term fluctuations in the global average temperature caused by factors closer to the weather end of the spectrum (eg. ENSO). Schmidt’s method is consistent with the method used in NIST TN 1900 E2. In that example they’re assessment of the uncertainty of a monthly temperature includes a component resulting from weather.

old cocky

Reply to bdgwx

September 25, 2023 4:40 pm

Just to be clear I was asked to compute the “observation error”. I took that to mean the monthly uncertainty

Yeah, I made that comment before having my metaphorical morning coffee.

That doesn’t mean it directly applies to Scaffetta and Schmidt debate here. And like Nick is saying one of the components of uncertainty we are focused on here is that caused by short term fluctuations in the global average temperature caused by factors closer to the weather end of the spectrum

Agreed. as noted in earlier comment above.

Schmidt’s method is consistent with the method used in NIST TN 1900 E2.

Schmidt’s approach to uncertainty, or his overall approach?

The object of the exercise with hypothesis testing is to be able to distinguish between the hypotheses. Hypothesis formulation and experimental design are quite critical there.
Scafetta’s approach is to test whether the model outputs for 2011 – 2021 could have come from the same population as the observations (his H0). He wants to reject that null.
Schmidt didn’t address that, but proposed another pair of hypotheses, using a superset of the data and the null hypothesis that the slopes of the linear regressions are the same. That is a more difficult null to reject.

It all comes down to the variance in both cases.

Jim Gorman

Reply to old cocky

September 25, 2023 5:07 pm

Much of what is done in both is assumes that the values being used have no uncertainty. The only thing being examined uses stated values with no sensitivity analysis done to resolve any uncertainty in those values.

It’s like a physicist saying the speed of light is known exactly and marching ahead ignoring the fact that it is not.

old cocky

Reply to Jim Gorman

September 25, 2023 6:46 pm

Even then, the variance looks high enough to be unable to reject the null. Measurement uncertainty increases the spread.

When hypothesis testing, do the easy bits first.
If you can’t reject H0 on the first pass with lower bounds of variance and uncertainty, there’s no need to look harder.

It appears Scafetta was using claimed measurement uncertainties for his first pass. Even then, he could barely reject one of his nulls because the model spreads are too high, which is why I asked about models with an ECS < 1.5

A more correct approach would have used an appropriate confidence interval around the observation mean (oh, alright, two-tailed t-tests) and compared against individual models/runs. That would have found quite a few of them didn’t belong to the same population as the observations.

karlomonte

Reply to Nick Stokes

September 24, 2023 8:33 pm

Total nonsense, uncertainty is not error.

From where does climate science dig out the true values?

Jim Gorman

Reply to Nick Stokes

September 25, 2023 12:06 pm

Nick,

10 years is 1/3 of the period for determining climate change due to changes in “weather”. Any change in weather over that 10 year period IS significant.

Hiding variability of the signal you are using by averaging is not sçientific. It is akin to p-hacking. In effect it is covering up natural variation that is important to determining the effect of variables.

Nick Stokes

Reply to Jim Gorman

September 25, 2023 1:38 pm

Well, it was Scafetta who chose to average over 10 yars to ge t a measure of warming over 40 years. It is the statistic he used to claim that models were running too hot. You’ve pretty much trashed his paper.

karlomonte

Reply to Nick Stokes

September 25, 2023 5:20 pm

Posted above, I clearly stated they are both wrong.

Jim Gorman

Reply to Nick Stokes

September 25, 2023 5:42 pm

If it is incorrect, then that is your conclusion. As others and myself have pointed out, hiding things like variance through averaging is not a good way to get at most things.

Would you mind if your phone provider averaged out variations in your speech to where no one could tell what you were saying? It is the same thing. I can pretty much do that with the DSP in my radio receiver, but you reach the point where speech becomes unintelligible.

bdgwx

Reply to old cocky

September 24, 2023 5:19 pm

Maybe not. Scafetta’s analysis depends more on the trend uncertainty than the monthly temperature uncertainty.

old cocky

Reply to bdgwx

September 25, 2023 5:01 pm

Thinking about it, Scafetta was trying to bypass the trend by just looking at the (rather broad) endpoints. This is a rather standard approach in other fields (treatment trials). Variances of both samples do need to be handled correctly.

Jim Gorman

Reply to bdgwx

September 25, 2023 11:58 am

Tell everyone what you used for the functional relationship as defined in Equation (1).

If it’s an average as you’ve said before, then tell how you obtain a second value for determining “y” in Section
4.1.4.

4.2.1
In most cases, the best available estimate of the expectation or expected value µq of a quantity q that varies randomly [a random variable (C.2.2)], and for which n independent observations qk have been obtained under the same conditions of measurement (see B.2.15), …”

You might also explain how you reconciled this statement of same conditions of measurement with global temps from all over the globe.

B.2.15 repeatability (of results of measurements)

closeness of the agreement between the results of successive measurements of the same measurand carried out under the same conditions of measurement

NOTE 1 These conditions are called repeatability conditions.

NOTE 3 Repeatability may be expressed quantitatively in terms of the dispersion characteristics of the results.

Look at note 3. See that term again DISPERSION CHARACTERISTICS OF THE RESULTS? What do you think that means.

Then tell us what Taylor has for assumptions when proving the division of σ by √N. There are two at least.

Gerald Browning

Reply to bdgwx

September 23, 2023 5:50 pm

I have found a few articles (one fairly recent one over Brazil) and the ERA precipitation data was the worst. I will get the abstract and post it. There was another earlier one over the polar regions.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 9:05 pm

Here is the first one:

de Lima, J.A.G., Alcântara, C.R. Comparison between ERA Interim/ECMWF, CFSR, NCEP/NCAR reanalysis, and observational datasets over the eastern part of the Brazilian Northeast Region. Theor Appl Climatol 138, 2021–2041 (2019). https://doi.org/10.1007/s00704-019-02921-w

I will post the abstract next

bdgwx

Reply to Gerald Browning

September 24, 2023 9:05 am

Thanks. Unfortunately I’m having a problem opening the full publication right now. I’ll try again later. In the meantime what does it say is the uncertainty of the global average temperature is?

Gerald Browning

Reply to Gerald Browning

September 23, 2023 9:07 pm

Abstract
Many studies have tried to determine the more accurate gridded datasets for a specific region of the world. This subject is complex given all available datasets, modeling approaches, and spatial and temporal resolutions. This study aimed to compare the results of reanalysis derived from climate indexes over eastern part of the Brazilian Northeast Region for temperature extremes and annual accumulated precipitation. Indexes from the Expert Team on Climate Change Detection and Indices were employed to compare the Climate Forecast System Reanalysis, ERA Interim, and National Centers for Environmental Prediction/National Center for Atmospheric Research gridded data with data of 36 stations from the Instituto Brasileiro de Meteorologia network. The results showed that the ERA Interim reanalysis had the lowest root mean square errors when compared to observe accumulated precipitation data and temperature indexes. In addition, this study provided an overview of the geographical characteristics of the error variation for each station studied with the aim of supporting future works.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 9:16 pm

And here is the second one:

ADVANCES IN ATMOSPHERIC SCIENCES, VOL. 27, NO. 5, 2010, 1151–1168
Validation of ECMWF and NCEP–NCAR Reanalysis Data in Antarctica
YU Lejiang1 (), ZHANG Zhanhai1 (), ZHOU Mingyu∗1 (), Shiyuan ZHONG2, Donald LENSCHOW3, Hsiaoming HSU3, WU Huiding4 (), and SUN Bo1 ( )
1Polar Research Institute of China, Shanghai 200136
2Department of Geography, Michigan State University, East Lansing, MI 48824-111, USA 3National Center for Atmospheric Research, Boulder, CO 80307, USA 4National Marine Environmental Forecast Center, Beijing 100081
(Received 8 September 2009; revised 18 December 2009)
ABSTRACT
The European Center for Medium-Range Weather Forecast (ECMWF) Re-Analysis (ERA-40) and the National Centers for Environmental Prediction-National Center for Atmospheric Research (NCEP-NCAR) ECMWF (ERA-40) and NCEP–NCAR reanalysis data were compared with Antarctic station observa- tions, including surface-layer and upper-layer atmospheric observations, on intraseasonal and interannual timescales. At the interannual timescale, atmospheric pressure at different height levels in the ERA-40 data are in better agreement with observed pressure than that in the NCEP–NCAR reanalysis data. ERA-40 re- analysis also outperforms NCEP–NCAR reanalysis in atmospheric temperature, except in the surface layer where the biases are somewhat larger. The wind velocity fields in both datasets do not agree well with surface- and upper-layer atmospheric observations. At intraseasonal timescales, both datasets capture the observed intraseasonal variability in pressure and temperature during austral winter.

-1

Gerald Browning

Reply to Gerald Browning

September 23, 2023 9:24 pm

So here it clearly states that the reanalysis data did not agree well with reality. This is no surprise as the model is being used to interpolate sparse data but the interpolator has serious errors.
At this point let us plot out a few days of both at the same time and their differences.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 9:46 pm

How about vertical component of vorticity at 500 mb?

bdgwx

Reply to Gerald Browning

September 24, 2023 9:26 am

I don’t know about vorticity at 500 mb. Scaffetta is only interested in the global average 2m temperature so I limited my type A evaluation to that and only that. I’d have to process and spatially average the grids manually. Doable. Just a lot of work. I’m also not sure how a global average vorticity would be useful anyway. I don’t even know if we are expecting it to increase, decrease, or stay about the same.

Sunsettommy

Reply to Gerald Browning

September 24, 2023 9:15 am

Even the surface temperature measuring stations around the world produces sparse data since it only measures a few feet of air space around itself thus most of the air remains unmeasured, thus maybe just 2% of the air just above the planet’s surface is actually sampled?

bdgwx

Reply to Gerald Browning

September 24, 2023 9:15 am

I was able to read the second publication. Unfortunately it only has the RMSE for a handful of stations and only for Antarctica at that. It doesn’t quantify the RMSE for the global average temperature. And it is for ERA40; not ERA5.

karlomonte

Reply to bdgwx

September 24, 2023 9:49 am

How do you calculate error without a true value?

Jim Gorman

Reply to karlomonte

September 25, 2023 12:14 pm

You can’t!

Every temperature measurement is a measurement of a new experiment. The only valid uncertainty value is the experimental standard uncertainty of the data.

Nick Stokes

Reply to Jim Gorman

September 25, 2023 9:13 pm

Which is what Schmidt used.

Gerald Browning

Reply to Gerald Browning

September 23, 2023 5:12 pm

I don’t care whose graphs they are. It is deceptive labeling and poor science.

Nick Stokes

Reply to Gerald Browning

September 23, 2023 11:30 pm

But it is Scafetta, not climate science.

bnice2000

Reply to Nick Stokes

September 24, 2023 4:16 am

No, it is Scafetti attempting to show how poor climate science is.

And succeeding.

E. Schaffer

September 23, 2023 10:59 am

Sorry if you think this is off topic, but it is not. There is little use in looking for the correlation between CO2 (forcing) and global temperature, when we already know the latter is driven by another force.

The crucial issue is this here..

Long before 2020 I suggested to cancel air traffic just for the sake of science. My assumption was that the skies would clear up substantially, similar to what happened during 9/11 shutdowns. The expirement was finally done, although because of covid, not for the science. Anyway. In spring 2020 we saw the clearest skies on record. And here is the context..

P. Minnis / NASA
“For a 1% change in absolute cirrus coverage with τ = 0.33, the GCM yielded surface temperature changes (DTs ) of 0.438 and 0.588C over the globe and Northern Hemisphere, respectively”

IPCC (special report on aviation)
“The potential effects of contrails on global climate were simulated with a GCM that introduced additional cirrus cover with the same optical properties as natural cirrus in air traffic regions with large fuel consumption (Ponater et al., 1996). The induced temperature change was more than 1 K at the Earth’s surface in Northern mid-latitudes for 5% additional cirrus cloud cover in the main traffic regions.”

The chart above is from Heerwaarden et al 2021. That paper concludes this..
The coinciding irradiance extreme and a reduction in anthropogenic pollution due to COVID-19 measures triggered the hypothesis that cleaner-than-usual air contributed to the record.
Our analyses show that the reduced aerosols and contrails due to the COVID-19 measures are far less important in the irradiance record than the dry and particularly cloud-free weather.

This is a very simple thing! If aviation does indeed whiten the sky to the extent the data suggest, then they are WITHOUT A DOUBT the main driver of global warming. It will also perfectly fit the pattern of warming (starting in 1970s, predominantly in the NH, excempting the the very south >45°S). That is why Heerwaarden et al had to reject the obvious causality. If not, CO2 driven global warming is busted, for good.

Now the irony is, there are plenty “climate scientists” not even aware of the issue. This is from Leon Simmons on Twitter (I do not really know him, but he is followed by Greta, J. Hansen and A Dessler, so..):

Everyone noticed how the skies changed when planes stopped putting aerosols high in the atmosphere during the pandemic. Emission reductions from the surface happened more gradually, but the effects are more profound and permanent.

https://twitter.com/LeonSimons8/status/1704396513873408293

Peta of Newark

Reply to E. Schaffer

September 23, 2023 11:58 am

That they got any sort of result from contrails is abject nonsense and demonstrates a complete lack of knowledge of climate:

Why: This is something I was told. I had the opportunity to verify on a daily basis.
How: While being a farmer in Cumbria, I was directly under the main flightpath linking London (all airports) with North America.
(Had the folks on Flight 103 Lockerbie looked out the starboard windows, the last thing they’d have seen was me and my farm below them)

At any given time I could see 5 or 6 aircraft in the sky (going both ways) and at night you saw even more thanks to their lights.

The Knowledge was that one could forecast whether it would rain within the next 24 to 36 hours by looking for Cirrus clouds, Mares Tails or: Contrails
All = the same thing really and it was surprising accurate.
Especially useful was the length of the contrail from any individual aircraft – a short trail indicated rain more that 36 hours away and a long, all across the sky trail, meant rain within 12 hours

What that meant was that The Future Weather was already ‘set in motion‘
i.e. The presence, absence or length of a contrail (cirrus cloud) was a consequence of Climate/Weather and NOT THE CAUSE

Frank from NoVA

September 23, 2023 11:01 am

Nick,

So let’s use the eyeball test on the spaghetti plots. Ok, the models stink, and possibly even more than the graphs would indicate if the ‘observations’ are net of the usual NASA-GISS data tampering.

Nick Stokes

Reply to Frank from NoVA

September 23, 2023 12:21 pm

Really? Here is his subplot of models with an ECS in the range 1.5 to 3 C/doubling, basically in the lower half of the IPCC range. The observations are dead centre (see the warming on the right)

-5

bnice2000

Reply to Nick Stokes

September 23, 2023 1:59 pm

Urban temperature are warming.. that is only natural with the majority of surface sites being urban.

There is also no reason whatsoever to assume the warming has been because of human released CO2. UAH shows the only warming has come at El Nino events, so that takes CO2 out of the picture as the cause of the warming.

The base of the AMO was around 1979. That was the period of the great Global Cooling scare.

There was a period of strong solar cycles in the latter half of last century, the trailing average shows solar TSI still very high.

It looks like there is a solar minimum on its way. That will destroy the models completely !

old cocky

Reply to Nick Stokes

September 23, 2023 4:00 pm

Are there any models with an ECS below 1.5?

bnice2000

Reply to old cocky

September 23, 2023 4:42 pm

There will need to be once the coming dip in solar activity starts to bite.

Why do you think the climate stooges and far-left totalitarians are so anxious to get all their anti-western society agenda in place. !

bnice2000

Reply to Nick Stokes

September 23, 2023 4:15 pm

The green in the graph below is a more accurate representation of temperature trends.

It is not as badly affected by urban warming.

It does over-respond to El Nino spikes though.

bnice2000

Reply to bnice2000

September 23, 2023 4:20 pm

note , this was done graphically by scaling to match both axes, then positioned to start at 1980 in line with the Chimps and ERorr5. graphs.

bnice2000

Reply to Nick Stokes

September 23, 2023 4:17 pm

There is actually ZERO EVIDENCE of any CO2 warming in the atmospheric temperature data.

Just ocean effects, mainly ENSO.

Andy May

Author

Reply to Nick Stokes

September 23, 2023 5:05 pm

I’m good with an ECS between 1.5 and 3.0, but I’m with Scafetta on above 3, can’t be that high. It must be below 3 and likely below 2. AR6 says 2.5-4.0, even 2.5 is probably too high.

bnice2000

Reply to Andy May

September 23, 2023 11:48 pm

I’d bet, that when all atmospheric processes are properly considered,..

… CO2 would be responsible for around 0.04% of any warming.

Frank from NoVA

Reply to Nick Stokes

September 23, 2023 5:47 pm

My goodness, Nick! The graphic in your comment used 4 panels to display 38+14+11+13=76 runs, of which only 13 runs are even close to reality. If there was any honor among the modelers, the good ones would be calling BS on the bad ones, but then, that would be the end of the so-called consensus, wouldn’t it?

Rud Istvan

September 23, 2023 11:49 am

Scafetta is right and Schmidt is wrong. But the point is narrow and almost irrelevant to the bigger picture. More broadly:

The CMIP6 models with ECS >= 3 deviate on average by a factor of ~2 from the lower observational EBM ECS estimates of about 1.7.
All CMIP6 models (except INM CM5) produce a tropical troposphere hotspot that does not exist in reality. They run hot where it counts most. INM CM5 has an ECS of 1.8, close to EBM estimates.
Use of hindcast anomalies hides the fact that in actual temperature (C) terms, even after parameter tuning to best Hindcast 30 years, the CMIP6 hindcasts disagree with each other by about +/-3C. That is huge actual hindcast disagreement, when we are supposed to worry about a 1.5C forecast future rise.

Dave Fair

Reply to Rud Istvan

September 23, 2023 1:30 pm

Thanks, Rud, for cutting through the bullshit. Nick and the boys argue minutia while the big stuff flies right over their heads.

bnice2000

Reply to Rud Istvan

September 23, 2023 4:44 pm

Do any of the models show the Nino34 region having zero warming trend (or slight cooling) since 1980?

Jim Masterson

Reply to Rud Istvan

September 23, 2023 4:48 pm

“All CMIP6 models (except INM CM5) produce a tropical troposphere hotspot that does not exist in reality.”

The hotspot is a requirement of the GHE. The fact that it doesn’t actually exist indicates that the science behind the GHE is faulty.

Rud Istvan

Reply to Jim Masterson

September 23, 2023 7:55 pm

Yah. And the reason is telling. INM CM5 parameterized ocean rainfall according to ARGO observations. Much more than previously thought. So much less positive water vapor feedback, since it rains out. So no false Tropical troposphere hotspot, and a much lower ECS. Simple observational lower positive water vapor feedback than in all other CMIP6 models.

Jonny5

Reply to Rud Istvan

September 24, 2023 7:33 am

I appreciate this information, sometimes as a newbie to all this I can get a bit lost in the more technical posts. This is so straightforward and makes a lot of sense, CO2 isn’t scary without feedbacks and new observations show water vapor feedback is less than predicted. When modelled the predictions become much closer to reality showing “dangerous warming” won’t happen, at least not from human caused CO2 increases.

Sunsettommy

Reply to Jim Masterson

September 24, 2023 9:20 am

BINGO!

Some of the two regions that was supposed to show the “hot spot” actually cooled a little bit…….., which is why warmist/alarmists stopped talking about it.

HAW HAW HAW

Bob

September 23, 2023 1:45 pm

Very nice.

ferdberple

September 23, 2023 3:38 pm

Years ago we had a performance problem and our boss brought in the prof that taught him analysis in uni.
I compiled all the data and graphed it out. I then did a statistical analysis on the data and proudly presented my work to the prof.
I was quite surprised when he put aside the analysis and picked up the graph. “This I trust” he remarked. “The other not so much”.

karlomonte

September 23, 2023 5:20 pm

Once again, uncertainty is not error.

Pat Frank

September 23, 2023 11:03 pm

“In fact, a proper analysis of the ensemble of observed global surface temperature members yields a decadal-scale error of about 0.01–0.02°C, as reported in published records. BEST (Berkeley Earth Land/Ocean Temperature record) derives an error of +/- 0.018- 0.020 °C for the 11-year period 2011-2021 (1951-1980 anomalies and the April 2023 version of the BEST dataset).”
Fat farking chance.

Even the aspirated USCRN PRTs have poorer resolution than that — typically ±0.05 C. How is it even possible to claim an accuracy below the instrumental detection limit?

bnice2000

Reply to Pat Frank

September 23, 2023 11:12 pm

“How is it even possible to claim an accuracy below the instrumental detection limit?”

The unicorns do it !

They are very good at maths, you know 😉

karlomonte

Reply to Pat Frank

September 24, 2023 8:38 am

And where did all the standard deviations run away to?

Clyde Spencer

Reply to karlomonte

September 25, 2023 3:59 pm

They were playing with the unicorns and the Arc launched without them.

David Wojick

September 24, 2023 12:56 pm

This now up from Scafetta:
https://judithcurry.com/2023/09/24/comment-and-reply-to-grl-on-evaluation-of-cmip6-simulations/

DMacKenzie

September 24, 2023 3:37 pm

Why do “observations” only go to 2021 ? Truncated series are generally used to hide something.

Jim Gorman

September 25, 2023 12:30 pm

Honestly, much of the measurement discussion is interesting but the fact that averaging hides information shouldn’t be overlooked.

We have all seen numerous locations from all over the globe posted here with little to no warming. I can’t continue ignoring the fact that NO ONE posts stations that have sufficient warming to offset these stations to such an extent as to cause 1.5 – 2° C warming since 1850. If the warmists here want to have some credibility they should start showing some of these stations.

bigoilbob

September 30, 2023 6:30 am

Yes, I’ve been lurking. Sorry/not sorry that this comment thread didn’t work out for Mr. May. As with Dr. Frank, Mr. Eschenbach and others, he ducked out when things went sideways for him. Usually then, there’s a quiet stew period, and the claims will reappear in some other apparition, within a few weeks…

wpDiscuz

Watts Up With That?

More on the statistical dispute between Scafetta and Schmidt

Works Cited

Like this:

Related

Works Cited

Share this:

Like this:

Related