September 2011

$latex e^ {\pi i} = -1&s=4$

Thanks, JetPack!

NASA’s Solar Dynamics Observatory captured last week’s X1.4-class solar flare on camera. Higher resolution video and more information here.

The possible existence of heteroscedasticity is a major concern in the application of regression analysis, including the analysis of variance, because the presence of heteroscedasticity can invalidate statistical tests of significance that assume the effect and residual (error) variances are uncorrelated and normally distributed. —Wikipedia

Perhaps I’m overeager to use one of my favorite words, but the more I look at Figure 11 of The Neutrino Preprint, the more I think I see a hint of heteroscedasticity in the residuals. If present, it would support the possibility that the model used for the best fit analysis (a one-parameter family of time-shifted scaled copies of the summed proton waveform) was not appropriate. See my previous post for some background.

Screen shot 2011-09-25 at 22.35.44

The figure above (which is the bottom half of Figure 11) shows the best fit of the complete summed proton waveform (red) vs. the observed neutrino counts (black), summarized using 150 nanosecond bins. For both extractions (left and right), the residuals of the fit (the distances from the red curve to each black dot) appear possibly heteroscedastic in two ways.

First, they seem to be slightly (negatively) correlated with the time scale — positive residuals are more likely towards the beginning of the pulse, negative residuals towards the end. Second, there may be a slight negative correlation of the variance of the residuals with the time scale as well. The residuals seem to become more consistent — vary less in either direction from zero — from left to right. [I didn’t pull out a ruler and calculate any real statistics.]

To be fair, there is little evidence of heteroscedastic residuals in Figure 12 (below), which shows a zoomed-in detail of the beginning and end of each extraction, summarized into 50 nanosecond bins. In all, only about a sixth of the waveform is shown at this resolution. (A data point appears to have been omitted from this figure; between the first two displayed bins in the the second extraction, there should probably be a black point to indicate that zero neutrinos were observed in that 50 ns interval.)

Screen shot 2011-09-25 at 22.43.39

The authors report some tests of robustness; for example, they analyzed daytime and nighttime data separately and found no discrepancy. They also calculated and report a reduced chi-square statistic that indicates a good model fit. They may also have measured the heteroscedasticity of the residuals, but they don’t mention it.

They do say a fair bit about how they obtained the summed proton waveform (the red line) used for the fit, but so far I don’t see any indication that they considered the possibility of a systematic process occurring over the length of each proton pulse that caused the ratio of protons to observed neutrinos to vary.

Then again, I don’t understand every sentence in the paper that might be relevant, such as this one: “The way the PDF [the probability density functions for the proton waveform] are built automatically accounts for the beam conditions corresponding to the neutrino interactions detected by OPERA.” And I’m not a physicist or a statistician.

[I’ve posted a follow-up here: Heteroscedasticity in the Residuals?]

When applying statistics to find a “best fit” between your observation and reality, always ask yourself “best among what?”

The CERN result about faster-than-light neutrinos is based on a best fit. If the authors were too restrictive in their meaning of “among what,” they might have missed figuring out what really happened. And what might have really happened was that the neutrinos they detected had not traveled faster than light.

The data for this experiment was, as usual, a bunch of numbers. These numbers were precisely-measured (by portable atomic clocks and other very cool techniques) arrival times of neutrinos at a detector. The neutrinos were created by shooting a beam of protons into a long tube of graphite. This produced neutrinos, some of which were subsequently observed by a detector hundreds of miles away.

Over the course of a few years, the folks at CERN shot a total of about 100,000,000,000,000,000,000 protons into the tube; they observed about 15,000 neutrinos. The protons were fired in pulses, each pulse lasting about 10 microseconds.

A careful statistical analysis of the data, the authors report, indicates that the neutrinos traveled about 0.0025% faster than the speed of light. Whooooooosh! Furthermore, because the experiment looked at a lot of neutrinos and the results were consistent, the experiment indicates that in all likelihood the true speed of neutrinos was very close to 0.0025% faster than the speed of light, and it was almost without doubt at least faster.

If the experimental design and statistical analysis are correct (and the authors are aware they might not be, though they worked hard to make them correct), this is one of the great experiments of all time.

So far, I haven’t read much scrutiny of the statistical analysis pertaining to the question of “among what?” But Jon Butterworth of The Guardian raised one issue, and I have a similar one.

Look at the graph below, from the preprint.

Screen shot 2011-09-24 at 16.23.45

The statistical analysis of the data was designed to measure how far to slide the red curve (the summed photon waveform) left or right so that the black data points (the neutron observation data) fit it most closely.

The experiment didn’t detect individual neutrinos at the beginning of the trip. The neutrons were produced by 10-microsecond proton bursts, and neutrinos were expected to appear in 10-microsecond bursts at the other end. The time between the bursts, then, should indicate how fast the individual neutrinos traveled.

To get the time between the bursts, slide the graphs back and forth until they align as closely as they can, and then compare the (atomic) clock times at the beginnings and ends of the bursts.

For this to give the right travel time, and more importantly, to be able to evaluate the statistical uncertainty, the researchers appear to have assumed that the shape of the proton burst upstream of the graphite rod exactly matched the shape of the neutrino burst at the detector (once adjusted for the fact that the detector sees about one neutrino for each 10 million billion or so protons in the initial burst).

Why should the shapes match exactly? If God jiggled the detector right when the neutrinos arrived, for example, the shapes might not match. More scientifically plausibly, though, at least to this somewhat-naïve-about-particle-physics mathematician, what if the protons at the beginning of the burst were more likely to create detectable neutrinos than those at the end of the burst? Maybe the graphite changes properties slightly during the burst. [Update: It does, but whether that might affect the result, I don’t know.] Or maybe the protons are less energetic at the end of the bursts because there’s more proton traffic.

The authors don’t tell us why they assume the shapes match exactly. There might be good theory and previous experimental results to support the assumption, but if so, it’s not mentioned in the paper. The authors do remark that a given “neutrino detected by OPERA” might have been produced by “any proton in the 10.5 microsecond extraction time.” But they don’t say “equally likely by any proton.”

If protons generated early in the burst were slightly more likely to yield detectable neutrinos, then the data points at the left of the figure should be scaled down and those at the left scaled up, if the observational data is expected to indicate the actual proton count across the burst.

If that’s the case, then the adjusted data might not have to be shifted quite so far to best match the red curve. And the calculated speed would be different.

Whether this would make enough of a difference to bring the speed below light-speed, I don’t know and can’t guess from what’s in the preprint. And of course, there may be good reasons for same-shape bursts to be a sound assumption.

[Disclaimer: I’m a mathematician, not a statistician or a physicist.]

Come hear the Dessoff Chamber Choir on tour with Ray Davies
performing The Kinks Choral Collection

Dessoff blew the roof off New York’s Town Hall and the Late Show with David Letterman the last time rock legend Ray Davies was in town; this year, we’re taking it on the road!

The Dessoff Chamber Choir backing up Ray Davies at Town Hall in 2010. (Victoria)


Tickets for Montclair and Boston are on sale now. Those for the Beacon go on sale this Saturday, September 17th, at 10:00 am. Philly, a week after that.


FRIDAY, NOVEMBER 18 8:00 pm     On sale now!
The Wellmont Theatre, Montclair, NJ
Tickets: $40, $60, $80

SATURDAY, NOVEMBER 19  8:00 pm  (Sales begin 10 am, Saturday, September 24)
Temple Performing Arts Center, Philadelphia, PA
Tickets: $55                    

SUNDAY, NOVEMBER 20  7:30 pm   Sales begin 10 am, Saturday, September 17)
The Beacon Theatre, New York, NY
Tickets: $49.59-$114.50  

WEDNESDAY, NOVEMBER 23  8:00 pm   On sale now!
The Wilbur Theatre, Boston, MA
Tickets: $66.15, $89.15…

In support of his latest album, See My Friends, Kinks front man Ray Davies plays a four-night East Coast tour, live with full chorus. Best known as the lead singer and songwriter for the classic British rock band, Davies’ 50-year career has yielded some of the most iconic rock songs in history. Performing solo, since the demise of The Kinks in 1996, he counts five albums of his own and receives numerous awards for his talent.

Bill Shanley on guitar, Dick Nolan on bass, Damon Wilson on drums, Ian Gibbons and Gunnar Frick on keyboards, and The Dessoff Chamber Choir join Ray for an unforgettable show.

Ten years have passed since 9/11. The New York Times put the passage of time into days, and hours, and minutes, and seconds in today’s paper. [A Day That Stands Alone]

Three-thousand six-hundred fifty-two days have now passed. At 8:46 a.m. — the time when the first plane slammed into the north tower of the World Trade Center — 87,648 hours had gone by. Another [*]  5,258,880 minutes. Another [†] 315,532,800 seconds.

For the record, 315,532,802 seconds passed between 8:46 a.m. on September 11, 2001, and 8:46 a.m. today, September 11, 2011. The missing seconds were inserted into our collective timeline by the authority of the International Earth Rotation and Reference Systems Service. One of them passed (largely unnoticed, no doubt) at 6:59:60 p.m. on December 31, 2005 (in New York City), and the other occurred at the end of 2008.

As decades go, this one was as short as they come for us, even with its two leap seconds. Many decades include not two, but three occurrences of February 29th, and all decades beginning between 1972 and 1997 have contained more than two leap seconds in addition to the minimum‡ number of two leap year days per decade.

Nothing is simple.

Steve on the Hoboken waterfront, September 1, 2011.

* Do not be distracted in search of the anaphor. It’s missing, and the issue is not addressed here.

† Another Another is missing its anaphor. Press on, dear reader.

‡ The minimum during our lifetime. The last decade to contain only a single leap year (which was the leap year 1896) ended early in 1904, because 1900 was not a leap year, despite its divisibility by four. The next single-leap-year decade will not begin until the year 2096.