Data porn

Making Data Work
Researchers pursue analogy between statistical evidence and thermodynamics
http://www.sciencenews.org/view/feature/id/343207/title/Making_Data_Work

I am far, far from schooled in statistics. The one class in my MBA program and much of the lecture sounded like static and jumbled words. However, one thing I did get out of the class was the incredible fact that the vast majority of experimental results are based on false statistical relevance and the vast majority of people reporting these results incorrectly are totally unaware that they are, in effect, lying. Looking back over many of the experiments I did in the lab I was able to conclude that the magnitude of the differences between the test group and the control group were such that I was comfortable that my conclusions were correct, but I wasn’t able to do that in all cases. Lies, damn lies and statistics and people wield statistics with abandon. As I discussed earlier regarding radiation, it is very simple to make an assumption and produce ‘statistically significant’ data to support your assumption. It could be as trivial as selection bias where the researcher unknowingly ignores data that conflict with his or her assumption. When stated to baldly one would be pardoned if one were to think that all scientists are frauds, but designing experiments to produce useful data is a challenging thing and something that is (at least in my experience) not taught but is learned through mentoring. Sometimes until one has performed a certain number of inconclusive experiments one simply lacks enough information to design an appropriate experiment. Unfortunately, sometimes deadlines refuse to leave enough time for people to be wrong enough to learn to be right and publish what is really bogus results in desperation. Once a certain critical mass of this has become mainstream then the entire mentoring system has been contaminated by people who have been poorly mentored, yet are now mentoring themselves.

I really like the approach the target of the article is taking, but I am not sure that it really will lead to the rigor we need. Thermodynamics are based on huge numbers of individual entities (billions on up), so any measurements are based on large averages. Indeed, lots of interesting science has been recently discovered because of the often dramatic change in behavior when you go from the macro to the micro scale. Since most scientific data is at the micro scale, it seems very plausible to me that making conclusions can be incredibly challenging unless you have a huge signal-to-noise ratio (in many of the experiments I did I measured radioactivity so literally I was looking at signal compared to background; I tended to throw out my stock compounds when the signal dropped to less than 10x background because the results would start to get too noisy).

People who engage in experimental research should be required to take at least a years of statistics and should be required to take courses on experimental design. I am sure that would drive out a lot of students, but I am not sure that a lot of the ‘data’ that has been collected over the last few decades is worth the resources spent on it, so the world might not lose anything if these people turn away from science.

I wonder if non-science people (members of the sheeple class) can feel these statistical lies at an unconscious level. It might help explain the rampant anti-science attitude that is permeating our society today.

Author: Tfoui

He who spews forth data that could be construed as information...

One thought on “Data porn”

  1. I have known professional QA people who, when they didn’t like their failure rate, continued to sample until they got a pass.

    Decades ago it was shown (don’t know if it’s perpetually true) that the top two and bottom two teams in one of the baseball leagues had records that were statistically significant. The rest might as well have met at the beginning of the season and tossed coins for their places.

Comments are closed.