Joho the Blog » science

April 29, 2012

[2b2k] Pyramid-shaped publishing model results in cheating on science?

Carl Zimmer has a fascinating article in the NYTimes, which is worth 1/10th of your NYT allotment. (Thank you for ironically illustrating the problem with trying to maintain knowledge as a scarce resource, NYT!)

Carl reports on what may be a growing phenomenon (or perhaps, as the article suggests, the bugs of the old system may just now be more apparent) of scientists fudging results in order to get published in the top journals. From my perspective the article provides yet another illustration how the old paper-based strictures on scientific knowledge caused by the scarcity of publishing outlets results not only in a reduction in the flow of knowledge, but a degradation of the quality of knowledge.

Unfortunately, the availability of online journals (many of which are peer-reviewed) may not reduce the problem much even though they open up the ol’ knowledge nozzle to 11 on the firehosedial. As we saw when the blogosphere first emerged, there is something like a natural tendency for networked ecosystems to create hubs with a lot of traffic, along with a very long tail. So, even with higher capacity hubs, there may still be some pressure to fudge results in order to get noticed by these hubs, especially since tenure decisions continue to place such high value on a narrow understanding of “impact.”

But: 1. With a larger aperture, there may be less pressure. 2. When readers are also commentators and raters, bad science may be uncovered faster and more often. Or so we can hope.

(There is the very beginnings of a Reddit discussion of Carl’s article here.)

Follow me

Categories: science, too big to know Tagged with: 2b2k • publishing • science Date: April 29th, 2012 dw

3 Comments »

April 23, 2012

[2b2] Structure of Scientific Revolutions, 50 years later

The Chronicle of Higher Ed asked me to write a perspective on Thomas Kuhn’s The Structure of Scientific Revolutions since this is the 50th year since it was published. It’s now posted.

Follow me

Categories: science, too big to know Tagged with: 2b2k • paradigms • science Date: April 23rd, 2012 dw

1 Comment »

April 22, 2012

[2b2k] Astounding two-minute video edit from NASA’s Cassini and Voyager missions – Only if you love Saturn, Jupiter, and, you know, the Universe

Outer Space from Sander van den Berg on Vimeo.

Follow me

Categories: science, too big to know Tagged with: 2b2k • space • video Date: April 22nd, 2012 dw

1 Comment »

March 2, 2012

[2b2k] TIL: Edward Jenner’s smallpox paper was rejected by the Royal Society

Edward Jenner is credited as the discoverer — or perhaps inventor would be the more apt word — of vaccination as a technique to prevent smallpox. That’s pretty much all that I knew, except for the story about milkmaids who got cowpox not getting smallpox. But I just read a really interesting article about the history of small pox at the National Institute of Health, by Stefan Riedel.

“TIL” is Reddit-speak for “Today I learned.” And today I also learned that “As early as 430 BC, survivors of smallpox were called upon to nurse the afflicted” in order to protect them. Today I also learned that “Inoculation…was likely practiced in Africa, India, and China long before the 18th century, when it was introduced to Europe.” And today I also learned that “It was the continued advocacy of the English aristocrat Lady Mary Wortley Montague that was responsible for the introduction of variolation [inoculation] in England.”

Follow me

Categories: science, too big to know Tagged with: 2b2k • science • smallpox • til Date: March 2nd, 2012 dw

1 Comment »

February 29, 2012

[2b2k] The next Darwin is a we

Sebastian Benthall has a fervent post about the need for open networks in science, inspired by an awesome talk by the awesome Victoria Stodden.

Along the way, he offers a correction (or extension, perhaps) of a point that I make in 2b2k: the next Darwin is likely to develop her work within an open network that add values to her work. In some real sense the knowledge lives in that network. Sebastian responds:

He’s right, except maybe for one thing, which is that this digital dialectic (or pluralectic) implies that “the next Darwin” isn’t just one dude, Darwin, with his own ‘-ism’ and pernicious Social adherents. Rather, it means that the next great theory of the origin of species is going to be built by a massive collaborative effort in which lots of people will take an active part. The historical record will show their contributions not just with the clumsy granularity of conference publications and citations, but with minute granularity of thousands of traced conversations. The theory itself will probably be too complicated for any one person to understand, but that’s OK, because it will be well architected and there will be plenty of domain experts to go to if anyone has problems with any particular part of it. And it will be growing all the time and maybe competing with a few other theories.

I love the point.

(Nit: I want to clarify, however, that I wasn’t saying that this next Darwin’s web would consist only of “pernicious Social adherents.” Throughout 2b2k I try to make the point that networked knowledge has value mainly because it includes difference and disagreement. When it does not, it fulfills the nightmare of the echo chamber.)

Follow me

Categories: science, social media, too big to know Tagged with: 2b2k • darwin • open science • science • victoria stodden Date: February 29th, 2012 dw

1 Comment »

February 20, 2012

Request for help: Structure of Sci Revs, 50 years later

I may be agreeing to write a relatively short article — 1,500-2,500 words — on the fiftieth anniversary of Thomas Kuhn’s Structure of Scientific Revolutions. What sources and effects should an article about that book’s legacy simply not miss?

Thanks for whatever help you can give helping me avoid missing something obvious.

Follow me

Categories: science Date: February 20th, 2012 dw

1 Comment »

February 4, 2012

[2b2k] The corruption of impact

According to a survey publishsed in Science [abstract][Slashdot] scientists are routinely pressured to include superfluous references in their papers in order to boost the Impact Factor of the journal publishing their paper. The Impact Factor is (roughly) a measure of the importance/influence of a journal, based on a two year average of how often its papers are cited. Careeers are made by publishing in high Impact Factor journals.

This sort of corruption (which I talk about a bit in Too Big to Know) might seem like an inevitable imprecision in how we gauge something as vague as “infuence” if alternatives were not becoming available. Services like Mendeley can provide real-time readouts of which articles are being read and commented on. Google likewise can see how often articles are being linked to. Facebook can see how articles are being passed around social networks, some of which are quite expert. It would of course be good to have measures not gated by commercial entities. In any case, institutions of knowledge are currently relying upon an instrument that was always too blunt and now known to be corrupt.

Follow me

Categories: science, social media, too big to know Tagged with: 2b2k • impact factor Date: February 4th, 2012 dw

Be the first to comment »

January 3, 2012

[2b2k] Moi moi moi

Because it’s book launch day, there’s more about me to post than usual or than I’m comfortable with. Nevertheless:

The Atlantic is running a substantial excerpt of the chapter on scientific knowledge.

And I had a really fun hour on Colin McEnroe’s show on WNPR in Connecticut this afternoon. They’ve already posted it. Colin’s a great interviewer, and I appreciate having the full hour with him.

Follow me

Categories: science, too big to know Tagged with: 2b2k Date: January 3rd, 2012 dw

2 Comments »

January 2, 2012

[2b2k] Correlation’s diminishing returns

Jonah Lehrer has a terrific article at Wired about the limitations of the causality model as data scales up and becomes more complex. (I’m over-simplifying to the point of inaccuracy.)

Follow me

Categories: science, too big to know Tagged with: 2b2k Date: January 2nd, 2012 dw

Be the first to comment »

October 25, 2011

[berkman] [2b2k] Michael Nielsen on the networking of science

Michael Nielsen is giving a Berkman talk on the networking of science. (It’s his first talk after his book Reinventing Discovery was published.)

NOTE: Live-blogging. Getting things wrong. Missing points. Omitting key information. Introducing artificial choppiness. Over-emphasizing small matters. Paraphrasing badly. Not running a spellpchecker. Mangling other people’s ideas and words. You are warned, people.

He begins by telling the story of Tim Gowers, a Fields Medal winner and blogger. (Four of the 42 living Fields winners have started blogs; two of them are still blogging.) In January 2009, Gowers started posting difficult problems on his blog, and work on the problem in the open. Plus he invited the public to post ideas in the comments. He called this the Polymath Project. 170,000 words in the comments later, ideas had been proposed and rapidly improved or discarded. A few weeks later, the problem had been solved at an even higher level of generalization.

Michael asks: Why isn’t this more common? He gives an example of the failure of an interesting idea. It was proposed by a grad student in 2005. Qwiki was supposed to be a super-textbook about Quantum Mechanics. The site was well built and well marketed. “But science is littered with examples of wikis like this…They are not attracting regular contributors.” Likewise many scientific social networks are ghost towns. “The fundamental problem is one of opportunity costs. If you’re a young scientist, the way you build your career is through the publication of scientific papers…One mediocre crappy paper is going to do more your career than a series of brilliant contributions to a wiki.”

Why then is the Polymath Project succeeding? It just used an unconventional means to a conventional means: they published two papers out of it. Sites like Qwiki that are an end in themselves are not being exploited. We need a “change in norms in scientific culture” so that when people are making decisions about grants and jobs, people who contribute to unconventional formats are rewarded.

How do you achieve a change in the culture. It’s hard. Take the Human Genome project. In the 1990s, there wasn’t not a lot of advantage to individual scientists to share their data. In 1996, the Wellcome Trust held a meeting in Bermuda and agreed on principles that said that if you took more than a thousand base pairs, you need to release it to a public database and be put into the public domain. The funding agencies baked those principles into policy. In April 2000, Clinton and Blair urged all countries to adopt similar principles.

For this to work, you need enthusiastic acceptance, not just a stick beating scientists into submission. You need scientists to internalize it. Why? Because you need all sorts of correlative data to make lab data useful. E.g., Sloane Digital Sky Survey: a huge part of the project was establishing the calibration lines for the data to have meaning to anyone else.

Many scientists are pessimistic about this change occuring. But there’s some hopeful precedents. In 1610 Galileo pointed his telescope at Saturn. He was expecting to see a small disk. But he saw a disk with small knobs on either side — the rings, although he couldn’t resolve the image further. He sent letters to four colleagues, including Kepler that scrambled his discovery into an anagram. This way, if someone else made the discovery, Galileo could unscramble the letters and prove that he had made the discovery first. Leonardo, Newton, Hooks, Hyugens all did this. Scientific journals helped end this practice. The editors of the first journals had trouble convincing scientists to reveal their info because there was no link between publication and career. The editor of the first scientific journal (Philosophical Transactions of the Royal Society) goaded scientists into publishing by writing to them suggesting other scientists were about to disclose what the recipients of the letter were working on. As Paul David [Davis? Couldn’t find it via Google] says, the change to the modern system was due to “patron pressure.”

Michael points out that Galileo immediately announced the discovery of four moons of Jupiter in order to get patronage bucks from the Medicis for the right to name them. [Or, as we would do today, The Comcast Moon, the Staples Moon, and the Gosh Honey Your Hair Smells Great Moon.]

Some new ideas: The Journal of Visualized Experiments videotapes lab work, thus revealing tacit knowledge. Geiger Science (from Springer) publishes data sets as first-class objects. Open Research Computation makes code into a first-class object. And blog posts are beginning to show up on Google Scholar (possible because they’re paying attention to tags?). So, if your post is being cited by lots of articles, your post will show up at Scholar.

[in response to a question] A researcher claimed to have solved the P not-P problem. One of the serious mathematicians (Cook) said it was a serious solution. Mathematicians and others tore it apart on the Web to see if it was right. About a week later, the consensus was that there was a serious obstruction, although they salvaged a small lemma. The process leveraged expertise in many different areas — statistical physics, logic, etc.

Q: [me] Science has been a type of publishing. How does scientific knowledge change when it becomes a type of networking?
A: You can see this beginning to happen in various fields. E.g., People at Google talk about their sw as an ecology. [Afterwards, Michael explained that Google developers use a complex ecology of libraries and services with huge numbers of dependencies.] What will it mean when someone says that the Higgs Boson has been found at the LHC? There are millions of lines of code, huge data sets. It will be an example of using networked knowledge to draw a conclusion where no single person has more than a tiny understanding of the chain of inferences that led to this result. How do you do peer review of that paper? Peer review can’t mean that it’s been checked because no one person can check it. No one has all the capability. How do you validate this knowledge? The methods used to validate are completely ad hoc. E.g., International Panel on Climate Change has more data than any one person can evaluate. And they don’t have a method. It’s ad hoc. They do a good job, but it’s ad hoc.

Q: Classification of Finite Groups were the same. A series of papers.
A: Followed by a 1200 word appendix addressing errors.

Q: It varies by science, of course. For practical work, people need access to the data. For theoretical work, the person who makes the single step that solves it should get 98% of the credit. E.g., Newton v. Leibniz on calculus. E.g., Perleman‘s approach to the Poincaré conjecture.
A: Yes. Perelman published three papers on a pre-press server. Afterward, someone published a paper that filled in the gaps, but Perelman’s was the crucial contribution. This is the normal bickering in science. I would like to see many approaches and gradual consensus. You’ll never have perfect agreement. With transparency, you can go back and see how people came to those ideas.

Q: What is validation? There is a fundamental need for change in the statistical algorithms that many data sets are built on. You have to look at those limitations as well as at the data sets.
A: There’s lots of interesting things happening. But I think this is a transient problem. Best practices are still emerging. There are a lot of statisticians on the case. A move toward more reproducible research and more open sharing of code would help. E.g., many random generators are broken, as is well known. Having the random generator code in an open repository makes life much easier.

Q: The P v not-P left a sense that it was a sprint in response to a crisis, but how can it be done in a more scalable way?
A: People go for the most interesting claims.

Q: You mentioned the Bermuda Principles, and NIH requires open access pub one year after paper pub. But you don’t see that elsewhere. What are the sociological reasons?
Peter Suber: There’s a more urgent need for medical research. The campaign for open access at NSF is not as large, and the counter-lobby (publishers of scientific journals) is bigger. But Pres. Obama has said he’s willing to do it by executive order if there’s sufficient public support. No sign of action yet.

Q: [peter suber] I want to see researchers enthusiastic about making their research public. How do we construct a link between OA and career?
A: It’s really interesting what’s going on. A lot of discussion about supporting gold OA (publishing in OA journals, as opposed to putting it into an OA repository). Fundamentally, it comes down to a question of values. Can you create a culture in science that views publishing in gold OA journals as better than publishing in prestigious toll journals. The best way perhaps is to make it a public issue. Make it embarrassing for scientists to lock their work away. The Aaron Swartz case has sparked a public discussion of the role publishers, especially when they’re making 30% profits.
Q: Peter: Whenever you raise the idea of tweaking tenure criteria, you unleash a tsunami of academic conservativism, even if you make clear that this would still support the same rigorous standards. Can we change the reward system without waiting for it to evolve?
A: There was a proposal a few years ago that it be done purely algorithmic: produce a number based on the citation index. If it had been done, simple tweaks to the algorithm would have been an example: “You get a 10% premium for being in a gold OA journal, etc.”
Q: [peter] One idea was that your work wouldn’t be noticed by the tenure committee if it wasn’t in an OA repository.
A: Spiers [??] lets you measure the impact of your pre-press articles, which has had made it easier for people to assess the effect of OA publishing. You see people looking up the Spiers number of a scientist they just met. You see scientists bragging about the number of times their slides have been downloaded via Mendeley.

Q: How can we accelerate by an order of magnitude in the short term?
A: Any tool that becomes widely used to measure impact affects how science is done. E.g., the H Index. But I’d like to see a proliferation of measures because when you only have one, it reduces cognitive diversity.

Q: Before the Web, Erdos was the moving collaborator. He’d go from place to place and force collaboration. Let’s duplicate that on the Net!
A: He worked 18 hours a day, 365 days/year, high on amphetamines. Not sure that’s the model :) He did lots of small projects. When you have a large project, you bring in the expertise you need. Open collaboration has the unpredictable spread of expertise that participates, and that’s often crucial. E.g., Einstein never thought that understanding gravity required understanding non-standard geometries. He learned that from someone else [missed who]. That’s the sort of thing you get in open collaborations.

Q: You have to have a strong ego to put your out-there paper out there to let everyone pick it apart.
A: Yes. I once asked a friend of mine how he consistently writes edgy blog posts. He replied that it’s because there are some posts he genuinely regrets writing. That takes a particular personality type. But the same is true for publishing papers.
Q: But at least you can blame the editors or peer reviewers.
A: But that’s very modern. In the 1960s. Of Einstein’s 300 papers, only one was peer reviewed … and that one was rejected. Newton was terribly anguished by the criticism of his papers. Networked science may exacerbate it, but it’s always been risky to put your ideas out there.

[Loved this talk.]

Follow me

Categories: berkman, open access, science, too big to know Tagged with: 2b2k • berkman • open access • science Date: October 25th, 2011 dw

2 Comments »

« Previous Page | Next Page »