Posts Tagged ‘measurement’

Mismeasuring scientific quality (and an argument in favour of diversity of measurement systems)

December 27, 2010 2 comments

There was a short piece here recently on the misuse of impact factors to measure scientific quality, and how this in turn leads to dependence on drugs like Sciagra™ and other dangerous variants such as Psyagra™ and Genagra™.

Here’s an interesting and important post from Michael Nielsen on the mismeasurement of science. The essence of his argument is straightforward: unidimensional reduction of a multidimensional variable set is going to lead to significant loss of important information (or at least that’s how I read it):

My argument … is essentially an argument against homogeneity in the evaluation of science: it’s not the use of metrics I’m objecting to, per se, rather it’s the idea that a relatively small number of metrics may become broadly influential. I shall argue that it’s much better if the system is very diverse, with all sorts of different ways being used to evaluate science. Crucially, my argument is independent of the details of what metrics are being broadly adopted: no matter how well-designed a particular metric may be, we shall see that it would be better to use a more heterogeneous system.

Nielsen notes three problems with centralised metrics (this can be relying solely on a h-index, citations, publication counts, or whatever else you fancy):

Centralized metrics suppress cognitive diversity: Over the past decade the complexity theorist Scott Page and his collaborators have proved some remarkable results about the use of metrics to identify the “best” people to solve a problem (ref,ref).

Centralized metrics create perverse incentives: Imagine, for the sake of argument, that the US National Science Foundation (NSF) wanted to encourage scientists to use YouTube videos as a way of sharing scientific results. The videos could, for example, be used as a way of explaining crucial-but-hard-to-verbally-describe details of experiments. To encourage the use of videos, the NSF announces that from now on they’d like grant applications to include viewing statistics for YouTube videos as a metric for the impact of prior research. Now, this proposal obviously has many problems, but for the sake of argument please just imagine it was being done. Suppose also that after this policy was implemented a new video service came online that was far better than YouTube. If the new service was good enough then people in the general consumer market would quickly switch to the new service. But even if the new service was far better than YouTube, most scientists – at least those with any interest in NSF funding – wouldn’t switch until the NSF changed its policy. Meanwhile, the NSF would have little reason to change their policy, until lots of scientists were using the new service. In short, this centralized metric would incentivize scientists to use inferior systems, and so inhibit them from using the best tools.

Centralized metrics misallocate resources: One of the causes of the financial crash of 2008 was a serious mistake made by rating agencies such as Moody’s, S&P, and Fitch. The mistake was to systematically underestimate the risk of investing in financial instruments derived from housing mortgages. Because so many investors relied on the rating agencies to make investment decisions, the erroneous ratings caused an enormous misallocation of capital, which propped up a bubble in the housing market. It was only after homeowners began to default on their mortgages in unusually large numbers that the market realized that the ratings agencies were mistaken, and the bubble collapsed. It’s easy to blame the rating agencies for this collapse, but this kind of misallocation of resources is inevitable in any system which relies on centralized decision-making. The reason is that any mistakes made at the central point, no matter how small, then spread and affect the entire system.

What of course is breath-taking is that scientists, who spend so much time devising sensitive measurements of complex phenomena, can sometimes suffer a bizarre cognitive pathology when it comes to how the quality of science itself should be measured.  The sudden rise of the h index is surely proof of that. Nothing can actually substitute for the hard work of actually reading the papers and judging their quality and creativity.  Grillner and colleagues recommend that “Minimally, we must forego using impact factors as a proxy for excellence and replace them with indepth analyses of the science produced by candidates for positions and grants. This requires more time and effort from senior scientists and cooperation from international communities, because not every country has the necessary expertise in all areas of science.” Nielsen makes a similar recommendation.

From the latest Federation of European Neuroscience Societies (FENS) Newsletter – an article from PNAS on ‘The Boon and Bane of the Impact Factor’ (and abuse of the drug ‘Sciagra’)

December 23, 2010 1 comment

A very hard-hitting piece on the abuses of impact factors and their pernicious effects on how science is done. Sten Grillner is a Kavli Prize winner who recently gave a lecture at Trinity College Institute of Neuroscience. It is worth musing on whether or not the widespread use and abuse of impact factors is science’s very own special version of grade inflation.

From FENS: The editorial “Impacting our Young” by Eve Marder (Past President of the American Society for Neuroscience), Helmut Kettenmann (Past President of FENS) and Sten Grillner (President of FENS) has been published in the most recent issue of PNAS (PNAS 2010 107 (50) 21233).

A quote:

It is our contention that overreliance on the impact factor is a corrupting force on our young scientists (and also on more senior scientists) and that we would be well-served to divest ourselves of its influence.

And another:

The hypocrisy inherent in choosing a journal because of its impact factor, rather than the science it publishes,undermines the ideals by which science should be done.

And their advice:

Minimally, we must forego using impact factors as a proxy for excellence and replace them with indepth analyses of the science produced by candidates for positions and grants. This requires more time and effort from senior scientists and cooperation from international communities, because not every country has the necessary expertise in all areas of science.

It reminds me off a piece lampooning impact factors by Uinseonn O’Breathnach (me too) in Current Biology a few years ago, entitled ‘Sciagra‘:

What is it? Sciagra™ is a psychologically self-administered drug that acts on grammar and vocabulary in scientific papers with the aim of improving performance, or at least convincing the user that it does.

How widespread is its use? It’s almost impossible to avoid in impact factor zones above 8. Some disciplines even have their own compounds. Psyagra™ and Genagra™ are particularly dangerous new ‘society’ versions, especially potent and unfortunately accessible to journalists who have to write “It’s the Brain wot does it!” or “Scientists produce creature that is half human, half grant reviewer” stories to tight deadlines.

How do I recognise its use by others? The symptoms are easy to spot. A user will always tell you the impact factor of the journal rather than what the paper is about. They will display an intensity unrelated to the importance of the finding and an inability to cite anything published before 1999. They frequently meet rejection of a paper with a complaint to the editor, and seasoned users may even make unsolicited phone calls to editors to make their complaint.

It seems to be available on open access.

The World Of Big Data – The Daily Dish | By Andrew Sullivan

December 20, 2010 Leave a comment

Great post on ‘The World Of Big Data’ by Andrew Sullivan -reproduced in full below.

In passing, a Government truly interested in developing the smart economy would engage in massive data dumps with the presumption that just about every piece of data it holds (excluding the most sensitive pieces of information) from ministerial diaries to fuel consumption records for Garda cars to activity logs for mobile phones to numbers of toilet rolls used in Government Departments would be dumped in realtime on to externally-interrrogable databases. This would be geek-heaven and would generate new technological applications beyond prediction and application. And the activity would be local – could an analyst sitting in Taiwan really make sense of local nuances? The applications would be universal, portable and saleable, however. They would seed a local high-tech industry – maybe even a local Irish Google. Can’t see the Civil Service going for it, though…

Elizabeth Pisani explains (pdf) why large amounts of data collected by organizations like Google and Facebook could change science for the better, and how it already has. Here she recounts the work of John Graunt from the 17th century:

Graunt collected mortality rolls and other parish records and, in effect, threw them at the wall, looking for patterns in births, deaths, weather and commerce. … He scraped parish rolls for insights in the same way as today’s data miners transmute the dross of our Twitter feeds into gold for marketing departments. Graunt made observations on everything from polygamy to traffic congestion in London, concluding: “That the old Streets are unfit for the present frequency of Coaches… That the opinions of Plagues accompanying the Entrance of Kings, is false and seditious; That London, the Metropolis of England, is perhaps a Head too big for the Body, and possibly too strong.”She concludes:

A big advantage of Big Data research is that algorithms, scraping, mining and mashing are usually low cost, once you’ve paid the nerds’ salaries. And the data itself is often droppings produced by an existing activity. “You may as well just let the boffins go at it. They’re not going to hurt anyone, and they may just come up with something useful,” said [Joe] Cain.

We still measure impact and dole out funding on the basis of papers published in peerreviewed journals. It’s a system which works well for thought-bubble experiments but is ill-suited to the Big Data world. We need new ways of sorting the wheat from the chaff, and of rewarding collaborative, speculative science.

[UPDATE] Something I noticed in The Irish Times:

PUBLIC SECTOR: It’s ‘plus ca change’ in the public service sector, as senior civil servants cling to cronyism and outdated attitudes, writes GERALD FLYNN:

…it seems now that it was just more empty promises – repeating similar pledges given in 2008. As we come to the end of yet another year, there is still no new senior public service structure; no chief information officer for e-government has been appointed; no reconstitution of top-level appointments has taken place; and no new public service board has been appointed [emphasis added].

So nothing will happen.

Self-experimentation – Scientists treating themselve as guinea pigs [from Oscillatory Thoughts: Sir Henry Head’s self-experimentation]

September 12, 2010 1 comment

Oscillatory Thoughts: Sir Henry Head’s self-experimentation: a great post on a long-standing but little known tradition in science – especially physiology and psychology – experimenting on one’s self, usually to do unpleasant and excruciating things that might not pass an ethics committee!

The great evolutionary theorist, JBS Haldane, was famous for this sort of thing. From a New Scientist story:

JBS Haldane’s smoking ear

One self-experimenter whose work had long-term personal consequences was the polymath JBS Haldane.

Haldane wanted to build on work done by his father, John Scott Haldane, on the physiology of working Navy divers in the early 20th century. But whereas Haldane senior restricted himself to observation and measurement, his son took a more direct approach, repeatedly putting himself in a decompression chamber to investigate the physiological effects of various levels of gases.

Haldane was motivated by concern for the welfare of sailors in disabled submarines, and his work led to a greatly improved understanding of nitrogen narcosis, as well as the safe use of various gases in breathing equipment. But he paid a high price, regularly experiencing seizures as a result of oxygen poisoning – one resulting in several crushed vertebrae.

He also suffered from burst eardrums, but he was sanguine about the damage. “The drum generally heals up,” he said, adding, “if a hole remains in it, although one is somewhat deaf, one can blow tobacco smoke out of the ear in question, which is a social accomplishment.”

From the blogpost cited at top:

Following in this fine scientific tradition is the brilliant and influential neurologist (not to mention appropriately named) Sir Henry Head. If that’s not a proper 1960s punny, alliterative, Stan Lee name for a neurologist, I don’t know what is. Anyway, the good Dr. Head published quite a ground-breaking article with his collaborator WHR Rivers in the journal Brain in 1908 titled A Human Experiment in Nerve Division. In this article, Head and Rivers sought to examine the course of recovery of somatosensation after peripheral nerve damage. It was known from observing patients with such damage that the touch senses often recover after peripheral nerve damage, but because the patients weren’t properly trained, they couldn’t give an adequate account of their own recovery. As they say:

“It soon became obvious that many observed facts would remain inexplicable without experimentation carried out more carefully and for a longer period than was possible with a patient, however willing, whose ultimate object in submitting himself to observation is the cure of his disease.”So Head’s solution? Cut open his arm and sever some nerves! Dr. Head enlisted the assistance of another doctor to surgically sever some of the peripheral nerves in his left arm and hand.

I recall reading a case report by the late (and great) OJ Grusser where he injected the dissociative anaesthetic and glutamergic antagonist ketamine into his own eyeball to study its effects on optokinetic nystagmus (the tracking of moving objects when the head is stationary). The ketamine would inactivate transmission in the extra-ocular muscles, reducing reflexive tracking eye movements to moving objects. (I can’t locate the reference at the moment – my recollection is that it was in a book chapter). Not an experiment to be undertaken lightly!

Another University Ranking System – The Washington Monthly

Check out another university ranking system – this is for the USA only.

Below are the Washington Monthly‘s 2009 national university college rankings. We rate schools based on their contribution to the public good in three broad categories: Social Mobility (recruiting and graduating low-income students), Research (producing cutting-edge scholarship and PhDs), and Service (encouraging students to give something back to their country). For an explanation of each category, click here. For more information about the overall goals of the rankings, click here. To learn more about our methodology, click here.

UC Berkeley is number one on this system, and Harvard (usually number 1 on other systems) falls to 11th place.  As ever, this shows the critical importance of the categories and weightings used in any ranking system. UCSD, UCLA, Stanford and Texas A&M make up places 2, 3, 4 & 5, respectively.

Full rankings here.

Roll-on the QS and THE rankings: it will be very interesting to see the strength of correlation between these two ranking systems.

Science & Technology | Data | The World Bank

April 20, 2010 Leave a comment

Science & Technology | Data | The World Bank.

Technological innovation, often fueled by governments, drives industrial growth and helps raise living standards. Data here aims to shed light on countries technology base: research and development, scientific and technical journal articles, high-technology exports, royalty and license fees, and patents and trademarks. Sources include the UNESCO Institute for Statistics, the U.S. National Science Board, the UN Statistics Division, the International Monetary Fund, and the World Intellectual Property Organization.

A huge body of comparative data are available for free; the ‘dive-in’ maps are particularly arresting.

Comparative indicators presented  include (and Ireland looks to be at about a consistent 50% or so of Finland’s level of performance on both the input and output side):

How Professors spend their time

March 15, 2010 Leave a comment