Keeping Stalin Out of Science
January 20, 2009
Back during the summer, we gave a talk at BioBarCamp, an “un-conference” about Open Science here in Palo Alto. The title of the talk was the same as that of this post, and we originally chose it because we thought that sounding provocative (and maybe even a little bizarre) was a good strategy for getting people to attend. You can find the video of the talk here (Click On Demand. Then BioBarCamp. Then the talk with Mark and Jeremy. Then on the thumbnail. I know, it’s like YouTube hybridized with a Rubik’s cube)
Nevertheless, I think given what we talked about (which is also what I’m about to write about here), the title is actually fairly appropriate. Stalin was many things (a lover, a mass-murderer, a gardener), but for our purposes here he was someone who thought the system needed to change, and that the best way of bringing that about was suddenly and radically, violently forcing new ways of doing things on people from the top down. That turned out not to work very well for Russians, and we think there is an analogous issue in to rapidly changing how science is funded and published.
Now, obviously, no one in science is trying to collectivize labs at gunpoint. There are, however, a lot of people who recognize that the recent emergence of information technologies like computers and the World Wide Web is ultimately going to have a significant impact on the way a lot of things are done in science (e. g. who funds research, how they get the funds to the people who plan the experiments, who carries out the experiments, where the data are published and reviewed etc). We’re looking forward to talking about all these different issues in detail in future posts, but, for the moment, suffice it to say that there’s a sense shared by many that grand things are afoot in the research world, and that twenty years from now Science is going to be Open. For example, people will publish first, and get peer-reviewed afterwards.
The point is, it’s one thing to have a vision for where things are (or ought to be) headed. It’s quite another to realize the path for getting there. There is so much excitement about Open Science, such a heightened consciousness of the flaws in the current system, that sometimes it is very tempting to wish that one could just make everyone adopt a radically new method for publishing their research tomorrow. Our contention is that this can’t work for science. Here’s why:
First of all, science is based on trust. We all know how easy it is to get hoaxes into Nature and Science. The reason is that peer-review relies, to a large degree, on the assumption that everyone involved is making a good-faith effort to contribute to knowledge. A referee might pan your submission because he thinks you are mistaken, sloppy, or (unfortunately) a competitor who should be held back, but he is very unlikely to catch you in an outright lie if you craft it carefully enough. The whole point of science is to discover things we don’t know yet, and it’s often challenging to distinguish between a calculated falsehood and a new truth. As a result, we in the scientific community can only make progress together because we can, to some degree, trust that we’re all well-intentioned and engaged collectively in the pursuit of knowledge.
Trust, however, is a fragile thing. It thrives in familiar environments, where the rules of the game aren’t changing every day. Put another way, trust is difficult to maintain if the way things are done gets altered too much too rapidly. Today, I know what to expect when I go to Cell and see what’s in this month’s issue. I don’t know with certainty that everything in Cell is true, but I have a good intuition for the ways in which the system can fail. If tomorrow every journal closed up shop, and all of my peers posted their new discoveries on blogs and then expected people to comment, all my gut instincts about how to tell what’s correct, what needs further refinement, what’s erroneous, and what’s fabricated would be useless. Past experience would not be able to tell me what it would be reasonable to expect from this new system, I wouldn’t have the first notion where to root my feet in solid ground.
I think a lot of scientists feel similarly, and as a result, it is extremely difficult to get a new model for Open publication off the ground if it makes too large a leap away from the current system. You wind up with a chicken and egg problem, where people won’t consider new formats good for sharing research until they’re already established and familiar, which they never will be if no one starts to use them!
Stalin’s solution to this dilemma would be to force everyone to make the leap at gunpoint. Ours is to figure out ways of getting individuals to act voluntarily, inching towards the destination bit by bit, always maintaining and protecting their sense of what is “normal” for a scientist to use the internet to do. We are starting right now by building tools to help people organize their papers, recommend articles to their colleagues, and share documents with their lab mates. Labmeeting is already beginning to succeed at gathering life scientists from around the world into a single point of consensus on the web. Once everyone gets together, and gets used to interacting with their peers in this setting, things are going to get really interesting. To find out how, stay tuned for future posts.
Which countries do the most research?
January 13, 2009
Labmeeting gets visits from people all over the world who are searching for answers to questions about biomedical research subjects. The fascinating thing about this is that it means we can look at our Google Analytics board and tell which countries are most interested in the life sciences.
I just measured traffic for a couple weeks in December and ranked countries by the total number of hits. The result was:
Country Normalized traffic
1. USA 28
2. UK 5.5
3. India 4.0
4. Germany 3.9
5. Canada 3.1
6. France 2.5
7. China 2.4
8. Japan 2.4
9. Italy 2.3
10. Spain 1.7
11. Israel 1.5
12. Netherlands 1.4
13. Australia 1.3
14. Taiwan 1.2
15. Brazil 1
I don’t think this should raise too many eyebrows. It’s the usual suspects: countries that are highly developed, highly populous, or both. But now, let’s re-normalize these hits by the populations I got from the table here. We get:
1. Israel 130
2. USA 50
3. Canada 50
4. UK 49
5. Netherlands 47
6. Australia 36
7. Taiwan 27
8. Germany 26
9. France 22
10. Spain 22
11. Italy 21
12. Japan 10
13. Brazil 3
14. India 2
15. China 1
A few comments, starting at the bottom. It is probably safe to say that sheer population put China, India, and Brazil into the top 15 in raw traffic numbers; they are countries on the grow in R&D, but they’ve got some catching up to do.
Numbers 8-12 are certainly doing much better than the bottom three; Japan’s per capita Labmeeting traffic is a factor of three higher than Brazil’s. That being said, they are all countries that we think of as having very strong research science, and yet as a fraction of their populations, they are getting edged out by Taiwan! Admittedly, size does matter here: Japan obviously has a lot more to throw into R&D in absolute terms than Taiwan does. Still, Australia and Japan have roughly equal income per capita and yet there the contest is not even close.
There is, of course, another explanatory variable here. What do the USA, Canada, the UK, Australia, and Holland all have in common? English, of course! (Basically everyone in Holland speaks English, right?) Labmeeting is not multilingual, and it is not surprising that countries where English is spoken by virtually everyone tend to ask more questions about the life sciences in English. That being said, this is not a trivial finding: English is the lingua franca of the biological sciences, and it is quite interesting to discover that countries like Germany and France, which have traditionally been strong in science, are conducting far fewer searches of the flagship biomedical literature per person. And kudos to Taiwan for beating them despite a higher language barrier and a lower GDP per capita!
I’ll close by saying that while I doubt anyone is shocked to find the US, UK, and Canada in the top four, the dramatic success of Israel here is quite an upset. We are surely all aware of the great research that comes out of places like the Weizmann Institute and Technion, but we rarely stop to think just how much good science is coming out of such a tiny country (there are twice as many people in Tokyo alone). The one caveat here is that these are December numbers, which means countries that celebrate Christmas may have momentarily slackened a bit in their search for knowledge. But the ratio between absolute traffic from Israel and the US was nearly identical last week, so this seems robust. In any case, the moral of the story is: I’m investing in Taiwan and Israel!
Sharing is Searching
January 4, 2009
One of the things we hope Labmeeting will help people do more effectively is find papers that are relevant to their research. This is a challenging problem that needs to be attacked from multiple directions. There are a lot of interesting things one can do through active recommendation, analysis of citations, and human-generated or automated categorization. The approach I want to focus on today, though, is quite basic, but nevertheless extremely powerful. Specifically, I want to talk about collection sharing.
All of us in biomedical research have some colleagues who work on research subjects closely related to our own area of interest. This is especially the case because most of us work in labs that have broad overarching goals and house overlapping projects that tackle different aspects of the same system. As a result many of us find out about papers that are very important to our own research through our lab mates. I’ve got a friend in lab who is really great about always emailing me a copy of any paper he comes across that he thinks might be useful to me. I have to guiltily admit that I am much less useful to him in that respect, though that may perhaps partly be because I don’t browse nearly as many abstracts as he does.
In any case, the question is: why should you need to rely on some friend or co-worker of yours going above and beyond to bring a paper she has discovered to your attention? What if you and she had a way of passively and automatically sharing the papers you each discovered with each other, so that you could each benefit from the other’s efforts to stay up on the literature?
Labmeeting makes this new method of paper discovery possible through collection sharing. Every user of Labmeeting gets to build his own private, searchable collection of PDFs and bibliographic citations. If he wants to, though, he can mutually share collections with a colleague, or he can join a lab and share his collection with everyone else in the lab.
Collection sharing radically alters the way one searches the literature. Suddenly, every individual has dozens of other people toiling away on his behalf to collect papers that might be relevant to him (because they’re actually working to collect the papers for themselves). All he has to do is navigate over to their collections, and he can perform searches for keywords or authors that matter to him in groupings of articles that have already selected by human beings for their relevance to research related to his.
This approach to paper search has the potential to work far better than the smartest algorithm one could devise for searching the entirety of the literature. Even the best general search engine would have a very hard time replicating the selectivity of your dozens of friends and colleagues, who are real human beings making decisions about which papers are interesting enough to be added to their collections. Very often we aren’t even sure what search terms would be best for finding the result we want, and in that case it’s especially important to be able to lean on the vetting provided by other human beings, rather than relying on a machine to figure out what we mean when our language is imprecise.
I think the most interesting thing about this is that there’s a stark difference between searching the web and searching the academic literature. It would be really hard to make a search tool for the whole internet based on this model of friends and bookmarking, because there are just too many different kinds of things that I might want to look for on the internet. I go on Google to find the phone numbers of pizzerias, explanations of black hole physics, and the addresses of distant relatives. It’s too much of a mess.
In contrast, researchers frequently are interested in searching a fairly coherent subspace of the academic literature for articles related to narrowly-defined subject areas. I study protein folding theory. I have a colleague who studies protein misfolding experimentally in yeast, and I am 100% certain that his paper collection is a better place for me to find the articles that matter most to me than the collection of someone else who is working on sequencing the genome of different strains of malaria.
The other thing I like about collection sharing is that it works locally. None of this nonsense of waiting until there are a gazillion people using the site before the “wisdom of the crowds” can be unleashed. If you can get a half dozen of your colleagues to collect papers on Labmeeting and share what they collect with you, you will already be sitting in the driver’s seat of the most powerful search engine for scientific literature related to your own work.
We’ve got a lot of other ideas here at Labmeeting for how to help people search the literature better, but we started by building collection sharing because it’s simple, easy, and has the potential to be much more powerful than any other clever thing we might come up with. Build a paper collection for yourself on Labmeeting, and invite your colleagues and lab mates to do the same. It’s going to make science a lot easier.