June 29, 2009 at 2:05 pm · Filed under The future of science
The advancement of Web 2.0 social technology such as blogs has resulted in traditional businesses coming under attack, with a reduction in ‘barriers to entry’ of certain industries such as News Media. This article states that blogs will result in scientists able to easily publish and review their colleagues work via the internet, possibly resulting in the disruption of the scientific publishing community. This could result in the destruction of traditional venues of publishing such as print based journals.
Part I: How Industries Fail
Until three years ago, the oldest company in the world was the construction company Kongo Gumi, headquartered in Osaka, Japan. It was founded in 578 CE to help construct the first Buddhist temple in Japan, the Shitenno-ji. The Kongo Gumi continued in the construction trade for almost one and a half thousand years. In 2005, the company had more than 100 employees, and 70 million dollars in revenue. But in 2006, Kongo Gumi went into liquidation, and as an independent entity no longer exists.
- How is it that large, powerful organizations, can simply disappear?
There are two common explanations for the disruption of industries. The first explanation is essentially that the people in charge of the failing industries are stupid. Polite critics phrase their explanations less bluntly, but nonetheless many explanations boil down to a presumption of stupidity. The second common explanation for the failure of an entire industry is that the people in charge are malevolent. In that explanation, evil record company and newspaper executives have been screwing over their customers for years, simply to preserve a status quo that they personally find comfortable.
Even smart and good organizations can fail in the face of disruptive change, and that there are common underlying structural reasons why that’s the case. If you think the newspapers and record companies are stupid or malevolent, then you can reassure yourself that provided you’re smart and good, you don’t have anything to worry about. But if disruption can destroy even the smart and the good, then it can destroy anybody. Scientific publishing is in the early days of a major disruption, with similar underlying causes, and will change radically over the next few years.
- Why online news is killing the newspapers
Some people explain the slow death of newspapers by saying that blogs and other online sources  are news parasites, feeding off the original reporting done by the newspapers. That’s false. Many of the top blogs do excellent original reporting. A good example is the popular technology blog TechCrunch, started by Michael Arrington in 2005, TechCrunch has rapidly grown, and now employs a large staff. Part of the reason it’s grown is because TechCrunch’s reporting is some of the best in the technology industry. Yet whereas the New York Times is wilting financially , TechCrunch is thriving, because TechCrunch’s operating costs are far lower, per word, than the New York Times. The result is that not only is the audience for technology news moving away from the technology section of newspapers and toward blogs like TechCrunch, the blogs can undercut the newspaper’s advertising rates. This depresses the price of advertising and causes the advertisers to move away from the newspapers.
There’s little they can do to make themselves cheaper to run. For example; let’s focus on one aspect of newspapers: photography. Newspapers employ photographers to take photos for articles, whereas when TechCrunch or a similar blog needs a photo for a post, they’ll use a stock photo, or ask their subject to send them a snap. The average cost is probably tens of dollars.
TechCrunch isn’t being any smarter than the newspapers. It’s not as though no-one at the newspapers ever thought “Hey, why don’t we ask interviewees to send us a polaroid, and save some money?” Newspapers employ photographers for an excellent business reason: good quality photography is a distinguishing feature that can help establish a superior newspaper brand. For a high-end paper, it’s probably historically been worth millions of dollars to get stunning. It makes complete business sense to spend a few hundred dollars per photo.
What can you do, as a newspaper editor? You could fire your staff photographers. But if you do that, you’ll destroy the morale not just of the photographers, but of all your staff. You’ll stir up the Unions. You’ll give a competitive advantage to your newspaper competitors. And, at the end of the day, you’ll still be paying far more per word for news than TechCrunch, and the quality of your product will be no more competitive.
- Local Optimum
The problem is that your newspaper has an organizational architecture which is, to use the physicists’ phrase, a local optimum. Relatively small changes to that architecture – like firing your photographers – don’t make your situation better, they make it worse. So you’re stuck gazing over at TechCrunch, who is at an even better local optimum, a local optimum that could not have existed twenty years ago:
Unfortunately for you, there’s no way you can get to that new optimum without attempting passage through a deep and unfriendly valley. The result is that the newspapers are locked into producing a product that’s of comparable quality (from an advertiser’s point of view) to the top blogs, but at far greater cost. And yet all their decisions – like the decision to spend a lot on photography – are entirely sensible business decisions. Even if they’re smart and good, they’re caught on the horns of a cruel dilemma.
- Organizational immune systems
Organizations are large, complex structures, and to survive and prosper they must contain a sort of organizational immune system dedicated to preserving that structure. If they didn’t have such an immune system, they’d fall apart in the ordinary course of events. Most of the time the immune system is a good thing, a way of preserving what’s good about an organization, and at the same time allowing healthy gradual change. But when an organization needs catastrophic gut-wrenching change to stay alive, the immune system becomes a liability.
Imagine someone at the New York Times had tried to start a service like Google News, prior to Google News. Even before the product launched they would have been constantly attacked from within the organization for promoting competitors’ products. They would likely have been forced to water down and distort the service, probably to the point where it was nearly useless for potential customers. And even if they’d managed to win the internal fight and launched a product that wasn’t watered down, they would then have been attacked viciously by the New York Times’ competitors, who would suspect a ploy to steal business. Only someone outside the industry could have launched a service like Google News.
- What are the signs of impending disruption?
New technologies often don’t look very good in their early stages, and that means a straightup comparison of new to old is little help in recognizing impending dispruption. That’s a problem, though, because the best time to recognize disruption is in its early stages. The journalists and newspaper editors who’ve only recognized their problems in the last three to four years are sunk. They needed to recognize the impending disruption back before blogs looked like serious competitors, when evaluated in conventional terms.
- Part II: Is scientific publishing about to be disrupted?
Today, scientific publishers are production companies, specializing in services like editorial, copyediting, and, in some cases, sales and marketing. My claim is that in ten to twenty years, scientific publishers will be technology companies . I mean they’ll be technology-driven companies in a similar way to, say, Google or Apple. That is, their foundation will be technological innovation, and most key decision-makers will be people with deep technological expertise. Those publishers that don’t become technology driven will die off.
Predictions that scientific publishing is about to be disrupted are not new. What’s new today is the flourishing of an ecosystem of startups that are experimenting with new ways of communicating research, some radically different to conventional journals. Consider Chemspider, the excellent online database of more than 20 million molecules, recently acquired by the Royal Society of Chemistry. And then there are companies like WordPress, Friendfeed, and Wikimedia, that weren’t started with science in mind, but which are increasingly helping scientists communicate their research. This flourishing ecosystem is not too dissimilar from the sudden flourishing of online news services we saw over the period 2000 to 2005.
- The gradual rise of science blogs as a serious medium for research.
It’s easy to miss the impact of blogs on research, because most science blogs focus on outreach. But more and more blogs contain high quality research content. Look at Terry Tao’s wonderful series of posts explaining one of the biggest breakthroughs in recent mathematical history, the proof of the Poincare conjecture. Or Richard Lipton’s excellent series of posts exploring his ideas for solving a major problem in computer science, namely, finding a fast algorithm for factoring large numbers. Scientific publishers should be terrified that some of the world’s best scientists, people at or near their research peak, people whose time is at a premium, are spending hundreds of hours each year creating original research content for their blogs, content that in many cases would be difficult or impossible to publish in a conventional journal.
This flourishing ecosystem of startups is just one sign that scientific publishing is moving from being a production industry to a technology industry. A second sign of this move is that the nature of information is changing. The natural way for publishers in all media to add value was through production and distribution, and so they employed people skilled in those tasks, and in supporting tasks like sales and marketing. But the cost of distributing information has now dropped almost to zero, and production and content costs have also dropped radically . At the same time, the world’s information is now rapidly being put into a single, active network, where it can wake up and come alive. The result is that the people who add the most value to information are no longer the people who do production and distribution. Instead, it’s the technology people, the programmers.
If you doubt this, look at where the profits are migrating in other media industries. In music, they’re migrating to organizations like Apple. In books, they’re migrating to organizations like Amazon, with the Kindle. In many other areas of media, they’re migrating to Google: Google is becoming the world’s largest media company. They don’t describe themselves that way (see also here), but the media industry’s profits are certainly moving to Google. All these organizations are run by people with deep technical expertise. How many scientific publishers are run by people who know the difference between an INNER JOIN and an OUTER JOIN? Or who know what an A/B test is? Or who know how to set up a Hadoop cluster? Without technical knowledge of this type it’s impossible to run a technology-driven organization. How many scientific publishers are as knowledgeable about technology as Steve Jobs, Sergey Brin, or Larry Page?
I expect few scientific publishers will believe and act on predictions of disruption. It’s also easy to vent standard immune responses: “but what about peer review”, “what about quality control”, “how will scientists know what to read”. These questions express important values, but to get hung up on them suggests a lack of imagination much like Andrew Rosenthal’s defense of the New York Times editorial page. (I sometimes wonder how many journal editors still use Yahoo!’s human curated topic directory instead of Google?) In conversations with editors I repeatedly encounter the same pattern: “But idea X won’t work / shouldn’t be allowed / is bad because of Y.” Well, okay. So what? If you’re right, you’ll be intellectually vindicated, and can take a bow. If you’re wrong, your company may not exist in ten years. Whether you’re right or not is not the point. When new technologies are being developed, the organizations that win are those that aggressively take risks, put visionary technologists in key decision-making positions, attain a deep organizational mastery of the relevant technologies, and, in most cases, make a lot of mistakes. Being wrong is a feature, not a bug, if it helps you evolve a model that works: you start out with an idea that’s just plain wrong, but that contains the seed of a better idea. You improve it, and you’re only somewhat wrong. You improve it again, and you end up the only game in town. Unfortunately, few scientific publishers are attempting to become technology-driven in this way. The only major examples I know of are Nature Publishing Group (with Nature.com) and the Public Library of Science. Many other publishers are experimenting with technology, but those experiments remain under the control of people whose core expertise is in others areas.
These opportunities can still be grasped by scientific publishers who are willing to let go and become technology-driven, even when that threatens to extinguish their old way of doing things. And, as we’ve seen, these opportunites are and will be grasped by bold entrepreneurs. Here’s a list of services I expect to see developed over the next few years. A few of these ideas are already under development, mostly by startups, but have yet to reach the quality level needed to become ubiquitous. The list could easily be continued ad nauseum – these are just a few of the more obvious things to do.
- Personalized paper recommendations
Amazon.com has had this for books since the late 1990s. You go to the site and rate your favourite books. The system identifies people with similar taste, and automatically constructs a list of recommendations for you. Google doesn’t actually use the personalized algorithm, because it’s far more computationally intensive than ordinary PageRank, and even for Google it’s hard to scale to tens of billions of webpages. But if all you’re trying to rank is (say) the physics literature – a few million papers – then it turns out that with a little ingenuity you can implement personalized PageRank on a small cluster of computers. It’s possible this can be used to build a system even better than Amazon or Netflix.
- A great search engine for science
ISI’s Web of Knowledge, Elsevier’s Scopus and Google Scholar are remarkable tools, but there’s still huge scope to extend and improve scientific search engines . With a few exceptions, they don’t do even basic things like automatic spelling correction, good relevancy ranking of papers (preferably personalized), automated translation, or decent alerting services. They certainly don’t do more advanced things, like providing social features, or strong automated tools for data mining. Why not have a public API  so people can build their own applications to extract value out of the scientific literature? Imagine using techniques from machine learning to automatically identify underappreciated papers, or to identify emerging areas of study.
- High-quality tools for real-time collaboration by scientists
Etherpad, lets multiple people edit a document, in real time, through the browser. They’re even developing a feature allowing you to play back the editing process. A similar service from Google, Google Docs, also offers shared spreadsheets and presentations. These are just a few of hundreds of general purpose collaborative tools that are lightyears beyond what scientists use. They’re not widely adopted by scientists yet, in part for superficial reasons: they don’t integrate with things like LaTeX and standard bibliographical tools. Yet achieving that kind of integration is trivial compared with the problems these tools do solve. Looking beyond, services like Google Wave may be a platform for startups to build a suite of collaboration clients that every scientist in the world will eventually use.
- Scientific blogging and wiki platforms
Why arent scientific publishers developing high-quality scientific blogging and wiki platforms? It would be easy to build upon the open source WordPress platform, for example, setting up a hosting service that makes it easy for scientists to set up a blog, and adds important features not present in a standard WordPress installation, like reliable signing of posts, timestamping, human-readable URLs, and support for multiple post versions, with the ability to see (and cite) a full revision history. Perhaps most importantly, blog posts could be made fully citable.
Publishers could also help preserve some of the important work now being done on scientific blogs and wikis.
- The data web
Data needs to be organized and searchable, so people can find and use it. The data needs to be linked, as the utility of data sets grows in proportion to the connections between them. It needs to be citable. And there needs to be simple, easy-to-use infrastructure and expertise to extract value from that data. On every single one of these issues, publishers are at risk of being leapfrogged by companies like Metaweb, who are building platforms for the data web.
- Why many services will fail
Development projects are often led by senior editors or senior scientists whose hands-on technical knowledge is minimal, and whose day-to-day involvement is sporadic. Implementation is instead delegated to IT-underlings with little power. It should surprise no one that the results are often mediocre. Developing high-quality web services requires deep knowledge and drive. The people who succeed at doing it are usually brilliant and deeply technically knowledgeable. Yet it’s surprisingly common to find projects being led by senior scientists or senior editors whose main claim to “expertise” is that they wrote a few programs while a grad student or postdoc, and who now think they can get a high-quality result with minimal extra technical knowledge. That’s not what it means to be technology-driven.
I’ve presented a pessimistic view of the future of current scientific publishers. Yet I hope it’s also clear that there are enormous opportunities to innovate, for those willing to master new technologies, and to experiment boldly with new ways of doing things. The result will be a great wave of innovation that changes not just how scientific discoveries are communicated, but also accelerates the way scientific discoveries are made.