Wikipedia Plateau?

There has been anecdotal evidence that Wikipedia’s prospects have shifted recently. Deterioriting article quality, backslide of featured articles, pages unnecessarily put in the deletion process.

But this graph provides definitive, sobering proof of something gone awry:

(UPDATE: n.b. the chart is for specifically)

Sometime in September/October of 2006, the growth rate of Wikipedia dropped dramatically. It crossed over from overperform to underperform in that time. And it’s been mired in that slump ever since.

People have recognized the community has been facing issues of quality and growth, but it has never been as stark as it is here. Wikipedia seems to be hitting that top part of the S-curve, and it’s something the community has been worried about for a while.

What could explain this? The beginnings of a virtual colony collapse disorder, or the natural course of a mature community?

Is it governance issues? A new board was put in place almost the exact same time while multiple staff reshuffles have taken place. Certainly a new style of oversight and leadership has taken hold. The board is larger than its ever been, and is very much an operational, hands-on entity. Gone are the days of grassroots informality. Elected folks are now delegated with authority and a six figure budget. Formal “chapters” with leaders dominate the community organizing efforts. There is still lingering resentment over the selection of an Asian city over a European city for the Wikimania 2007 conference. The stakes are higher. Have the tensions too?

Is a “tragedy of the commons” affecting Wikipedia? As a community grows, and becomes more anonymous and unfamiliar with each other, the same original grassroots underpinnings start to fade. It’s the difference between a town and a city. The difference between Ann Arbor and Detroit. In big town concrete jungle Wikipedia, unfamiliarity with others fosters incivility. This was something that was more often kept under control in small town picket fence-lined Wikipedia.

Is it the natural consequence of a nearly complete project? The “low hanging fruit are all gone” according to one Wikipedian. The thrill of starting the article on [[Monkey]] or [[Theodore Roosevelt]], and the associated feeling of empowerment, no longer exists. More and more, newbies are met with “don’t touch that,” or “we don’t do that here.” As more veterans leave, their ranks are not being filled with those inspired by the early exuberance and “aha” factor that Wikipedians felt even one year ago. That has a significant effect on whether quality editors can be replaced. New article writing, map creation and forming new features all thrilled the first generation of Wikipedians. Today, the mundane and boring tasks remain — copyediting, fact checking and vandalism fighting.

Increasingly, Wikipedia admins today find themselves fending off the tacking on of often pointless “Trivia” sections to every article. (If they’re lucky, they can shunt these factoids to pages such as [[Eiffel_Tower_in_pop_culture]], but you are still left with cringeworthy examples like [[Hitler_in_popular_culture]])

Perhaps the only virgin areas for Wikipedia are ones related to “newsmakers” or sudden celebrity. News is constantly streaming out new facts, stories and personalities, and Wikipedia’s strength has been to capture it all. Another area of potential growth involves “second level” articles that go up the information pyramid — comparative analyses, “Impact of…” articles, or large sweeping [[The Eighties]] roundups. The problem is they start to drift closer to a current no-no within the community — original research. Those attitudes may have to change.

A community built on passion and interest can do great things. For six years, it has done great things in Wikipedia. But what happens when the fuel is exhausted? The low hanging fruit has been plucked. The soil not as rich. Has the golden age of Wikipedia passed? And how will it be recorded in [[Wikipedia]]?

48 thoughts on “Wikipedia Plateau?

  1. Actually, I find this data somewhat reassuring. I hate the focus on number of articles as a measure of success; it places the emphasis in the wrong places. We don’t need to focus on creating more articles; we need to focus on improving the ones that are there.

    And I seriously doubt that anything to do with governance affects article count. Most of the editing community doesn’t have any interest in staff crises or governance woes: which is as it should be.

  2. Hoi,
    Your graph is probably for the English Wikipedia. What would it look like for other Wikipedias ?

  3. The German language wikipedia has moved from a exponential article growth towards a linear growth in May 2007. July 2004: 451 new articles per day. April 2007: 461 new articles per day (and many ups and downs in between)

    Three years later, we now have about six times as many articles (600k) and about 250 million words, 8 times as many as the largest encyclopedia on the commercial german market (Brockhaus Enzyklopädie, 21st edition).

    Given these numbers, I am almost proud to say that the focus has shifted so much towards quality, improving the existing articles, providing sources and so on.

    It will be a huge task to at least maintain the articles we have right now. If you apply the old statistics to it, you might see a flatline where it actually is an improvement.

  4. I do think that the English Wikipedia is experiencing difficulties in leveraging the motivations of Wikipedians toward getting the tasks that they really need to have done done. Getting new articles written is not a task that requires a lot of directed motivational effort; people naturally want to do that and so it just happens. Improving existing articles doesn’t happen quite so organically, and Wikipedia currently provides insufficient motivation so as to get enough people improving existing articles.

    In general, the leadership of the Wikipedia community — if there is any — has a hard task ahead of it in coming up with ways to incent its membership to do the tasks that need to be done in order to improve, instead of merely grow, the encyclopedia.

  5. Yes, you have a good point in that this also exactly coincides with when Jimmy Wales decreed we should shift from growth to quality. It’s harder to quantify quality. And in retrospect, the terms “overperform” and “underperform” should be qualified as applying only to article count, which is not a great yardstick in itself.

    However, I’m still somewhat skeptical that the quality has been upped in the meantime. I see vandalism not caught as quickly or thoroughly as before, and I see more obvious typos in high profile articles.

    Gerard, you are correct in that this is for en: only. Would be interesting to plot it for other languages. Some trends can be seen in the charts/graphs here:

    Mathias, yes de: is famous for its much more stringent article inclusion standards, so perhaps it is not surprising. Have you seen the same dynamic of people leaving, and wondering who’s replacing them? Has the community morphed from growth types to quality types accordingly. I suspect your “Trivia” buffs are more in check than on en:.

    For the readers out there, de: is quite distinct from en: — it does not have an article about every hurricane, controversy, phenomenon and current event. While en: could be almost seen as an extension of the newspaper, de: is most certainly not.

  6. Andrew, I think your analysis speaks more than the evidence.

    I still hold by my interpretation from a few months back when Ross Sage supplied the first evidence of this levelling-off: all of the low-hanging fruit has been picked, & new articles will require increasing amounts of work of create in terms of research and organization. To furnish one example, I have gathered the materials for biographical articles on three Ethiopian figures, but haven’t started work on them because I’m still not certain about how to present the material: an individual’s life is unique until oneself. However, I have written in the same time about a dozen articles on Ethiopian towns because I have a structure for the information in my head: one can write an acceptibly good stub on a town by recording it’s location, its population statistics, & any anecdotal information like history or local points of interest.

    If I may attempt to read your mind (& I apologize in advance for this invasion of your privacy), I suspect that your analysis that this drop-off is due to “something wrong” with Wikipedia is based on following the latest cycle of Wikidrama. All communities have their ongoing dramatic narratives, & Wikipedia is no exception, & you are a mature & thoughtful adult, so normally I would not expect you to let this color your thinking.

    However, as communities grow in size, their centrifugal forces grow closer to parity with — if not superiority to — the centrifical forces. In simple words, larger groups of people are more likely to break into smaller groups hostile to each other than smaller ones. I have been wondering when this would happen to Wikipedia in its current form — & whether it might happen before a new version of the community that can handle more people arises — so if this thought lies behind your analysis, then you are not alone.

    I simply believe this fall-off of article creation is not the direct symptom of the problem of managing a community that is outgrowing the principles of a Wiki.


  7. This is almost certainly nothing more than a side effect of new article creation being restricted to logged in users, which was put in place in early 2006. It is a change in the software, not in the community.

  8. Walt, I’m not sure that’s a plausible explanation by itself. The restriction was put in place December 2005, a good 7-8 months beforehand. I’d be interested why you think it’s “almost certain” that’s the cause of the drop.

  9. Geoff thanks for the comment. It’s interesting that we see nearly the same conditions, but have rather different conclusions.

    Wikipedia has a problem that is similar to urban population management. It’s easy to tell when people arrive — an edit history, IRC chatter, email list activity, request for adminship, etc. It’s harder to tell when people leave. They don’t tell why, there is no exit interview, and unless it’s a LoveBomb, the community lacks any feedback mechanism about departures. Wikipedia:Missing Wikipedians is perhaps the only indicator of such, and even that is incomplete and a guesstimate as to why folks leave.

    Your previous March 2007 post had an interesting comment:

    “If we are correct, the principle of least work — the easiest tasks will almost always be completed first — would predict that the quality of Wikipedia’s articles will start to gradually improve, because that is becoming the easiest task on Wikipedia to do now. Even if this first takes the form of automated edits — running bots to make large numbers of repetative changes. Eventually, someone will have to acknowledge the countless requests for sources that dot so many Wikipedia articles, and begin the long, tedious task of researching the issue and meeting that demand.”

    I don’t quite share your optimism, because as sageross pointed out: “A crucial question is, to what extent does low-hanging fruit entice new editors into the fold who later tackle higher-value targets? Might rapid growth via the easy stuff hamper Wikipedia’s long-term growth?”

    The principle of least work does not adequately explain the motivations and actions of Wikipedians. If you apply a “uses and gratifications” model and the “pleasure principle”" to Wikipedia, people tend to work on things that interest them or they are passionate about. And they are encouraged a lot by the “piranha effect” of one’s edits inspiring or spawning the work of others.

    The task of article improvement — copyediting, fact checking, grammar usage and cross article consistency — has historically been less popular than the immediate gratificiation of content creation and features addition. The former activities are likely to be less social, and pertain to certain personal pet peeves and hangups. It also seems less likely to get noticed by others, as it’s not as high profile a contribution as creating the page about [[NAFTA]]. Another problem with the “least work” hypothesis is that it takes a lot of work to find that “least work.” What that means is you need to be quite adept in navigating to and interpreting the Community Portal to access the queues of pending requests and outstanding tasks that now make up the easiest and most requested ones from the community.

    These are all theories though. What’s necessary is more Wikipedia anthropology — there’s rich insight hidden in the user contribution histories. But no one’s really gotten around to tracking long term user behavior, and evaluating this aspect of Wikipedia’s emergent behavior.

  10. I would like to see more action on the part of the Wikipedia community to study different aspects of it’s growth- what works and what does not. We can speculate all day as to why this trend has popped up, but rather than speculate, why not research the question? Survey people in the community and devise better ways to understand the trends. Perhaps its even time for Wikipedia to reconsider some of the principles on which it was founded that have been taken for granted for so long. Openness is a wonderful thing and something I love to see, but at the same time it is a shame to see talented contributors being scared off by vandalism, edit wars and less than courteous co-authors. I’m not saying Wikipedia should close the doors and simply become another Citizendum, but it wouldn’t hurt to at least look into more ways of preserving the quality of articles, and preventing contributions from people who think Wikipedia as a playground.

  11. Pingback: …My heart’s in Accra » links for 2007-06-30

  12. I think the one factor you are missing is that wikipedia is no longer being discovered by new people. Everybody knows about us already. And everybody has made up their mind about whether they are the kind of person who wants to edit wikipedia or not. There are no, and I mean no, people who have never heard of wikipedia, and just by chance come up on it via a google search, a slashdot story or whatever.

  13. Hi Cimon, even if that were true, how would that translate into a dramatic drop at that point in time? That might explain a gradual tailing off.

  14. Jussi: Well, if you “believe” in the ComScore numbers, new people are constantly discovering Wikipedia. Right now, only one third of the US online population is using Wikipedia, so there is still room to grow.

    Of course, you are also right: There are no more “early adopters” and saturation effects should kick in.

  15. I’m on enwiki, but have wade a little bit into the Arabic Wikipedia. With only ~30,000 articles, there’s still a long way ahead. I recently created the article on the State of Washington there, and surely numerous other basic topics are lacking. There is still tremendous room for growth, to increase the number and quality of articles on arwiki, in general.

    arwiki also has especially much to offer in developing material on Arabic-related topics, places, etc, which often requires consulting Arabic-language sources. There’s a clear bias on enwiki towards the U.S., Europe, Australia, and such places, with other places covered more sparsely on enwiki and the whole of arwiki needing lots of work. There’s also a bias on enwiki towards pop culture, with some academic topics still needing much work.

    Don’t know how much I’ll do on arwiki, as I’m not a native speaker. What I can do is help develop featured quality material on enwiki, and help translate featured articles from enwiki to arwiki. Someday, I would like to translate high quality material from arwiki to enwiki.

    In all, what’s needed
    (1) increasing participation on arwiki (and other wikis)
    (2) reducing the bias on enwiki
    (3) increasing number of featured articles on enwiki

    These aren’t simple, easy things to do.

  16. Andrew,

    When I first read your post, I had a similar reaction to Geoff: there’s no one thing that has changed dramatically in terms of governance or community, while the logistically (low-hanging fruit) explanation has been predictable since even before the first signs started showing.

    However, the (at the this point very limited) data shows too sharp a drop off for most reasonable “low-hanging fruit” models; I also suspect there’s more to it.

    Jussi-Ville Heiskanen may be onto something; the reduced supply of low-hanging fruit could be compounding the learning curve for new editors at the same time that Wikipedia is nearing saturation in terms of exposure to likely editors. So the earlier generations of Wikipedians are fading away without smaller pools of new talent replacing them. (This brings up an important question: what is the half-life of a Wikipedian? My guess: about 2 years)

    Another contribution might be the recent (and to some extent ongoing) notability reforms. Though the changes were not large, and actually eased the notability requirements, they also brought a lot more Wikipedians on board from the “inclusionist” end of the spectrum, meaning that more editors are willing to enforce (and encourage others to abide by) the notability guidelines.

    The contention over BLP policy and its interpretation has pushed away some users, but I suspect that could only account for a tiny portion of the dropoff.

    You suggest that “Perhaps the only virgin areas for Wikipedia are ones related to “newsmakers” or sudden celebrity.” Even in terms of news, there is plenty of notable material that does not get covered. For example, I’m perpetually disappointed in the Wikipedia coverage of pending and recently enacted U.S., and litigation of national significance. For most scholarly topics, even relatively well-covered areas like the sciences, the current Wikipedia coverage (in terms of number of potential articles) is still only scratching the surface. It’s just that the number of people both capable and willing to write any given missing article is small.

    I agree with Kat that, prima facie, the dropoff in new article creation is not particularly worrying. The sooner there are no notable entertainment topics (besides the brand new ones) to start new articles on, the better off the Wikipedia community will be. The next step is to push forward with quality control mechanisms, to make Wikipedia more attractive to more of the kinds of editors who can pick the higher fruit.

  17. In Who Writes Wikipedia?, Aaron Swartz hypothetized the larger an encyclopedia grows, the more new content creation is shifted to new users and anons, because the knowledge required is more and more diverse. If he is right, we can expect the same with new article creation – after a while, you need specialized knowledge to even know what new articles would be required, and eventually the number of people needed to cover all subjects outgrows the number of editors. Thus, the negative effects of disallowing anonymous article creation might show months or years later, when the wiki outgrows the obvious subjects. This still does not explain the sharp drop, but it might be a factor.

    To translate this and some earlier comments to research questions:
    - what is the distribution of edit count in new content creation and new article creation? How much of that is done by anons? How does this distribution correlate with the size of a wiki?
    - why do people leave? Maybe the software could send a mail to all editors with sufficiently high edit count who didn’t edit for, say, three months, and ask them to participate in a survey.
    - in which wikis did the growth in articles change from exponential to slower, and when? How does that correlate with the number of people speaking that language? (Even better if internet/broadband access statistics can also be factored in somehow.) This might also be interesting to see for the number of editors and number of edits.
    - was the smaller rate of new articles accompanied by a growth in quality? GA and assessments are probably too new to be useful, and FA is too much of a bottleneck, but mean article size might be a useful metric.
    - what is the half-life of editors? how does it correlate with wiki size and with edit frequency? Is there a burnout effect (shorter halflife for very high-frequency edits)?

    It would be really nice to see research done in these areas, maybe even funded by Wikimedia.

  18. Andrew;

    Your comments and others have brought to mind som ideas I’ve been kicking around regarding web 2.0 material.

    1. What about The Wikilacrum, or the similar wiki sites that spring up that consolidate some of the more niche-oriented material that demands greater specialization and a more familiar knowledge of the material? I think Tgr is right about one thing, that specialized knowledge bases seek different tools and have different reasons for writing information.

    2. Brandipedia.I am assuming that most people are like me. They may not be. But all I know of wikipedia is that it is a brand that challenges conventional thinking on resource material. If I was a resource creator by trade, I don’t think I would seek to access Wikipedia as a contributor. My resources and livelihood are tied up into other things, usually more “stable” and traditional. I know that’s assuming a lot, like, that other means other than Wikipedia are traditional. :)

    3. In my opinion, it would be ridiculous to assume that Wikipedia could ever be finished and reach a plateau. That would go contrary to web 2.0 thinking. Although, whose to say that all web 2.0 sites act like other web 2.0 sites in terms of user behavior.

    The final point I’d like to make is simply about what is natural intelligence and what is consumer-generated, or consumer-guided intelligence? Jimmy Wales had to sell the idea of Wikipedia before it became a useful alternative to encyclopedic discourse. Encyclopedic knowledge building has always been a revolutionary thing. Usually, manic and marginal individuals set out on journeys of mental discovery by catalouging everything they can in order to, well, find some kind of notoriety or success.

    Is Wikipedia really a natural system? Or is it a successful project in non-profit consumerism?

  19. Nice post, and a lot of interesting responses too. There are clearly many variables at work here, so the reasons are complex. I’m always an optimist, so I tend to support the idea that much of this is a natural maturing of Wikipedia. A lot IMHO has to do with a switch from quantity to quality; there are always those who can point to a familiar article and deplore its decline, but there are many more great articles popping up all the time. Also, our standards have risen over the years – look at a new featured article today compared to 2004.

    One “variable” that has gone unmentioned has been the explosion (on en, at least) of the number of WikiProjects. To extend your Ann Arbor/Detroit analogy, I believe these provide “neighbourhoods” in which people can still feel a sense of community. They also provide (hopefully!) a welcoming place for newbies, and a place to organise quality improvement work.

    Although a newbie may not get to create the NAFTA article, there is still a great sense of pride in turning that into a Featured article. Wikipedia may be maturing, but there still lots to do and it’s still an exciting place to work!

