home

Archive for the 'Wikipedia' Category

Telegraph UK on Wikipedia Inclusionism/Deletionism

Wednesday, October 10th, 2007

I usually talk to at least a reporter a week on background concerning Wikipedia’s community and associated shenanigans. Writing a book on the subject will attract that attention.

This week, however, Telegraph UK’s article about inclusionism/deletionism put the Pownce issue up front and center again. I talked to reporter Ian Douglas about a lot of concerns, and he pretty much came out with the right set of facts. But this was perhaps worded too strongly:

Submission of new articles is slowing to a trickle where in previous years it was flood, and the discussion pages are increasingly filled with arguments and cryptic references to policy documents.

Creation of new articles is hardly a trickle, it’s just down from previous highs, so that we are seeing perhaps the latter part of the S-curve of growth. Though we got into the lack of dumps, missing statistics and other tools that might help to diagnose this phenomenon, but that did not make it into the article. My mention is but a single data point in the debate, so it would have been good to have better stats on this.

Andrew Lih was a well-known deletionist until recently when he became embroiled in the row over the entry for Pownce, a messaging and bookmarking website from Kevin Rose, the founder of the popular site Digg.com. The entry for Pownce, which had been written up in Business Week, was deleted as advertising until Lih resurrected it. He wrote about the row on his blog and has become a de facto spokesman for the inclusionists, and says he feels like an old hand.

“The old timers remember the early days when we used to say ‘ignore all rules’ and ‘assume good faith’, but people tend not to emphasise that now. The third or fourth generation of Wikipedians has only heard Jimmy Wales talk about the problems.

“So now, mixed in with the euphoria and positive energy it’s a lot of cutting, fighting, referencing, cutting back while leaving the good stuff in. New priorities are arriving. Newer folks feel like they’re wielding a machete, not planting new trees.

“A lot of the veterans see established articles nominated for deletion. They try not to be arrogant, try to be inclusive, but it’s tedious after six, seven or eight times.”

Wikimania 2008 in Alexandria, Egypt

Tuesday, October 9th, 2007

A group of Wikimedia Foundation staff and volunteers served on the jury to decide on where to hold the Wikimania conference in 2008, and selected Alexandria, Egypt. (Full disclosure: I served nominally as one of the three moderators of the process, but did not take part in the decisionmaking itself.)

I have to say it’s an exciting choice. The New Library will be the host, and will be sponsoring much of the conference for this historic homecoming to the most significant repository of human knowledge in ancient times. The public relations value and the seeding of interest in the Middle East and Africa is a great opportunity.

However, the choice has not been without controversy. David Strauss was one of the more vocal critics, alluding to Egypt’s treatment of women, LGBT and dissidents. His initial post:

I’m offended that the desire to have Wikimania hop around the globe (rotation) trumps the egregious history Egypt has with LGBT and other civil rights (local laws). While visitors to Egypt are certainly not at the same risk, I refuse to spend any money in a country that — as recently as 2004 — sentenced someone to 17 years of prison and two years of hard labor for posting a personal ad on a gay website[1]. A blogger was imprisoned in 2007 for four years for “insulting Islam and defaming the President of Egypt.”[2] Jimmy Wales even attended the Amnesty conference denouncing the censorship. No legal or cultural reforms since give me confidence that the situation has improved.

After a followup thread, there were some folks who agreed with him, but there has been no large outcry. Most folks seem excited and in support of the historic context and chance to engage a new community.

The choice of whether to boycott or engage has been a tough one. It happens with the Olympics, on trade, on technology transfer, and choosing conference venues. Given the international makeup of the Wikipedia community you’re not going to get consensus. When we chose Boston two years ago, there were folks who were upset because of the US’s foray into Iraq and the harsh requirements for visas.

Jimmy Wales has noted this, and has chosen “engagement” as his stance.

In honor of David’s concerns, I have decided to make the title of my own talk at Wikimania 2008 “Free knowledge and human rights” and I will use this opportunity to speak out against censorship and other violations of human rights around the world, including examples from Egypt.

Phoebe Ayers, who served on the jury, mentioned that the process is not perfect and that the team is willing to re-evaluate the criteria for future cycles. But this particular decision is final, and it seems on balance rather well supported.

She perhaps summed it up best.

Wikimania and Wikimedia are both global in scope, which means that while we can condemn censorship and loss of human rights everywhere we must also take into account a global range of values. Our projects focus specifically on free knowledge, and I expect that will be highlighted at the conference.

LA Times and the Deletion Roundup

Sunday, September 30th, 2007

I talked at length with the reporter from the LA Times about the entire universe of contemporary Wikipedia issues, before jetting off for China’s October holidays. The circle of Wikipedia bloggers has done a fine job of summarizing the article by David Sarno about deletion/inclusionism, sparked by the Mzoli Meats controversy, so I will simply link to them.

Overall, it was a good treatment. I’m glad to see Sarno took the time to talk at length to understand the complexity of the issue, rather than doing a simplistic “parachute journalism” article.

Erik Moeller on Wikipedia 2.0

Sunday, September 23rd, 2007

This is a response from Erik Moeller to me concerning the New Scientist article I talked about earlier.

To be clear, movement towards a usable stable versions is a good thing. However, this is one of the first major technical- and content-oriented initiatives being handled with money and oversight by the Wikimedia Foundation board of trustees, or a delegate thereof. And for that reason, to channel Dwight Eisenhower, this “is new in the Wikipedia experience.”

Andrew, I attempted to post the following, but I got an error message “Error: This file cannot be used on its own.” when trying to post a comment on your blog.

Luca de Alfaro gave a presentation at Wikimania 2007, and has been in touch with both Sue Gardner and our Technical Staff. Tim Starling commented about his work here:

http://www.nabble.com/Re%3A-%22Software-Weighs-Wikipedians%27-Trustworthiness%22-p12011354.html

Please be sure to read Luca’s actual paper before commenting on any of the potential problems with his approach:

http://www.soe.ucsc.edu/~luca/papers/07/wikiwww2007.pdf

We have provided Luca with the kind of live feed that we normally only give to companies to do his research in real time, and right now he’s working to process a full dump of the English Wikipedia. I have suggested that we could then offer a MediaWiki “tab” that could show the articles with trust coloring overlay.

Initially this could be something that editors add by modifying their user JavaScript, like navigation popups and countless other tools. The trust coloring itself would run on Luca’s servers (but inside a MonoBook skin).

After my conversations with Jim Giles this was condensed into “incorporated into Wikipedia” in the New Scientist article, which is an error (we’re going to send a correction on Monday). It’s not Jim’s mistake, though, as he sent me my quotes for approval, and I overlooked that particular part.

In essence, anything we do with Luca’s work will be done in stages, and with plenty of time for community feedback and so forth.

That said, I personally think that the kind of “overlay” functionality that Luca could provide (trust coloring for Wikipedia articles) is one of many overlays that could be useful. Wikipedia is a treasure for data miners, and in my opinion, it would be neat to think of a way to integrate recent research directly into the site, similar to the way
Google Earth integrates content overlays.

Wikimedia Foundation moving to California

Friday, September 21st, 2007

Sue Gardner of the the Wikimedia Foundation announced today the move of the foundation’s offices from St. Petersburg, Florida to San Francisco, California. Makes sense from a technology point of view, may not be so good for European cross-collaboration, but may be advantageous for Asia and Australia interaction. Some of the announcement details:

In making this decision, we assessed five major cities: Boston, London, New York, San Francisco and Washington, DC - as well as St. Petersburg itself. The upshot: after a fairly detailed analysis, I recommended to the board that the Foundation relocate to San Francisco, and the board accepted that recommendation.

[…]

Here is what’s planned at this point:

- The new office will open sometime this winter. We’ll probably start out in downtown San Francisco, until we get our bearings and choose a permanent location.
- The St. Petersburg office will close late this winter, probably at the end of January.
- We know that many people’s personal circumstances will make it impossible for them to move, but we are hoping that some of the current staff will be able to come with us.
- The servers will remain in Tampa indefinitely. If we do choose to move them, that would be a separate, subsequent decision. At this point, it’s not under active consideration.

Wikipedia 2.0? Hold on now…

Friday, September 21st, 2007

New Scientist has a new article out about Wikipedia’s “stable versions” proposal as a way to address criticism about how to trust articles that are constantly in flux. The idea is that there will be some type of rating system and selection of a presentable version for ordinary passersby.

Jimmy Wales announced the push to this initiative in August 2006, and the German Wikipedians have been working on implementing this as a pilot. While it’s being implemented later than expected, the New Scientist piece does a decent job explaining the impetus for it, and some of its features.

But then things in the article get oh-so-strange, and it’s caused a bit of a firestorm behind the scenes.

The article goes on to describe “trust ratings” for users, based on the work of Luca de Alfaro at UC Santa Cruz and the color coding system. This was a shock to me when I read it, and I consider myself moderately in-the-know.

Specifically in the article, they mention the feature as described by Erik Moeller, the Wikimedia Board of Trustee member most involved with this:

As well as relying on trusted editors, Wikipedia’s upgrade will involve automatically awarding trust ratings to chunks of text within a certain article. Moeller says the new system is due to be incorporated into Wikipedia within the next two months, as an option for the different language communities.

The software that will do this, created by Luca de Alfaro and colleagues at the University of California, Santa Cruz, starts by assigning each Wikipedia contributor a trust rating using the encyclopedia’s vast log of edits, which records every change to every article and the editor involved. Contributors whose edits tend to remain in place are awarded high trust ratings; those whose changes are quickly altered get a low score. The rationale is that if a change is useful and accurate, it is likely to remain intact during subsequent edits, but if it is inaccurate or malicious, it is likely to be changed. Therefore, users who make long-lasting edits are likely to be trustworthy. New users automatically start with a low rating. [Emphasis mine]

When asked about this, Erik referred me to this page on meta that explains part of this rationale: Wikiquality.

What raises my concern is that this wiki page, created for “brainstorming”, was made available just days before the New Scientist article was published, and it seems the publication has taken it as gospel as to what will happen. I’m not aware of how many people have seen or vetted this idea.

I’ll leave my comments at that.

I’m eager to hear the response from the community about this proposal. Let’s just say we had a lively commentary on the WikipediaWeekly podcast just a few hours ago about this, and I’m sure a vigorous conversation will follow.

TechCrunch 40 Results

Thursday, September 20th, 2007

Last night, I was shooting the breeze with Kaiser Kuo about the tech scene in China, and TechCrunch40 came up as something that would be interesting to do this side of the Pacific. For those not familiar with the concept:

The format is simple: Forty of the hottest new startups from around the world will announce and demo their products over a two day period at TechCrunch40. And they don’t pay a cent to do this. They will be selected to participate based on merit alone. In fact, we’re even offering a $50,000 cash award and lining up other in-kind services and awards from a generous group of corporate sponsors. [ref]

As co-sponsor of the conference, Jason Calacanis announced the site mint.com won top honors. Their slogan is “refreshing money management,” while TechCrunch describes them as, “a personal finance application that lets users track and monitor their financials in one place without the need of routine maintenance or accounting knowledge. Their application tracks bank, credit union and credit card transactions and alerts users to upcoming bills, low balances or unusual spending.”

That’s pretty slick, but not entirely new. Back in the dot-com era (I recall 1999 or so, but I’m not entirely sure) I used the site Yodlee.com as an account aggregator that tracked investments and balances for you. It’s pretty scary giving one site all your bank and credit card passwords to manage. For Yodlee, I started by entering one every few weeks until I was sure they weren’t going to fleece me and run off to the Caymans. Yodlee is still around, and I login in occasionally to check my “net worth” in their display.

In this industry it often takes a few generations for something to stick. YouTube wasn’t the first to share video,  and Flickr was not the first to share photos. But they’re certainly the big ones getting attention now. It’s quite curious what makes these services stick when others fail.

Also of note, Kaltura.com took the People’s Choice award at the conference. I had the pleasure of meeting Shay David, co-founder of Kaltura at Wikimania 2007, where he demoed for me their new video editing in a Web page feature using Adobe Flash. It was really slick, and supported a wiki-like video application. This is perhaps the holy grail of the video production world — supporting meaningful video editing collaboration, and Kaltura really impressed me with what they could do within a Web browser.

Can’t wait to play with both of these when I get more time, and when their sites are not completely swamped with traffic (like mint.com right now).

Yahoo! mash, their SNS site

Saturday, September 15th, 2007

In keeping with its trailing-edge tendencies, Yahoo! has had to play catch-up again in the Web 2.0 space. This time it’s social networking.

Let’s do a quick stroll down Yahoo’s road of broken dreams. Yahoo! didn’t do much with GeoCities when it acquired it in 1999, even though sites like MySpace and Xanga cashed in on user-gen sites much later. For audio and video, even though it acquired Broadcast.com early on (making Mark Cuban a billionaire) it doesn’t have a real video product like YouTube or an audio success like iTunes Music Store. Yahoo had to buy its way into photos with Flickr, even though it seems activity there has flattened out. And it only recently upgraded its Yahoo! Mail to become AJAX and Web 2.0 savvy. You can imagine why investors are disheartened by Yahoo! and its strategic direction.

So on that record, Yahoo! has put its entry into the social networking arena — Yahoo! mash.

One tech web site described it best — Xanga + Facebook + Wiki = mash.

It has the sparse but customizable HTML-ability of Xanga/MySpace, the modular components of Facebook, and for an interesting twist, a wiki-like ability for any of your friends (or anybody at all, if you like) to edit your profile page.

That last “wiki” feature is perhaps the only thing that will turn heads. It’s invite only for now, which makes it a pain to find folks you know. Isn’t that the whole point to be able to search and find folks? And once you’ve used Facebook’s clean slick interface, using Yahoo! mash seems like going from Prada to Walmart.

Especially amateurish is their “Mash Pet” feature which appears to be what an engineer scribbled on a whiteboard with five seconds of thought. Iconic and cute it is not.

But they might have something with the wiki idea. It’s got the right potential signal/noise ratio to make it interesting if it can handle rich media.

Send me a note if you want to be invited “in”.

Two Million English Wikipedia articles! Celebrate?

Monday, September 10th, 2007

This weekend, the English Language Wikipedia surpassed two million articles with the creation of [[El Hormiguero]], an article about a Spanish-language television show.

Interestingly, in the weeks preceding this event, there were many on the internal Wikimedia mailing lists who thought this milestone should be downplayed. Article count in itself does not mean much, but it’s still an important achievement in the history of Wikipedia. What it says to me, though, is that the core Wikipedia community is feeling somewhat empty and lost for direction.

The “community” doesn’t really know what motivates and excites them anymore. The thrill of new article creation is gone. There was a time when every article count, mention of Wikipedia in the newspaper or television story about the community was enthusiastically heralded on the lists. There were virtual slaps on the back, digital high fives and a hurrah in the community. But the euphoria of being a revolutionary and disruptive project enabled by the Internet is now simply nostalgia. Wikipedia articles routinely show up in Google searches, while the public’s expectations of quality have gone higher and higher. Wikipedia has become an indispensable part of the Internet landscape. It is no longer just a cool free novelty.

As a result, Wikipedia’s volunteer culture has shifted dramatically from being rouge and revolutionary, to remaining staid and conventional, both in content and in policy.
Instead of two million articles being a time to celebrate, El Hormiguero shows the challenges Wikipedia faces. If you’ve seen my recent blog posts or my Wikimania 2007 presentation, you can probably guess what happened to our dear article. Yes, it was promptly listed on Articles for Deletion by User:Alkivar within 24 hours of creation with the note:

Subject is a non notable tv show from Spain. It fails Wikipedia:Television_episodes content guidelines. There are no google news results once you do a google search and strip out blogs, youtube/google video, wiki clones, and the tv network cuatro who hosts it, you find more references to Nicaragua than you do the TV show. It has not “received significant coverage in reliable sources that are independent of the subject.”

The deletion didn’t get any traction. Eight folks voted right away to keep the article, rightly pointing out that this is a successful show. Depending on English-language sources for a Spanish show is not the best tactic for research. So it appears it will be kept, if for no other reason that it would be embarrassing to tell the world the two millionth article in English Wikipedia didn’t survive 24 hours.

I’ve mentioned what I’ve seen become the main activities in Wikipedia — deleting, pruning, citing and challenging contributions, which makes it a very different atmosphere than the original “anyone can edit” culture that brought the first waves of contributors. Today, there is more concern about keeping the bozos out, and making Wikipedia more “respectable.”

It’s clear now that Wikipedia’s growth curve is starting to get clipped. The latest graphs in [[Wikipedia:Modeling Wikipedia’s growth]] show that the top of the S-curve is in sight, and that “slope” is starting to decrease. This is not unique to the English edition, as German has seen the same phenomenon. The best estimates show that English Wikipedia may not double in size anytime soon.

The lack of community clarity on “what’s next” is because of Wikipedia’s coming phase — quality. At Wikimania 2006, Jimmy Wales proclaimed the next challenge was “quality” rather than “growth.” A feature called “stable” or “checked” versions was put forth by members of the German Wikipedia as a way for vetting versions of an article, so they could be checked or marked as non-vandalized, accurate, or some other criteria. A set of authorized users would be able to “mark” or “rate” versions of articles. The Flagged Revisions feature was discussed at the conference and online afterwards with hopes of implementing it within the next year. But one year later, at Wikimania 2007, there was no news as to when the feature would go live, or even have a public test.

To be fair, implementing this feature is pretty hard, functionally and culturally. It drastically changes the nature of the community. By adding this un-wiki feature of encapsulating quality in a metric value, it goes against a wiki culture that always encouraged critical thinking and careful individual evaluation of articles. By giving the public a thumbs up or down, Wikipedians would be vouching for the article and giving it some type of certification. That would be an entirely new role for the community.

Also, the user interface for this feature will be challenging for average users and administrators of the site.

For public users, do you show latest version as we do now, or do you show the last non-vandalized version, or the non-vandalized and fact-checked version? How do you toggle among them, and prevent user confusion about what is being displayed?

For administrators, each rating or action is stored in the database, but it’s another vector for vandalism and trolling. How do you monitor all these actions effectively? The rating feature would dramatically affect the bulk of Wikipedia’s operations when doing diffs, reviewing logs, etc. The rating feature is a very generic one, and there has been no large scale community discussion or consensus on what would be appropriate to rate, or even what they would mean. So Flagged Revisions is in the “cooking” stage, ready to be tested by a small circle of folks by the end of 2007, but getting this feature into the mainline Wikipedia will no doubt be controversial.

It’s a tough road ahead for the community. They have to realize that Wikipedia will not be ever increasing, and this is a special time in its history before it inevitably goes into “maintenance mode.” There have been warnings for years, and people don’t want to see Wikipedia turn into “another DMOZ.”

I don’t think Wikipedia will collapse into a DMOZ state of affairs. A directory of links and sites like DMOZ can get stale faster than a New York bagel, but human knowledge crafted by Wikipedia’s community has a much longer shelf life. However, the current drive-by culture of overly bureaucratic rules and regulations could turn Wikipedia into a positively dreadful place to hang out, and could stunt its growth and quality if we are not careful.

While doing research for the book about Wikipedia, I’ve found Jane Jacobs‘ book The Death and Life of Great American Cities extremely relevant. Jacobs, an urban activist, was not just fighting to preserve New York from the bulldozer of uber-developer Robert Moses, she prescribed ways to keep a big city feeling intimate and personal. The book is about how to keep citizens in contact with each other, to always make sure sidewalks provide individuals with interaction and to maintain a sense of humanity in what could easily become a faceless impersonal jungle of concrete and steel.

Wikipedia needs to learn from this.

It needs to remain people-centered and willing to suffer some inefficiencies in order to keep the community in control, and not at the mercy of a multitude of incoherent policies. Some of the community norms being adopted now are like Moses’ expressways, ripping an institutional path of destruction and uprooting communities at work — Critieria for Speedy Deletion and Requests for Adminship are perhaps the worst of these in the Wikipedia universe.

Regardless of whether it’s 3 million or 5 million as the final “sweet spot” of English Wikipedia, it will be interesting seeing how the community survives while getting there.

Barcamp Beijing 2007

Monday, September 3rd, 2007

This past Sunday marked the first-ever Barcamp held in Beijing, which turned out to be an upbeat gathering showcasing the potential of a grassroots tech community here.

Some of the themes discussed at the “unconference” included business planning, startup advice, translation, Web 2.0 applications (like twitter), China’s economic position, Wikipedia (yours truly), the Great Firewall, T-shirts 2.0, gaming industry in China and Creative Commons in China.

What exactly is Barcamp? Even those attending may not know the origins, so here’s the 30 second summary. Publisher Tim O’Reilly has an exclusive “Foo Camp” for Friends Of O’Reilly in Northern California each year, where he invites a select techno elite to meet and create a conference agenda on the spot and hang out. He calls it the “wiki of conferences.” After the second year of FOO Camp, some tech folks were annoyed that it was so closed a group. Even invitees from one year were not always invited the next year’s event, which caused some angst. So geeks in the San Francisco area decided to have an alternative “Barcamp” at the same time (See [[Foobar]] in Wikipedia for the techie cultural significance of this) where anyone could come and have an unconference of their own. The idea became viral, and now there are Barcamps around the world, as an adhoc gathering of techies with the common interest of sharing knowhow and ideas.

Typically how Barcamp works is folks arrive ready to discuss, present or demonstrate something. You write your idea on a yellow PostIt note, and stick it on the board. After all interested folks have put up their proposals, they are either voted on or just organized by the conveners into 30 minute time slots throughout the day. At Barcamp Beijing there were slightly fewer proposals than slots, so each one got a slot.

In reality, many presentations are really just an excuse to get conversation going as the most useful learning happens in the hallways and side discussions.

In China, Shanghai has always been the more progressive city for business and technology, so last year they hosted the first Barcamp in China. This year marked the first one in Beijing, and there was an average of 60 or so people at any one time, with a total attendance of around 100 in all. Held at the slick facility of Orange Labs/France Telecom in Haidian, northwest Beijing, it’s right in the heart of the university and technology park district. This is where you’ll find Tsinghua, Peking, Renming and other universities and the offices of Microsoft, Google and other tech companies.

Hopefully this marks the advent of more ad hoc gatherings in the tech community here.

The grassroots, unpredictible nature of these plan-on-the-spot unconferences make them uncomfortable for the authorities here, but perhaps they’ll see these do much more good than harm. The free flow of ideas and contacts are absolutely necessary if China wants to be competitive in the software world with Bangalore, if not Silicon Valley. Otherwise, the PRC will be continue to be stuck at the bottom of the value chain, simply being a cheap source for hardware manufacturing and assembly.

Great job by Kris Krug, Robert Scales, Orange Labs and the rest of the folks who helped out. I hope this can be replicated more.

I’ll post shortly with some session summaries and reflections, but in the meantime you can find some good summaries with MeMedia and Jodi Xu.