Horst is guilty of preventing me from doing university stuff. He wrote a grumpy rant about the shortcomings of Wikipedia.
I wrote several long comments trying to correct his picture of Wikipedia, among others
What happens if you read a “random hit” from Google and notice a bug? Often it is not easy to find a contact address of the author, and even if there is a contact information, there is no gurantee this bug will ever be fixed.
There is no easy way to tell the world that this information is wrong.
The Wikipedia contributors believe that n+1 eyes find more bugs than n eyes.
I 100% agree with you, that an encylopedia is about trust. To help your decision about the trust of a Wikipedia article, you can also look at the whole history. If a fact has survived more than 100 edits, i would trust it more than one that has been added only recently. (One feature I am still missing in Wikipedia is a “cvs annotate” like functionality)
Another comment from myself, about linking in Wikipedia.
I really should do more research on this topic and write a longer article in the summer holidays.
The good thing about it — giving everybody who visits the opportunity to share their knowledge — turned into a bad thing because there are just too many people out there who are feeling compelled to share their lack of knowledge.
This could be generalized to apply to the Internet or Publishing. And I really hope that this is not the future.
On the german Wikipedia Mailinglist there is currently a similar discussion on how to ensure the quality.
If you are a wikipedia contributor, I would really like to know, if you can confirm Horst’s claim that the quality of your contributions was decreased by the follow-up edits?
Update 2004-06-03: Horsts article is making a lot of noise in the german Blogosphere. Some articles and comments are interesting to read:
- Mathias Schindler – Why wikipedia sucks less than TROTW*
- Der Schockwellenreiter – Scheffredakteur!? We don’t need no stinky Scheffredakteur!
- IT&W – Das Web im Web
( *Wow*, Although I don’t understand the popularity and the general philosophy behind Schockwellenreiter and IT&W, this is the first time i link to them. The advantage of popular blogs are the many and high quality comments.)
Yes, I can confirm, that the quality of some articles decreases if you do not watch them. But the next edit can improve it, too. A problem is the amount of work you have to spend to correct bad contributions and discuss with people that are not that well informed as you are.
Well, but are there many articles that are unwatched, and decrease continuously? There are a lot of people using watchlists, and a lot of people use the Recent Changes page to watch changes. My theory is that, bad articles have always been bad and that bad edits are reverted in a short timeframe
Arved: Well, at least you seem to be one of the few people who understood my article as what it was: a grumpy rant, full of sarcasm, not to be taken too literally. Which makes your reaction more readable than some other people’s.
Yes, bad articles will (hopefully) eventually be corrected. But until they are dealt with, they are bad articles. You can’t guarantee that any article I’m looking at is correct at any given time. In an (admittedly unlikely) worst-case scenario, every single article I look at could be incorrect.
Wikipedia obviously ranks the social/technical/communal principle of the wiki higher than the correctness of its content. Which is, I guess, okay from a technical or even communicative point of view, but it’s not what a reference librarian like myself expects from reference material.
When engineers make encyclopedias
I find it interesting that the response to my anti-Wikipedia rant yesterday came mostly from two groups of people: People not involved in the Wikipedia project, who were mostly agreeing with me, or even expressing what I wanted to say more eloquently s…
People dealing with IT probably know about the difficulties of correctness proofs in software. It is nearly impossible to proof that a complex program fullfills a specification. The progress of IT is mostly based on “best effort”, it is possible that a computer program crashes every time you use it, but the software developers try to limit the worst cases. This strategy has been successfull in software development in the last decades.
The development of Wikipedia is similar. The worst cases get fixed first. All other bugs may get fixed in the future. But most people are able to use it, although they know it has bugs.
I think this is a very reasonable approach. There are edge cases were correctness is a must (think atomic power plants) but for the average user best effort is ok, they can live with a small error probability.
I agree that there are areas were people should not read Wikipedia as a source, e.g everywhere knowledge affects life and death, like I really hope GWB does not read in the Wikipedia about the Evilness of country XYZ and decides to invade it or my doctor reading in Wikipedia about how to engraft my heart.
The question is: Is there a quality assurance (qa) process in wikipedia?
No? So why does everybody believe wikipedia is a well researched, nonbiased, opinionfree and worthy resource?
Wikipedia is the work of amateurs, they believe that an n-endless count of humans may somewhere in a distant future produce a free encyclopedia that may contain all information, wisdom, knowledge – remember the cite about monkeys and typewriters?
I think Horst is very right but forgot two mean princinples of information retrieval:
1. Don’t rely on one source only.
2. Don’t rely on one source only.
My personal extension on this topic:
Don’t rely on one search-engine, don’t rely on the world wide web.
The deep internal and external linking within wikipedia is not a wikipedia problem, Google started to count incoming links as a criteria of quality or intrest, but that isn’t a good idea and it is easy to forge – as we all know, Google sucks even more.
On the other hand: Why is everbody reading, citing and linking to Joerg aka the Schockwellenreiter?
Joerg has no clue and is extremly biased, he associates UBE/UCE with Hormel Food Corp. and belives he can stop UBE/UCE related problems with forged e-mail adresses, I bet he and many other individuals like wikipedia, because wikipedia has all the easy answers he has easy questions for.
No wikipedia user or author wants a qa process, because somebody could proof most of the wikipedia authors and articles wrong, biased or incomplete. That would proof wikipedia as a waste of resources and efforts.
Its a very comon problem: Face it, quite nobody likes complex, well researched, high quality, peer-reviewed, nonbiased articles and is able to produce, afford, understand and consume that hard stuff, your head may explode, your view of the world may change, you may change.
Quite everyone wants free (as in free beer), easy (as in easy to unterstand) information, (did I say infomation — err, I mean biased entertainment) tailored to his needs, to call some names: www, wikipedia, blogs, google, schockwellenreiter – all are easy, biased, incomplete and everybody who claims “google knows” knows nothing at all (and isn’t aware of this fact).
Schopenhauer said: “Every man takes the limits of his own field of vision for the limits of the world.” and he is damn right.
Yes there is a qa-process in Wikipedia, but it just doesn’t scale.
I think a lot of people would like a qa process, but nearly nobody wants to do the hard work of reviewing for free. Thats one of the reasons why nupedia failed.
I think Schopenhauer is right, and I think that’s the reason it is very difficult to write “complex, well researched, high quality, peer-reviewed, nonbiased articles”.
Conclusion: Mankind isn’t ready for a free encyclopedia.
Stefan: Your longish comment summarizes *excellently* what I wanted to express. Thank you.
My opinion is that wiki (software and concept) is fine for ongoing work and describing progress on a topic or a piece of work, its scrap paper.
I don’t see peer-reviews in wikipedia as an qa-process at all, because there is no end to it.
If an article is (almost) complete, somebody will come along and rewrite and possible remove or replace information forcing others to rise the quality again, rewrite the article again and to review it again, destroying efforts of other authors and reviewers, that means there may be quality but it isn’t assured that it stays this way. The wikipedians compensate a lack of concept, authors, skills or knowledge by rising their efforts again and again:
http://openfacts.berlios.de/index-en.phtml?title=Wikipedia_plans
http://www.heise.de/newsticker/meldung/47996 — looks like its not only the qa-process that doesn’t scale (I bet Horst is rotfling his ass off at this time).
Some thoughts:
What makes Matthias think that the wikipedia quality assurance (the “many eyeballs” idea) will work exact the same way for wikipedia like it does for the linux kernel? The linux kernel uses a complete different approach, and you have to know at least a programming language (a skill) to contribute.
Why does the wikipedia database store so much redundant information internally? Is it really necessary to reproduce each article from his earliest origins? Sounds like vanity or a big mysql testbench to me and not like a software to carry out a encyclopedia project?
Why does wikipedia don’t elect a board of directors or reviewers that try to assure quality and seal or mark complete articles and focus on other articles that are incomplete or wrong?
Why does every wikipedian belive he can do better than Denis Diderot shouldn’t his try act like a warning to each of them?
Ad qa. It is still discussed if the good articles should be moderated. The question is, who should decide that an article is good (or maybe good enough for mankind). A lot of changes are of political nature.
When the Wikipedia people chose MySQL as Backend, they probably didn’t expect that it would grow that fast.
MySQL is nice for small and medium size sites, but not for that large sites.
Ad Linux Kernel, I have seen enough people posting on OpenSource projects mailinglists that can’t program. (“On the internet nobody knows you’re a dog”)
ad database size. The Database is already split into two parts. cur and old (e.g. for de cur is 87MB and old is 1789MB).
And I have to agree, it is impressive how Horst predicted the future
“MySQL is nice for small and medium size sites, but not for that large sites. ”
try saying that to Jeremy Zawodny, head MySQL guru at Yahoo.
http://jeremy.zawodny.com/blog/
scaling mysql isn’t down to mysql itself – it’s down to the architecture of your app and the network of servers you have and they way they are configured.
the scaling is pushed down the stack so to speak.
Thanks for the late comment. Well I searched Jeremy’s blog for “Wikipedia”, and the only entry i found was an article about Brad from LJ, who wrote an extension called memcached to increase MySQL performance and this extension is now used by Wikipedia too.
That was exactly my point, with sites like Wikipedia or LJ you are hitting the limits of MySQL fast.