A Jury of Your Peers: Quality, Experience and Ownership in Wikipedia

Authors

Abstract

Wikipedia is a highly successful example of what mass collaboration in an informal peer review system can accomplish. In this paper, we examine the role that the quality of the contributions, the experience of the contributors and the ownership of the content play in the decisions over which contributions become part of Wikipedia and which ones are rejected by the community. We introduce and justify a versatile metric for automatically measuring the quality of a contribution. We found little evidence that experience helps contributors avoid rejection. In fact, as they gain experience, contributors are even more likely to have their work rejected. We also found strong evidence of ownership behaviors in practice despite the fact that ownership of content is discouraged within Wikipedia.

Research Question

How do quality, experience and ownership effect which changes are rejected in Wikipedia?

Methods

In order to answer this question, we constructed a logistic regression model with the boolean outcome variable whether or not a revision was reverted and explanatory variables related to the experience of the editor making a change, the recent quality of their work and the ownership structure of the edited version of the article.

We randomly sampled 1.4 million revisions from the January 2008 dump of English Wikipedia. Within our model, a revision is an observation.


Two models of the effects of quality, experience and ownership on the probability of being reverted. On the left is the hypothesized "ideal" model. On the right is the actual model modified for the effects we saw in Wikipedia.
(+) means a positive correlation.
(-) means a negative correlation.
(0) means no correlation.

First we propose an ideal model(left) of how quality experience and ownership should effect the probability of being reverted. We then apply our sample to the model to determine the actual effects. Then we describe how the ideal model deviates from the actual model(left) in Wikipedia

Major Contributions

The following sections represent the three major contributions of this work.

Word persistence as a measure of contribution qaulity.

A toy example of six revisions of an article. In this example, the editor named "Steve" adds the word "apple" to the article in the first revision. That word then "persists" for at least five subsequent revisions.

For this work, we developed a metric based of previous work by Priedhorsky et al. and Adler et al. that we call word persistence. Word persistence is simply a measure of how long a word lasts through the revision history of an article. We use word persistence to estimate the quality of a contribution based on the assumption that editors who edit an article act as reviewers of the content of the article. We assume that the more revisions that take place which do not remove the word from the article, the higher quality in quality the original contribution that added it is.

Beyond simple tests for interference with other variables and independent significance in our statistical prediction model, we found a significant correlation between editors with a recent history of high word persistence edits and increases in the Wikipedia 1.0 Assessment rating of the articles they edit.

Editors don't exhibit a learning effect.

This figure shows a two plots that compare the probability of having a change rejected and tenure, the amount of time an editor has been editing Wikipedia. The plot on the left represents edits by all editors while the plot on the right represents edits by editors who continue editing for at least a year.

We hypothesized that editor would exhibit a learning effect in Wikipedia through which the probability of their changes being rejected would decrease as they gained experience. We did find that experience was useful for predicting when a change would be rejected; however, when we controlled for the amount of time an editor would eventually continue to edit Wikipedia, we found that the effect we saw was not, in fact, an effect of experience. Instead, there seems to be a drop-out effect whereby editors who are reverted often do not stay long in Wikipedia.

In the figure above, the probability of being reverted (having a change rejected) appears to be effected by tenure when we sample all editors(left), but when we control for the lifetime, the amount of time and editor will continue to edit, by subsampling only those editors that will stick around for at least a year(right), the effect of tenure disappears.

Ownership has a powerful effect over rejection of contributions.

This figure shows the effect that stepping-on-toes(x axis) has on the probability of being reverted(y axis).

When we examined the effect of our metric for ownership, stepping-on-toes or "removing words added by other editors when they are likely to notice", we found that it had both a powerful and independent effect the probability of a change being rejected in Wikipedia.

The number of "toes" stepped on by an editor was a very powerful predictor. This plot shows that, if you remove the work of nine other editors when they are likely to notice your change, you have approximately a 50% chance of being reverted.

The number of "toes" stepped on was also particularly independent from the other predictors in our model. Essentially, what that means is this: No matter how much experience you have, how high in quality your edits generally are or how often your changes are accepted by the Wikipedia community, if you step on other editors toes, you are likely to be reverted.

The views and opinions expressed in this page are strictly those of the page author.
The contents of this page have not been reviewed or approved by the University of Minnesota.