Operating an Employer Reputation System: Lessons from Turkopticon, 2008-2015
In November 2005, Amazon launched Mechanical Turk (AMT), a website where “requesters” can post tasks, called “Human Intelligence Tasks” or “HITs”, for workers to complete for pay. Workers are required to agree that they are independent contractors, not employees, and that they are therefore not entitled to minimum wage or other employment benefits. Requesters post tasks to the platform, workers choose and do tasks, and requesters then review and “approve” or “reject” the submitted work. Workers are paid for approved work; they are not paid for rejected work. Requesters can reject (i.e., decline to pay for) work for any reason.
In 2008, in response to reports from workers describing conditions of low pay, slow pay, poor communication, and arbitrary rejections (i.e., nonpayment), we designed Turkopticon, a web site and browser extension that workers can use to review requesters, mainly along criteria of pay, pay speed, fairness of evaluation, and communication. As of January 2016, 56,000 users have created accounts on the Turkopticon web site and about 35,000 use one of the two browser extensions. Since early 2009, these users have posted 290,000 reviews of 42,000 requesters. As on AMT itself, only a small fraction of registered users are “active”: in an average month, about 1,000 workers post about 5,000 reviews; in the period between December 16, 2015 and January 17, 2016, for example, 1,205 reviewers posted 5,205 reviews. Even within these “active” users, participation is very unequally distributed: within any given month, most users who have posted any reviews have posted exactly one, but a dozen or so post more than 50, and one or two post more than 100.
To our knowledge, most “professional” AMT workers use Turkopticon. And in 2014, in an ethically fraught but instructive experiment, a group of economists found that effective wages among requesters with “good” reputations on Turkopticon were about 40% higher than effective wages among requesters with “neutral” or “bad” reputations — and that requesters with good reputations attracted workers to their tasks at nearly twice the rate as requesters with bad reputations.
Despite these apparent successes, and generally favorable portrayals in the media, Turkopticon has serious problems that threaten its long-term usefulness to workers. Workers often disagree about how to review “properly”; these disagreements can become heated and even vicious, destroying trust and goodwill and draining participants emotionally and mentally. Turkopticon is also occasionally a site of the harassment, insults, sexism, racism, profanity, baseless accusations, and occasional threats that trouble other online communities (and indeed offline worker communities and organizations). We have not developed robust processes for mediating these disagreements or moderating the harassments and incivility; indeed our attempts to do so have generally produced further complications.
This paper describes AMT and Turkopticon, and attempts to draw lessons from our experiences with Turkopticon for broader efforts to develop broad-based worker power in the “on-demand economy.” The paper proceeds as follows. Part I describes AMT in detail. It describes the kinds of tasks available, the kinds of requesters who post them, the process by which work is completed, common problems that arise in the course of work, and so on. It also describes the broader “ecosystems” of tools, practices, and discourses “around” it developed by both workers and requesters. Part II describes Turkopticon in detail. It describes its design and operation; outcomes and current problems; and ongoing discussions about possible future developments. Finally, Part III steps back from the technical and social details of AMT and Turkopticon to draw lessons from our experiences.