HSS
California Institute of Technology
Division of the Humanities and Social Sciences

The Two-Sided Turing Tournament

Human or Machine?

Imagine being told you would be playing online chess games against World Chess Champion Vladimir Kramnik and Deep Fritz (the successor to IBM's Deep Blue). Do you think you could tell which was which? Perhaps both games would end too quickly to make any such determination, but the question being posed is: can a human be fooled by a machine?

This is the concept behind the Turing test, developed by English logician and mathematician Alan Turing, in 1950, to test for intelligent behavior of a computer algorithm. In the test, a human judge, engaging in wide-ranging conversation, attempts to distinguish whether he or she is interacting with another human or with a computer imitating human responses.

Now, imagine that HAL, the intelligent computer in Stanley Kubrick's 1968 film, "2001: A Space Odyssey," replaced you as judge. Do you think HAL could accurately detect whether its opponent were Kramnik or Deep Fritz? In other words, can a machine be programmed to distinguish the subtleties between natural human behavior and a sophisticated computer mimicking human behavior?

A Two-sided "Turing Tournament"

The same ideas can be used to develop and measure how good a social scientific theory of human behavior is. Caltech researchers Jasmina Arifovic and Richard McKelvey point out that the development of social science theories can be likened to the task of building a computer to mimic human behavior, or equivalently, to building a computer that will pass the Turing test in the range of behavior covered by the theory. Thus, social science can be deemed to be successful when it is no longer possible for a computer judge to tell the difference between behavior generated by humans and that generated by the theory (i.e., by a machine).

Based on the above ideas, Caltech researchers plan to run a two-sided computer tournament, called the "Turing Tournament," to try and simultaneously develop strong models of human behavior, and good ways of telling the difference between human and machine behavior. Arifovic and McKelvey, together with post-doctoral student Svetlana Pevnitskaya, will apply this initially to the question of developing theories for how subjects play a repeated, two-person matrix-form game.

In the tournament, Caltech will solicit computer programs that can mimic human behavior, called "emulators." Also solicited will be computer algorithms designed to detect whether the observed behavior is generated by humans or by machine, called "sniffers." After all entries are received, repeated rounds of a simple, matrix-form game are played by humans and by emulators. The data generated from these rounds is then presented to the sniffers, whose task it is to determine which data are human and which are machine generated. The winning sniffer will do the best job of distinguishing between the human and machine data, and the winning emulator will do the best job of fooling the best sniffer. Significant monetary prizes for the best emulators and sniffers will be awarded in order to encourage the submission of entries representing the best current thinking on these questions.

Other Applications

The Turing Tournament raises fundamental, unsolved issues in game theory, computer science, econometrics/statistics, and experimental economics. Other applications of this methodology include detection of "program trading" in financial markets, modeling behavior in public good problems, evaluation of machine-translation programs, and building decision-making robots to take the place of humans in economics experiments. Some of these topics will be the focus of the Turing Tournament in future years. One particularly fertile area is the question of program trading. The SEC has dealt with the problems caused by program trading by introducing market mechanisms such as "circuit breakers" to temporarily slow down or stop trading when prices become too volatile. However, these remedies introduce their own inefficiencies in the market. The Turing Tournament methodology developed here could be used instead to provide a way of detecting program trading, and hence provide alternative ways for the SEC to regulate it. Another fruitful area of study is experimental economics. Here, with good models of human behavior in a voting setting, decision-making robots could be used in the place of human voters in experiments on candidate competition, to model voters' response to candidate behavior. This would allow experiments on equilibrium candidate behavior in large elections without having to pay for thousands of subjects to play the part of the voters. Instead, the only subjects needed would be the candidates. Funding

Tournament organizers envision the Turing Tournament running for five years. The first year focuses on repeated games described above. In subsequent years, other applications will be identified as the program evolves. Caltech has provided funds for the first phase of the project. Funding for the project's second phase, during which scientists will proceed to apply the methodology to specific problems, are now being sought. This funding will be tied in with the Social Science Experiment Laboratory at Caltech, which already has some of the personnel needed for the tournament, as well as the necessary laboratory facilities and access to subjects.