Tutorial for term 2 week 3

TUTORIAL 2: for Semester 2 Week 3

Evaluating Conversations

The Loebner Prize Competition in Artificial Intelligence was established in 1990 by Hugh Loebner and the Cambridge (Massachusetts) Center for BehavioralStudies. It is awarded annually to the designer of the computer system that best succeeds in passing a variant of the Turing Test. In 1997 2,000 dollars and a bronze medal was awarded to David Levy, designer of the Most Human Computer as rated by a panel of 5 judges, for his program Converse. Second place went to Jason Hutchens and Bruce Cooper for SEPO. Third and fourth places were tied (based upon the rankscores assigned by the judges) and shared by Richard Gibbons (Julie) and Robby Glen Garner (Barry Defacto). Fifth was Robert E. Medeksza, with BOB OS.

See http://www.loebner.net/Prizef/loebner-prize.html for more background information.

Your task is to look at the transcripts on the conversations between various judges and system, and comment on them. Transcripts of conversations generated by two of the programs: SEPO and Barry Defacto are given here:

Extract of conversations between judges and Jason Hutchens and Bruce Cooper's: SEPO

Extract of conversations between judges and Robby Glen Garner's: Barry Defacto

(note these are abstracts from the original text).

A. For each program, select three sections of dialogue that illustrate good examples of human-like conversation between the judge and the program. Justify your selection by saying for each what the program did well and why you selected this example.

B. For each program, select three sections of dialogue that illustrate examples of where the programs fail to produce good human-like conversation between the judge and the program. Justify your selection by saying for each where the program failed and why you chose this example.

C. Imagine you were a judge. Give five questions you might ask and explain why you think these would be good questions. Speculate on how you think the programs might answer these questions.

D. Compare the two programs. What are their relative strengths and weaknesses? Illustrate your answer with reference to the examples given above. Which would you rate as best (you may have to say what you mean by "best" to answer this).

For information about other conversational agents, take a look at:

1. Elbot: developed by Artificial Solutions over the last 7 years. "It is our most intelligent bot with an enormous knowledge base, and he is currently used for advanced testing and sets the basis for the personality module. If you want to have a chat with Elbot, you can visit him at http://www.elbot.com/."

2. Hal, Alan and the nursery.... Ai Research is "a leading artificial intelligence research project. ...... Our expanding web site is an essential part of the emerging global discussion about artificial intelligence. On this website, we showcase the state of the art in pattern-matching conversational machines, demonstrated by Alan, and in reinforcement learning algorithms, demonstrated by HAL. " "Ai is hosting a collection of 'Virtual Children': a group of HAL personalities that were trained by users who have agreed to make them available to the general public. If you wish to choose one of these Virtual Personalities to speak to, click on HAL Personalities to view the list".

Home : Teaching : Courses : Hc1