ELIZA from the 1960s beat ChatGPT in the Turing test!

British mathematician and computer scientist Alan Turing first introduced the Turing test in 1950 under the name “The Imitation Game”. It has since become a famous but controversial criterion for determining a machine’s ability to imitate human speech. In modern versions of the test, a human judge typically speaks to either another human or a chatbot without knowing which is which. If the judge cannot reliably distinguish the chatbot from the human to a certain extent, the chatbot is said to have passed the test. The threshold for passing the test is subjective, so there has never been broad consensus on what would constitute a passing success rate.

ChatGPT lost to ELIZA

Through the site, human interrogators interacted with a variety of “AI witnesses” representing either other humans or AI models that included the aforementioned GPT-4, GPT-3.5, and ELIZA, a rule-based speech program from the 1960s. Random roles were assigned to everyone who participated in the test. In the test, witnesses were instructed to convince the interrogator that they were human. The players matched with the artificial intelligence models were always in the role of interrogator.

It is stated that 652 participants completed a total of 1,810 sessions in the experiment and 1,405 games of them were analyzed. Surprisingly, ELIZA, developed by computer scientist Joseph Weizenbaum at MIT in the mid-1960s, scored relatively well, achieving a 27 percent success rate during the study. GPT-3.5 achieved a success rate of 14 percent, below ELIZA, depending on the question. GPT-4 achieved a success rate of 41 percent, second only to real humans.

Frequently used strategies during the sessions included small talk and questioning. Evaluations based on linguistic style and socio-emotional characteristics were effective in shaping participants’ decisions. Additionally, the study shows that education level and familiarity with language models are not decisive in detecting AI.

GPT-4 also fails

As a result, the study authors concluded that GPT-4 did not meet the success criteria of the Turing test, neither achieving a 50 percent success rate (higher than a 50/50 chance) nor exceeding the success rate of human participants.

Researchers think that with the right stimulus design, GPT-4 or similar models could eventually pass the Turing test. But the real challenge lies in designing a cue that mimics the subtleties of human speech styles. Like GPT-3.5, GPT-4 is conditioned not to present itself as human.

On the other hand, in the sessions, people were not convinced that a real person was real, even though they were in front of them, or in other words; people could not prove that they were human. This may reflect the nature and structure of the test and the expectations of the jury members rather than the nature of human intelligence. Researchers also state that some people engage in “trolling” by pretending to be artificial intelligence.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Amazon’s Starlink rival Kuiper received the date of launching the first satellites

Amazon can officially buy Tiktok!

Signals have arrived: O Altcoin is preparing for a 20 %rally!

New details about the Xbox Hand console have been discovered in the new update of Windows 11

Hundreds of unknown chip production documents were stolen: Russian espionage operation emerged

This application shortens the battery life of iPhones!

TechnoPixel

ELIZA from the 1960s beat ChatGPT in the Turing test!

ELIZA from the 1960s beat ChatGPT in the Turing test!