Legendary superior Mario is used to compare artificial intelligence

While different benchmark tests are used to measure the abilities of artificial intelligence models, a new approach recently draws attention: Super Mario Bros. play Hao AI Lab, a research organization at the University of California, has the popular artificial intelligence models Super Mario Bros. He tested it by inserting it in the game and achieved striking results.

In the experiment, Anthropic’s Claude 3.7 model exhibited the best performance, followed by Claude 3.5. Google’s GIZİ 1.5 PRO and OpenAI’s GPT-4O models showed a lower performance than expected.

Thinking models were the victims of “thinking” too much

However, this test did not use exactly the same as the 1985 classic game. The game, which was run at the emulator and integrated with a special framework called Gamingagent, offered artificial intelligence to control Mario. This system provided simple commands and screenshots, such as bouncing to avoid obstacles or enemies, and enabled artificial intelligence to make moves. Models directed Mario by producing Python codes.

According to the researchers of Hao AI Lab, this test is important to test the ability of artificial intelligence to plan the complex maneuvers and develop game strategies. Interestingly, it was seen that the “Thinking” models, which make sense of step by step, were more unsuccessful than intuitive working models. OpenAI’s O1 model often performed strongly in many comparison tests, but failed here.

The main reason for this is that the speed of decision -making in real -time games is critical. Artificial intelligence models like O1 need to “think için for a certain period of time before making moves. However, even one second delay in Super Mario Bros. can result in the death of the character.

On the other hand, artificial intelligence has been tested through games for decades. However, some experts question if the game skills give the right idea about the general intelligence or technological progress of artificial intelligence. Because games often offer more abstract, certain rules based on certain rules and provide infinitely data in theory.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

The games added to GeForce Now this week!

The most different question of mathematics: Is zero one single number or double number?

Bitcoin return rate was 560 percent: What does it mean for investors?

Metaplanet appointed Trump’s son to Bitcoin Strategic Application Board

The Last of Us Part II Remastered PC system needs announced

Nvidia will spend hundreds of billions of dollars for chip production in the USA

TechnoPixel

Legendary superior Mario is used to compare artificial intelligence

Legendary superior Mario is used to compare artificial intelligence