OpenAI wants a language model that creates a human-like chain of thought

OpenAI has published a new article outlining some of the advances it has made in eliminating the pervasive “hallucination problem” where AI (artificial intelligence) makes up things that aren’t true. The article outlines two models used to identify and clear hallucinations and how they occur, called outcome control and process control.

Reward models are trained to provide feedback on the end result of AI in OpenAI’s results control model. In process control, the reward model creates a human-like chain of thought and provides feedback at every step of the way.

Both models were tested on a math dataset and found that the process control method “significantly better performance,” OpenAI states in their research paper. It may be important to note that more work will be required to see if it performs.

Explaining the possible implications of the process control method, OpenAI said, “If these results are generalized, we can see that process control offers us a model that combines the best features of both options – a method that both performs better and is more cohesive than results control.”

It seems too early to say how much this step-by-step verification will help to stave off hallucinations more generally. But given that hallucinations are probably the biggest problem with LLMs at the moment, hopefully it will be an effective solution.

OpenAI has not made a prediction on when process control will be introduced in the publicly available ChatGPT service. The new method is still in the research phase and needs to be tested on general knowledge.

OpenAI mentions that while the initial results are good, safer methods can cause poor performance, called compatibility difficulty. The results suggest that so far process control has not experienced such a performance degradation when working on math problems, but we don’t yet know what the result will be in more general information.

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

The games added to GeForce Now this week!

The most different question of mathematics: Is zero one single number or double number?

Bitcoin return rate was 560 percent: What does it mean for investors?

Metaplanet appointed Trump’s son to Bitcoin Strategic Application Board

The Last of Us Part II Remastered PC system needs announced

Nvidia will spend hundreds of billions of dollars for chip production in the USA

TechnoPixel

OpenAI wants a language model that creates a human-like chain of thought

OpenAI wants a language model that creates a human-like chain of thought