OpenAI wants a language model that creates a human-like chain of thought

OpenAI wants a language model that creates a human-like chain of thought to solve the "hallucination problem" of artificial intelligence.
 OpenAI wants a language model that creates a human-like chain of thought
READING NOW OpenAI wants a language model that creates a human-like chain of thought

OpenAI has published a new article outlining some of the advances it has made in eliminating the pervasive “hallucination problem” where AI (artificial intelligence) makes up things that aren’t true. The article outlines two models used to identify and clear hallucinations and how they occur, called outcome control and process control.

Reward models are trained to provide feedback on the end result of AI in OpenAI’s results control model. In process control, the reward model creates a human-like chain of thought and provides feedback at every step of the way.

Both models were tested on a math dataset and found that the process control method “significantly better performance,” OpenAI states in their research paper. It may be important to note that more work will be required to see if it performs.

Explaining the possible implications of the process control method, OpenAI said, “If these results are generalized, we can see that process control offers us a model that combines the best features of both options – a method that both performs better and is more cohesive than results control.”

It seems too early to say how much this step-by-step verification will help to stave off hallucinations more generally. But given that hallucinations are probably the biggest problem with LLMs at the moment, hopefully it will be an effective solution.

OpenAI has not made a prediction on when process control will be introduced in the publicly available ChatGPT service. The new method is still in the research phase and needs to be tested on general knowledge.

OpenAI mentions that while the initial results are good, safer methods can cause poor performance, called compatibility difficulty. The results suggest that so far process control has not experienced such a performance degradation when working on math problems, but we don’t yet know what the result will be in more general information.

Comments
Leave a Comment

Details
134 read
okunma50119