OpenAI’s “Thinking” artificial intelligence came to $ 50 rival

10
OpenAI’s “Thinking” artificial intelligence came to $ 50 rival

Researchers from the University of Stanford and Washington have developed a “reasonable” artificial intelligence model with only $ 50 cloud processing loan. This model, called “S1 ,, performed similarly with OpenAI’s O1 and Deepseek’s R1 models and created an important question mark in the artificial intelligence world: Do you really need million dollar investments?

Artificial intelligence thinking for $ 50

The S1 model is available as an open source via Github. In creating this model, researchers took an existing artificial intelligence model in a method called “distillation”. In this process, Google’s Gemini 2.0 Flash Thinking Experimental model was developed by using the answers of S1’s reasoning capabilities.

This type of distillation was used before. For example, researchers from the University of Berkeley spent about $ 450 to develop a similar artificial intelligence model last month. However, the most remarkable aspect of S1 is that it has only emerged at a cost of only $ 50. As it will be recalled, Deepseek was said to have educated OpenAI models by educating their own models.

With this type of achievements, question marks naturally occur in the heads. If relatively small research groups can produce high -performance models for several hundred dollars, the sustainability of billions of dollars is questioned.

Meanwhile, the Gemini 2.0 Flash Thinking Experimental model, which was used during the S1’s training, is open to free access via Google AI Studio, but Google’s conditions of use prohibit the reversal of models with reversal. Google has not yet made a statement on this issue.

A great way to imitate

S1’s research article suggests that a controlled fine -tuning (SFT) process, in which an artificial intelligence model is explicitly instructed to imitate certain behaviors in a data set, suggests that reasoning models can be distilled by a relatively small data set.

The S1 model was built on a small -scale open source model of the Qwen laboratory of Chinese technology giant Alibaba. The researchers conducted training with 1,000 carefully selected questions and detailed thinking processes of these questions. This process took only 30 minutes and 16 Nvidia H100 GPUs were used.

Meta, Google and Microsoft plan to invest hundreds of billions of dollars in its artificial intelligence infrastructure in 2025. However, projects such as S1 show that strong models can be produced without big investments.

Until here, of course everything is great, but there is an important point that is missed in these discussions. The distillation method will not always give great results. You can imagine this as a teacher-student relationship. The teacher can teach his student what he knows, but at the same time doing so, he can also convey his own mistakes, wrong stance and mistakes.

Distillation is often used to convert the capabilities of existing models into a smaller and more efficient structure. However, this process is not suitable for creating new and stronger models than scratch. That is, distilled models usually cannot exceed the capacity of the model that educates themselves. Yes, in the real world, the horn ears can pass, but you cannot make a small model much more competent by distillation in the world of artificial intelligence. Therefore, with distillation methods, you can imitate the powerful models quite accurately and strengthen the model with fine adjustments. However, in order to do something completely new, it is still necessary to invest a high amount of investment.