GPT-5 Pro führt unabhängig mathematische Forschungen durch und gibt nach dem Lesen von Forschungsartikeln genauere Grenzen an. Der CEO von OpenAI sagt: "Das ist ein Lebenszeichen."
Can AI think independently and prove new mathematical laws?
Researchers at OpenAI said that they fed a paper to GPT-5 Pro, and after reading it, the model reached a new conclusion.
In the convex optimization problem, GPT-5 Pro gave a more precise threshold and corresponding proof than the original paper for a boundary problem.
The news immediately sparked a heated discussion across the network. In less than half a day, the tweet had more than 2.3 million views.
However, this researcher did not publish GPT-5 Pro's research results as a paper because humans had beaten it to the punch.
Later, a new version of this paper was updated, presenting a new boundary, which surpassed GPT-5 Pro.
However, GPT-5 Pro's proof idea is different, indicating that it already has the ability to explore independently. Therefore, humans' counterattack does not affect this being a new breakthrough for GPT-5 Pro.
OpenAI President Brockman even called this achievement "signs of life".
Is the convex optimization curve convex?
Another paper fed to GPT-5 Pro studied the convex optimization problem. Convex optimization is a sub - field of mathematical optimization, which studies the problem of minimizing convex functions defined in convex sets.
Specifically, the title of this paper is "Is the convex optimization curve convex?" It studied the following problem:
When using the gradient descent algorithm to optimize a smooth convex function, is the resulting optimization curve convex?
Here, the "optimization curve" refers to the curve of the function value f(x_n) changing with the number of iterations n. If this curve is convex, it means that the optimization rate (i.e., the decrease in the function value between two adjacent iterations) is monotonically decreasing.
Regarding this problem, the conclusion of the paper is that whether the optimization curve is convex depends on the choice of the step size. The specific key points are as follows:
Convexity guarantee interval: When the step size η ∈ (0, 1/L] (L is the smoothness), the optimization curve is guaranteed to be convex;
Non - convex possible interval: When the step size η ∈ (1.75/L, 2/L), even if the gradient descent still converges monotonically, the optimization curve may not be convex;
Gradient norm property: For the entire convergence interval η ∈ (0, 2/L], the gradient norm sequence ||∇f(x_n)|| is always monotonically decreasing;
Convexity of the gradient flow of a twice - differentiable convex function: For a convex and twice - continuously differentiable function, the optimization curve of the gradient flow is always convex;
Convexity of the gradient flow of a smooth convex function: For a convex L - smooth function (second - order differentiability is not required), the optimization curve of the gradient flow is always convex;
Monotonicity of the gradient norm of the gradient flow: For the continuous - time gradient flow, the optimization curve is always convex;
Regarding the first conclusion, the core of the proof is to prove that the sequence {f(x_n) - f[(x_(n + 1)]} is non - increasing.
The author of the paper cleverly introduced an auxiliary function g_k(t), transformed the discrete iteration process into the integral of a continuous function, used the properties of convex functions to prove the monotonicity of the auxiliary function, and finally proved the convexity of the optimization curve by comparing the size relationship between two adjacent auxiliary functions.
The non - convex possible interval part was proved by constructing a piecewise function (a combination of a quadratic function and a linear function) as a counterexample.
The author chose a specific initial point x_0 = - 1.8. By directly calculating the decrease in the function value in the first three steps of the iteration, it was verified that within this step - size range, the later decrease was larger than the earlier one, violating the convexity requirement.
Since GPT-5 Pro's proof mainly focuses on the boundary problem, the proof processes of the latter four conclusions are not introduced in detail here. If you are interested, you can read the original paper.
GPT-5 Pro gives a new boundary
In the first version of the paper, the author proved the situations when the step size is not greater than 1/L and greater than 1.75/L respectively, but there was no conclusion within the range of (1/L, 1.75/L].
GPT-5 Pro moved the boundary of 1/L to 1.5/L in 17 and a half minutes through more refined inequality techniques.
The time it took for humans to check the proof process was 25 minutes, which was longer than the time GPT-5 Pro spent reading the paper and conducting the proof.
Its core idea is similar to that of the original paper, which is to transform the convexity problem of the optimization curve into proving the decreasing of the function value decrease.
However, GPT-5 Pro cleverly used two basic inequalities of a convex L - smooth function - the Bregman divergence inequality (providing a tighter lower bound) and the standard cocoercivity inequality.
Through this clever algebraic operation, GPT-5 Pro successfully refined the convexity condition further.
After that, before GPT-5 Pro's discovery could be published, the original author of the paper updated it. A new author was added, and the key was that it was proved that 1.75/L is an exact boundary, closing the previously unexplored interval.
The idea was to use the Bregman divergence inequality of a convex L - smooth function to establish inequalities for three point pairs (x_0,x_1), (x_1,x_2), and (x_0,x_2) respectively. Then, the three inequalities were multiplied by different weights and summed, and the complex gradient terms were simplified through an identity.
Although GPT-5 Pro's proof was finally countered by humans, its idea and process are different from those of the new - version paper.
That is to say, GPT-5 Pro did not achieve the refinement of the boundary after discovering the new paper. Instead, it indeed has the ability to independently discover and prove mathematical laws.
Reference links:
[1]https://x.com/SebastienBubeck/status/1958198661139009862
[2]https://arxiv.org/abs/2503.10138v1[3]https://arxiv.org/abs/2503.10138v2
This article is from the WeChat public account "QbitAI". Author: Focus on cutting - edge technology. It is published by 36Kr with permission.