Finally, the Era of Pictures Not Guaranteeing the Truth Has Arrived

Every picture mentioned below is generated by GPT-Image-2...?

One of the most popular uses after the release of ChatGPT Images 2.0 is to forge screenshots.

Tweets on X, posts on Weibo, Moments on WeChat, chat interfaces, corporate press release pages—just describe what you want in one sentence, and it will generate a screenshot for you. A screenshot that seems flawless at first glance and is still hard to spot flaws even on closer inspection. A screenshot with perfect fonts, margins, color schemes, and interactive elements.

All the pictures mentioned below by the editor are generated by GPT-Image-2.

The Xiaomi Corporation's Weibo account announced that Tim Cook has been appointed as the CEO of Xiaomi Auto. Tim Cook's personal Weibo account followed up with a response saying "Let's power up together," accompanied by an advertising video of the SU7 by the seaside. The post was on the hot search with 287,000 views.

Lu Yonghao of Smartisan Technology tweeted an announcement of the acquisition of Apple Inc., accompanied by an official poster in the style of a Chinese landscape painting. The tweet had 1.8 million views.

Anthropic tweeted an announcement of its return to Baidu and the relocation of its headquarters to Beijing, accompanied by a poster of the Beijing skyline.

SpaceX's official account tweeted an announcement of a partnership with Cursor, stating that Cursor has granted SpaceX the right to acquire Cursor for $60 billion. The tweet had 6.567 million views.

The investor relations pages of Huya and Douyu released a press release announcing the completion of their merger. The page layout, sub-navigation, forward-looking statements, and investor contact information are all in place.

Anthropic's official account announced the official opening of Claude to Chinese users, accompanied by a well-designed promotional poster. The post had 1.2 million views.

Because the previous pictures were quite ironic, people just took them as a joke. However, in the editor's group chat, some people really thought that Claude had been unilaterally opened to the Chinese mainland.

By the way, the editor lied above. Actually, the picture of SpaceX's announcement of the partnership with Cursor is real.

The spread of false information through forged screenshots on the Internet is not a new thing.

A previous article by Hangtongshe, "21 Years Ago, a Genius Teenager Was Already Causing a Stir on Wall Street," faithfully recorded the oldest form of fraud in the history of human informatization.

On April 8, 1999, the SEC and NASDAQ officials conducted a joint investigation. An engineer from a technology company forged a false report on the Bloomberg official website, claiming that the company was being acquired at a high price. As a result, the company's stock price quickly rose by more than 30%, making it the 12th most active stock on NASDAQ that day.

The false report webpage was incredibly realistic, with citations, comments, and annotations. It read as detailed as a corporate press release. It circulated half an hour before the trading session began, leaving little time for people to verify, but it was enough for investors to place orders. The perpetrator escaped the maximum 20-year prison sentence by pleading guilty and was sentenced to house arrest and fined $93,000 to compensate investors for their losses.

During the same period, 15-year-old high school student Jonathan Lebed posted a large number of bullish posts on the Yahoo Finance information board, creating the illusion of multiple people echoing. He manipulated the prices of penny stocks and became the first minor to be sued by the U.S. Securities and Exchange Commission for stock market fraud.

After that, phishing websites forging bank pages emerged one after another. A well-known case in the 2000s was the use of domain names like 1cbc.com.cn to impersonate the online banking of the Industrial and Commercial Bank of China (ICBC), or disguising icbc.com.cn.bk-bj.com as the official ICBC link to deceive user names and passwords.

In 2018, a forged screenshot of Ma Huateng's WeChat Moments circulated widely in the tech circle. Even Tencent Technology misquoted it. It was not until the forger himself came forward to admit it that the matter was settled. However, all these forgeries had certain thresholds.

In the article "If You Believe That Fake Ma Huateng WeChat Screenshot, What Right Do You Have to Laugh at Your Elders and WeChat Businesspeople?" written by Hangtongshe at that time, these thresholds were described, and the technical path of screenshot forgery at that time was speculated:

In fact, the cost of creating a screenshot of a WeChat chat record is very low. You can even create one directly in Windows Paint, which might deceive people who don't usually read the news.

However, for more discerning readers, it may be difficult to ignore the subtle differences in icon size, spacing, font size, etc. in such a rough picture. Even based on the differences between the iOS and Android systems, obvious interface differences can be found.

But what if your opponent has mastered such identification methods and deliberately avoided these problems when creating fake pictures to further improve the level of authenticity?

An app called "News Camera" can apply templates imitating the news programs of major TV stations to pictures on your phone, pretending that it is a news segment broadcast by a TV station. Some well-known rumor pictures, such as "The whole nation welcomes the rise in oil prices."

However, to avoid more trouble, "News Camera" deliberately left some flaws in the templates. For example, the station logo is not "CCTV" but "CCFV," and the fonts are also different. In a TV enthusiast discussion area on Baidu Tieba, there are posts specifically discussing these differences to debunk rumors for ordinary netizens.

The problem is that the same group of people in the same discussion area, out of interest, are also comparing how to use tools such as Photoshop and PowerPoint to create flawless and realistic static screenshots and animated GIFs, and they are sharing their works and exchanging experiences on Tieba.

The reason why that fake Ma Huateng screenshot deceived so many people is precisely because the creator deliberately avoided common flaws and used a flawless iOS WeChat interface screenshot. To achieve this, one needs to invest time and skills.

GPT-Image-2 has removed this threshold.

Why is this time different?

The revolutionary difference between GPT-Image-2 and previous models is the "one-shot" feature. You don't need to specify what words to put in, what fonts to use, or what the interface should look like. It will search, think, and match on its own. The logo is really the logo of SpaceX, the avatar is a real avatar, and the layout of the X interface is also real.

After comparing the pictures, the editor made a discovery that made him break out in a cold sweat. So far, no one else seems to have made the same discovery. Maybe it's just an accidental result of a single generation, or maybe it doesn't mean much. But let's take a look:

Let's go back to the picture area above and compare the "Lu Yonghao tweet" with other X tweet screenshots. In fact, the "Lu Yonghao tweet" uses the default STHeiti font of Smartisan phones; in another X tweet, the PingFang font is used.

This shows that it doesn't really "understand" how to handle Lu Yonghao's tweet as described in the prompt. It still hasn't escaped the nature of a probability prediction machine. It's just using all means to get close to the intention of the prompt. Statistical rules have linked Lu Yonghao - Smartisan - STHeiti together.

If there are still inappropriate parts, you can modify the prompt separately, or feed the genuine picture to it for image-to-image generation. For example, the simplest workaround: if you're not allowed to include a name in the generated image, you can feed the avatar instead.

This has led to a fundamental change:

The previously relatively easy way to identify flaws has become difficult because of a more efficient and lower-threshold model.

Readers who are experts in this field can infer which brand and model of phone it is based on information such as the notification bar and fonts, and they can also query information such as the EXIF of the picture for further verification; the "intuition" of ordinary people is similar to that of a bank teller. After handling a lot of real money, they can tell that a fake bill is wrong with just a light touch, even if they can't say exactly what's wrong. It's all about practice.

Is the font consistent with the claimed device? Are the notification bar icons reasonable? Do the line spacing and font size conform to the system specifications? Now all these can be automatically and correctly handled by the model.

We have to select all the text in the screenshot, search for the original text, and identify the domain name of the browser URL on the recognition result before we can trust it.

When the editor saw the real news about SpaceX and Cursor, he did just that—because GPT-Image-2 can completely synthesize something identical.

In mainland China, due to the difficulty of accessing foreign websites, it will take longer to disprove more rumors, and there is a possibility that this may cause greater chaos.

Even if the claim that "Claude is open to China" seems a bit false, what if someone says that "Claude no longer requires domestic users to hold up their ID cards"? Is it possible that you might place an order with your dual-currency card without verification?

When Seedance 2.0 was released, Tim from FilmForce felt fear immediately after trying it. He uploaded his facial photo, and without any text prompts or audio files, the AI automatically generated a video of him speaking, with his mouth shape, voice, micro-expressions, and the exterior view of his office. In that video, Tim said "terrifying" six times.

As a media and journalism professional, the editor is sensitive to GPT-Image-2 because his profession and experience have made him feel afraid.

"Seeing is believing" is no longer a simple truth. However, different from the previous fake news screenshots full of flaws, when a tool is found to be able to generate realistic fake pictures, as has already happened, it can stimulate more people to actively engage in forgery for fun.

But should we regulate this model?

The simple answer is: of course not. Developers should always be allowed to launch and open all fully functional and state-of-the-art models to allow users to maximize the effectiveness of AI.

It is the mission of a tool to evolve and become better.

The editor believes that the approach of "too advanced to be made public" like Claude Mythos Preview is, firstly, unnecessary and, secondly, ineffective. Although it can independently discover and exploit thousands of 0day vulnerabilities, the time gap for latecomers to catch up will only get shorter and shorter.

That is

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Finally, the era has arrived when having pictures doesn't guarantee the truth.

The spread of false information through forged screenshots on the Internet is not a new thing.

Why is this time different?

This has led to a fundamental change:

But should we regulate this model?