Claude's Anti - Wordy Plugin Goes Viral: Forcing AI to Act Like a Caveman, Netizens Fed Up with AI's Nonsense

A plugin that makes AI speak like a primitive person went viral overnight on HN, surpassing 20,000 stars. Its core is just a simple and crude prompt: delete articles, niceties, and all nonsense, claiming to save 75% of output tokens. Its popularity shows that developers have had enough of AI's verbosity.

Recently, a Claude Code plugin called "caveman" has caused a stir on Hacker News.

Let's take a look at a picture first.

Judging from the GitHub star growth curve, "JuliusBrussee/caveman" climbed slowly for a long time at first and then soared sharply:

In just about half a day, the number of stars jumped from dozens to 500 and has now exceeded 20,000!

The "caveman" skill of saving tokens has become extremely popular!

Behind the overnight popularity of caveman is actually a typical resonance of community emotions.

It means that the pain point of "AI Yap (talking nonsense)", which seems small but has already broken the defenses of countless people, has been accurately punctured again.

Soon, netizens called caveman "the most powerful prompt word skill in 2026", saying that it can cut out the tokens wasted on politeness and preambles like "I'm happy to help you".

What this plugin does is actually very simple: make the AI agent speak like a caveman.

Delete "the", "please", "thank you"... Delete all the "human politeness" that does not affect the technical meaning but constantly consumes tokens.

https://github.com/JuliusBrussee/caveman

The project is created by developer Julius Brussee, and the GitHub repository is named "JuliusBrussee/caveman".

The core question Julius posed in the README is also very straightforward: Why use so many tokens to say something that can be clearly stated with a small number of tokens?

This is a skill/plugin that is compatible with both "Claude Code" and "Codex".

Its core idea is to make the agent speak like a "primitive man". Without sacrificing technical accuracy, it compresses the output to the extreme and claims to reduce token consumption by about 75%.

The question then arises: Can deleting articles and polite words really save users three - quarters of the money?

Digging into SKILL.md, netizens are shocked. Is this it?

How exactly does caveman "save" tokens?

Opening its core file SKILL.md, the content is indeed not long.

https://raw.githubusercontent.com/JuliusBrussee/caveman/main/skills/caveman/SKILL.md

The file frontmatter directly defines it as "Ultra - compressed communication mode".

And it states:

By speaking like a caveman, the goal is to reduce token usage while maintaining technical accuracy.

It is enabled when the user says "caveman mode", "talk like caveman", "use caveman", "less tokens", "be brief", or calls "/caveman".

It can also be automatically triggered when the user explicitly requests higher token efficiency.

Its rules for saving "tokens" are also very simple and crude: Don't use articles, don't talk nonsense, don't be polite; Keep technical terms and code blocks, and cut out other things if possible.

Delete the following: articles, filler words, polite expressions, hesitant expressions.

Allow the use of short sentences and fragmented sentences.

Give priority to using shorter synonyms, for example, say "big" instead of "huge", and say "fix" instead of "implement a solution".

Technical terms must be kept accurate.

Don't change code blocks.

Error messages must be quoted as they are.

Recommended sentence pattern: [Problem][Action][Reason]. [Next step].

For example, don't write like this: "Sure! I'm happy to help you. The problem you encountered is probably caused by..."

Instead, write like this: "Bug in authentication middleware. Token expiration judgment used <, not <=. Change here:"

It supports three levels of intensity: lite, full (default), ultra.

lite: Remove filler words and hesitant expressions. Keep complete sentences and a normal written style. Professional and concise;

full: Further compress the expression. Some function words can be omitted, fragmented sentences are allowed, and short words are used as substitutes. Typical caveman style;

ultra: Use a large number of abbreviations, such as DB, auth, config, req, res, fn, impl; Try to remove conjunctions; Use arrows to express causality, such as "X→Y"; Use one word if possible instead of two.

For example:

lite: "The connection pool will reuse the already opened database connections instead of creating a new one for each request, thus avoiding the overhead of repeated handshakes."

full: "Connection pool reuses opened DB connections. Don't create new one for each request. Save handshake overhead."

ultra: "Connection pool = reuse DB connections. Skip handshake → faster for high concurrency."

Of course, when encountering safety warnings, confirmations of irreversible operations, multi - step processes, or when the user is obviously confused, clear expression still takes priority. This is also the exception logic clearly written in SKILL.md.

Without changing the model architecture and without compression at the inference mechanism level, the essence of caveman is a carefully written system prompt that constrains the output style of the AI.

More importantly, the author Julius Brussee himself actively clarified in the HN discussion post that this skill does not target hidden reasoning tokens and thinking tokens.

The process of the model "thinking" in the background will not automatically become shorter because of caveman. It mainly compresses the part that is finally spoken out.

The official Anthropic documentation also mentions that the names and descriptions of skills themselves will occupy the context budget.

In other words, loading the caveman skill itself will consume tokens.

So the real cost savings from end to end may not be equal to the eye - catching "75%" in the README.

Therefore, caveman may significantly compress the length of the visible output, but this should not be directly understood as an equal - proportion decrease in the total cost.

Is the 75% in the README reliable?

Judging from the public content of the repository, the author did provide a benchmark script and listed the token comparisons of several tasks in the README, ranging from 22% to 87%, with an average of 65%.

But as of now, what can be directly seen in the public repository are the test scripts and example tables; It is still difficult for the outside world to fully review the reproduction experiment chain of each result based on the current content of the repository.

The author said in the HN post that this is only a preliminary test, not a strict benchmark test.

However, the academic community has indeed studied the question of "whether concise expression will harm AI performance".

https://arxiv.org/pdf/2401.05618

The 2024 paper "The Benefits of a Concise Chain of Thought on Problem - Solving in Large Language Models" shows that:

When the researchers asked the model to use a more concise reasoning chain, the average answer length of GPT - 3.5 and GPT - 4 decreased by 48.70%, while the overall problem - solving ability hardly decreased significantly; But in math problems, the performance of GPT - 3.5 decreased by an average of 27.69%.

The 2026 paper "Brevity Constraints Reverse Performance Hierarchies in Language Models" further points out that:

On some benchmarks, adding concise constraints to large models can increase the accuracy by 26 percentage points and may even change the original performance ranking among models of different scales.

https://arxiv.org/pdf/2604.00025

The above two papers provide a research background for "conciseness does not necessarily harm performance".

But it must be made clear that they study the effect of brevity as a general prompt strategy, not a special evaluation of the caveman GitHub repository.

The README's citation of these studies can at most show that its idea is not without theoretical background and cannot be directly regarded as a strict verification of the project's own effect.

The plugin ecosystem of Claude Code is starting to take off

There is another background reason for the popularity of caveman:

Anthropic has provided a relatively complete skill and plugin mechanism for Claude Code.

https://code.claude.com/docs/en/skills

According to the official Anthropic documentation, developers only need to create a SKILL.md file, and Claude can recognize it as a skill; The description is used to determine when to load it automatically, and the name will become a slash command that can be directly triggered.

The official documentation also clearly states that the path structure of a plugin - level skill is /skills//SKILL.md.

In the caveman repository, directories such as.claude - plugin, plugins/caveman, and skills/caveman can indeed be seen, indicating that it is not just a toy at the level of "a few prompt words" but an extension packaged according to the skill/plugin mechanism of Claude Code.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

Forcing AI to act like a caveman: Claude's anti-wordy plugin goes viral. Netizens: Fed up with AI's nonsense.

Digging into SKILL.md, netizens are shocked. Is this it?

Is the 75% in the README reliable?

The plugin ecosystem of Claude Code is starting to take off