Adam Smith created a "large model"
While the whole world is chasing after AI, a Scottish professor who lived 267 years ago might have figured out how to build a "large model" earlier than any engineer in Silicon Valley. More importantly, the engine he described doesn't just operate in the realm of AI; it also runs within every enterprise and every management decision. Understanding where it fails could be a severely underestimated management issue at present.
Fast Reading
- In his 1759 book The Theory of Moral Sentiments, Adam Smith used the "simulation engine" of sympathy to construct a "Large Moral Model" (LMM) that explains the origin of human society. Its training data consists of social experiences, its architecture is empathetic simulation, and its output is what we call "conscience." This is astonishingly consistent with today's Large Language Models (LLMs): both abandon writing rules and let the system learn from experience.
- From children learning to handle conflicts, new employees learning to make judgments, to enterprises expanding overseas, board independence, and the alienation of KPIs, Smith's theory of how the "judge in one's heart" develops and fails can almost be directly applied to the pain points in enterprise management. The failure modes of the empathy system (self - deception, ignorant confidence, collective mania, and the worship of the "visible") appear in a different form in both AI alignment research and enterprise management.
- A subordinate who always agrees with you, a product that always caters to users, and a board of directors that always supports the CEO are all examples of failed "perfect alignment." What we need is not unison but harmony: close enough to understand each other, and far enough to correct each other. This is the deepest isomorphism between the Large Moral Model and the Large Language Model.
I. Why is it hurtful when no one laughs at a joke?
In 1759, Adam Smith published his first book, The Theory of Moral Sentiments. In the first volume, he described a common scenario we've all experienced:
A person tells a joke at a dinner party. After finishing, he looks around expectantly but finds that no one else is laughing except himself.
Smith said that this is one of the most embarrassing moments in human society. But he saw a profound question in it: Why is this person embarrassed? He wasn't scolded, nor did he suffer any material loss. Why should it matter to him that others don't laugh?
Smith's answer is: Humans are naturally eager to have an emotional resonance with others. He used the word "sympathy." We need others to "feel what we feel," just as we need air and water. This need is an inherent part of human biology.
But how is this resonance achieved? We can never directly enter another person's mind. Smith was well - aware of this: We have no direct experience of others' feelings. The only way is to imagine how we would feel in the same situation.
What to do? We can only simulate. Put ourselves in the other person's shoes and run the scenario in our minds. The more times we do this, the more accurate our simulation becomes. Smith said, First, there are thousands of specific experiences, and then there are abstract moral laws.
Written 267 years ago, this passage precisely describes the operating conditions of one of today's most - watched technologies. Large Language Models also have no way to directly access the user's mind. The only thing they can do is infer the user's situation from the information provided and then generate a response based on the experience "learned" from a vast amount of human language.
One path comes from Scotland in 1759, and the other from Silicon Valley in the 21st century. These two paths didn't converge under Smith's guidance; they independently reached the same place: If you can't get into someone else's head, you have to rely on experience to guess.
The theory Smith constructed is essentially a "Large Moral Model" (LMM) that explains the origin of human society. And today's "Large Language Models" (LLMs) created by engineers are surprisingly similar to it.
Is the one - character difference between LMM and LLM just a coincidence?
II. How to make a computer recognize a kitten?
Let's put Smith aside for a moment and talk about how Large Language Models came about.
There have been two distinct routes in the history of AI development: Symbolism and Connectionism. Essentially, they are two ways to deal with complex problems. One is top - down design, and the other is bottom - up experience accumulation.
For example, suppose you want to teach a computer to recognize whether there is a cat in a picture.
The symbolic approach is to first list the key features of a cat, such as pointed ears, big eyes, whiskers, and a furry body, and then write a bunch of rules based on these features. Then, these rules are fed into the computer for it to follow. The appeal of this approach is obvious. The rules are transparent, and the logic is traceable. If something goes wrong, you know which rule is at fault. Scientists naturally prefer this route because it is in line with the ideal temperament of science: clear, precise, and verifiable. From the 1950s to the early 21st century, Symbolism occupied a central position in the field of AI for a long time.
But this path has a fatal problem: The world is too complex, and the rules can never be fully written.
Cats don't all look the same. The pixel features of the same cat can vary greatly when photographed under different lighting conditions. You would need to write a rule for each situation, which is almost impossible. What's more, how pointed does an ear have to be to be considered "pointed"? A dog with its ears up also has somewhat pointed ears. It's difficult to give the computer a precise definition to distinguish between "pointed" and "not so pointed."
Connectionism takes a completely different approach. It doesn't write rules. It builds a network composed of a large number of "neurons" and directly shows this network thousands of labeled pictures: These are cats, and those are not. Every time it makes a mistake, it fine - tunes its internal connections. After looking at millions of pictures, it figures out the rules on its own. What these rules are, no one can fully explain. That's why deep - learning models are called "black boxes." Although they work, you can't understand what's going on inside them.
In 2012, AlexNet from the Connectionist camp reduced the error rate of image recognition from 26% to 15% overnight. The black box defeated the rulebook. In the following decade, descendants of Connectionism, such as neural networks, deep learning, and Large Language Models, swept across the entire AI field.
III. What makes Zhao Benshan's jokes funny?
Let's get back to jokes. How can a computer judge whether a joke is funny?
Suppose you are a "symbolic humor engineer." You make the following list: Surprise reversal, add points; logical gap, add points; involving taboos but not crossing the line, add points; completing the setup and punchline within three sentences, add points. Then, you feed in the lines from Zhao Benshan's skits. The system outputs: Not funny.
It checks item by item and finds that Zhao Benshan's skits have a messy structure. It takes a long time to get to the point, the same gag is used over and over again, and they rely heavily on dialects, facial expressions, and body language. According to the rulebook, there are only demerits. But Zhao Benshan was the anchor of the Spring Festival Gala ratings for more than a decade. Where does his "comedic effect" come from?
It comes from the long - established emotional resonance between him and the audience. His rhythm is not the one in textbooks; it's the rhythm of chatting in rural Northeast China. Hundreds of millions of people who have moved from rural areas to cities are familiar with this rhythm deep in their bones, and this rhythm itself is a source of security. The words he uses are not carefully - crafted punchlines but what real people would say. The characters he plays, those sly, a bit rascally, face - saving but cash - strapped little people, are the people around those hundreds of millions of people, or even themselves.
Later, Zhao Benshan faded out of the stage, and stand - up comedy became the new mainstream of comedy. This is not a matter of taste. When the main body of the audience changes, the threshold for collective resonance also changes. They need a different rhythm to deal with a different kind of anxiety. With the same human hardware but different experience data, what is considered "funny" is different.
The Chinese crosstalk circle has actually done what a "symbolic humor engineer" does. Ma Ji summarized 22 techniques for organizing punchlines, such as "laying the groundwork and delivering the punch." But Guo Degang himself said that his best punchlines are not delivered according to the rules; they come from the on - site chemistry with the audience and Yu Qian. The rules are the framework, but the thing that makes people laugh uncontrollably can't be captured by the rules.
Smith saw through this. Whether a joke is funny doesn't depend on the joke itself but on whether the emotions of the joke - teller and the audience can be in sync at that moment. This depends on what has been precipitated from long - term social experience. You "know" what's funny, just as a neural network "knows" that it's a cat.
The Theory of Moral Sentiments
The Theory of Moral Sentiments
Author: [British] Adam Smith
Translators: Jiang Ziqiang / Qin Beiyu / Zhu Zhongdi / Shen Kaizhang
Publisher: The Commercial Press
IV. How does human society work?
The fact that no one laughs at a joke seems like a trivial matter. Whether Zhao Benshan's jokes are funny seems like a topic for after - dinner chats. But Smith saw a big problem in these trivial things: How does human society actually work?
Billions of people, each with their own desires, fears, and interests, and no one can ever directly know what's going on in another person's mind. Under these conditions, instead of falling into the "war of all against all" described by Hobbes, humans have established families, villages, cities, countries, and civilizations. How is this possible?
Economists would say: Markets and prices. But the premise of a market is trust. You have to believe that the bread you buy hasn't been poisoned. Political scientists would say: Laws and the state. But the premise of laws is that most people voluntarily abide by the rules most of the time. So, before markets and laws, there must be a more fundamental thing at work. Smith said that it is empathy, the simulation ability that each of us unconsciously runs all the time.
When you put yourself in the other person's situation and simulate, you can roughly judge whether their behavior is reasonable or outrageous. The other person is also doing the same simulation of you. Over time, countless such small interactions precipitate into a set of shared behavioral norms. They are not designed by anyone but emerge spontaneously from daily life.
Smith used an architectural metaphor: Benevolence is like the decoration of a building, something that adds beauty. But justice is the main pillar of the building. Without it, "the huge and magnificent edifice of human society would collapse in an instant". The foundation that supports this main pillar is the automatic simulation system in each person's heart.
This system that emerges norms from small interactions not only supports the grand human civilization but also hides in the capillaries of the daily operation of modern enterprises.
Why is a team "capable of fighting"? On the surface, it's because of clear division of labor and smooth processes. But what really makes the collaboration work is often something finer under the system: You roughly know why I'm in a hurry, and I roughly know why you're hesitating. Whether the customer's complaint is an emotional outburst or an early warning, several people in the team have a general idea. These things can be partially replaced by processes, but it's difficult to fully write them down in the processes. Processes can specify who reports to whom, but they can't specify when a person should ask one more question.
So, an enterprise is also like a model. It takes in not only reports and meeting minutes but also the experiences left over from dealing with customers, failures, misunderstandings, and cooperation in the past. Over time, it will form a collective intuition. The difficulty of management is not to write the rules clearly but to let this collective judgment ability grow gradually.
But Smith didn't stop there. He asked a deeper question: How does this simulation system develop and mature in a person?
He used such a metaphor: If a person grows up alone on a desert island from childhood, he has no concept of his own beauty or ugliness. When he is brought into the crowd, the reactions of others become his mirror, and he "sees" himself for the first time through the eyes of others.
The same is true for judging one's own behavior. When you were a child, you looked at your mother's face. As you grow up, you look at your teachers, colleagues, and leaders. Every external reaction quietly adjusts your inner sense of "what is right." But you gradually find that these mirrors are not reliable because if you rely on them completely, your self - evaluation will be as changeable as the weather forecast.
So, you start to install your own mirror in your heart. It's not any specific person but a judgment standard refined from all social experiences: If there is a person who fully understands my situation and has no stake in this matter, how would he evaluate me?
This is what Smith called the "impartial spectator." He also called it "the man within the breast" or "the great judge and arbiter."
You may think this is a state only saints can achieve. In fact, you used it yesterday. Your leader praised you at a meeting, but you know that the project was mainly done by your colleague. Your reaction is not joy but uneasiness because it's saying: You don't deserve it. Conversely, you made a decision that you know is right, maybe by rejecting an improper request or speaking the truth when everyone else chose to remain silent. Your family doesn't understand, your friends think you're stupid, and your colleagues think you're out of touch. You're aggrieved and sad. But there is a part of your heart that remains firm. That firm part is saying: I didn't do wrong.
Chinese people are familiar with this. "There is a god three feet above your head," but the restraint still comes from external monitoring. "Self - discipline when alone" goes a step further; you still remain self - disciplined when the external mirror is removed. Smith goes even further: The 'judge in one's heart' is not innate, nor is it taught by saints. It is your own entire social experience.
From the embarrassment of no one laughing at a joke, to the big question of how civilization is possible, to the development and maturity of the "judge in one's heart," Smith strung all these together with the same simulation engine. This engine is innate, but its calibration is acquired. First, there are thousands of specific experiences, and then there are abstract moral laws. It doesn't derive from rules but learns from experience.
Does this process sound familiar to you?
V. In what ways are the Large Moral Model and the Large Language Model similar?
Yes, it does. This process is almost the moral - philosophical version of the learning method of Large Language Models.
It's not a coincidence. Smith and the engineers of Large Language Models are facing the same fundamental constraint: They can't directly access the internal state of the other party and can only simulate based on experience. Under this constraint, they independently reached the same path. Before Smith, other contemporary philosophers took completely different paths: Some said that humans have a special "moral sense" that can directly perceive good and evil; some said that all moral behaviors are disguised self - interest; and some said that moral truths can be rationally deduced like mathematical theorems. Each of these views captures an aspect of moral life but takes it as the whole truth. Their methods are narrow: Rule - driven, not learning from experience, and failing when encountering situations outside the training scope.