High Emotional Intelligence AI Agent Arrives: Cambridge Team Unveils EvoEmo RL Framework for Bargaining via Anger and Sadness

EvoEmo: Building an "Emotional Evolution Assembly Line" for AI

In daily life, negotiations are everywhere: When shopping online, you might say, "I'll place an order if the seller drops the price by $50." When renting an apartment, you could try to persuade the landlord for "one - month deposit and monthly payment." In the workplace, you finalize cooperation details with clients... At such times, "emotional skills" are often the key. For example, pretending to hesitate and saying, "My budget is really tight," or moderately expressing your expectations can always steer the negotiation in a more favorable direction for yourself.

However, negotiation is a major challenge for AI.

Most existing LLM Agents ignore the functional role of emotion in negotiations. The emotional responses they generate are often passive and preference - driven, making them vulnerable to manipulation and exploitation by opponents. Even the most advanced LLMs often stumble during multi - round negotiations: They either remain "polite throughout" and are easily taken advantage of by opponents, or they can't tell whether the other party is genuinely anxious or just pretending, and readily make concessions.

Recently, a team from the University of Cambridge and its collaborators jointly proposed an evolutionary reinforcement learning framework called "EvoEmo," which fills the gap of "emotional negotiation" for LLMs. The relevant research paper has been published on the pre - print website arXiv.

Paper link: https://arxiv.org/abs/2509.04310

Extensive experiments and ablation studies show that EvoEmo has achieved breakthroughs in success rate, efficiency, and cost - saving for buyers. This finding emphasizes the crucial role of adaptive emotional expression in enhancing the effectiveness of LLMs in multi - round negotiations.

Traditional LLM Negotiations: Three Major Shortcomings Holding Back

Why have previous AI negotiations always been less than satisfactory?

Extensive behavioral research shows that human decision - making systematically deviates from the assumption of pure rationality in classical economics. It is dynamically shaped by psychological biases and emotional states, rather than just stable personality traits.

Although modern LLMs have made progress in replicating personality - driven behavior patterns through the Chain of Thought (CoT), the role of emotion in decision - making remains relatively under - researched, especially compared with methods based on static human traits. However, in delicate negotiation scenarios such as price bargaining, emotional dynamics play a key role. Emotions directly influence tactical choices and immediately affect negotiation outcomes. In contrast, personality traits can only capture broad behavioral tendencies but cannot explain adaptive and immediate dynamic changes.

According to the paper, compared with human negotiators, LLMs have three fundamental flaws:

Firstly, tactical inflexibility. Human negotiators can dynamically adjust emotional signals and flexibly change tactics based on the opponent's reaction. If the seller is tough, they may deliberately show "disappointment" or "giving up" to apply pressure; if the seller relents, they will quickly use "gratitude" to consolidate the results. However, LLMs usually default to a static response mode and only respond in a fixed pattern. No matter what the seller says, they will only mechanically "request a price reduction," making their behavior both predictable and exploitable.

Secondly, adversarial naivety. Although LLMs have strong emotion recognition abilities, it has become a fatal weakness. Despite being able to recognize signals such as frustration or empathy, LLMs still cannot distinguish between genuine emotions and manipulative strategies, such as feigned urgency in price negotiations. When faced with such manipulative strategies, LLMs often readily make concessions and have no ability to resist.

Thirdly, strategic myopia. Before negotiating, humans will build up emotions and actively shape the emotional trajectory of the interaction. For example, they may chat with the seller about daily life, praise the product quality, and then make a price - reduction request after building a good impression; during the negotiation, they will also control the rhythm and not reveal their bottom line immediately. However, unlike humans, due to the lack of the ability to reason about emotional causality, LLMs remain passively reactive rather than actively generating when managing emotional dynamics. They can only respond passively, taking one step at a time, and it is difficult to take the initiative in the negotiation.

The above three flaws explain why LLMs with strong reasoning abilities may perform worse than humans in emotion - sensitive negotiations, especially in "bargaining," where strategic emotional regulation is of utmost importance.

EvoEmo: Building an "Emotional Evolution Pipeline" for AI

The EvoEmo framework is an evolutionary reinforcement learning framework for optimizing emotional strategies in multi - round emotion - sensitive negotiations. This method discovers the optimal emotional transition rules through a population - level evolutionary learning mechanism and iteratively optimizes the strategy based on the rewards obtained during the negotiation process. Evolutionary operations (including crossover and mutation) can efficiently explore the strategy space and spread high - reward emotional strategies. EvoEmo combines the exploration advantages of population optimization with the sequential decision - making framework of reinforcement learning, providing an effective way to evolve complex emotional strategies.

In other words, the core idea of the EvoEmo framework is very simple: Since AI can't learn to use emotions flexibly on its own, let it continuously evolve in "actual combat." Just like biological evolution, good emotional strategies will be retained, and bad ones will be eliminated, gradually screening out the optimal solution.

Figure | Schematic diagram of the working process of the EvoEmo framework

The effectiveness of this framework lies in the following designs, which make AI's emotional decision - making "rule - based":

Firstly, emotional perception MDP. The EvoEmo framework formalizes the negotiation process as an MDP (state - action - policy - reward) and divides the emotions in negotiation into 7 basic types: anger, disgust, fear, happiness, sadness, surprise, and neutral. Each emotion corresponds to different negotiation intentions. For example, "moderate anger" can express dissatisfaction with the price, "neutral" is suitable for rational communication of details, and "surprise" can consolidate the results when the seller makes a concession, making AI's emotional expression no longer chaotic.

Secondly, systematic strategy composition. Each negotiation strategy encodes three core components that govern the Agent's emotional behavior: emotional trajectory, temperature parameter, emotional transition matrix, etc. The combination of these components makes AI's emotional decision - making both planned and flexible.

And a scientific reward mechanism. Evolutionary optimization evaluates strategies through a reward function, which can be interpreted as a fitness score to measure the effectiveness of negotiation. Each round of negotiation by AI will be scored: A basic score is given for a successful negotiation. The more money the buyer saves and the fewer rounds are used, the more points are added. This scoring standard of "success rate + cost - saving + efficiency" urges AI not to hard - bargain just to save money or make concessions easily just to be fast, and accurately find the best balance point.

Finally, improved reinforcement learning. The EvoEmo framework transforms the optimization problem of emotional strategies into an evolutionary reinforcement learning task. Through the evaluation of generational cycles and the population optimization mechanism, it continuously improves the emotional transition parameters of the strategy. In each iteration, candidate strategies are first deployed in a multi - round dialogue simulation environment, which is jointly constructed by the LLM model and the interaction prompt set. After each strategy is executed, a complete emotional state and dialogue sequence will be generated, and its effect is quantitatively evaluated through the reward function. After this evaluation stage, the system will select strategies for optimization based on probability.

The entire evolutionary process is like an "assembly line" for emotional strategies: First, initialize a batch of random emotional strategies, let them participate in negotiations and be scored respectively; then keep the strategies with good performance, generate new strategies by combining the advantages of two good strategies and randomly adjusting some parameters; then let the new strategies participate in negotiations and be scored... Iterate repeatedly until the most powerful emotional strategy is found.

Bargaining with Anger and Sadness

To test the effectiveness of EvoEmo, the research team conducted a rigorous set of experiments: They selected a subset of negotiation cases from the CraigslistBargain dataset for evaluation, which includes 20 multi - round negotiation scenarios across different categories, covering electronics, furniture, cars, and housing and other fields. Each scenario includes three elements: product details, a specific target price set by the seller, and emotional annotations reflecting the real bargaining dynamics. It also covers a wide price range from $50 to $5000 and includes products in different conditions such as brand - new or second - hand, so as to comprehensively evaluate the effectiveness of negotiation strategies in different market environments.

The research team selected three mainstream LLMs, GPT - 5 - mini, Gemini - 2.5 - Pro, and DeepSeek - V3.1.1, to drive the buyer and seller Agents in the experiment.

During the evaluation process, the researchers defined two baseline models for comparison: The first baseline only includes standard Agents, and neither the buyer nor the seller receives emotional guidance. This setting ensures that both parties act entirely based on their internal emotional tendencies and strategic reasoning abilities, thus providing a reference baseline reflecting default negotiation behavior.

The second baseline pairs a standard seller with a buyer with a fixed emotion, where the buyer maintains a constant emotional state throughout the negotiation process. By comparing these baselines with the setting where the buyer's emotions are optimized by EvoEmo, the impact of emotions on negotiation outcomes can be quantified, and the effectiveness of EvoEmo in enhancing LLM - based, emotion - driven negotiations can be evaluated.

The experimental results also confirm the effectiveness of EvoEmo: EvoEmo consistently achieves the highest cost - saving rate for buyers in all buyer - seller pairings, significantly outperforming the baseline models (ordinary setting and fixed - emotion setting).

Figure | The percentage of cost - saving for buyers in the negotiation results of 9 buyer - seller pairs. The black vertical line at the top of each bar represents the 95% confidence interval (CI) for each setting.

In addition, the research results also reveal two interesting findings:

Firstly, in terms of emotional strategies, buyers using fixed negative emotions (such as anger and sadness) usually perform better than the ordinary baseline model. This effect is particularly obvious when facing sellers who continuously express disgust or sadness, indicating that when faced with continuous negative emotional signals, LLM seller Agents are more likely to make concessions.

This finding emphasizes the important role of continuous negative emotions in influencing negotiation dynamics and outcomes. On the contrary, buyers with fixed positive emotions, such as happiness and surprise, save less money than the baseline level. This indicates that when seller agents interpret the buyer's emotions as positive, they can defend the price more effectively, perhaps thinking that there is no urgent need to make concessions.

Figure | The negotiation results of the negotiation success rate (%) and negotiation efficiency (number of dialogue rounds) of 9 buyer - seller pairs.

Secondly, the performance differences among different language models (LLMs) are significant.

On the seller side, the Gemini - 2.5 - pro model shows the strongest price - defense ability against ordinary buyers and buyers with fixed emotions, but it is still vulnerable when facing emotion - adaptive buyers optimized by EvoEmo. The results on the buyer side vary by model: Buyers based on the Gemini model achieve the maximum cost - saving when negotiating with GPT - 5 - mini sellers, while GPT - 5 - mini buyers perform best when dealing with DeepSeek - V3.1 sellers. It is worth noting that no buyer model shows a significant advantage when negotiating with the robust Gemini - 2.5 - pro seller, highlighting its advantage as a challenging negotiation opponent.

Table | Comparison of negotiation performance among different reward function formulas

Buyers with emotional configurations optimized by EvoEmo always maintain a success rate close to 100% and are more efficient than buyers using conventional or fixed - emotion settings, and the number of rounds required to reach an agreement is also significantly reduced. These results fully demonstrate the significant advantages of EvoEmo in both benchmark tests.

High - EQ AI? Still a Long Way Off

The above research results show that emotion is an important factor in successful negotiations. Compared with the basic model and the fixed - emotion baseline, the emotional strategies optimized by EvoEmo can continuously improve negotiation performance, manifested in a higher success rate, stronger efficiency, and more cost - saving for buyers. It has been proven that the ability to dynamically adjust emotional states is crucial for effective multi - round bargaining, enabling Agents to strategically utilize emotional intelligence in negotiations.

Of course, EvoEmo is not perfect and still has some limitations:

Limitations of emotional spectrum and baseline comparison. This study only examined 7 basic emotional states and may not fully capture the complexity of human emotional expression in real negotiations. In addition, the baseline comparisons are limited to fixed - emotion strategies and emotion - neutral strategies, omitting potentially valuable comparison schemes such as random emotional sequences.

Situational dependence and generalization challenges. The evaluation is based on 20 daily negotiation scenarios and focuses on the traditional business field, raising questions about potential selection bias and limited generalization ability. The effectiveness of EvoEmo in diverse negotiation scenarios has not been verified, especially in high - risk, emotionally intense fields, where its emotional dynamics may differ significantly from the standard business environment.

Interpretability of emotional strategies. Due to the black - box nature of LLM responses and the evolutionary optimization mechanism, it is difficult to explain why a specific emotional sequence achieves results in a specific negotiation scenario.

The gap between simulation and reality. Simulation verification based on LLMs may not capture human expertise, and the high computational intensity limits the real - time adaptation ability in actual deployment.

In addition, future work will also explore the ethical implications and behavioral consistency of evolutionary strategies, and pay special attention to the emergence of deceptive or compromising behaviors.

However, it cannot be denied that EvoEmo has pointed out a new direction for the development of AI emotional intelligence. In the near future, it may really be a high - EQ AI that helps you bargain when shopping online and conduct cross - border trade negotiations every day.

This article is from the WeChat official account "Academic Headlines" (ID: SciTouTiao)

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。

The high emotional intelligence AI Agent is here. A Cambridge team has introduced the evolutionary RL framework EvoEmo, which successfully “bargains” through anger and sadness.

Traditional LLM Negotiations: Three Major Shortcomings Holding Back

EvoEmo: Building an "Emotional Evolution Pipeline" for AI

Bargaining with Anger and Sadness

High - EQ AI? Still a Long Way Off