With a margin of less than 400 votes, a 16-year-old CTO led the team and used 5,000 AIs to accurately predict the US election.
Can you know what people are thinking without chatting with them? A group of post - 2000s are rewriting the research industry with AI.
In 2024, a group of young people with an average age of 18 used about 5000 AI conversations (each only taking 30 - 90 seconds) and successfully predicted the results of the Democratic primary in New York State, USA, with a vote error of less than 400 votes at nearly zero cost.
In less than two years, Aaru, the AI research company founded by these young people, has secured top - tier partners such as Accenture, Ernst & Young, and IPG, and completed a $50 million Series A financing at a valuation of $1 billion at the end of 2025.
Behind all this is a simple yet almost audacious concept — replacing the "finite sample" with "infinite simulation".
The core of Aaru is not to make AI better at "asking questions", but to make AI learn to "be human". They have trained thousands of AI agents, each endowed with complex demographic attributes and behavioral cognitive patterns, like a miniaturized real person.
When these "synthetic people" interact in the digital world, they can answer questions that were previously unanswerable, such as the collective reaction of a crowd to a new product, new policy, or new advertisement.
The "synthetic behavior" represented by Aaru is at the top of the technology stack. It is reshaping the $80 billion research market together with other explorers of "synthetic interaction" (such as Keplar, Outset) and "synthetic data" (such as Gretel, YData).
01 When AI Agents Think Like Humans
While most AI competitors in the market are still competing around "how to collect human insights more efficiently", Aaru has a different approach: What if, instead of relying on real people, we directly "synthesize" an infinite number of digital agents that can simulate human behavior to predict group reactions?
Their core is called "simulation prediction", which is a dynamic deduction of "modeling - simulation - prediction" with a focus on "what if...?"
The technical path is to train a large number of AI Agents (Multi - Agent System, MAS). These agents rely on structured and unstructured data from multiple sources, such as socioeconomic statistics, consumer behavior data, and social media sentiment signals.
Each agent not only carries labels such as age and income but is also endowed with behavioral patterns, decision - making motives, and even cognitive preferences, equivalent to individual "simulated users".
Combining these agents forms a dynamic and interactive knowledge base of human behavior. In other words, this is not just synthetic data; it directly synthesizes people.
For example, after Aaru trains specific population labels such as "white - collar workers aged 25 - 30 in first - tier cities", it will simulate their decision - making logic, such as whether to buy a new corporate product or their attitude towards public events.
What can these "synthetic people" do?
Aaru has found a "lighthouse scenario" that best showcases its advantages — political election prediction.
They used about 5000 AI Q&A sessions (each only taking 30 - 90 seconds) and successfully predicted the results of the 2024 Democratic primary in New York State, USA, with a difference of less than 371 votes from the actual votes, and the cost is said to be only 1/10 of traditional polls.
If it were a traditional market research, it might take weeks and cost hundreds of thousands of dollars.
This scenario has the characteristics of public results, a short verification cycle, and a clear winner. Its successful prediction at extremely low cost has become "solid evidence" of its technical capabilities.
Aaru's accuracy has also been recognized. The Chief Solution Officer of IPG (Interpublic Group) commented that Aaru's accuracy is "higher than any website survey, poll, or focus group".
In addition to political elections, Aaru's applications also extend to areas such as corporate decision - making and public strategies. The project scale can also be flexibly adjusted, supporting everything from small tests with a few agents to large - scale simulations of hundreds of thousands of agents.
Currently, Aaru's products are mainly divided into three categories:
① Lumen, for corporate decision - making simulation. It can simulate groups that are difficult to reach, such as corporate executives and high - net - worth customers, and is used for product concept testing, ultra - targeted marketing strategy verification, etc. The target audiences include "people who spend $30,000 a year on handbags" and "new parents with diabetes in the rural market".
② Dynamo, simulating human nature and focusing on election prediction. By allowing a large number of agents to continuously receive and process information, it simulates how voters access the media and update their views. In the political election scenario, each AI agent continuously receives information, simulating how real voters obtain media content and update their views, thus replacing traditional polls to predict election results.
③ Seraph, designed for the public sector, supports the configuration of any time, place, and media environment, and is used to simulate public opinion and information dissemination in a dynamic environment to assist in high - risk decision - making.
Currently, Aaru has established a "Simulation Studio" in cooperation with IPG.
Simply put, IPG will integrate Aaru's "population simulation" capabilities into its own consumer data platform, Acxiom. This means that on the premise of legally and compliantly using data, the simulated population profiles will be more detailed and closer to reality, thus helping brands with segmentation and market reach.
It is worth mentioning that the team driving this vision forward is a young team with an average age of only 18, and the company's CTO is only 16 years old.
▲ Cam Fink, Ned Koh, John Kessler (from left to right)
Cam Fink, 20 years old, co - founder & CEO, with work/research experience at Kleiner Perkins, RSI, etc.;
Ned Koh, 20 years old, co - founder, formerly studied at Harvard University, with research experience at Northwestern University and co - founding experience in startups;
John Kessler, 16 years old, co - founder & CTO.
Data is the new gold. Aaru is using a near - science - fiction method to try to disrupt the traditional research industry that relies on experience and samples. The entry and cooperation of industry giants are already a signal that cannot be ignored.
02 Replacing the "Finite Sample" with "Infinite Simulation"
Behind the $80 billion research market is a large labor force. The core of the traditional model is "sampling - asking - statistics", and its bottlenecks lie in the finiteness of samples, high costs, and lagging feedback.
AI is reshaping this industry in two ways:
Interview - enhancement type
The first type of companies focuses on the "front - end" of the research process, using AI to simulate the interaction process (interview) but still interacting with real people.
The barriers lie in natural interaction technology and process automation. By obtaining qualitative insights on a large scale and capturing non - verbal cues such as tone and expression, they attempt to gain deeper emotional and behavioral insights.
① AI Voice Interview Research: Keplar
Keplar is an AI voice interview platform that replaces traditional manual interviews with voice AI. Its highlight lies in the authenticity of multi - modal conversations: The AI host can conduct voice interviews in anthropomorphic identities such as "Ellie" and "Andrew", and participants often forget that the other party is an AI. Natural interactions where participants directly call the AI by its name can even be heard in the conversation recordings.
It transforms any product question into an interview guide, directly retrieves the CRM customer list, conducts hundreds of voice interviews, and analyzes the answer themes in real - time.
Compared with traditional research companies, it reduces the interview cycle from weeks to hours and cuts the cost to a fraction. The difference lies in its voice - first approach. It builds trust through voice tone and pause rhythm to obtain deeper emotional feedback. Its deliverables are PPTs and reports ready for presentation, rather than raw data.
② Video In - depth Interview: Listen Labs
Listen Labs is an AI user research platform heavily invested in by Sequoia Capital, with a cumulative financing of $27 million. Its uniqueness lies in the balance between the depth and scale of video interviews: The AI host can conduct video interviews, and participants can respond through video, voice, text, or screen sharing, restoring the richness of face - to - face interviews.
The core difference is the combination of video and AI analysis. The platform emphasizes "qualitative depth at a quantitative scale" and can conduct hundreds of video interviews simultaneously. The AI automatically codes answers, identifies themes, and generates reports.
Listen Labs captures visual cues such as facial expressions, operational behaviors, and environmental backgrounds, making it more suitable for UX research and product testing. Its customers include large enterprises in consumer goods, healthcare, etc.
③ AI Host Interview: Outset
Outset focuses on AI - led in - depth interviews and has a total financing of $21 million, led by 8VC and participated in by Bain Capital. Its platform allows the AI host to converse with thousands of participants via video/voice and automatically synthesizes the results.
The core highlight is its ultra - large scale and speed: While 25 in - depth interviews traditionally take 4 - 6 weeks, Outset can complete 250 interviews in one week and automatically analyze them, increasing the speed by 8 times and reducing the cost by 81%.
The difference lies in the full automation of the research process, from creating discussion guides, recruiting respondents to analyzing results and generating reports. The research team only needs to input the research questions, and the platform automatically handles the rest.
Customers include Fortune 500 companies such as Nestle, Microsoft, and Weight Watchers. Compared with Listen Labs, Outset emphasizes end - to - end automation and enterprise - level integration capabilities, making it suitable for complex research projects that require quick and large - scale acquisition of customers' "why".
④ Neuromarketing AI Platform: Neurons
Neurons focuses on predicting advertising and creative effectiveness and is built based on cognitive neuroscience, machine learning, and psychology. Its highlight is second - level attention prediction: Upload advertising materials, and the AI generates a heat map within seconds, predicting the distribution of the audience's attention and providing KPI scores such as engagement and ad recall.
The platform serves advertising agencies and brand marketing teams, addressing the pain point of "whether the creativity is effective" and reducing repeated modifications.
Different from interview - type platforms that collect what users "say", Neurons measures what users "see" and predicts subconscious reactions. Its core value is data - driven creative decision - making, identifying the best materials before launch, reducing risks, and improving ROI.
⑤ AI User Research Platform: Synthetic
The core of Synthetic is to collect product feedback through simulated interviews. Its technology is based on a multi - agent system, leveraging models such as GPT, LLaMA, and Mistral. Each synthetic user uses the Five - Factor Model (FFM) to simulate cognitive biases and behavioral patterns and adjusts trust and tone in conversations with real people.
The platform supports enterprises to upload proprietary data such as historical interviews and customer service tickets to customize and synthesize real user backgrounds.
Its customers mainly come from industries such as pharmaceuticals, automobiles, and consumer goods. The platform is SOC 2 certified and provides an API interface. In a case of a pharmaceutical company, the expert interview cycle was shortened from 3 months to a few hours. The platform publicly discloses that the matching degree between its synthetic results and real user insights is about 85 - 92%.
Synthetic data companies
The second type of companies focuses on the "back - end" of the technology, that is, the data itself. The barriers lie in data fidelity, privacy compliance, and system integration. They are responsible for providing safe and high - quality "fuel" for upper - layer models and traditional analysis.
① Developer - friendly Synthetic Data API Platform: Gretel Labs
The core highlight of Gretel Labs is instant generation and privacy assurance. It provides SDKs and APIs for engineers, which can be seamlessly embedded in existing data pipelines. High - fidelity synthetic data can be generated with just a few lines of code. Its customizable generative AI model can synthesize text and time - series data while maintaining the integrity of cross - table relationships, making it suitable for complex scenarios such as financial transactions and medical records.
Gretel serves technology companies such as Techstars and HelloFresh, meeting high - frequency needs such as development testing and data sharing. Its barriers lie in low - friction integration and model generalization ability rather than complex UIs or consulting services.
② Enterprise - level Synthetic Data Platform: Tonic.ai
Tonic.ai focuses on providing "desensitization" solutions for production data for Fortune 500 companies. Its difference lies in database subsetting and relationship fidelity: It can extract a representative subset from PB - level production databases while maintaining cross - table foreign - key associations, timestamp logic, and business process integrity, which is indispensable for testing complex enterprise systems (ERP, CRM).
Technically, it uses structure - aware generation, first parsing the database schema and then training the generation model table by table to ensure that the synthetic data is 100% compatible with the original system in structure. Its customers include Adobe, eBay, etc. The core value is to comply with regulations and replace traditional data desensitization, avoiding insufficient test coverage caused by masking and encryption.
③ Data Privacy and Analysis Enhancement Platform: YData
YData's uniqueness lies in the closed - loop of synthetic data and data quality. It not only generates data but also diagnoses data defects (missing values, biases, imbalances) in advance and then synthesizes supplementary samples accordingly to improve model training results.
Its Fabric platform covers the entire process from data annotation, generation to model training, making it particularly suitable for fields sensitive to data quality such as autonomous driving and financial risk control.
Different from most synthetic data tools that focus on privacy protection, YData emphasizes that it is an "AI development accelerator". Its synthetic data not only "looks like" real data but also aims to make downstream AI learn better and predict more accurately.
Whether it is the "front - end" or the "back - end", they all point to a transformation: Market research is moving from passive collection relying on "finite samples" to active prediction using "infinite simulation". A new research era driven by AI, emphasizing both speed and depth, is an inevitable historical trend.
This article is from the WeChat public account "Silicon - based Observation Pro", author: Silicon - based Jun. Republished by 36Kr with permission.