HomeArticle

In-depth Analysis of 151 Job Descriptions: Unveiling "Data Labelers" – Polarization, 30-Fold Salary Gap, and a Harsh Future

硅星人Pro2026-06-08 11:59
Is it a good job?

At nine in the morning, Xiaolin put on her headphones and opened the annotation platform. A Mandarin voice with a Sichuan accent came into her ears.

She had to transcribe this voice word by word first, then mark the pronunciation deviations, abnormal intonations, and dialect feature words. Finally, she had to judge where the AI's recognition results were correct and where they went wrong. It sounded like she was listening to a podcast, and others might think she was slacking off, but this was her job.

Her official title is "data annotator", but she prefers to say she is an "AI trainer" – it sounds cooler. After all, in most people's perception, this job is like an assembly line in the AI era: sitting in front of a computer, mechanically clicking the mouse, drawing boxes and adding labels day after day. It doesn't require much technical skill and is a typical "human battery".

But once someone asks "what exactly do you do", Xiaolin usually pauses for two seconds and then replies, "Well... teaching AI to understand human language." She can't say much more than that.

Perhaps the job description in the following resume of a practitioner can roughly illustrate what they do every day.

For more and more young people who want to enter the AI industry, data annotation is becoming an entry - level job they consider. How did this job come into being, what is the overall picture of the industry, and where are the practitioners headed? We retrieved 302 positions under the keyword "data annotation" on Boss Zhipin based in Beijing and dissected 151 complete job descriptions one by one.

1

The monthly salary for the same annotation job can vary by thirty times

ChatGPT can write poems because annotators evaluate each line, saying "this line is good, that line is bad"; autonomous driving can recognize traffic lights at intersections because someone traces the boundaries pixel by pixel on tens of thousands of street - view images. When you say to an agent, "Play Jay Chou's songs", it understands and executes in a second. Behind this are thousands of voice commands with accents, environmental noise, and elisions that have been manually annotated.

Now, for more natural interaction, the complexity of voice annotation is increasing exponentially. It's no longer just about converting sounds into text, but also about marking emotions, intentions, and pragmatic scenarios, and even marking the subtle differences in dialects so that the model can truly learn to "understand human language".

Every flash of intelligence is supported by human hands. How much the owners of these hands earn and how long they can do the job is another matter. Let's first look at the income.

In 151 complete job descriptions, the median monthly salary for data annotation positions in Beijing is 10,500 yuan, the lowest is 2,000 yuan, and the highest is 65,000 yuan – a difference of more than thirty times between the two ends.

Most of the low - paying jobs are internships, part - time jobs, and crowdsourcing. There are 84 daily - paid jobs, with a median of 185 yuan per day. The recruitment posts say "data annotation, two - day weekend, suitable for beginners", with no restrictions on education and experience, and the monthly salary is 4,000 - 5,000 yuan. On the other end of the spectrum, Baidu offers 500 - 600 yuan per day for interns in the autonomous driving data annotation algorithm, requiring a master's degree; Alibaba's AI trainer position offers 20,000 - 35,000 yuan with 16 salaries, also requiring a master's degree.

For the same annotation job, the salary can vary by ten times. The gap comes from the differentiation of work content: the low - paying end is about execution, following requirements and standard operating procedures (SOPs); the high - paying end is about definition, setting annotation rules, managing quality standards, and closing the loop between algorithms and data. The former can be easily replaced, while the latter is difficult to replicate.

An old hand in the annotation industry who has worked for six years put it bluntly: In 2016, annotators were craftsmen, and experienced workers were in high demand; now, annotators are assembly - line workers, and anyone can do the job. You're just an account.

2

From "drawing boxes" to making decisions for AI

The 151 job descriptions can be divided into four main modalities.

Text annotation accounts for 16%, including corpus cleaning, dialogue quality assessment, and multilingual translation proofreading.

In the era of large models, the demand for this type has skyrocketed. Tencent is recruiting for "large - model data annotation in the code direction". Applicants need to understand code, judge where the programs written by AI are good and where there are bugs, and then correct them with human judgment.

Image and video annotation account for 17%, including box selection, point marking, segmentation, and key - point annotation. Autonomous driving is a major area.

A position of "Intelligent Driving Data Annotation Algorithm Engineer" at a large automobile company requires processing 4D point clouds and LiDAR data and outputting BBox ground truths with consistent timing, with a monthly salary of 40,000 - 70,000 yuan and 15 salaries. This is no longer just "drawing boxes", but requires an understanding of sensor principles and three - dimensional spatial relationships.

Voice and audio annotation have the lowest proportion on the recruitment platform, only 1%, but the requirements are not low at all.

At the beginning of June this year, Elon Musk's xAI recruited Chinese AI tutors globally to train Grok's Chinese voice ability. The requirements include native - level Chinese, familiarity with dialects and regional accent differences, and the ability to do voice transcription, pronunciation correction, and audio annotation, with an hourly wage of 35 - 45 US dollars in the United States. At the same time, JD Technology is recruiting annotators for French, German, and Dutch. A TEM - 8 certificate is the minimum requirement, and they need to be able to identify pronunciation errors and intonation deviations – the standards are comparable to those of linguistic research.

Multi - modal and comprehensive annotation account for the largest proportion, up to 36%. One position involves text, images, audio, and video at the same time, which is common in large - model data teams. The replaceability of single - skill workers is getting higher and higher, and all - around players are more popular.

Looking at these 151 job descriptions by business area, the distribution is highly concentrated:

Large - model/AIGC corpora directly account for 28%, and autonomous driving and medical imaging each account for 7%. Nearly 30% of the positions are "feeding" large models. The arms race has become white - hot. The number of GPUs is no longer the only bargaining chip. The one with data closer to real humans has a greater chance of winning.

The entry requirements have also changed. Nearly 90% of the positions require a bachelor's or associate degree, seemingly still a low - entry - level job. But among the 13 positions that require a master's degree, almost all are in large - model evaluation, algorithm support, and overseas multilingual directions. Shanda Network's "expert - level data annotator" offers a daily wage of 400 - 800 yuan and requires a master's degree and the ability to work remotely; Alibaba's trainer position offers 20,000 - 35,000 yuan with 16 salaries and only recruits master's degree holders; there are also financial annotation experts with an hourly wage of 150 - 200 yuan.

Professional barriers are also getting higher. Medical annotation clearly requires a background in clinical medicine or imaging; the code direction requires a computer major and the ability to write and debug code; film and television aesthetics annotation favors majors in drama, film and television literature, and digital media art; financial annotation requires a major in finance or economics; and embodied intelligence annotation points to mechanical and automation majors. The closer to the upstream of the data value chain, the more it depends on real - world domain knowledge rather than just carefulness and patience.

Under the same job title, there are crowdsourcing workers with a daily wage of 100 yuan and experts with a monthly salary of 65,000 yuan, and the middle ground is being continuously squeezed.

From another perspective, a single position can accommodate people from medical, coding, design, and finance backgrounds. It is becoming an outlet for almost all majors.

3

Big companies set the rules, and outsourcing companies break them down

Looking at the companies in the recruitment pool, there is a clear distinction between big companies and outsourcing providers.

Among the 302 positions, the recognizable big companies are JD, Tencent, Alibaba, Kuaishou, Xiaohongshu, and Baidu. But the ones that recruit the most are not them, but annotation outsourcing companies and data service providers – Haitian Ruisheng, Yunce Data, and Beisai Technology firmly occupy the top positions in terms of the number of positions.

The industry logic is clear: Big companies set annotation rules and evaluation standards, and outsourcing companies break them down into detailed SOPs and subcontract them layer by layer.

This is why many annotators feel that their work is mechanical and they can't see the whole picture – they are at the end of the assembly line, with only an account and a set of instructions in hand.

But big companies never easily hand over their core model capabilities.

Tencent recruits big - model annotators in the code direction on its own, Kuaishou directly bids for annotation project management for its Keling AI, and Xiaohongshu recruits interns for big - model data annotation. The more critical the model, the more they want to control the data quality themselves.

4

The past, present, and future of data annotation

Data annotation has become the human foundation behind the progress of AI. To understand how it has developed to the present and where it will go in the future, we need to look at its entire history.

2006 - 2014 was the pre - annotation era.

At that time, "data annotation" was not yet considered a profession. When Fei - Fei Li launched ImageNet at Princeton, she initially hired undergraduates at an hourly wage of 10 US dollars to annotate images one by one. But the students soon couldn't stand this repetitive work. According to the efficiency at that time, it would take 19 years to annotate the entire dataset. The turning point came from Amazon Mechanical Turk: From 2008 to 2010, nearly 50,000 crowdsourcing workers from 167 countries completed the annotation of more than 14 million images. In the academic circle, doing annotation was regarded as "manual labor". Fei - Fei Li's grant application was even criticized by the NIH review as "a disgrace for Princeton to do this".

In 2012, AlexNet won the ILSVRC championship with a top - 5 error rate of 15.3%, leading the second - place by more than 10 percentage points. Yann LeCun later called it an undisputed turning point in the history of computer vision. The entire industry then realized that the gap in algorithms could be caught up, but the gap in data was the real barrier. For the first time, there was a possibility for annotation to become a business.

2014 - 2017 was the era of annotation factories.

The first batch of data annotation companies were established. These companies often chose third - tier cities as their locations for very practical reasons: cheap labor, low rent, and government subsidies. "We are taking advantage of the demographic dividend," a boss of an annotation company once said without hesitation. "You can't hire people for 4,000 yuan a month in Beijing, but in a county, 2,000 yuan will attract many applicants."

A large number of rural youths, stay - at - home moms in small towns, and people with disabilities became annotators after training. In counties with scarce industries, a monthly income of 3,000 - 5,000 yuan was already a decent income. But most of them didn't know what they were doing: "We just draw boxes every day. No one tells us what these boxes are for."

2017 - 2020 was the era of differentiation and upgrading.

Some big companies began to build larger - scale data annotation bases. The entry of big companies brought standardization and stratification. The group of annotators began to differentiate: at the bottom was still basic box selection, above that were quality inspectors, and further up were annotation rule designers – they needed to understand the basic principles of AI, and their income increased several times. At the beginning of 2020, the Ministry of Human Resources and Social Security officially included "Artificial Intelligence Trainer" in the national occupational classification directory, and the annual salary of top trainers had exceeded 300,000 yuan.

Since 2020, there has been an AI backlash.

The GPT - 3 paper ("Language Models are Few - Shot Learners", NeurIPS 2020) proved the few - shot learning ability of large models. Models no longer need a large amount of manual annotation to complete many tasks. At the same time, automatic annotation technology has matured, and synthetic data has emerged. The industry's automation rate has soared from about 30% three years ago to over 60%.

But RLHF has given rise to a new type of demand: ranking the preferences of model outputs, evaluating factual accuracy, and correcting reasoning chains – these jobs are no longer called annotation, but prompt engineers or AI alignment trainers, and the skill requirements are completely different.

This kind of replacement has been written into the recruitment notices. Baidu's "Autonomous Driving Data Annotation Model Algorithm Intern" is responsible for developing pre - annotation models to let AI annotate data by itself. "Automatic annotation + manual review" has become the mainstream workflow: AI does a rough annotation first, and humans are responsible for quality inspection, error correction