Built a "Mini Google" in Laundry Room with $200 CPU and Cheap Second - hand Hardware: Can Process 2 Billion Web Pages for $5,000 Total Cost

I won't let it stay in the laundry room forever.

If someone tells you that you can recreate a mini "Google Search" at home using an old server you pieced together yourself, you might think it's a fantasy. But in 2025, this actually happened.

The protagonist of this story is developer Ryan Pearce. Without backing from a tech giant or a large investment, he attempted to retrace Google's path in a "low - cost way": by using a bunch of second - hand hardware he scavenged, a CPU that cost less than $200, and the power of AI, he set up a "mini Google" in his home laundry room.

With just these resources, Ryan Pearce created two search engines:

● Searcha Page —— A regular search engine (https://searcha.page/);

● Seek Ninja —— More privacy - focused and doesn't collect user data (https://seek.ninja/https://seek.ninja/).

In other words, when users enter a search request on the page, the server behind it is actually running next to the washing machine and dryer in Ryan Pearce's home.

Retracing Google's Early Steps: From a Dorm Room to a Laundry Room

Going back nearly 30 years, the starting point of "Google Search" was also quite humble.

When Google was just starting out, it didn't have any fancy hardware. Its first experimental server had a capacity of only 40GB and ran in a Stanford University dorm room. The case of the server was even pieced together with large Lego (Duplo bricks). Later, with donations from IBM and Intel, Google upgraded its servers to small racks.

Today, Google Search has grown so large that it can't even fit in a single data center. But if you're willing to put in the effort, with a bit of clever resource management and a lot of perseverance, you can recreate a fairly modern search experience on a machine similar in scale to Google's first - generation server —— and even place it in your home laundry room.

Ryan Pearce joked about this:

"Right now, the storage capacity in my laundry room is larger than Google's in 2000. That's just crazy to think about."

In a way, he is retracing Google's historical path, except the setting has changed from a campus dorm room to a home laundry room.

Building a DIY Search Engine: No Cloud, Just Scavenged Old Servers

Different from most cloud - computing - driven projects, Ryan Pearce's search engines are almost completely cloud - free and follow a self - hosted model:

● The upper - level host: It is mainly pieced together from old server parts, and Ryan Pearce installed a simple air duct for heat dissipation.

● The lower - level computer: It provides additional support for the entire system.

At first, this device was placed in the bedroom, but it was too hot and noisy, making it impossible to sleep. After a "reminder" from his wife, Ryan Pearce moved the device to the laundry room and ran the network cable through the wall. Since then, the server has been sitting next to the washing machine and dryer. Although the heat problem hasn't been completely solved, at least it doesn't affect daily life: "It won't get too hot unless the door is closed for too long."

How does a search engine running in a laundry room perform? According to Ryan Pearce, except for occasional delays in search results in the early days (which have significantly improved in recent weeks), the overall performance of the engine is hard to fault, and the quality of the results is even better than expected —— supported by a database of up to 2 billion documents.

Ryan Pearce also plans to expand it to 4 billion documents within half a year. To put this into perspective, in 1998, when Google was still at Stanford, its database had only 24 million web pages; by 2020, this number had reached 400 billion.

By today's Google standards, 2 billion is just a drop in the bucket. But for an individual working alone, it's an astonishing achievement.

The Core Secret: Traditional Search with AI Enhancement

The key to Ryan Pearce's ability to scale up his "old - server" setup lies in large language models (LLMs).

"What I'm actually doing is very traditional search, similar to what Google was doing 20 years ago. But I've added a little 'flavor' —— using AI for keyword expansion and context understanding. This is actually the most difficult part of search."

So, although Searcha Page and Seek Ninja both have minimalist interfaces, they rely heavily on AI behind the scenes.

Many people might say, "I just want a search engine without AI." But in fact, AI has been deeply integrated into search engines: for example, reverse image search wouldn't be possible without AI; Google launched RankBrain a decade ago to optimize search results using machine learning; Microsoft revealed as early as 2019 that 90% of Bing's search results rely on machine learning.

Therefore, when people complain that AI is making search worse today, they often overlook the fact that AI has become an integral part of modern search engines. Ryan Pearce's case further proves that even an individual can use AI to build and expand their own search engine.

Second - Hand Hardware + Top - Tier CPU: Drastically Reducing DIY Costs

The core of Ryan Pearce's search engine is a 32 - core AMD EPYC 7532:

● When it was first released in 2020, it cost over $3000;

● Now, you can buy it on eBay for less than $200.

To save even more money, Ryan Pearce bought a "quality - tested version" chip. He added that he could have bought another CPU with twice the number of threads for the same price, but he gave up on it because it would generate too much heat and wasn't suitable for a home environment.

In addition, Ryan Pearce also bought a lot of second - hand server hardware with good performance at low prices. Since enterprises replace their computers every three years, the old hardware they discard has significantly depreciated in the market, but still has strong performance. So, as long as you can tolerate high power consumption, you can get a huge amount of computing power at a very low price.

Ryan Pearce took advantage of this and pieced together a system capable of running a modern search engine with "dirt - cheap" old equipment, and its performance is even stronger than some of Google's early servers. It is understood that the total cost of the entire system is only about $5000, with about $3000 spent on storage because half a terabyte of memory is still quite expensive, but this is considered top - notch in the DIY community.

Using LLMs for "Catch - Up": From Rapid Prototyping to Continuous Iteration

It's worth noting that Ryan Pearce isn't the only DIY search engine developer.

For example, another tech enthusiast, Wilson Lin, chose a completely different approach: his system relies on at least nine different cloud services; he developed new data parsing technology, which significantly reduces the operating cost of the search engine. He explained that this way, the overall cost is much lower than using AWS, and "he can advance the project within a reasonable budget."

These two seemingly different approaches have both reached their current scale, mainly thanks to a key factor: AI. Many people complain that AI is reducing the quality of search, but it's also AI that gives these individual developers a chance to achieve a "Google - level" search experience.

One of the biggest controversies surrounding AI is whether search engines over - emphasize it. Often, the presence of AI is directly reflected on the result pages, trying to "explain" your search. Some people like it because it saves time, while others strongly dislike it. But for individual developers with limited resources, LLMs are essential tools for quickly building and expanding datasets.

Take Ryan Pearce as an example. With a background in enterprise software and game development, he isn't opposed to introducing AI into programming. The codebase of his search engine has exceeded 150,000 lines, and with repeated iterations, the actual amount of code he's written should be close to 500,000 lines. His approach to using AI is to first let the LLM handle certain functions and then gradually replace them with traditional implementations —— this allows him to quickly build a complex system and then refine it through iteration.

Wilson Lin also commented: "LLMs have indeed lowered the threshold. The biggest obstacle preventing us from challenging Google now isn't technology, but the market."

"I Won't Let It Stay in the Laundry Room Forever"

However, the complexity of LLMs still exceeds the capacity of the laundry - room server.

So, Ryan Pearce connected his Searcha Page and Seek Ninja to the Llama 3 inference service provided by SambaNova to get fast AI capabilities at a low cost. In addition, Ryan Pearce also benefits from Common Crawl —— an open web data warehouse, which is also an important training source for large models. He was even temporarily banned by Common Crawl during the project development for excessive scraping.

Ryan Pearce sighed, "I really appreciate them and hope to give back in the future. Once my project grows, I'll rely on them less."

Of course, not all attempts were successful. Ryan Pearce revealed that at first, he tried to use a vector database for search, but it failed: "It could search, but the results were too 'artistic,' similar to the hallucination problem of LLMs."

So far, Ryan Pearce's search engines have attracted a lot of attention. For example, a Chinese user contacted him, asking for a "censorship - free search" that could be connected to their own LLM proxy. But Ryan Pearce admitted that it's currently difficult to support Chinese because it would mean rebuilding the dataset, which is too costly.

When it comes to the future, Ryan Pearce said he plans to move the server out of his home, perhaps to a hosting facility or a co - located data center. For this reason, he's also starting to try some lightweight advertising monetization methods:

"Once the traffic increases, I'll move it to a hosting environment. I won't let it stay in the laundry room forever."

Reference Link

https://www.fastcompany.com/91396271/searcha-page-seekninja-diy-search-engines

This article is from the WeChat official account "CSDN", author: Zheng Liyuan. Republished by 36Kr with permission.

该文观点仅代表作者本人，36氪平台仅提供信息存储空间服务。