HomeArticle

After quitting his job at a big tech company, he spent $48,000 to build a server at home. One year later, he saved an average of $105 per day.

CSDN2026-05-27 18:50
As more and more AI developers start to complain that "cloud GPUs are too expensive," some simply choose to build their own servers. But how much cheaper is it to build your own server compared to renting cloud GPUs?

In 2024, Rosmine chose to quit his job at FAANG and became an independent researcher.

To conduct research, he built a server named "grumbl" himself, equipped with 6 Ada 6000 GPUs.

This article records the process of building this server and the problems encountered, and at the same time answers a core question: Is it more cost - effective to build a server yourself or rent cloud GPUs?

Rosmine explained that the server is named "grumbl" because he always misspells the word "GPUs".

Regard GPUs as an investment

Rosmine revealed that this device cost him a total of $48,000. It sounds expensive, but it is far less than the income loss caused by quitting his job.

For him, as long as more powerful GPUs can enable his research work to achieve results two months earlier than using a small - scale machine, then it is worth buying a more powerful server.

Therefore, he finally decided to directly buy a server with the strongest performance within the scope of the power supply and environmental conditions of his apartment.

Selection of GPUs

Rosmine referred to another researcher, Tim Dettmers' GPU selection guide. After comprehensive consideration, he narrowed the candidate range of GPUs to A100, H100, and RTX 6000 Ada.

However, since the A100 does not support FP8 and its inference performance is slower than that of the new - generation GPUs, and Rosmine said that he will carry out a large number of inference tasks (reinforcement learning / RL) next, so finally there are only two options left: RTX 6000 Ada and H100.

After comparing the price/throughput ratios of the 6000 Ada, H100, and A100, he finally chose the RTX 6000 Ada.

Power limitations

Since Rosmine lives in an apartment, he has no conditions to upgrade the circuit to support a standard data - center server.

The power consumption of 6 GPUs has exceeded the range that a single - circuit of an ordinary apartment can bear. So he had to use two power supplies and connect them to two sockets on different circuits respectively.

However, if you search on Google for "connecting a PC to multiple sockets", you will see a lot of warnings, as if just considering this solution will make a person explode on the spot.

Therefore, to avoid potential risks, Rosmine specifically hired a professional PC installation engineer to ensure that the entire system is safe and reliable at the power and hardware levels. Although this is more costly than assembling it completely by himself, compared with the serious accidents caused by operational errors (such as damaging the equipment or even endangering the living environment), this investment is obviously more prudent.

Ironically, although the entire design was initially completed around the power - supply limitations of the apartment, in the end, this GPU server named "grumbl" was moved to the basement of his parents' house - where he can actually directly upgrade the circuit, and many of the initial limitations no longer hold.

Building your own GPU server vs. renting cloud services?

So, is it more cost - effective to buy GPUs yourself or directly rent GPUs from cloud providers?

In this regard, Rosmine adopted a relatively straightforward method for evaluation: counting his actual usage of GPUs and comparing it with the cost of renting cloud services with the same computing power.

In 2024, according to the GPU rental prices at that time, he needed to keep these GPUs at a utilization rate of nearly 85% or more and run continuously for about a year to basically equal the cloud rental cost.

This result doesn't seem difficult to achieve, but for a more comprehensive analysis, the electricity cost must also be included in the calculation. At the same time, a real - world factor must be considered: as GPUs with higher performance are continuously launched, the rental prices of the same computing power in the cloud will also gradually decrease.

To conduct more accurate statistics, he specifically wrote a script to record the usage of each GPU once a minute. He also recorded the overall power consumption (in watts) to further calculate the actual electricity cost.

In this comparative analysis, he only used the on - demand billing price of cloud services as a reference.

Of course, cloud providers also offer reserved - instance plans for 6 to 12 months, but in his opinion, the significance of such plans is limited - because the discount is not significant, and the gap compared with directly buying the entire server is not large, while the latter has the advantage that the GPUs will ultimately belong to him completely.

If "grumbl" is not equipped with a monitor, it is a waste in a sense - after all, this server can support up to 24 monitors connected simultaneously. Even theoretically, he can transform it into a mini - version of the "Las Vegas Sphere".

GPU usage time chart

To measure the actual usage of GPUs, Rosmine conducted statistics on each GPU: recording the number of hours each day when it was "used at least once".

He believes that this statistical method is relatively close to the billing logic of cloud GPUs - in the cloud scenario, if a server is idle for less than an hour, usually one will not choose to stop and restart the instance.

From a comparative perspective, this method is even relatively "lenient" for the cloud rental model, because it assumes that users can independently start and stop each GPU. But in actual use, Rosmine said: "A lot of my idle time occurs when 'running multiple experiments in parallel': one of the experiments ends early or fails, but other experiments are still running. If I were really renting a cloud server, I wouldn't stop the whole machine because of this."

It should be noted that the statistical indicator here is the "usage" of GPUs, not the training efficiency. Even if the utilization rate of a GPU is only 10%, as long as it is used within that hour, it will still be counted as an active state. (Even in the cloud, this level of code efficiency will not change.)

The following is a statistical chart of GPU usage over time:

As can be seen from the chart, the server was shut down for maintenance 3 times during the period.

Each shutdown brings a high level of uncertainty pressure because it is impossible to determine the source of the problem: is it a single PCIe riser failure or a more serious systematic problem, such as GPU damage?

Rosmine said that since June 2025, the GPU usage rate has shown an obvious upward trend. Before that, he mainly ran small - scale experiments, and the development cycle was close to the experiment cycle, so there was a lot of idle time between experiments.

After June 2025, Rosmine started to promote a project that requires a large amount of computing power. Most of the GPUs were continuously used for experiment runs, and only 1 - 2 were reserved for development and debugging.

From the overall statistics, the average GPU utilization rate is 76%. If only the data after January 1, 2025, is counted, the utilization rate is 85%.

He himself expressed a little disappointment with this result because in fact, the experiments were running almost 24/7, and there was always a queue of tasks to be executed. He originally expected the utilization rate to easily exceed 95%.

Final calculation

In the cost calculation, Rosmine adopted the method of first calculating the unit price according to the daily cloud rental price and then multiplying it by the actual number of GPU hours used on that day, and then accumulating the total cost day by day.

Due to the lack of historical API price records of cloud service providers, he could only reverse - estimate the historical prices based on the publicly available timestamped information.

Based on the recorded power consumption data, he further calculated that the overall electricity cost is about $3,000, which is about $125 per month.

Considering all the above factors, as of March 13, 2026, if using cloud GPUs with equivalent computing power, the total rental cost would be about $68,000. Therefore, in comparison, he has saved about $17,000 so far.

Based on this calculation result, this GPU system has already recovered its cost. According to the current market price estimate, from now on, about $90 - $105 in computing costs can still be saved every day.

The real "final conclusion"

Rosmine said that the starting point for buying this server was never to save costs, but to build something "interesting".

In this process, he invested a lot of time in trying high - risk, high - reward experiments and experienced multiple failures.

But finally, he did achieve some results and claimed that he had solved a key problem in large language models.

He plans to officially release the relevant results next week to verify whether this is a real technological breakthrough or another misjudgment like "LLM psychosis".

Advice

Rosmine reminded in his sharing that one needs to be very cautious about building a high - end GPU server by oneself, because it is easy to make costly mistakes. He originally thought that since the apartment could not upgrade the circuit, he could not use a standard data - center server and had to adopt the method of "connecting two power supplies to different circuits respectively". Based on this limitation, he chose a motherboard with a slower GPU interconnection speed. This configuration is very suitable for running a large number of small - scale experiments in parallel (which is also his main usage scenario), but it performs poorly in tasks that require cross - GPU model partitioning.

In many failures, a considerable part of the problems came from PCIe riser - related components, and Nathan Odle's investigation and analysis of risers provided important help in the troubleshooting process.

He also mentioned that his consumption habits are more like those of a "budget - constrained graduate student". This set of equipment is actually an investment after years of savings. Although he is in a relatively lucky position to bear such high - risk expenditures, he does not recommend that everyone copy the same plan.

In his opinion, even just using Google's Google Colab subscription, cheaper cloud GPUs, or small local devices can also complete high - quality research work.

The psychological change from "renting GPUs" to "owning GPUs" is very obvious. In the rental mode, each experiment incurs a direct cost, so one needs to constantly weigh whether it is worth running; while after owning the equipment, not running experiments will instead create a "sense of loss due to idle resources". At the same time, it also avoids the trouble of frequently starting and stopping cloud instances.

In addition, this analysis does not include the time cost, and building and maintaining the entire server itself consumes a lot of energy.

In terms of insurance, he once tried to include the device in his rental insurance, but the insurance company did not accept this plan, and finally he had to switch to commercial insurance coverage.

Finally, he said that if he had to choose again, he might not do such a highly customized assembly but directly buy a standard data - center server and host it in a computer room. However, this also means losing the personalized experience of occasionally greeting "grumbl".

Source: https://rosmine.ai/2026/05/13/was-my-48k-gpu-worth-it/

This article is from the WeChat official account "CSDN Programmer's Life", author: Rosmine; translated by Su Mi, published by 36Kr with authorization.