I've been using AI for general tasks as well as assisting me with my web design and development work for a couple years now. I mainly use SuperGrok for basic research and simple tasks and Copilot Pro+ with Claude Sonnet 4 in agent mode integrated with VSCode for more complex work. But, I want to take this to the next level, and try to run my own AI stuff locally, instead of endless subscriptions to cloud services. So, Project BedroomGPT was born. If it's all a flop, well, I got a new gaming PC haha.
I usually build my own systems, there are four custom built computers and servers in our home that I put together, but retail component pricing being what they are, and after playing Lenovo's bizarre coupon games, I ended up buying a ThinkStation P3 Tower Gen (Intel). I got that for $850. It gets me an Intel Core Ultra 5 235, 1x16GB DDR5, 512GB NVMe SSD, Windows 11 Pro. I didn't select any upgrades from Lenovo because they are a total rip-off. And honestly selling anything but the most basic computer with a single RAM stick is a joke... yet so many prebuilts, even "gaming PCs" come gimped like that. Lame!
I'm going to pull out the factory SSD, and set it aside. When I outgrow this server or get tired of the project, I'll just put it back in and use it as a gaming computer or sell it.
I'm putting in a 4TB Samsung 990 Pro I already have. Hopefully the fans don't go nuts, as I've had issues with multiple Lenovo laptops and desktops using some sort of custom firmware on their OEM SSDs for thermal/power management... and putting in non-Lenovo SSDs causes increased fan speeds. Worst case I'll put the factory SSD back in and put the 4TB drive in using an NVMe PCI-E adapter, hopefully that works. We'll see.
There is an interesting tidbit in Lenovo documentation for this workstation. The base configuration may or may not include a heatsink on the VRM on the motherboard, but in the documentation they recommend it not just for the higher end CPUs, but also if you're using the base CPU with a high end GPU. So, if mine doesn't come with a VRM heatsink, I will use the $50 of Lenovo rewards I earned from the purchase to buy one.
For the RAM I picked up 128GB (2x64GB) of DDR5 for about $300. I can always add another set of that down the road if it's beneficial, but I'm constrained on budget at this time.
The most important decisions was the GPU, as for AI, this is the heart and soul of things. I can run stuff on the CPU and using system RAM, but you get exponentially better performance using GPU. I may explore hybrid models, where some layers run on the GPU and its memory, and the rest runs on the CPU and system memory. I'm starting with an Nvidia RTX5060ti 16GB as Nvidia is the most popular and most support configuration. I wish I'd got an Intel Arc Pro B60 24GB the other day when Central Computers (accidentally?) listed them for $599, but I can always upgrade my GPU later, and Intel Arc isn't officially supported by many tools yet, so maybe by the time I find another opportunity to pick one up at that price, software and driver integration might be better.
The Lenovo workstation in my config comes with a 750W PSU with 8-pin PCI-e power, not 12VHWPR, but they officially support bigger cards like the RTX Pro 5000 (Blackwell) with an adapter, so I'm not too worried.
For OS, I plan to start with Ubuntu Server. I'm very familiar with this.
The software is really the interesting part. I think I'm going to start with Ollama and Open WebUI. If you're curious, check it out: https://docs.openwebui.com/getting-started/quick-start/starting-with-ollama/
Then, to integrate with VSCode, I'll use the Continue plugin.
The most complicated part of all this? What models to use. There are so many options: https://ollama.com/search
I'll update this thread as my project progresses. In the meantime, while I wait for all my stuff to arrive, does anyone have any suggestions on which models to try? Is anyone else running this stuff locally or on their own server?
I usually build my own systems, there are four custom built computers and servers in our home that I put together, but retail component pricing being what they are, and after playing Lenovo's bizarre coupon games, I ended up buying a ThinkStation P3 Tower Gen (Intel). I got that for $850. It gets me an Intel Core Ultra 5 235, 1x16GB DDR5, 512GB NVMe SSD, Windows 11 Pro. I didn't select any upgrades from Lenovo because they are a total rip-off. And honestly selling anything but the most basic computer with a single RAM stick is a joke... yet so many prebuilts, even "gaming PCs" come gimped like that. Lame!
I'm going to pull out the factory SSD, and set it aside. When I outgrow this server or get tired of the project, I'll just put it back in and use it as a gaming computer or sell it.
I'm putting in a 4TB Samsung 990 Pro I already have. Hopefully the fans don't go nuts, as I've had issues with multiple Lenovo laptops and desktops using some sort of custom firmware on their OEM SSDs for thermal/power management... and putting in non-Lenovo SSDs causes increased fan speeds. Worst case I'll put the factory SSD back in and put the 4TB drive in using an NVMe PCI-E adapter, hopefully that works. We'll see.
There is an interesting tidbit in Lenovo documentation for this workstation. The base configuration may or may not include a heatsink on the VRM on the motherboard, but in the documentation they recommend it not just for the higher end CPUs, but also if you're using the base CPU with a high end GPU. So, if mine doesn't come with a VRM heatsink, I will use the $50 of Lenovo rewards I earned from the purchase to buy one.
For the RAM I picked up 128GB (2x64GB) of DDR5 for about $300. I can always add another set of that down the road if it's beneficial, but I'm constrained on budget at this time.
The most important decisions was the GPU, as for AI, this is the heart and soul of things. I can run stuff on the CPU and using system RAM, but you get exponentially better performance using GPU. I may explore hybrid models, where some layers run on the GPU and its memory, and the rest runs on the CPU and system memory. I'm starting with an Nvidia RTX5060ti 16GB as Nvidia is the most popular and most support configuration. I wish I'd got an Intel Arc Pro B60 24GB the other day when Central Computers (accidentally?) listed them for $599, but I can always upgrade my GPU later, and Intel Arc isn't officially supported by many tools yet, so maybe by the time I find another opportunity to pick one up at that price, software and driver integration might be better.
The Lenovo workstation in my config comes with a 750W PSU with 8-pin PCI-e power, not 12VHWPR, but they officially support bigger cards like the RTX Pro 5000 (Blackwell) with an adapter, so I'm not too worried.
For OS, I plan to start with Ubuntu Server. I'm very familiar with this.
The software is really the interesting part. I think I'm going to start with Ollama and Open WebUI. If you're curious, check it out: https://docs.openwebui.com/getting-started/quick-start/starting-with-ollama/
Then, to integrate with VSCode, I'll use the Continue plugin.
The most complicated part of all this? What models to use. There are so many options: https://ollama.com/search
I'll update this thread as my project progresses. In the meantime, while I wait for all my stuff to arrive, does anyone have any suggestions on which models to try? Is anyone else running this stuff locally or on their own server?