Getting A Proprietary-Bus GPU Onto PCIe Enables Cheaper Local LLMs, For Now

If you’ve been thinking of getting into self-hosting generative AI, but don’t have a big budget for hardware, you might want to check out [Hardware Haven]’s latest video on an unusually cheap GPU option — but you’ll have to do so quickly, before the market realizes the chance for arbitrage and prices rise accordingly.

He’s gotten a hold of a 16 GB NVidia V100 card for only about a hundred bucks, mostly because it’s not easy to plug in, being on an SXM2 socket rather than the PCIe bus. SXM is a server architecture, and not something you’re likely to get on your motherboard. Another hundred got him an adapter board to fit this enterprise GPU on a consumer motherboard. That’s still a lot less than the PCIe version of the same card, which will likely set you back a thousand or more unless you get very lucky on eBay.

It’s not the newest card, dating back from 2017, but that doesn’t mean it can’t run the latest open models. After 3D printing a fan shroud for the thing so it didn’t cook itself, adding very slightly to the build cost, [Hardware Haven] set to work seeing what it could do. Going head-to-head against an RTX 3060 12 GB, the older V100 delivered more tokens per second at a slightly higher efficiency — but much higher idle power.

Still, it’s nice to see a cheap way to get into local AI, even if it might not still be cheap by the time you read this. Once you have the hardware, you might want some easy software options so you don’t have to spend all day on setup. Of course you only need a hefty GPU to run larger models — you can get into hosting your own AI on a Raspberry Pi, if you’re patient.

WordUp News

Tech

Getting A Proprietary-Bus GPU Onto PCIe Enables Cheaper Local LLMs, For Now

Leave a Reply
Cancel reply

Leave a Reply

Trending

Leave a Reply Cancel reply

Leave a Reply

Trending

Leave a Reply
Cancel reply