Luminal raises $5.3 million to create a better framework for GPU code

Identifying the Real Bottleneck

Three years ago, Luminal co-founder Joe Fioti was deep in chip design work at Intel when he recognized a bigger issue. Even with top-tier hardware, he realized developers often struggled because the software layer wasn’t keeping up. That mismatch, he said, was becoming the real limitation.

He has since launched a company built around solving that exact problem. Earlier this week, Luminal revealed that it raised $5.3 million in seed funding, led by Felicis Ventures, with additional backing from angel investors including Paul Graham, Guillermo Rauch, and Ben Porterfield.

Fioti teamed up with Jake Stevens, formerly at Apple, and Matthew Gunton, previously with Amazon, to build the company. Luminal also participated in Y Combinator’s Summer 2025 cohort.

Luminal’s Approach to Compute and Optimization

The business model is straightforward: Luminal sells compute power, similar to newer cloud providers like Coreweave or Lambda Labs. But instead of emphasizing hardware, the company focuses on extracting more performance from the GPUs it already has by improving the compiler layer — the same area that once caused Fioti countless frustrations.

Right now, Nvidia’s CUDA dominates the compiler landscape and is a major factor in the company’s success. Many of CUDA’s components are open-source, though, and Luminal is wagering that with demand for GPUs still outpacing supply, there’s room to innovate around the rest of the software stack.

This strategy aligns with a rising wave of inference-optimization companies aiming to deliver faster and more cost-efficient model execution. Established players like Baseten and Together AI have long prioritized optimization, and newer entrants such as Tensormesh and Clarifai are emerging with even more specialized approaches.

Competition and Luminal’s Long-Term Bet

Still, Luminal and its peers must contend with in-house optimization teams at major AI labs, which have the advantage of tailoring their systems for a narrow set of models. In contrast, Luminal must be prepared to work with many different architectures, but Fioti says the demand for improvement is accelerating quickly enough that he sees ample opportunity.

He acknowledges that deeply customized tuning on specific hardware can produce the best possible performance, but that such efforts are costly and time-intensive. Luminal’s bet, he says, is that for the majority of use cases, high-quality general-purpose optimization will remain extremely valuable.

Also Read:

HCLTech and NVIDIA Launch Physical AI Innovation Lab in Santa Clara

Disseqt AI Joins HCLTech and Microsoft to Bring Agent-Based AI to Banks