Physical Autonomy (Part I)
Missing Pieces in the Push for Generalist Robotics
Robotics sits atop billion dollar markets with single digit penetration. Historically, most robots haven’t scaled because development has been slow, high-friction, and siloed. Waymo’s first public pilot launched in 2017 and only now is it reaching commercial viability. But in 2024 venture interest surged, reflecting a growing belief that the scaling laws which transformed language might finally apply to the physical world.
As Colossus notes, the core bottleneck to progress is software. In AI, model architecture sets the ceiling, but data determines how close you get. Without the right dataset the best transformer won’t generalize. We see Robotics now, mirroring early Natural Language Processing (NLP) applications (circa 2018-2021). Pre-GPT models plateaued until transformer architectures unlocked emergent behavior through scale.
For a GPT-like foundational model to emerge for the Physical World, the scale and diversity of data is orders of magnitude more complex than our current Digital counterparts. Data for physical intelligence is unique across embodiments, tasks and environments. A quadruped and drone each experience the world differently.
Data from one robot is not inherently fungible, which is why companies like Tesla, Waymo, and Aurora focus on curating the right data and not just more of it.
We segment the space into three investable layers:
Generalist: foundational model and humanoid robotics efforts trained across many tasks
Task-specific: vertically integrated systems solving one high-friction problem well
Infrastructure: platforms that enable others to build, simulate, or label faster
Opportunities Within Infrastructure
On the infrastructure layer, we outline three areas we are particularly interested in:
1. Data Collection and Acceleration:
Collecting teleoperated play data instead of expert demos
Collecting synthetic data using AI generation
Turning video or 3D scenes into usable training inputs (3D domains are notoriously hard)
Building domain-specific and compliant data foundations to accelerate deployment
This is still a wide-open space. No single strategy has emerged.
On the synthetic data front, players like Odyssey AI, who use strong physics engines to create AI videos, might be helpful in generating synthetic training data for robots in the future. Competitors like Runway and Luma AI have already stated that Robotics will eventually dwarf their other customer verticals. Yet, as good as the AI models are in generating clips for robotic training, many robotics companies are skeptical of simulations, AI generated or otherwise, and insist that there is no substitute for letting a robot take actions in the real world.
Regarding compliant data foundations, we are watching companies like Nominal AI, which handle sensitive government and industrial data to accelerate the testing of complex hardware systems. Forward deployed engineer models like Palantir are likely to proliferate given the sensitive and complex nature of such data.
2. Latency:
Systems that keep robots running safely even if cloud connection fails
Efficient software to speed up AI decision-making inside the robot
The performance of a robot is a direct function of the models it builds in real time. In dynamic environments, latency is the constraint. Early stage companies like Summer Robotics utilize laser-event sensing technology to reduce perception latency by 5-10x. Go-to-market strategies for these companies should vary; some will aim to sell their technology into the diversified robotics platforms (similar to what the synthetic data companies are doing), while others may find particular traction on deployed machines in fast-moving manufacturing-line environments such as Pharmaceutical or Food & Beverage.
3. Multimodal Complexity:
Infrastructure that turns different sensor data into a unified format
Tools that translate hard data like point clouds into more usable formats like images or text, not necessarily unified
Mach9 is a solid example. They convert massive-scale image and pointcloud datasets into 2D and 3D engineering deliverables, such as CAD or GIS. Terra AI aggregates layers of geological data from magnetic field readings and seismic activity to create underground maps and identify the best places to drill. Eventual Computing is building a modality-agnostic query layer, essentially the SQL for multimodal data.
Generalist Models: High Ambition, High Friction
Generalist models rely on continuous, structured real-world data. Some startups outsource this via teleoperation or human labeling. Others collect it organically through hardware use. We favor the latter, especially when usage is gamified or turned into a data marketplace. Companies that can build a data flywheel by letting users interact with their hardware will collect information in a way that’s organic and scalable, much like how Tesla isn’t just a car company but a data company.
Still, friction remains a massive hurdle. Consumers are accustomed to zero-latency digital tools, and brief delays drive customer churn. We fear that demos for generalist, humanoid robots have raised consumer expectations to a level where early results will be disappointing. It will be very hard (but not impossible) to build a flywheel in this space, particularly given the breadth of data that needs to get collected over extended periods of time.
Task Specific Models:
Task-specific robots, on the other hand, have potential to deliver outsized value per unit of friction and perhaps better positioned to scale. If a robot reliably handles hard or dangerous tasks, setup and iteration time is justifiable. Gecko Robotics, for example, builds robots for predictive maintenance in extreme industrial environments.
Regardless of the model type, there are four key questions which pertain to both models:
Daily utilization per robot (is it used once every few days or every hour?)
Maintenance cycles (how easy is it to upkeep?)
Edge-cases and labeling (how are you collecting and labeling mission-critical, edge-case data?
Does the team have deep industry experience?
Generalist or task-specific, if a company has sensible answers to the above with real world data to support their claims, we are always interested in chatting.
The Task-Specific Moat
Physical intelligence faces unique constraints: data scarcity, embodiment mismatch, and environmental variance. In the long-term, even when these issues are solved, we believe task-specific robots will remain defensible against generalist companies due to non-fungible data and hardware moats. Whereas AI agents are easily replicable and require deep vertical integration, efficacy in robotics is also derived from a robot’s form factor. A humanoid robot built for washing dishes and folding laundry will not be suited for cleaning the undersides of cruise ships. What’s intuitive to one embodiment may be entirely alien to another. You need a lot more data about a Bunny if you don’t want a robot arm to kill it. Safe, robust behavior depends on collecting the right data for the right body in the right context.
Again, regardless of form or task, we are always looking for companies with strong answers to the four key questions outlined above, the technical depth to overcome those same software constraints, and the ability to shift from RaaS (robot-as-a-service) to data platforms. Products should improve with every new deployment. Some examples include Gecko Robotics, building hardware and software to monitor critical infrastructure that’s hard to maintain. Overland AI helps address GPS-denied, dynamic, or low-visibility navigation.
We’re less interested in robots that may do difficult tasks, but do not capture mission-critical data, data essential for evolving into a broader platform.
Conclusion
Foundational models for robotics won’t emerge overnight. Unlike language or images, physical intelligence is embodied. That makes every decision, from data collection to deployment, not just a software problem, but a systems problem.



