🔑 Key Takeaways
- 56% of enterprises now prefer private cloud for production AI inference.
- 83% of organizations are actively considering repatriating workloads from public clouds.
- Cost has overtaken security as the primary concern for public cloud deployments.
- Broadcom’s VCF 9.1 targets this shift with unified, cost-effective private AI infrastructure.
- Data sovereignty and compliance are driving localized, on-premises AI deployments.
The era of default public cloud AI deployments has officially ended. According to Broadcom’s newly released Private Cloud Outlook 2026 report, enterprise IT has reached a critical inflection point. Driven by spiraling hyperscaler costs, stringent data sovereignty mandates, and the maturation of turnkey private AI infrastructure, organizations are aggressively executing an Enterprise AI Repatriation strategy. Production AI workloads are moving back to on-premises and private cloud environments at an unprecedented scale.
When enterprises originally began building their artificial intelligence strategies, the default assumption was straightforward: AI would run in the hyperscaler cloud. The APIs were ready, the GPU capacity was building out, and the inertia of a decade of public cloud investment pointed in one direction. However, as organizations transition from pilot programs to production scale, the financial and operational realities of public cloud AI have forced a massive structural reallocation of enterprise technology budgets.
The data from Broadcom’s blind, global survey of 1,800 senior IT leaders across eight countries is undeniable. Last year, 56% of enterprises used public cloud as the primary environment for production AI inference. In 2026, that number has plummeted 15 percentage points to 41%. Conversely, 56% of enterprises are now running or planning to run production inferencing in a private cloud. The broader repatriation trend has accelerated sharply: 83% of enterprises are now considering repatriation, up from 69% in 2025, and half have already moved at least some workloads.
The Architectural Reality

To understand the mechanics of this shift, one must look at the underlying infrastructure enabling it. The release of VMware Cloud Foundation (VCF) 9.1 in May 2026 serves as a prime example of how private AI infrastructure is being engineered to be as consumable as public cloud, but with strict cost and governance controls. VCF 9.1 is designed to be the most cost-effective and secure foundation for production AI, operating under a single control plane on infrastructure the enterprise owns and governs.
The architectural divergence between AI training and AI inference is at the heart of this transition. While training massive foundation models still largely requires the immense, concentrated compute power of hyperscaler public clouds, inference—the actual day-to-day running of these models—demands a different approach. Inference requires low latency, predictable costs, and proximity to proprietary enterprise data. By deploying LLM Infrastructure directly within a private cloud environment, enterprises can run AI-powered content summarization, agentic workflows, and data analysis without exposing sensitive information to public networks.
VCF 9.1 addresses the operational complexity that previously hindered private cloud adoption. It provides a unified platform that supports mixed compute infrastructure across AMD, Intel, and NVIDIA hardware. This flexibility is crucial for enterprises looking to avoid vendor lock-in and optimize their hardware investments based on specific workload requirements. Furthermore, features like intelligent memory tiering, compression, and deduplication drive down the total cost of ownership (TCO) for storage and compute, making the private cloud economically superior for sustained, high-volume inference workloads.
Security and compliance remain the single most important factor in workload placement decisions, cited by 32% of respondents in the Broadcom report. On top of existing obligations, AI is introducing new ones: data protection and privacy (37%) and security and control (36%) are now the leading infrastructure requirements that AI brings to the table. Private cloud architectures inherently provide the governance framework to meet these requirements by design, built in from the start rather than bolted on after deployment.
Market Impact & Deployment
The financial implications of this shift are staggering. For the first time in Broadcom’s ongoing study, cost has overtaken security as the top concern regarding public cloud usage. A staggering 97% of IT leaders believe some portion of their public cloud spend is wasted, and more than half (52%) say that waste exceeds 25% of their total spending. Generative AI and agentic workloads are compounding this pressure, with 62% of IT leaders reporting being very or extremely concerned about AI infrastructure costs.
This cost crisis has elevated Cloud Cost Management from a tactical IT function to a board-level strategic imperative. Enterprises that built their AI ambitions on variable, consumption-based public cloud pricing are recalculating. Private cloud investment is now growing at more than twice the rate of public cloud, with net intent to increase private cloud investment over three years rising from 51% to 72%.
However, an investigative look at the broader market reveals a fascinating dual-pronged strategy by Broadcom. While its VMware division aggressively pushes enterprises toward private cloud inference, its semiconductor division is simultaneously financing and building the Custom Silicon powering the very hyperscalers those enterprises are fleeing. Broadcom co-designs custom AI chips (ASICs) for tech giants like Google, Meta, Anthropic, and OpenAI.
In April 2026 alone, Broadcom secured a long-term supply agreement for Google’s next-generation TPUs through 2031, expanded its partnership with Meta for the MTIA chip through 2029, and facilitated a massive compute-access arrangement for Anthropic. Broadcom’s CEO, Hock Tan, projects AI chip revenue to exceed $100 billion by 2027, backed by a $73 billion committed customer backlog. This brilliant “picks-and-shovels” strategy ensures that Broadcom profits immensely whether an enterprise runs its AI in a Google data center or on a private VMware server rack.
Geopolitics and Data Sovereignty
Beyond cost and architecture, geopolitics has entered the infrastructure conversation in a significant way. Eighty-six percent of IT leaders say geopolitical and regulatory factors are now directly affecting their IT strategy and operations. Data sovereignty and residency requirements are the top concern, cited by 54% of respondents, followed by jurisdiction-specific compliance requirements at 51%.
For enterprises operating across borders, decisions about where data lives carry direct implications for where workloads can run. AI workloads that process sensitive, regulated, or proprietary data require infrastructure that provides governance and control from the ground up. The European Union’s AI Act, alongside tightening data localization laws in the Asia-Pacific region, has made reliance on centralized, cross-border public clouds a massive legal liability.
AI sovereignty is no longer just about where the data sits; it is about controlling the entire operational stack. As Chris Wolf, global head of AI and advanced services for the VMware Cloud Foundation Division, noted, true sovereignty means that an enterprise can disconnect from the internet and continue to operate its AI models without interruption. This level of resilience and control is fundamentally incompatible with a pure public cloud model, further accelerating the repatriation trend.
The Consumer Translation
While the mechanics of enterprise cloud infrastructure may seem distant from the average consumer, the shift toward private AI has profound implications for the worldwide public. First and foremost is the issue of data privacy. When a consumer interacts with an AI-powered banking assistant, healthcare portal, or retail application, their personal data is processed by an underlying model. If that model runs in a public cloud, the data traverses multiple networks and resides on shared infrastructure. By repatriating these workloads to private clouds, enterprises drastically reduce the attack surface and ensure that consumer data remains within their direct, localized control.
Secondly, the move to private, localized AI inference significantly reduces latency. For consumer applications requiring real-time responsiveness—such as autonomous driving networks, augmented reality interfaces, and voice-activated smart agents—the round-trip time to a centralized hyperscaler data center is a critical bottleneck. Localized private clouds and edge computing nodes enable instantaneous AI processing, resulting in smoother, faster, and more reliable consumer experiences.
Finally, there is an economic trickle-down effect. The exorbitant costs of public cloud AI inference are currently being passed down to consumers in the form of subscription fees and premium service tiers. If enterprises can successfully slash their AI operating costs by 30% to 50% through private cloud repatriation and optimized infrastructure, the barrier to entry for advanced AI features will lower, democratizing access to next-generation digital tools.
TechNode HQ Verdict: Pros, Cons & Usability
- Pro (Engineering): VCF 9.1 provides a unified, hardware-agnostic control plane that drastically simplifies the deployment of Kubernetes and AI workloads on-premises, reducing the operational overhead of managing fragmented infrastructure.
- Pro (Consumer): Enhanced data privacy and reduced latency for end-user applications, as sensitive information is processed locally rather than being transmitted to centralized public cloud servers.
- Con: The upfront Capital Expenditure (CapEx) required to build out private AI infrastructure (servers, GPUs, networking gear) remains a significant barrier for mid-market companies compared to the pay-as-you-go public cloud model.
- Con: The industry faces a severe skills gap; 40% of IT leaders cite AI infrastructure and operations as their top talent shortage, making the management of private clouds challenging without heavy reliance on professional services.
Enterprise Usability: CTOs and enterprise architects should immediately audit their public cloud AI inference spend. If inference costs are scaling linearly with user adoption, a phased repatriation strategy using platforms like VCF 9.1 is highly recommended to stabilize budgets and ensure compliance with emerging data sovereignty laws.
Everyday Usability: For the general public, this shift happens behind the scenes. However, consumers should view this trend positively, as it signals a maturation of the AI industry toward more secure, private, and economically sustainable deployment models that will ultimately protect user data more effectively.