공지 • May 03
CoreWeave Inc Expands Sunk Capabilities to Bring AI Workloads Online Faster
CoreWeave, Inc., The Essential Cloud for AI™, shared significant momentum in CoreWeave SUNK capabilities to enable AI research and platform teams to accelerate how clusters are set up and run across CoreWeave and multi-cloud environments. As AI training workloads grow larger, run longer, and span thousands of GPUs, organizations are increasingly constrained not by raw compute availability, but by the operational complexity and reliability challenges of standing up and managing AI training infrastructure at scale. With the expansion of SUNK self-service and the recent launch of SUNK Anywhere, CoreWeave continues to address an industry-wide bottleneck by enabling faster, guided cluster setup and greater flexibility to run AI workloads. CoreWeave SUNK is the industry’s first unified training system for the most demanding AI workloads, built to deliver production-grade reliability and deep operational visibility for large, long-running training jobs. With the expansion of CoreWeave SUNK self-service, customers can bring SUNK clusters into operation using guided self-service, capturing CoreWeave’s operational learnings from supporting research clusters at scale – delivering the following benefits: Flexible paths for simple and complex needs: Teams can start with a guided path, while those with advanced requirements can work with CoreWeave Solutions Architects to design custom environments for frontier-scale training. Across both approaches, SUNK delivers consistent behavior, strong operational visibility, and CoreWeave-owned lifecycle management. Start standardized, stay consistent: SUNK self-service uses standardized setups that reduce drift over time, providing a production-ready starting point that’s easier to onboard, easier to manage, and more consistent as clusters evolve. For researchers, this means less time waiting on environments and fewer barriers between access and experimentation. For platform teams, it means a repeatable way to deploy and operate research clusters, without rebuilding the same model again and again. Secure access from day one: Automated User Provisioning can sync users and groups from an identity provider into CoreWeave, while SUNK User Provisioning automatically configures users, permissions, and accounts inside each cluster to reduce manual onboarding while keeping access aligned with real-world research environments. CoreWeave SUNK Anywhere extends CoreWeave’s unified training system, giving teams a faster and more secure path from proof of concept to production as their deployments expand anywhere they have infrastructure, be it multi-cloud or on-premises. As teams work across environments, they often lose time switching tools, workflows, or operating models every time the infrastructure changes. CoreWeave SUNK Anywhere extends the same unified training system beyond CoreWeave, letting teams operate demanding AI workloads with consistent workflows and operational discipline across environments and clouds. That consistency helps platform teams expand without fragmentation and helps researchers keep familiar scheduling and workflows as their infrastructure footprint grows. Taken together, CoreWeave SUNK self-service and CoreWeave SUNK Anywhere reinforce CoreWeave’s continued investment in reducing the infrastructure assembly burden of modern AI research clusters. Central to this momentum is CoreWeave Mission Control™, which helps teams spot performance outliers across GPUs, nodes, or communication paths that can degrade synchronized training and erode productive training time. CoreWeave Mission Control is a core element of how CoreWeave is evolving SUNK: giving teams clearer, real-time operational visibility so quiet degradation is easier to diagnose and less dependent on manual work. CoreWeave consistently sets new standards for performance, demonstrated by industry-leading MLPerf benchmark results and its position as the only AI cloud to earn the top Platinum ranking in both SemiAnalysis ClusterMAX™ 1.0 and 2.0, which evaluate AI cloud performance, efficiency, and reliability.