🗓️ Posted: January 22th, 2026
Shu Liu, Shubham Agarwal, Audrey Cheng, Ion Stoica, and the ADRS team
<aside> 💡
This post is part of our AI-Driven Research for Systems (ADRS) case study series, where we use AI to automatically discover better algorithms for real-world systems problems.
In this post, we study Cloudcast, a problem originally introduced in an NSDI ‘24 paper, which focuses on cost-aware data multicasting across multi-region and multi-cloud networks. The goal is to transfer large datasets from one source to many destinations while minimizing total egress cost.
We start from a simple, intuitive baseline strategy and show how GEPA automatically discovers a significantly more cost-efficient solution, achieving over 30% cost reduction and converging to a structure close to the human-designed state-of-the-art. GEPA’s solution builds a single overlay that connects all nodes through a set of carefully chosen intermediate waypoints rather than sending data separately to every destination.
</aside>
In real-world multi-cloud deployments, systems frequently need to multicast the same data from a source node to many destination nodes across different clouds and regions. These multicasts traverse heterogeneous and asymmetric cloud networks, where link bandwidths, latencies, and, most importantly, monetary egress costs vary widely across intra-cloud and inter-cloud connections. Available paths also differ across providers, making the cost of data transfers topology-dependent.

Figure 1. Multi-cloud data multicasting across different cloud providers.
In multi-cloud data multicasting, the primary cost is often data egress. When data is transferred across cloud providers or regions, each transfer incurs a monetary charge that can vary significantly depending on the source, destination, and network path. As shown in Figure 2 below, transferring the same data item independently form a source region to each destination region can lead to substantial costs.
Indeed, sending identical data directly from the source to every destination leads to redundant transfers over expensive inter-cloud links, unnecessarily inflating total egress cost. In contrast, by carefully designing a multicast tree, we can reuse intermediate transfers, route data through cheaper links, and reduce the overall cost of moving data across clouds. Cloudcast therefore seeks multicast trees that minimize total egress cost while delivering data to all destinations.
This problem is deceptively simple to state. Given: