-
SOSP'25 COpter: Efficient Large-Scale Resource-Allocation via Continual Optimization
Suhas Jayaram Subramanya, Don Kurian Dennis, Gregory R. Ganger, Virginia Smith
Keywords: Resource-Allocation, MILP.
Motivation: 1) Exsiting solvers are not scalable: Extremely slow with large scales. 2) Exsitings accelerating techniques cannot keep efficiency and optimality. 3) Accelerate and keep optimality .
Design: 1) Solving a standard LP: Utilize standard LP to unify different problems. 2) System-level problem parametere manipulation for acceleration. 3) Eliminate slow post-processes from related LP solutions to MILP solutions by simple rounding.
-
SOSP'24 Tiered Memory Management: Access Latency is the Key!
Midhul Vuppalapati, Rachit Agarwal
Keywords: Measurement, Page placement
Motivation: 1) Hotest pages may be not expected: Demonstrated by experiments. 3) Palcement should minimize expected latency.
Design: 1) Measure expected queue with small effort: CPU-to-memory datapath. 2) Algorithm (Or principle): Change type (hot or alternative) of pages by measured latencies.
-
SIGCOMM'25 SaTE: Low-Latency Traffic Engineering for Satellite Networks
Hao Wu, Yizhan Han, Mohit Rajpal, Qizhen Zhang, Jingxian Wang
Keywords: Satellite Network, Traffic Engineering, ML Generalization
Motivation: 1) Dynamic: Satellite Networks' topology change frequently. 2) Generalization: NN-like approaches' general problem. 3) Large Scale: Numerous nodes and links.
Design: 1) GNN-only: Elinimate DNN to improve generalizability. 2) Graph pruning based on topology similarity (to an existing baseline topology): improve generalizability. 3) Supervisely learn Gurobi's solution.
-
SIGCOMM'25Centralium: A Hybrid Route-Planning Framework for Large-Scale Data Center Network Migrations
Yikai Lin, Mohab Gawish(Meta)
Keywords: Centralized Routing, Distributed Routing, Network Migration, Route Planning, BGP
Motivation: 1)Data centers frequently undergo large-scale network migrations 2)BGP cannot encode the sequential and conditional routing behaviors required during transitional migration phases.
Design: 1)Route Planning Abstraction (RPA):Intervenes in BGP decisions by injecting an "intent" (a prioritized set of paths) and three primitives: Path Selection RPA, Route Attribute RPA, and Route Filter RPA. 2)Hybrid Control Plane: Centralium + BGP. Centralium is a centralized controller embedded with the RPA module, while BGP is responsible for route execution. 3)Protection Mechanisms: Loop avoidance via worst-attribute advertisement, and transient traffic imbalance avoidance via bottom-up deployment.
-
SIGCOMM'25 PreTE: Traffic Engineering with Predictive Failures Congcong Miao(Tencent), Zhizhen Zhong(Massachusetts Institute of Technology) etc
keywords:Wide-area networks, Traffic engineering, Optical Failures, Machine learning, Network optimization
Key Insight:25% fibers usually transmit a "warning signal" (optical degradation) seconds or minutes before the actual cut.
motivation:1)Network operators want their fibers both High Utilization and High Availability.2)use this "warning signal" to switch from static guessing to dynamic prediction
design: 1)dynamically(static before) adjust the failure probability by NN.2)comstomized:take advantage of the "time window" of deterioration signals to built new tunnels.3)employ Bender’s Decomposition(master choose scenario and sub calculate allocation, iterate) to optimizing traffic flow allocations.
-
SIGCOMM'25From ATOP to ZCube: Automated Topology Optimization Pipeline and a Highly Cost-Effective Network Topology for Large Model Training
Zihan Yan, Dan Li (Tsinghua University)
Keywords: Data center networks, Network topology, AI infrastructure
Motivation: 1)The explosive growth in LLM training scales requires new large-scale network topology designs. 2)Expert-designed topologies overlook potential asymmetric structures and struggle to balance multi-objective performance; existing automated approaches are not mature enough.
Design:1)formalize existing topologies into hyperparameters to construct topology space.2)using the NSGA-II evolutionary algorithm to explore topology designs and evaluate in simulator(astra-sim).3)2-stage Evaluator: the first stage produces a Pareto-optimal set of topologies,the second obtains the final results.
-
DAC'21 NVCell: Standard Cell Layout in Advanced Technology Nodes with Reinforcement Learning
Haoxing Ren, Matthew Fojtik
Keywords: Standard Cell Layout, RL, DRC, Placement and Routing
Motivation:1) Advanced technology nodes face DRC explosion with conditional and multi-pattern correlation, hard to model analytically. 2) Traditional methods suffer from long runtime, variable explosion, and poor scalability. 3) Need automated layout generation with competitive area and DRC compliance.
Design: 1)Integrate simulated annealing, RL and ML routability prediction for placement. 2) Use genetic algorithm for initial routing and RL for local DRC fixing. 3)Pre-trained with simulated annealing samples. 4) ML routability predictor.
-
DAC'20 GCN-RL Circuit Designer: Transferable Transistor Sizing with Graph Neural Networks and Reinforcement Learning
Hanrui Wang, Kuan Wang, Jiacheng Yang, et al.
Keywords: Transistor Sizing, Graph Convolutional Network (GCN), RL, Transfer Learning
Motivation: 1) Analog circuit sizing relies on human experts, time-consuming with complex performance tradeoffs. 2) Traditional black-box methods (BO, ES) ignore circuit topology and cannot transfer knowledge across technology nodes/topologies. 3) Need transferable automated sizing with superior performance.
Design: 1)Circuit modeled as graph to capture topology. 2)7-layer GCN aggregates neighbor features, Actor-Critic architecture with DDPG for continuous action space. 3) Action space: Component-specific continuous parameters to avoid discrete space explosion.
-
DAC'24 PVTSizing: A TuRBO-RL-Based Batch-Sampling Optimization Framework for PVT-Robust Analog Circuit Synthesis
Zichen Kong, Xiyuan Wang
Keywords:PVT-Robust, TuRBO, PCGrad, Batch Sampling, Cutting
Motivation: 1)Existing tools lack efficient PVT exploration strategies, leading to redundant simulations. 2) Traditional methods either treat each PVT condition as independent or use low-quality initial sampling. 3) Need an automated framework to balance PVT dynamically.
Design: 1) TuRBO for high-quality initial sampling. 2) PCGrad algorithm to handle optimization conflicts across PVT corners. 3) Critic-assisted pruning to predict scheme impact and filter ineffective candidates. 4) tighter training constraints, looser validation constraints.
-
DAC'25 Reinforcement Learning-Driven Window Selection for Enhanced Window-Based Rip-up and Reroute in Chip Detailed Routing
Yu-Chan Keng, Yu-Chun Pai
Keywords: Detailed Routing, Rip-up and Reroute (RUR), Design Rule Violation (DRV), RL, SE-ResNet, Dynamic Window
Motivation: 1) Traditional tools use fixed windows, failing to adapt to DRV distribution. 2) Fixed RUR modes (RM0/RM1) lead to inefficiency. 3) DRV propagation is hard to handle with static strategies.
Design: 1) Model detailed routing as RL task, with SE-ResNet as the core network for feature extraction and decision-making. 2) Action space: sequential window extension and RM mode selection. 3) Set multi-component reward: window shape/size reward, DRV correction reward, and RM mode penalty to balance effect and efficiency.
-
DAC'25 Hardware Generation with High Flexibility using Reinforcement Learning Enhanced LLMs
Yifang Zhao, Weimin Fu
Keywords: Offline dataset, RTL Code, LLMs, PPA Optimization, DPO
Motivation: 1) Traditional flows discover mismatches only after synthesis, leading to high iteration cost. 2) Existing LLM-based hardware generation ignores post-synthesis PPA metrics. 3) Real-time PPA simulation is computationally expensive, while ML-based approximation lacks precision.
Design: 1) Integrating RL with LLMs to incorporate PPA feedback into code generation. 2) Build offline PPA dataset: generate multiple RTL codes per function description, synthesize to extract PPA values. 3) Create preference datasets via configurable PPA weights to select optimal/worst codes. 4) Use DPO algorithm for RL training. 5) Compare two training strategies (RL-only vs. SFT-RL).
-
DAC'24 CAMO: Correlation-Aware Mask Optimization with Modulated Reinforcement Learning
Xiaoxiao Liang, Haoyu Yang, Kang Liu, Bei Yu, Yuzhe Ma
Keywords: Optical Proximity Correction, RL, GNN, RNN, Inspired Modulator.
Motivation: 1) Regression- and generative-based OPC are bounded by dataset quality and fail on complex metal layers. 2) Existing RL-OPC ignores spatial correlation among neighboring segments. 3) Action space grows exponentially.
Design: 1) GraphSAGE fuses local geometry along proximity edges. 2) OPC-inspired modulator: guiding stable and fast convergence. 3) Two-stage training: warm-up by Calibre, followed by self-exploration to surpass expert.
-
DAC'24 Using Probabilistic Model Rollouts to Boost the Sample Efficiency of Reinforcement Learning for Automated Analog Circuit Sizing
Mohsen Ahmadzadeh, Georges G. E. Gielen
Keywords: Analog Circuit Sizing, Model-Based RL, TD3.
Motivation: 1) Analog sizing demands thousands of SPICE-level simulations. 2) Genetic/Bayesian optimizers either converge too slowly. 3) Need sample-efficient RL that retains accuracy while cutting simulation.
Design: 1) MBTD3: wraps TD3 with an ensemble of probabilistic neural nets that output mean+variance of next circuit specs. 2) Short k-step rollouts are branched from past real states, learning cheap steps. 3) Optimal-Neighbourhood Exploration. 4) MA-MBTD3: partitions large circuits into weakly-coupled sub-blocks; each block owns a TD3 agent.
-
DAC'24 Advanced Reinforcement Learning Algorithms to Optimize Design Verification
Zahra Aref, Rohit Suvarna, Bill Hughes, Sandeep Srinivasan, Narayan B. Mandayam
Keywords: Design Verification, DDPG, PEF ,FIFO Depth.
Motivation: 1) Deep FIFOs: hard-to-hit states remain uncovered. 2) Bayesian optimization is slower on 32-D stimulus knob space. 3) First RL approach to maximize average FIFO depth.
Design: 1) discrete blocks → single-state bandit. 2) DDPG-PER prioritizes high-TD-error experiences, boosting sample efficiency.