## POLITECNICO DI TORINO Repository ISTITUZIONALE

### A Structured Krylov Subspace Projection Framework for Fast Power Integrity Verification

Original

A Structured Krylov Subspace Projection Framework for Fast Power Integrity Verification / Carlucci, Antonio; Grivet-Talocia, Stefano; Mongrain, Scott; Kulasekaran, Sid; Radhakrishnan, Kaladhar. - ELETTRONICO. - (2023), pp. 1-4. (Intervento presentato al convegno 2023 IEEE 27th Workshop on Signal and Power Integrity (SPI) tenutosi a Aveiro, Portugal nel 07-10 May 2023) [10.1109/SPI57109.2023.10145566].

Availability: This version is available at: 11583/2979366 since: 2023-06-14T09:22:11Z

Publisher: IEEE

Published DOI:10.1109/SPI57109.2023.10145566

Terms of use:

This article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository

Publisher copyright IEEE postprint/Author's Accepted Manuscript

©2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collecting works, for resale or lists, or reuse of any copyrighted component of this work in other works.

(Article begins on next page)

# A Structured Krylov Subspace Projection Framework for Fast Power Integrity Verification

Antonio Carlucci\*, Stefano Grivet-Talocia\*, Scott Mongrain§, Sid Kulasekaran§, Kaladhar Radhakrishnan§

\*Dept. Electronics and Telecommunications, Politecnico di Torino, Italy

<sup>§</sup>Intel Corporation, Chandler, AZ, USA

antonio.carlucci@polito.it

Abstract—This paper presents a model order reduction approach, specifically designed for the generation of compact and efficient transient simulation models of system-level power distribution networks (PDN) of multicore processor systems. The proposed approach applies a Krylov subspace projection, with a structure that is adapted to a block-coupled state-space description of individual PDN subsystems. The latter include board-package, averaged models of integrated voltage regulators switching circuitry, and individual models of all cores including regulator inductors and capacitors. Numerical results from proposed reduced-order models provide major speedup with respect to SPICE with negligible loss of accuracy.

#### I. INTRODUCTION

Modern multicore processor systems are equipped with sophisticated voltage regulation systems [1], aimed at stabilizing and controlling power delivery to individual cores. This is usually achieved through Fully Integrated Voltage Regulators (FIVR) [2], usually implemented as multi-phase switching power supplies with buck topology. Power transistors, switching control circuits, and the output decoupling for these FIVRs are fabricated on-die, while the inductors are placed in the package. Feedback loops through dedicated controllers determine the duty cycle of FIVR switches based on the instantaneous output voltage, which is thus adaptively filtered and stabilized. Although FIVRs provide an intended decoupling between the board/package and voltage-regulated core circuitry, a global coupling still exists and must be properly characterized and modeled for full-system power integrity verification. Unfortunately, this type of analysis requires a complete description of all system parts, which for many-core systems may become impractical due to complexity.

In this work, we attempt a complexity reduction through a suitable model order reduction approach [3], [4]. Differently from the black-box approach of [5], based on a compressed representation of the linearized output PDN impedance, in this work we propose a structured projection framework. We first cast the PDN equations in a block-structured system of coupled state-space equations. These include: models of all interconnects and linear passive components, obtained by standard rational fitting of S-parameters from electromagnetic simulations; averaged models of FIVR switches; per-core feedback loops sensing output voltage and providing duty cycle signals through dedicated controllers. The overall coupled system is nonlinear due to the FIVR feedback, however

979-8-3503-3282-7/23/\$31.00 ©2023 IEEE



Fig. 1. Schematic illustration of the power distribution system under investigation, including  $N_c$  cores whose voltage is regulated by  $N_p$ -phase FIVRs.

such nonlinearity is concentrated at the FIVR switch models, whereas all other system parts can be seen as large-scale Linear and Time-Invariant (LTI) blocks. The size of these blocks is reduced through structured and block-partitioned Krylov subspace projection, with projection matrices based on a set of system responses (snapshots) under suitable operating points. The resulting reduced model preserves the block-interconnected structure (with nonlinear feedback) of the original system and is easily solved in time-domain by a basic discretization scheme. Numerical results obtained on a system based on a recent Intel® Core<sup>TM</sup> microprocessor show a major speedup (up to  $200 \times$  with respect to reference SPICE simulations), with a negligible loss of accuracy.

#### **II. PROBLEM STATEMENT AND NOTATION**

The structure under analysis is depicted in Fig. 1. The PDN provides the supply voltage to  $N_c$  microprocessor cores. In this work, all cores are considered identical, so that the subsystem in the box in Fig. 1 is repeated  $N_c$  times. The leftmost voltage source represents the mainboard voltage supply. This is connected to the *input network* block, which represents the electrical behavior of the board and package as a passive LTI system that models the effect of parasitics, on-board decoupling capacitors, and it includes a linear model of the VRM. The *output network* box is another passive linear system describing integrated passive components and the die. The microprocessor units to which the power is supplied are represented as  $N_o$  current sources loading each output network. The load currents for all cores are indicated by  $i^o$  and the corresponding load voltages are  $v^o$ .

Core-level voltage regulation is implemented with FIVRs, depicted in the middle box in Fig.1. These are switching converters with time-varying duty cycle and  $N_p$  phases. The *i*th core has a dedicated controller that senses the output voltage and steers the duty cycle signal  $d_i(t)$  of the respective switches so that the load voltage tracks a constant reference value  $V_{\rm ref}$ . In this work, the FIVR switches are represented with ideal transformers, which is a well-known averaged approximation of buck converters. In particular, each core is associated with  $N_p$  ports on the output side corresponding to an equal number of ports on the input network side; a per-core bank of  $N_p$  ideal transformers connects these two groups of ports.

The overall PDN system can be represented by the following system of differential-algebraic equations

$$\dot{\boldsymbol{x}} = \mathbf{A}\boldsymbol{x} + \mathbf{B}_w \boldsymbol{w} + \mathbf{B}_u \boldsymbol{u} \tag{1a}$$

$$\boldsymbol{z} = \mathbf{C}_z \boldsymbol{x} + \mathbf{D}_{zw} \boldsymbol{w} + \mathbf{D}_{zu} \boldsymbol{u}$$
(1b)

$$y = C_y x + D_{yw} w + D_{yu} u$$
 (1c)

$$w = \Delta(d)z$$
 (1d)

$$\dot{\boldsymbol{x}}_k = \mathbf{A}_k \boldsymbol{x}_k + \mathbf{B}_k \left( \mathbf{N} \boldsymbol{y} - \boldsymbol{V}_{\text{ref}} \right), \quad \boldsymbol{d} = \mathbf{C}_k \boldsymbol{x}_k \quad (1e)$$

where all quantities are defined as follows. Let  $v_1, i_1$  be vectors collecting all voltages and currents at the  $N_c N_p$  ports of the input network (to the left of the FIVRs in Fig.1) and  $v_2, i_2$  be the corresponding quantities on the output network side. The characteristic equations of the FIVR switches are

$$egin{aligned} oldsymbol{v}_2 &= oldsymbol{\Delta}_1(oldsymbol{d})oldsymbol{v}_1 &= i_1 = -oldsymbol{\Delta}_1(oldsymbol{d})oldsymbol{i}_2 \ oldsymbol{\Delta}_1(oldsymbol{d}) &\triangleq ext{blkdiag}\left(oldsymbol{\Delta}_{1,1}, \dots, oldsymbol{\Delta}_{1,N_c}
ight), \quad oldsymbol{\Delta}_{1,i}(oldsymbol{d}) &\triangleq d_i \mathbb{I}_{N_F} \end{aligned}$$

Let us further introduce

$$oldsymbol{w} riangleq egin{pmatrix} oldsymbol{i}_1 \ oldsymbol{v}_2 \end{pmatrix}, \quad oldsymbol{z} riangleq egin{pmatrix} oldsymbol{v}_1 \ oldsymbol{i}_2 \end{pmatrix}, \quad oldsymbol{\Delta}(oldsymbol{d}) riangleq egin{pmatrix} oldsymbol{0} & -oldsymbol{\Delta}_1(oldsymbol{d}) \ oldsymbol{\Delta}_1(oldsymbol{d}) & oldsymbol{0} \end{pmatrix}$$

so that we can compactly represent the FIVR switches with the equation  $w = \Delta(d)z$  in (1d). Furthermore, the behavior of the linear and passive subsystems in Fig. 1, i.e. the input network and all the output networks, can be viewed as a single coupled dynamical system mapping the inputs  $V_{\mathrm{VRM}}, \boldsymbol{i}^o, \boldsymbol{i}_1, \boldsymbol{v}_2$  to the outputs  $v_1, i_2, v^o$ . Hence, by stacking  $V_{\text{VRM}}$  and  $i^o$  in the vector u and letting  $y \triangleq v^o$ , we can use the state-space representation for this subsystem in (1a)-(1b)-(1c). Finally, the  $N_c$  compensators can be grouped into another linear subsystem whose input is the vector of error signals for each core  $m{e} \triangleq \mathbf{N}m{y} - m{V}_{\mathrm{ref}}$ , where  $\mathbf{N} \in \{0,1\}^{N_c imes ar{N}_o N_c}$  is a matrix that selects  $N_c$  entries out of all load voltages to be compared with the vector of constant reference voltages  $V_{\rm ref}$ . The compensators output is the vector of duty cycle signals  $d \in \mathbb{R}^{N_c}$ , whose *i*-th entry is  $d_i(t)$ . The state-space equations of the compensator subsystem are (1e). The main objective of this work is the definition of a fast transient simulation algorithm that can solve (1) under realistic current excitation of all cores.

#### III. FORMULATION

Let us analyze the structure of (1). Due to (1d), the overall system is nonlinear, as it involves products of state variables.

Conversely, the first three equations (1a)–(1c) are linear due to the adopted models of input and output networks. The complexity in (1) mainly comes from the first state equation, which has as many components as the dynamic elements used to model the electrical behavior of board, package and die. In the application at hand, x is high-dimensional and the remaining states  $x_k$  of the duty cycle controllers are few ( $\approx 5 - 10N_c$ ). This motivates a structured approach in which we derive a reduced-order model of (1) by reducing the dimensionality of x only and leaving the controller equations and the other algebraic constraints untouched. Specifically, we derive the following reduced subsystem by projection of the first three equations in (1)

$$\begin{cases} \dot{\boldsymbol{x}}_{r} = \mathbf{A}\boldsymbol{x}_{r} + \mathbf{B}_{w}\boldsymbol{w} + \mathbf{B}_{u}\boldsymbol{u} \\ \boldsymbol{z} = \hat{\mathbf{C}}_{z}\boldsymbol{x}_{r} + \mathbf{D}_{zw}\boldsymbol{w} + \mathbf{D}_{zu}\boldsymbol{u} \\ \boldsymbol{y} = \hat{\mathbf{C}}_{y}\boldsymbol{x}_{r} + \mathbf{D}_{yw}\boldsymbol{w} + \mathbf{D}_{yu}\boldsymbol{u} \\ \boldsymbol{w} = \boldsymbol{\Delta}(\boldsymbol{d})\boldsymbol{z} \\ \dot{\boldsymbol{x}}_{k} = \mathbf{A}_{k}\boldsymbol{x}_{k} + \mathbf{B}_{k}\left(\mathbf{N}\boldsymbol{y} - \boldsymbol{V}_{\text{ref}}\right), \quad \boldsymbol{d} = \mathbf{C}_{k}\boldsymbol{x}_{k} \end{cases}$$
(2)

where  $\hat{\mathbf{A}} = \mathbf{W}^T \mathbf{A} \mathbf{V}$ ,  $\hat{\mathbf{B}}_w = \mathbf{W}^T \mathbf{B}_w$ ,  $\hat{\mathbf{B}}_u = \mathbf{W}^T \mathbf{B}_u$ ,  $\hat{\mathbf{C}}_z = \mathbf{C}_z \mathbf{V}$ ,  $\hat{\mathbf{C}}_y = \mathbf{C}_y \mathbf{V}$  are the reduced state-space matrices obtained by *Petrov-Galerkin* projection via the biorthogonal matrices  $\mathbf{V}$  and  $\mathbf{W}$  (i.e.,  $\mathbf{W}^T \mathbf{V} = \mathbb{I}$ ). These matrices are here constructed so that the input-output mapping from u to y is reproduced accurately, using a specific linearized moment matching approach to deal with the nonlinearity of (1).

#### A. Local linearization

The main tool we use to overcome the fact that (1) is a nonlinear system is local linearization around an operating point. Given a constant input  $\bar{u}$ , the corresponding operating point is  $(\bar{x}, \bar{x}_k, \bar{w}, \bar{z}, \bar{d}, \bar{u}, \bar{y})$ , that is a steady state solution of (1) obtained by letting  $u(t) = \bar{u}$  and  $\dot{x} = \dot{x}_k = 0$ . Let us identify the operating point with  $p(\bar{u}) \triangleq (\bar{x}_k, \bar{w}, \bar{z}, \bar{d}, \bar{u}, \bar{y})$ , and let us split each variable as  $x = \bar{x} + \tilde{x}$ , where  $\bar{x}$  is the bias and  $\tilde{x}$  is the small-signal component. The linearized system around  $p(\bar{u})$  can be easily obtained by replacing (1d) with its first-order approximation  $w = \Delta(d)z \approx \bar{w} + \tilde{w}$ , with  $\bar{w} = \Delta(\bar{d})\bar{z}, \ \tilde{w} = \Delta(\bar{d})\tilde{z} + \Delta(\tilde{d})\bar{z}$ .

The resulting set of equations can be assembled in the compact linearized descriptor form (we omit the detailed expression of the descriptor matrices due to lack of space)

$$egin{aligned} & egin{aligned} \dot{\mathcal{E}} \dot{m{\xi}} = \mathcal{A}_{m{p}} m{\xi} + \mathcal{B}_{m{p}} ilde{m{u}} & \ & m{y} = \mathcal{C}_{m{p}} m{\xi} + \mathcal{D}_{m{p}} ilde{m{u}} & \ & m{with} & m{\xi} = egin{pmatrix} & ilde{m{x}}_k \ & ilde{m{z}} \end{pmatrix}. \end{aligned}$$

The transfer function of the linearization of (1) around the operating point associated to  $\bar{u}$  can be written as

$$\mathbf{H}(s; \boldsymbol{p}(\bar{\boldsymbol{u}})) = \mathcal{C}_{\boldsymbol{p}}(s\mathcal{E} - \mathcal{A}_{\boldsymbol{p}})^{-1}\mathcal{B}_{\boldsymbol{p}} + \mathcal{D}_{\boldsymbol{p}}.$$
 (3)

The same linearization procedure can be carried out on the projected nonlinear system (2) to obtain the reduced linearized transfer function

$$\hat{\mathbf{H}}(s;\hat{\boldsymbol{p}}(\bar{\boldsymbol{u}})) = \hat{\mathcal{C}}_{\hat{\boldsymbol{p}}} \left(s\hat{\mathcal{E}} - \hat{\mathcal{A}}_{\hat{\boldsymbol{p}}}\right)^{-1} \hat{\mathcal{B}}_{\hat{\boldsymbol{p}}} + \hat{\mathcal{D}}_{\hat{\boldsymbol{p}}}$$
(4)

where  $\hat{p}(\bar{u})$  is the operating point corresponding to the input  $\bar{u}$  in the reduced system (2). Note that, by enforcing that the operating points match as  $p(\bar{u}) = \hat{p}(\bar{u})$  for a fixed  $\bar{u}$ , then

$$\hat{\mathcal{A}}_{\hat{p}} = \mathcal{W}^T \mathcal{A}_p \mathcal{V}, \quad \hat{\mathcal{B}}_{\hat{p}} = \mathcal{W}^T \mathcal{B}_p, \quad \hat{\mathcal{C}}_{\hat{p}} = \mathcal{C}_p \mathcal{V}, \quad \hat{\mathcal{D}}_{\hat{p}} = \mathcal{D}_p$$

where  $\mathcal{V} \triangleq$  blkdiag  $(\mathbf{V}, \mathbb{I}, \mathbb{I})$  and  $\mathcal{W} \triangleq$  blkdiag  $(\mathbf{W}, \mathbb{I}, \mathbb{I})$ .

#### B. Projection matrices

This section discusses how to construct the projection matrix  $\mathbf{V}$ . The general idea proposed here is that the linearized transfer function of the reduced nonlinear system (4) should (approximately) interpolate the linearization of the original system (3). We consider separately matching at DC and at other frequencies.

*DC accuracy:* Steady-state conditions are of particular importance in the verification of power delivery networks. Hence, we would like our reduced model to reproduce the exact steady-state response of the full system. In particular, for any constant  $\bar{u}$ , the equilibrium solution  $p(\bar{u})$  of (1) should be the same as  $\hat{p}(\bar{u})$ . This is ensured by taking

$$\mathcal{R}(\mathbf{V}) \supset \mathcal{R}(\mathbf{A}^{-1} \begin{pmatrix} \mathbf{B}_w & \mathbf{B}_z \end{pmatrix})$$
(5)

where  $\mathcal{R}$  denotes the column space (range). This condition implies that, for any steady-state solution of the original system  $(\bar{x}, \bar{x}_k, \bar{w}, \bar{z}, \bar{d}, \bar{u}, \bar{y})$ , the reduced system has a steadystate solution at  $(\bar{x}_r, \bar{x}_k, \bar{w}, \bar{z}, \bar{d}, \bar{u}, \bar{y})$  with  $\bar{x} = V\bar{x}_r$ . Consequently,  $p(\bar{u}) = \hat{p}(\bar{u})$ .

*Transfer function moments:* In order to enforce  $\mathbf{H}(s; \boldsymbol{p}(\bar{\boldsymbol{u}})) \approx \hat{\mathbf{H}}(s; \hat{\boldsymbol{p}}(\bar{\boldsymbol{u}}))$ , we consider a finite set of operating points induced by the inputs  $\bar{\boldsymbol{u}}_1, \ldots, \bar{\boldsymbol{u}}_M$  and a finite set of points  $s_1, \ldots, s_K \subset \mathbb{C}_+$ . For any given  $\bar{\boldsymbol{u}}_j$  with  $\boldsymbol{p}_j = \boldsymbol{p}(\bar{\boldsymbol{u}}_j)$ , consider the block matrix  $\mathbf{M}_i$  defined as

$$\begin{pmatrix} \mathbf{M}_{j,1} \\ \mathbf{M}_{j,2} \\ \mathbf{M}_{j,3} \end{pmatrix} \triangleq \left( (s_1 \mathcal{E} - \mathcal{A}_{\mathbf{p}_j})^{-1} \mathcal{B}_{\mathbf{p}_j}, \dots, (s_K \mathcal{E} - \mathcal{A}_{\mathbf{p}_j})^{-1} \mathcal{B}_{\mathbf{p}_j} \right)$$

where  $\mathbf{M}_{j,1}, \mathbf{M}_{j,2}, \mathbf{M}_{j,3}$  provide a partitioning conformal with  $\boldsymbol{\xi}$ . Now if  $\mathbf{V}$  is such that  $\mathcal{R}(\mathbf{V}) \supset \mathcal{R}(\mathbf{M}_{j,1})$ , then  $\mathcal{R}(\mathcal{V}) \supset \mathcal{R}(\mathbf{M}_j)$ , implying

$$\hat{\mathbf{H}}(s_k; \hat{\boldsymbol{p}}(\bar{\boldsymbol{u}}_j)) = \mathbf{H}(s_k; \boldsymbol{p}(\bar{\boldsymbol{u}}_j)) \qquad k = 1, \dots, K.$$
(6)

In words, if the range space of  $\mathbf{V}$  contains the image of the first block-row of  $\mathbf{M}_j$ , then the linearization of the reduced system (2) around the operating point induced by  $\bar{\mathbf{u}}_j$  matches the linearized transfer function of the original system (1) around the same operating point for the prescribed set of frequencies  $\{s_k\}$ . Finally, multiple operating points are considered at once by collecting all  $\mathbf{M}_{j,1}$  in a single matrix  $\Psi_1$ 

$$\mathbf{\Psi}_1 \triangleq egin{pmatrix} \mathbf{M}_{1,1} & \cdots & \mathbf{M}_{j,1} & \cdots & \mathbf{M}_{M,1} \end{pmatrix}$$

C. Approximate interpolation via SVD

If the matrix  $\mathbf{V}$  is constructed to satisfy

$$\mathcal{R}\left(\mathbf{V}\right)\supset\mathcal{R}\left(\mathbf{\Psi}_{1}\right),\tag{7}$$

then the interpolation condition (6) will hold for all j = 1, ..., M. However, the dimension of  $\mathcal{R}(\Psi_1)$  can quickly grow

large if many operating points and frequencies are considered. Therefore, we give up the exact interpolation condition and construct V by taking only the first  $\rho$  principal components of  $\Psi_1$  as given by its singular value decomposition

$$\mathbf{\Psi}_1 = egin{pmatrix} \mathbf{P}_1 & \mathbf{P}_2 \end{pmatrix}$$
blkdiag $\{\mathbf{\Sigma}_1,\,\mathbf{\Sigma}_2\}\mathbf{Q}^T$ 

with the condition (7) relaxed to  $\mathcal{R}(V) \supset \mathcal{R}(\mathbf{P}_1)$ . Combining this with the condition for DC accuracy in (5), we can finally state our choice of **V** 

$$\mathbf{V} = \operatorname{orth} \{ \mathbf{P}_1, \, \mathbf{A}^{-1} \begin{pmatrix} \mathbf{B}_w & \mathbf{B}_z \end{pmatrix} \}.$$

#### D. Stability

In order to establish stability results, let us focus on the first four equations in (1). By viewing it as a linear system with time-varying parameters d = d(t), we can use a result analogue to [6, p.63], stating that if the LTI system in (1a)–(1c) is passive and  $\Delta(d) + \Delta(d)^T \leq 0$ , then the time-varying system is stable for any trajectory  $d_i(t) \in [0, 1]$ . Since the original subsystem represented by (1a)–(1c) is passive and the duty cycle signals are positive and smaller than 1 by construction, the same property can be preserved in the reduced model (2) by preserving passivity in the projection. As explained in [7], this can be done by first finding a positive definite matrix **X** that certifies the passivity of the original subsystem based on the Positive Real Lemma. This matrix is then used to construct **W** as  $\mathbf{W}^T = (\mathbf{V}^T \mathbf{X} \mathbf{V})^{-1} \mathbf{V}^T \mathbf{X}$ .

#### IV. NUMERICAL RESULTS

The proposed approach was tested on a mobile system based on a 4-core Intel<sup>®</sup> Core<sup>TM</sup> microprocessor. This is the same system already used in [5], for which each core has  $N_o = 36$ output ports where voltage is to be sensed and regulated, as well as  $N_p = 4$  ports connecting to the four phases of each FIVR. One output voltage for each core is fed to a feedback controller which determines the duty cycle of the corresponding FIVR switches. Application of proposed MOR scheme led to a reduced model with 527 states, with  $\rho = 350$ and 177 additional states to enforce DC accuracy, whereas the full-order system has 2673 state variables. The global basis **V** was constructed based on M = 4 distinct operating points defined by a sequential activation sequence for the cores.

The resulting system is excited by a sequence of current pulses, whose cumulative waveforms for each core are depicted in the bottom-left panel of Fig. 2. Such currents are uniformly distributed across the ports of each core, so that each core port is excited with the respective waveform divided by 36. The rise time of each pulse is 5 ns. The top panels of Fig. 2 compare the results obtained by proposed reduced-order solver to reference SPICE results, for two different port voltages located on two different cores. Bottom-right panel of Fig. 2 reports the maximum instantaneous deviation between our proposed solution and SPICE, computed for each time step as the largest error among all  $N_cN_o = 144$  output voltages. This error is uniformly less than 5 mV, which is well below the engineering accuracy that is requested for this transient



Fig. 2. Transient simulation results of a 4-core microprocessor system. Top: load voltage at selected ports of two different cores (left: port 1 of core 1; right: port 18 of core 3); bottom-left: total load currents applied to each core; bottom-right: maximum instantaneous error across all load voltages.

power integrity verification. Within this good accuracy level, larger instantaneous errors are related to multiple concurrent core switching events.

In terms of runtime, the SPICE simulation required 6880 seconds whereas the proposed model could be solved under the same loading conditions and on the same workstation in 33 seconds. Both simulations were performed with a maximum step size of 50 ps, resulting in about  $2 \cdot 10^5$  time steps. The speedup in runtime is about  $200 \times$ . These results confirm the excellent potential of proposed approach for efficient and comprehensive transient power integrity verification.

#### V. CONCLUSIONS

This paper presented a structured Krylov subspace projection framework for the model order reduction of largescale descriptions of system-level power distribution networks from VRM to core loads, including per-core integrated voltage regulators. The proposed approach preserves the structure of the initial system through a block-partitioned projection, constructed by assembling snapshots of the linearized system states at suitably chosen operating points. The resulting reduced model can be simulated (including feedback voltage stabilization) very efficiently, as demonstrated by the  $200 \times$ speedup with respect to SPICE that was achieved on a mobile system based on an Intel® Core<sup>TM</sup> microprocessor. In order to be ready for routine application, some work is still required for automating the various modeling steps, including selection of number and placement of the snapshots. This ongoing work will be documented in a future report.

#### REFERENCES

- K. Radhakrishnan, M. Swaminathan, and B. K. Bhattacharyya, "Power delivery for high-performance microprocessors—challenges, solutions, and future trends," *IEEE Transactions on Components, Packaging and Manufacturing Technology*, vol. 11, no. 4, pp. 655–671, 2021.
- [2] E. A. Burton, G. Schrom, F. Paillet, J. Douglas, W. J. Lambert, K. Radhakrishnan, and M. J. Hill, "FIVR — Fully integrated voltage regulators on 4th generation Intel<sup>®</sup> Core<sup>™</sup> SoCs," in 2014 IEEE Applied Power Electronics Conference and Exposition - APEC 2014, Mar. 2014, pp. 432–439.
- [3] W. H. A. Schilders, H. A. Van Der Vorst, and J. Rommes, *Model order reduction: theory, research aspects and applications.* Springer Verlag, 2008.
- [4] A. C. Antoulas, *Approximation of large-scale dynamical systems*. Society for Industrial and Applied Mathematics, 2005.
- [5] A. Carlucci, S. Grivet-Talocia, S. Mongrain, S. Kulasekaran, and K. Radhakrishnan, "Towards accelerated transient solvers for full system power integrity verification," in 2022 IEEE 31st Conference on Electrical Performance of Electronic Packaging and Systems (EPEPS), 2022, pp. 1–3.
- [6] S. Boyd, L. El Ghaoui, E. Feron, and V. Balakrishnan, *Linear matrix inequalities in system and control theory*. Society for Industrial and Applied Mathematics, 1994, vol. 15.
- [7] A. Megretski, "6.242, Fall 2004: Model Reduction Course Notes, ch. 4, MIT, Department of Electrical Engineering and Computer Science." [Online]. Available: web.mit.edu/6.242/www/images/lec4\_6242\_2004.pdf