1 Some Efficient Optimization Methods for Solving the Security-Constrained Optimal Power Flow Problem Dzung Phan and Jayant Kalagnanam Abstract—The security-constrained optimal power flow problem considers both the normal state and contingency constraints, and it is formulated as a large-scale nonconvex optimization problem. We propose a global optimization algorithm based on Lagrangian duality to solve the nonconvex problem to optimality. As usual, the global approach is often time-consuming, thus for practical uses when dealing with a large number of contingencies, we investigate two decomposition algorithms based on Benders cut and the alternating direction method of multipliers. These decomposition schemes often generate solutions with smaller objective function values than those generated by the conventional approach and very close to the globally optimal points. Index Terms—Security-constrained optimal power flow, Lagrangian duality, branch-and-bound, Benders decomposition, alternating direction method of multipliers. I. I NTRODUCTION The optimal power flow (OPF) problem aims at determining an optimal operating point for control variables which minimizes a given objective function subject to physical constraints and control limits [1]. The most commonly considered objective function is total cost of generation. In current practices of the real-time dispatch, the Regional Transmission Organizations (RTOs) and Independent System Operators (ISOs) often perform contingency analysis to assure that the power system remains balanced and no operational constraint is violated during both the normal case and contingency scenarios when the outage of a component incurs, the so-called N − 1 condition. The OPF problem with the contingency constraints is often referred to as the security-constrained optimal power flow (SCOPF) problem in the literature [2]. In this paper, we are interested in the alternate current (AC) power flow constraints for the problem. The SCOPF problem has been widely classified into two classes: preventive [3] and corrective [4] formulations. The preventive formulation minimizes some cost function by considering only the normal case control variables, that are feasible for both the normal and contingency conditions. For C contingency scenarios, the size of the problem is approximately C + 1 times larger than that of the conventional OPF. Consequently, a direct method for handling large-scale power systems with numerous contingencies in a centralized D. Phan and J. Kalagnanam are with the Department of Business Analytics and Mathematical Sciences, IBM T.J. Watson Research Center, Yorktown Heights, NY, 10598 USA, email: {phandu,jayant}@us.ibm.com. This material is based upon work supported by the U.S. Department of Energy under Award Number DE-OE0000190. manner would result in a prohibitive memory usage and unacceptably large execution time. In practice, however, many postcontingency constraints are redundant [5]. As a result, a class of algorithms based upon contingency filtering techniques has been developed [5]–[7] that identify and only include those potentially binding contingencies into the formulation. For instance, the contingency ranking method in [6] is derived by investigating a preventive SCOPF relaxation, where postcontingency conditions and the normal case are considered separately. The ranking scheme makes use of the values of dual variables and the decrease factor of regularized objective function values, and then chooses contingencies with a severity index exceeding some threshold for further consideration. Other contingency filtering approaches [5] focus on determining a minimal subset of contingencies to be added based on the post-contingency constraint violations. Additionally, a method using the generalized Benders decomposition [8] is developed in [9]. The corrective SCOPF formulation exploits the observation that some contingency violations can resist up to some allowed restoration time without damaging the devices [10]. It permits post-contingency control variables such as power generation outputs and transformer taps to be readjusted in order to remove any violations caused by the contingency. The total generation cost given by the corrective model is often smaller than that of the preventive model. However, the model requires additional decision variables and constraints, and more importantly it might need a large number of reschedules for each contingency. The authors in [4] reformulate the problem using only the base-case variables, where constraint reductions are represented as implicit functions of the contingency-free variables. The generalized Benders decomposition is used to solve the reformulation. An extension of a contingency filtering technique from [5] is studied in [10], which includes an additional OPF procedure to check the controllability of postcontingency states. A network compression technique [11] shrinks the networks under contingency states based on the fact that the impact of a failure is, in general, related to a localized area of the power grid. The DC SCOPF approximation can provide a good guess for binding contingencies once integrated in a contingency filtering framework [12]. The recent paper [13] introduces a semidefinite program relaxation to the problem and proves that a zero dual gap is obtained if strong duality holds for every classical OPF problem associated with each configuration; thus a globally optimal solution can be recovered from the dual. The reader is referred to [14] for a 2 recent literature review on the topic. In this work, we propose three approaches to solve the corrective SCOPF problem. The first one is an exact method based on a branch-and-bound algorithm for the rectangular formulation, which guarantees to find a global solution. The main idea comes from an extension of the technique for the conventional OPF problem proposed in [15], which exploits the quadratic structure of the problem together with the box and spherical constraints. For practical purposes, we also investigate two fast local algorithms based on decomposition techniques for any coordinates, which treat each contingency separately. The first local one is based on Benders decomposition, for which we derive new feasibility cuts by linearizing a feasibility checking optimization problem. An adaptive cut is proposed to handle the nonconvexity, which aims at improving the quality of solution. The second heuristic is an application of the alternating direction method of multipliers (ADMM) [16]. The decomposition algorithms can be run on parallel computers to reduce the running time. Note that these three approaches can also be used to solve the preventive model. The paper is organized as follows. In Section II, we review the problem formulation in a general format as well as in rectangular coordinates. Section III presents a Lagrangian duality and a rectangular branching procedure for a branch-andbound algorithm. In Section IV, we introduce a decomposition scheme based on Benders cut, and in Section V we study the application of ADMM for solving the SCOPF problem. Section VI contains numerical experiments and Section VII concludes the paper. II. P ROBLEM F ORMULATION In this paper, we are interested in the following corrective SCOPF problem: minuc ,xc :∀c∈{0}∪C f (u0 , x0 ) s.t. hc (uc , xc ) = 0, ∀c ∈ {0} ∪ C, gc (uc , xc ) ≤ 0, ∀c ∈ {0} ∪ C, |uc − u0 | ≤ uc , ∀c ∈ C, (1) where C = {1, 2, . . . , C} is the set of pre-specified C contingencies, c = 0 represents the base case, the index c ∈ C represents a contingency, uc is the control variables, xc is the state variables, uc is the upper bound deviation for control variables, hc and gc are operational constraints such as power flow equations and transmission line limits. The last ramping rate constraint is understood componentwise, i.e., |uic −ui0 | ≤ uic for every i. Note that some controls can operate in a preventive mode when setting uic = 0, and they can model for transformers, phase shifters and shunts but in a continuous space. We assume the list of contingencies is given. We now review the model (1) in rectangular coordinates, which will be used to develop a global optimization algorithm. Because of the limitation of our approach, only the levels of power generation are considered as control variables. Let N denote the set of buses in the grid, G denote the set of generators, and B denote the set of branches. Suppose that every complex nodal voltage can be modeled in rectangular coordinates vi = ei + jfi , for all i ∈ N , where j is the imaginary unit. The net active power Pi and reactive power Qi injections into bus i ∈ N are given by ∑( ) Pi (e, f , G, B) = ei (gij ej − bij fj ) + fi (gij fj + bij ej ) , j∈N ∑( Qi (e, f , G, B) = ) fi (gij ej − bij fj ) − ei (gij fj + bij ej ) , j∈N where e ∈ R|N | and f ∈ R|N | are vectors with voltage components, gij is the ij-th entry of the bus conductance matrix G ∈ R|N |×|N | , and bij is the ij-th entry of the bus susceptance matrix B ∈ R|N |×|N | . The active power flow Pij from bus i to bus j is computed from Pij = (e2i + fi2 )ḡii + (ei ej + fi fj )ḡij − (ei fj − ej fi )b̄ij , where ḡii is the self-conductance of branch admittance at bus i, ḡij is the mutual conductance, and b̄ij is the mutual susceptance of branch admittance from buses i to j. Define Pig,c and Qg,c as dispatchable active and reactive i powers, Pid and Qdi as active and reactive power demands at bus i. To ease the notation, we assume that Pig,c = Qg,c =0 i and Pid = Qdi = 0 for any bus i that has zero generation or zero demand, respectively. Without loss of generality, we assign the index 1 as the reference bus f1c = 0 [2]. The SCOPF problem of minimizing the generation cost becomes: ∑( ) g,0 2 g,0 g,0 min f (P ) = c (P ) + c P + c (2) 0i 1i 2i i i g,c g,c c c ei ,fi ,Pi s.t. ,Qi i∈G Pi (e , f , G , B ) = Pig,c − Pid , i ∈ N d Qi (ec , f c , Gc , Bc ) = Qg,c i − Qi , i ∈ N max,c c c c c Pij (e , f , Ḡ , B̄ ) ≤ Pij , (i, j) ∈ B c f1 = 0 Pimin,c ≤ Pig,c ≤ Pimax,c , i ∈ G Qmin,c ≤ Qg,c ≤ Qmax,c ,i ∈ G i i i (vimin,c )2 ≤ (eci )2 + (fic )2 ≤ (vimax,c )2 , i c |Pig,c − Pig,0 | ≤ P i , i ∈ G c g,0 |Qg,c i − Qi | ≤ Qi , i ∈ G, c c c c (3) (4) (5) (6) (7) (8) ∈N (9) (10) (11) for all c ∈ {0}∪C, where Pmin,c and Pmax,c are active power generation limits, Qmin,c and Qmax,c are reactive power generation limits, vmin,c and vmax,c are voltage magnitude c c limits, P and Q are pre-determined maximal allowed variation of power outputs, and c0i , c1i , and c2i are generation c cost parameters. The entries of Ḡc are ḡij ’s, and the entries c c of B̄ are b̄ij ’s. Other forms of the objective function and transmission line constraints can be considered by using the same techniques described below if the quadratic feature still holds, but for simplicity in the presentation, we present a global algorithm only for the cost of generation and the active power flow limits. III. E XACT METHOD BASED ON BRANCH - AND - BOUND In this section, we investigate an exact method based on a spatial branch-and-bound approach to solve the SCOPF problem in rectangular coordinates (2)-(11) to optimality. The lower bound for the objective function is obtained by Lagrangian duality, while the feasible set subdivision is based 3 on a rectangular bisection. Note that if control variables such as transformer ratio and phase shift are considered, the power flow balance equations are not quadratic, the exact method cannot be applied. A. Lagrange Dual Problem We propose a Lagrange dual problem for the SCOPF modeled by (2)-(11), which can provide a lower bound to the global optimum in our branch-and-bound algorithm if strong duality does not hold. In particular, we dualize the complicating quadratic constraints, but retain the simple bound constraints such as the boxes and spheres in the feasible set. In many applications, the solution of the primal problem can be achieved from the Lagrange dual problem, for example, when strong duality holds. We will show later in the numerical simulation that a zero gap is observed for almost our test instances for the proposed dual problem. Let Ω be the feasible set of the SCOPF problem and assume Ω is nonempty. We form a new constraint for the summation of squared voltage magnitudes ∑ min,c ∑ ∑ max,c Rc = (vi )2 ≤ ((eci )2 + (fic )2 ) ≤ (vi )2 . i∈N i∈N i∈N The SCOPF problem is equivalent to min f (Pg,0 ) s.t. (3) − (11), (ec , f c ) ∈ Rc , ∀c ∈ {0} ∪ C. (12) We define the Lagrangian L of (12) by incorporating complicating quadratic constraints into the objective function L(ec , f c , Pg,c , Qg,c , λc , β c , µc , γ c , θ c ) = f (Pg,0 )+ { C ∑ ∑ ( ) ∑ c( d λci Pid + Pi (ec , f c , Gc , Bc ) + βi Qi + c=0 i∈N Qi (ec , f c , Gc , Bc )) + ) Pijmax,c + ∑ γic ( and ∑ minec ,f c i∈N (λci Pi (ec , f c , Gc , Bc ) + βic Qi (ec , f c , Gc , Bc )) ∑ +(θic − γic )((eci )2 + (fic )2 ) + (i,j)∈B µci,j Pij (ec , f c , Ḡc , B̄c ) s.t. (6) and (ec , f c ) ∈ Rc . (16) Note that the problem (15) is a separable convex quadratic programming problem and it can be further decomposed into |G| subproblems (associated with each i ∈ G). Problem (16) is a nonconvex one, the additional spherical constraint Rc and the quadratic structure of the objective function make it tractable. Define the matrix Ac (λc , β c , µc , γ c , θ c ) by −β c Gc −λc Bc λc Gc −β c Bc 1 1: 1 1: 1 1: 1 1: . . .. .. λc Gc −β c Bc c c c c −β|N | G|N |: −λ|N | B|N |: |N | |N |: |N | |N |: β c Gc +λc Bc λc Gc −β c Bc 1 1: 1 1: 1 1: 1 1: . .. ... ( + c c c c β|N | G|N |: +λ|N | B|N |: µc ◦ Ḡc −µc ◦ B̄c µc ◦ B̄c µc ◦ Ḡc ) c c λc|N | Gc|N |: −β|N | B|N |: + diag([θ c − γ c ; θ c − γ c ]), i∈N ∑ ( µci,j Pij (ec , f c , Ḡc , B̄c )− (i,j)∈B (vimin,c )2 ) ∑ − (eci )2 − (fic )2 + θic i∈N ( The dual (14) is a concave maximization problem, the optimal solution can be computed by a subgradient algorithm. A subgradient of function q can be obtained by solving the inner minimization problem (13). The task of solving (13) can be done efficiently in our application for solving the SCOPF. For a given (λ, β, µ, γ, θ), (Pg,c , Qg,c : c = 0, . . . , C) and (ec , f c : c = 0, . . . , C) are decoupled in both the objective function and the constraints of problem (13). Now, the minimization problem (13) can be decomposed into C + 2 subproblems, which are relatively easy to be solved: ∑C ∑ minPg,c ,Qg,c f (Pg,0 ) − c=0 i∈G (λci Pig,c + βic Qg,c i ) (15) s.t. (7), (8), (10), (11) i∈N } ∑ ) max,c 2 c 2 c 2 c g,c c g,c ) − (λi Pi + βi Qi ) . (ei ) + (fi ) − (vi i∈G where Gci: and Bci: are the i-th rows of matrices Gc and Bc , respectively, diag(x) is the diagonal matrix with x down the diagonal, and “◦” denotes the componentwise product. Let Āc (λc , β c , µc , γ c , θ c ) be the matrix obtained from Ac by removing the (|N | + 1)-th row and (|N | + 1)-th column. Likewise, f̄ c and R̄c are obtained from f c and Rc , respectively, by deleting f1c . Then the latter (16) can be expressed as follows The Lagrange dual function is given by min (ec ; f̄ c )T Āc (ec ; f̄ c ) s.t. (ec , f̄ c ) ∈ R̄c . q(λ, β, µ, γ, θ) = minec ,f c ,Pg,c ,Qg,c L(·) The following theorem from [15] gives an optimal solution to the problem (17). s.t. (6) − (8), (10), (11) (ec , f c ) ∈ Rc , ∀c ∈ {0} ∪ C, (13) where we abuse the multiplier notation by assuming λ = (λ0 , . . . , λC ), β = (β 0 , . . . , β C ), etc. It is known that for every {(λ, β, µ, γ, θ) : µ ≥ 0, γ ≥ 0, θ ≥ 0}, the value of the dual function provides a lower bound on the optimal objective function value of the primal problem. The best lower bound is obtained by solving the dual problem: max λ,β,µ,γ,θ q(λ, β, µ, γ, θ) s.t. µ ≥ 0, γ ≥ 0, θ ≥ 0. (14) (17) Theorem 1: Let uc be the eigenvector associated with the smallest eigenvalue λmin of the matrix Āc + (Āc )T . The solution of (17) is now determined as follows: √∑ max,c 2 ) i∈N (vi uc , if λmin < 0 c∥ ∥u c c √ (e ; f̄ ) = ∑ min,c 2 (v ) i∈N i uc , otherwise. ∥uc ∥ Thus we can solve the dual problem (14) efficiently, for which a zero duality gap is numerically shown for many test cases in the numerical experiments. 4 B. Global Optimization Algorithm If strong duality does not hold for the dual problem (14), we propose a branch-and-bound algorithm to obtain a globally optimal solution of the SCOPF. The lower bound of a node is achieved by solving the dual, while the branching procedure is based on an iteratively rectangular bisection. Rectangular bisection: The branching process in the branchand-bound algorithm is based on successive rectangular bisections of the bound constraints for the active and reactive powers: Pg,c and Qg,c ) } {( Pg,c Pimin,c ≤ Pig,c ≤ Pimax,c , i ∈ G c B = : . Qg,c Qmin,c ≤ Qg,c ≤ Qmax,c ,i ∈ G i i i Consider a rectangle B in general form B = {x ∈ Rn : a ≤ x ≤ b}, a point v ∈ B, and an index j ∈ {1, . . . , n}. Suppose that B is partitioned into two subrectangles determined by the hyperplane {x : xj = vj }. We call a subdivision of B via (v, j) is a bisection if the index j corresponds to a longest side of B and v is a point of this side such that In our application for solving the SCOPF, we apply the scheme to B̂0 = B0 × B 1 × . . . × B C . Over a new generated rectangle min,c B̂k = {(Pg,0 , Qg,0 , . . . , Pg,C , Qg,C ) : P̂k,i ≤ Pig,c ≤ max,c P̂k,i , Q̂min,c ≤ Qg,c ≤ Q̂max,c , for all i ∈ G, c = i k,i k,i 0, . . . , C}, the Lagrange dual function is given by (ec , f c ) ∈ Rc , c = 0, . . . , C IV. B ENDERS DECOMPOSITION ALGORITHM We present an application of Benders decomposition to solve the problem (1), where new feasibility cuts are introduced. They are different from the standard technique in the literature by linearizing an optimization problem rather than a contingency-related function. An adaptive cut is proposed to handle nonconvexity to improve solution quality. A. Cut Generation min(vj − aj , bj − vj ) = 0.5(bj − aj ). q(λ, β, µ, γ, θ) = minec ,f c ,Pg,c ,Qg,c L(·) s.t. (6), (10), (11) Theorem 2: The rectangular branch-and-bound algorithm generates an optimal solution of the SCOPF problem in a finite number of iterations, or every accumulation point of the sequence {xk } is a solution of the SCOPF. Proof: Suppose the algorithm does not terminate in a finite number of iterations. Since the branching procedure is performed on the largest side of the rectangle at each iteration, a nested sequence of bisections shrinks to a point. The proof is competed by following the same reasoning used in Theorem 4.1 of [15]. (18) (Pg,0 , Qg,0 , . . . , Pg,C , Qg,C ) ∈ B̂k . Define the set F by F = {(ec , f c , Pg,c , Qg,c : ∀c ∈ {0} ∪ C) : (3)-(6), (9)-(11)} . For any box B̂, define ℓ(B̂) the lower bound of f (Pg,0 ) over F ∩ B̂ given by the dual. Now, the branch-and-bound algorithm can be stated as follows: R ECTANGULAR BRANCH - AND - BOUND ALGORITHM Initialize S = {B̂0 }, compute ℓ(B̂0 ) by the dual and use a local solver to get an upper bound u and a feasible point x. Repeat until S = ∅ a) Select B̂ ∈ S such that ℓ(B̂) = min{ℓ(B̂) : B̂ ∈ S}. Use the rectangular bisection for B̂ to obtain B̂1 and B̂2 . b) If B̂i ∩ Ω = ∅, set ℓ(B̂i ) = ∞; Otherwise compute ℓ(B̂i ) by the dual. c) Update the best upper bound u and the associated feasible point xk by applying a local solver over F ∩ B̂i . d) Set S = {B ∈ S ∪ {B̂1 } ∪ {B̂2 } : ℓ(B) < u and (u − ℓ(B))/u > ϵ, B ̸= B̂} The following theorem shows that all accumulation points of a branch-and-bound algorithm are the global optima. We notice that the study of Benders decomposition for the problem (1) amounts to considering the following problem minx∈X f (x) h(y) = 0, g(y) ≤ 0 Mx + Ny + b ≤ 0 (19) yl ≤ y ≤ yu , where X ⊂ Rn , y ∈ Rm , M and N are matrices. The connection between (1) and (19) can be explained as follows. The variable x represents the base-case variable (u0 , x0 ) and X is the feasible set of the base case. For each contingency, the feasible set can be written in the form of the constraints in (19), where y represents (uc , xc ), the power flow balance constraint hc (uc , xc ) is h(y), the ramping rate constraints is Mx + Ny + b ≤ 0, the flow limit and operational limit constraints gc (uc , xc ) ≤ 0 are separated into g(y) ≤ 0 and yl ≤ y ≤ yu . We only need to present the feasibility cut generation for one contingency since every contingency can be expressed in the same format. Denote x̄ by the solution of the initial master problem min f (x) s.t. x ∈ X. (20) Then we have to check if the following system is feasible {h(y) = 0, g(y) ≤ 0, Mx̄ + Ny + b ≤ 0, yl ≤ y ≤ yu } (21) The algorithm is terminated if (21) is feasible for the given x̄. Otherwise, we generate a linear cut, which is added to the master problem (20) to enforce the feasibility of the system (21). We now construct a linear cut if (21) is infeasible. For a given x̄, consider the optimization problem ∑ miny,α i αi s.t. h(y) = 0, g(y) ≤ 0 Mx̄ + Ny + b − α ≤ 0 yl ≤ y ≤ yu , α ≥ 0, (22) 5 by introducing new variables αi ’s, whose optimal solution is (ȳ, ᾱ). In general, in order to generate a linear cut based on the dual solutions, we need the strong duality assumption. However, the problem arising from our application is nonconvex. It motivates us to study the linearized version of (22). The infeasibility of (21) implies that the optimal function value of (22) is strictly positive. We make the substitution z = y − yl and use the first-order approximations of h(y) and g(y) at ȳ − yl , we yield the linear programming relaxation ∑ minz,α i αi h(ȳ) + ∇y h(ȳ)(z − (ȳ − y )) = 0 l s.t. g(ȳ) + ∇y g(ȳ)(z − (ȳ − yl )) ≤ 0 (23) Nz − α + Mx̄ + b + Nyl ≤ 0 z ≤ yu − yl , 0 ≤ α, 0 ≤ z. Because the optimal function value of (22) is strictly positive, it follows from the first-order optimality conditions that the optimal function value of (23) is also strictly positive. We derive a linear cut based upon the dual variables of (23). The Lagrange dual problem of (23) is defined as max min π T (Nz − α + Mx̄ + Nyl + b) + ) ( − (ȳ − yl )) + µT g(ȳ) + ) ∑ ∇y g(ȳ)(z − (ȳ − yl )) + αi max min ( π≥0,µ≥0,η 0≤α 0≤z≤yu −yl T l i )T T NT π + ∇T z y h(ȳ)η + ∇y g(ȳ)µ +π (Mx̄ + Ny + b) − η T ∇y h(ȳ)(ȳ − yl ) + ∑ µT (g(ȳ) − ∇y g(ȳ)(ȳ − yl )) + αi (1 − πi ) = π (Mx̄ + Ny + b) − η ∇y h(ȳ)(ȳ − y ) ∑ ℓi (π, µ, η), where +µT (g(ȳ) − ∇y g(ȳ)(ȳ − yl )) + max l T l 0≤πi ≤1,µ≥0,η i ] [ { T 0 if NT π + ∇T y h(ȳ)η + ∇y g(ȳ)µ i ≥ 0 ℓi (·)= [ T ] u T l N π + ∇T y h(ȳ)η + ∇y g(ȳ)µ i [yi − yi ] otherwise Let (π̂, η̂, µ̂) be the optimal solution of the dual problem. Since strong duality holds for (23), we have ∑ l T T l i α̂i = π̂ (Mx̄ + Ny + b) − η̂ ∇y h(ȳ)(ȳ − y )+ ∑ µ̂T (g(ȳ) − ∇y g(ȳ)(ȳ − yl )) + i ℓi (π̂, η̂, µ̂). ∑ From the definition of ℓi (π, µ, η), we have i ℓi (π̂, η̂, µ̂) ≤ ∑ 0. Because i α̂i > 0, it implies that π̂ T (Mx̄ + Nyl + b) − η̂ T ∇y h(ȳ)(ȳ − yl )+ µ̂T (g(ȳ) − ∇y g(ȳ)(ȳ − yl )) > 0. Thus, we should impose the inequality (24) to the master π̂ T (Mx + Nyl + b) − η̂ T ∇y h(ȳ)(ȳ − yl )+ µ̂ (g(ȳ) − ∇y g(ȳ)(ȳ − y )) ≤ 0, T l (26) Define U = {(y, α) : yl ≤ y ≤ yu , α ≥ 0}. Applying the first-order optimality conditions for (22) yields ( ) ( ) ( ) 0 Nj: ∇y hj (ȳ) ∑ ∑ + j π̄j + j η̄j 1 −ej 0 ( ) ( ) (27) ∇y gj (ȳ) uy ∑ + j µ̄j + = 0, 0 uα (uy ; uα )T (ȳ − y; ᾱ − α) ≥ 0, for all (y, α) ∈ U . We choose (y, α) = (yl , 0) and then multiply (27) by (ȳ − yl , ᾱ), which yields ∑ T l T T l i ᾱi + π̄ N(ȳ − y ) − π̄ ᾱ + η̄ ∇y h(ȳ)(ȳ − y )+ µ̄T ∇y g(ȳ)(ȳ − yl ) ≤ 0. i T Proof: We prove by contradiction. Suppose that ( π̄ T Mx̄ + π̄ T (Nyl + b) − η̄ T ∇y h(ȳ)(ȳ − yl )− ) µ̄T ∇y g(ȳ)(ȳ − yl ) ≤ 0. where Nj: is the j-th row of matrix N, 0 is the vector whose entries are all 0, 1 is the vector whose entries are all 1, ej is the j-th unit vector, (uy ; uα ) is a vector in the normal cone of U at (ȳ, ᾱ), that is, π≥0,µ≥0,η 0≤α 0≤z≤yu −yl ( T η h(ȳ) + ∇y h(ȳ)(z = multipliers corresponding to the constraints Mx̄ + Ny + b − α ≤ 0, h(y) = 0 and g(y) ≤ 0, respectively, in problem (22). Because of the complementary slackness condition, it follows µ̄T g(ȳ) = 0. From (24), we derive the following result. Theorem 3: For a given x̄, if the system (21) is infeasible, then the following inequality is a cut for x̄ ( π̄ T Mx + π̄ T (Nyl + b) − η̄ T ∇y h(ȳ)(ȳ − yl )− ) (25) µ̄T ∇y g(ȳ)(ȳ − yl ) ≤ 0. (24) which acts as a linear cut. Motivated by the above analysis, we propose an alternative cut when (21) is infeasible. Let (π̄, η̄, µ̄) denote the Lagrange Notice that π̄ T ᾱ = π̄ T (Mx̄ + Nȳ + b), then we have ∑ l T i ᾱi − π̄ (Ny + Mx̄ + b)+ (28) η̄ T ∇y h(ȳ)(ȳ − yl ) + µ̄T ∇y g(ȳ)(ȳ − yl ) ≤ 0. ∑ Combining i ᾱi > 0, (26), and (28) gives a contradiction. Note that our cut (25) is different from the Benders cut used in [4], which is obtained by the linearization of an implicit function with respect to the base-case variable for each contingency. We explicitly use not only the Lagrange multipliers of the linear constraints but also that of nonlinear ones as well as the bounds on the decision variables. B. Description of Benders Decomposition Algorithm We now describe a Benders decomposition algorithm using linear cuts (24) and (25). The master problem is expressed as min f (u0 , x0 ) s.t. {h0 (x0 , u0 ) = 0, g0 (x0 , u0 ) ≤ 0, Pu0 + q ≤ 0} (29) where Pu0 + q ≤ 0 denotes the set of feasibility cuts at a given step, P is a matrix, and q is a vector. B ENDERS D ECOMPOSITION A LGORITHM 1. Solve the regular OPF to get (x10 , u10 ): min f (u0 , x0 ) s.t. {h0 (x0 , u0 ) = 0, g0 (x0 , u0 ) ≤ 0} 6 2. For k = 1, 2, . . . (a) For c = 1, . . . , C: Solve a problem of∑form (22) to get k the optimal value i αi , where u0 is in place of x̄. ∑ ∗ If i αi > 0, then add the cut (24) or (25) ∑ into the master problem (29). ∗ If i αi = 0 for all c ∈ C, terminate the algorithm. (b) Solve the master problem (29) to get (xk+1 , uk+1 ). 0 0 message passing scheme based on the application of ADMM for solving the optimal power scheduling problem in a distributed manner. Let us briefly review the main idea of ADMM. Consider the optimization problem with block separable structure min {ϕ(x) + φ(y) : Mx + Ny = d, x ∈ X, y ∈ Y }, x,y where X ⊂ Rn , Y ⊂ Rm , M ∈ Rp×n , N ∈ Rp×m and d ∈ Rp . The augmented Lagrangian function is given by Remark 1 (Adaptive cuts): Because SCOPF is a nonconvex optimization problem, if we use a linear cut without care, its optimal solutions will likely be trimmed off. It might delete subregions of the feasible set containing good feasible points. Consequently, the generated solution is often far from the global optimum. Even worse, the new cut can make the master problem infeasible (see Figure 1). To alleviate this effect, we propose to use a cut in an adaptive manner. Suppose that c(x) ≤ 0 is a cut for x̄, it is easy to see that Lβ (x, y, λ) =ϕ(x) + φ(y) + λT (Mx + Ny − d)+ β ∥Mx + Ny − d∥2 , 2 where β > 0 is the penalty parameter. Using a Gauss-Seidel step in the augmented Lagrangian multiplier framework, ADMM has the form c(x) − τ c(x̄) ≤ 0, for any τ ∈ (0, 1), λk+1 = λk + β(Mxk+1 + Nyk+1 − d). yk+1 = argminy∈Y Lβ (xk+1 , y, λk ) (30) is also a cut. By choosing τ adaptively based on a constraint violation measure for x̄, we likely improve the quality of the solution. For example, the reference factor can be the optimal function value of the problem (22). The idea is that if the constraint violation is large, we select τ close to 1, otherwise τ is close to 0. Fig. 1. xk+1 = argminx∈X Lβ (x, yk , λk ) We notice that the SCOPF (1) can be reformulated by introducing auxiliary variables u0c in a form suitable for ADMM as min f (u0 , x0 ) s.t. h0 (u0 , x0 ) = 0, g0 (u0 , x0 ) ≤ 0 hc (uc , xc ) = 0, gc (uc , xc ) ≤ 0, ∀c ∈ C |uc − u0c | ≤ uc , ∀c ∈ C u0 − u0c = 0, ∀c ∈ C. Regular and adaptive Benders cuts We define x , (u0 , x0 ), y , (u1 , x1 , . . . , uC , xC ), ϕ(x) , f (u0 , x0 ), φ(y) , 0, X , {(u0 , x0 ) : h0 (u0 , x0 ) = 0, g0 (u0 , x0 ) ≤ 0}, Remark 2: To speedup the Benders algorithm, we can switch off a contingency c if it was feasible for incumbent uk0 because it is likely that no cuts are generated for later . At the end, we need to check if the final u∗0 is feasible uk+j 0 for it. Y , (Y1 , . . . , YC ), where { Yc , (uc , xc , u0c ) : hc (uc , xc ) = 0, gc (uc , xc ) ≤ 0, } |uc − u0c | ≤ uc , ∀c ∈ C, Mx + Ny = d , u0 − u0c = 0, ∀c ∈ C, then the augmented Lagrangian is Lβ (x, y, λ) = f (u0 , x0 )+ V. A LTERNATING DIRECTION METHOD OF MULTIPLIERS This section introduces an alternative approach to solve the large-scale SCOPF problem (1) by decomposing it into a number of smaller subproblems related to each contingency. By reformulating the original problem, we can apply the alternating direction method of multipliers [16] to obtain our algorithm. The first-order primal-dual algorithm ADMM iteratively updates both primal and dual variables. In power system analysis, to solve the classical OPF problem without the security constraints, in [17], Kim and Baldick split the power grid into a number of separate regions. By duplicating the variables in overlap regions, they are able to decompose the OPF problem by ADMM. Kraning et. al [18] develop a C ∑ β 2 (λT c (u0 −u0c )+ ∥u0 −u0c ∥ ). 2 c=1 The ADMM scheme for SCOPF is defined as follows ADMM ALGORITHM Initialize k = 0 and starting points u00c , λ0c . Repeat 1. (xk+1 , uk+1 ) = argmin(u0 ,x0 )∈X f (u0 , x0 ) + ) ∑0 ( (0 k )T λc (u0 − uk0c ) + β2 ∥u0 − uk0c ∥2 . c∈C 2. For c = 1, . . . , C : (xk+1 , uk+1 , uk+1 ) = argmin(uc ,xc ,u0c )∈Yc ( ck )T ck+1 0c λc (u0 − u0c ) + β2 ∥uk+1 − u0c ∥2 . 0 k+1 k+1 k+1 k λc = λc + β(u0 − u0c ). 3. Set k = k + 1. 7 TABLE II CPU TIMES AND NUMBER OF ITERATIONS FOR GLOBAL ALGORITHM Until a stopping criterion is satisfied. Note that if F, G, X, and Y are convex, ADMM is guaranteed to converge to the optimum. However, in our application, X and Y are nonconvex, it can serve as a local algorithm. At each iteration, we decompose the SCOPF into C + 1 subproblems with roughly the same size of the conventional OPF problem, which is suitable for parallel implementation. The feasible sets of subproblems in the ADMM algorithm have not been modified over iterations, a significant computation effort can be saved by using warm-start techniques for these subproblems. VI. C OMPUTATIONAL EXPERIMENTS We use several standard power networks appearing in the literature, whose structures and characteristics are summarized in Table I. NE39 is the the New England system from [19], while IEEE14, IEEE30, IEEE57, IEEE118 and IEEE300 are available at http://www.ee.washington.edu/research/pstca/, and PL3012 is the Polish network during winter 2007-08 evening peak conditions from Matpower [20]. We artificially generated TABLE I T EST SYSTEMS CHARACTERISTICS # 1. 2. 3. 4. 5. 6. 7. Case IEEE14 IEEE30 NE39 IEEE57 IEEE118 IEEE300 PL3012 |N | 14 30 39 57 118 300 3012 |G| 5 6 10 7 54 69 385 |B| 20 41 46 80 186 411 3572 |C| 10 15 20 30 100 150 300 #Var 418 1152 2058 3968 34744 111438 1992620 the list of contingencies; corresponding numbers of scenarios are presented in the fifth column. The last column gives the number of decision variables. In our experiments, a contingency relates to a transmission line outage; every generator allows to readjust up to 5% of its maximal generation output capacity. All the experiments were performed on a personal computer using a Matlab implementation with an Intel Xeon X5570 2.93 Ghz and 8Gb of memory. In the following tables, the basecase OPF cost “Base” represents the generation cost without contingency constraints. Denote “Optimal” by the optimal value generated by a global method. The total generation cost is labeled by “Cost”, “Time” is the running time in seconds, and “It” is the number of iterations. A. Numerical Results for Global Algorithm We present numerical results in Table II using the proposed branch-and-bound algorithm in Section III. The projected subgradient method was used to solve for the dual (14). CPLEX 12.4 was used to solve for (15). The inverse-free preconditioned Krylov subspace projection algorithm [21] deals with the eigenvector associated with the minimum eigenvalue in Theorem 1. First of all, it is worth noting that the branch-andbound algorithm terminates at the first iteration, i.e., It = 1, which implies the Lagrange dual problem gives a zero duality Cases IEEE14 IEEE30 NE39 IEEE57 IEEE118 IEEE300 PL3012 Base 80.81 89.06 361.54 417.37 1296.67 7197.23 25817.09 Optimal 81.04 89.74 391.61 419.55 1333.06 7197.48 Out of It Time 1 81.62 1 19.56 1 0.25 1 0.37 1 336.04 1 875.81 memory gap. For these test cases, the branching procedure is not needed. Thus solving the dual provides a global optimum to the SCOPF problem. As usual, the main drawback of the global scheme is the large computational time required, which suffers from the scalability. However, its proved optimal results are useful because they can be used to evaluate the solution quality of any local approach. We can reduce the running time and alleviate the memory requirement by utilizing the decomposable structure of the dual to implement in parallel. Note that instances NE39 and IEEE57 were solved quickly because we used the Lagrangian multipliers of the primal as the starting point for the dual, and these values are indeed the optimal solution to the dual. TABLE III CPU TIMES AND NUMBER OF ITERATIONS FOR GLOBAL ALGORITHM C Cases Changes IEEE30 Pd = 0.5Pd {2 4 10} 36.3 Qd = 0.1Qd Pd = 0.75Pd {2 6 15 22} 238.6 Qd = 0.2Qd NE39 Base Optimal It Time 36.4 103 218.4 291.2 156 305.7 For all above standard test cases, the Lagrange duality relaxation solves the SCOPF problem, i.e., there is no need to employ the branching step. We use some modified test power systems suggested in [22], for which Lagrangian dual relaxation has a non-zero duality gap for SCOPF, to illustrate the benefit of the branching. The modifications are active and reactive power demands. The power generation readjustments are within 2% of maximal capacities. The stopping criterion for the branch-and-bound algorithm was set at a relative gap of ϵ = 0.01 between the lower bound and the upper bound. The details of these tests and the algorithm performance are shown in Table III. B. Numerical Results for Local Algorithms Version 3.10 of IPOPT [23] was used to solve the subproblems arising in these decomposition algorithms. The stopping criterion for ADMM algorithm is k+1 k ∥uk+1 − uk+1 0 0c ∥∞ ≤ 0.01 and ∥u0c − u0c ∥∞ ≤ 0.01. We used a fixed parameter of β = 1 for the iterations. First, we present the performance of Benders decomposition using the cut (25) and the adaptive cut (30) with the following rule for selecting τ in (30): if the optimal function value of (22) is larger than 1, then we choose τ = 0.8; otherwise τ = 0.4. We omit to report the results for the cut (24) because it has almost 8 CPU TIMES OF TABLE IV B ENDERS [4], OUR PROPOSED B ENDERS DECOMPOSITION ALGORITHMS , AND ADMM Benders [4] Adaptive Benders Prop. Benders Adaptive Prop. Benders ADMM Cases Optimal Cost It Time Cost It Time Cost It Time Cost It Time Cost It Time IEEE14 81.04 83.25 5 0.61 81.08 8 2.09 84.18 3 0.53 81.12 4 1.41 81.07 35 2.20 IEEE30 89.74 failed 90.71 10 2.48 97.25 5 1.44 92.58 20 3.56 89.82 71 7.11 NE39 391.61 415.65 6 1.77 393.59 17 3.31 418.81 5 1.61 391.50 18 3.14 391.58 31 4.74 IEEE57 419.55 457.38 4 2.74 426.23 12 4.06 432.67 3 2.24 427.96 9 3.55 419.52 12 4.46 IEEE118 1333.06 failed - 1333.55 23 16.42 1402.74 7 5.81 1332.92 18 12.63 1333.15 29 23.54 IEEE300 7197.48 7429.50 3 9.72 7197.39 14 21.94 7325.25 5 12.97 7197.36 18 26.15 7197.41 16 34.61 PL3012 26457.20 7 779.14 25923.47 13 1008.28 26137.92 8 831.16 25923.62 12 955.08 25921.58 23 3270.52 similar results with those of (25). If the optimal function value of (22) is bigger than 0.01, we generate the feasibility cut for this contingency. The performance of the standard Benders decomposition algorithm in [4] and the application of our adaptive cut scheme for this method are reported. Table IV shows that both the standard Benders decomposition algorithm and our proposed algorithm have a fast convergence rate (up to 8 iterations) if they are convergent; however, they often generate solutions of poor quality. The standard Benders cut from [4] failed to solve for the 30bus and 118-bus cases, which is due to the infeasibility of the master problem when more linear cuts are added. The difficulty with this sub-optimality and the infeasibility is understandable since hyperplanes from the feasibility cuts often prune the optimal solution from the nonconvex feasible set as explained in Remark 1. The adaptive selection for both Benders cuts gives much better solutions which are often close to the globally optimal points. Moreover, it is also able to cure the infeasibility issue. As expected, the deployment of the technique leads to more iterations to converge in all cases. The solution quality and running times for these adaptive Benders cut algorithms are comparable. ADMM is more robust in terms of solution quality. That is, it often gives a slightly lower generation cost. In view of a smaller number of iterations, the running times of the adaptive Benders decompositions outperform ADMM. This is also partially because of the fact that ADMM cannot exploit the switch-off strategy presented in Remark 2, which requires us to solve C + 1 subproblems per iteration. We notice that all ADMM solutions are the essentially global optima. These numerical results on serial codes show that these algorithms are scalable, they are suitable for solving large-scale networks with a large number of contingencies. The running times will be reduced significantly if we exploit the decomposable properties of these algorithms and implement in parallel in order to meet the real-time operation requirement. Table V gives the overall power generation (MW) and total cost for OPF, SCOPF, and first four contingencies. The fourth column, denoted by “Shifted”, shows the absolute value of overall power (MW) shift between OPF and SCOPF solutions. For cases with a large amount of power rescheduled such as NE39, IEEE118 and PL3012, the generation costs of SCOPF are noticeably bigger than those of OPF. In view of the large distance between the solutions of OPF and SCOPF, Benders decomposition algorithms tend to need more cuts, thus more iterations, to converge. The total cost for each contingency is always bigger than that of the normal case, i.e., the optimal function value of SCOPF. However, the overall power generation for some contingencies can be smaller than that of the SCOPF, for example, the third contingency of the IEEE14 system. TABLE V C OMPARISON BETWEEN OPF AND SCOPF SOLUTIONS Case OPF SCOPF Shifted Ctg. 1 Ctg. 2 Ctg. 3 Ctg. 4 1. MW 268.2 268.4 9.3 270.8 268.3 268.0 269.4 $ 80.8 81.0 82.7 81.2 81.3 81.8 2. MW 295.1 292.8 66.9 292.6 294.8 292.9 292.5 $ 89.1 89.7 90.4 91.8 90.6 89.9 3. MW 6194.2 6207.3 1038.0 6220.2 6215.2 6212.3 6214.8 $ 361.5 391.6 390.0 400.3 389.2 389.5 4. MW 1267.2 1267.3 316.2 1267.1 1269.2 1267.4 1268.9 $ 417.4 419.6 419.8 420.8 419.8 420.5 5. MW 4319.4 4309.6 738.8 4312.1 4311.1 4313.7 4312.3 $ 1296.7 1333.1 1333.7 1333.5 1334.5 1340.8 6. MW 23829.9 23830.4 103.1 23831.6 23827.1 23831.8 23828.9 $ 7197.2 7197.5 7202.6 7202.9 7203.6 7202.4 7. MW 27781.4 27794.3 579.5 27835.4 27814.7 27833.9 27813.5 $ 25817.1 25921.6 26075.8 25938.2 25931.4 25928.6 C. Stressed Networks We tested the performance of these local algorithms on stressed networks, where the upper limits on active power flow in transmission lines were reduced. The modifications are shown in Table VI, where the second column gives bus indices connecting the modified lines, the third one presents the power flows (MW) for the normal case of unmodified networks. We select the lines among the highest values of power flows and set the line limits (MW) significantly smaller than these values, which are given in the fourth column. The new values of power flows for the modified networks are given in the fifth column. The last column shows the total MW adjustment between the OPF and SCOPF. TABLE VI M ODIFIED Cases IEEE30 NE39 IEEE118 PL3012 Mod. Lines 1-2 6-28 2-3 29-38 26-30 68-116 34-40 98-99 191-193 TEST CASES FOR STRESSED NETWORKS Org. Flow 120.1 16.8 522.6 930.0 226.0 184.1 678.9 874.7 425.2 Mod. Limit 100.0 10.0 450.0 800.0 100.0 100.0 580.0 750.0 370.0 Mod. Flow 79.5 6.1 425.8 800.0 100.0 100.0 580.0 746.9 369.5 Shifted 138.8 1410.5 951.1 872.4 9 TABLE VII CPU TIMES FOR STRESSED NETWORKS Benders [4] Adaptive Benders Prop. Cases Optimal Cost It Time Cost It Time Cost IEEE30 94.6 failed 94.6 20 3.1 failed NE39 397.5 failed 397.5 15 3.4 428.6 IEEE118 1346.3 1375.3 5 6.4 1347.8 13 10.2 1366.6 PL3012 26170.2 17 1280.5 26091.4 26 1445.7 26198.3 As shown in Table VII, both regular and our Benders decomposition algorithms require more iterations for the networks with stress conditions, but the increases comparing with the unstressed cases are moderate. Again, the adaptive cut scheme not only alleviates the convergence failure of these Benders cuts but also improves the solution quality. For many cases, it helps to find an operating point very close to the optimal point. The convergence rate of ADMM seems to be insensitive to the stress condition of test systems, that is, the number of iterations does not depend on the congestions. Its found solutions are consistent, which are essentially global optima. VII. C ONCLUSIONS We have developed three different approaches to solve the SCOPF problem, namely a global optimization algorithm and two decomposition schemes. Lagrangian duality is the main tool for the global technique to compute a lower bound, and the dual shows a zero duality gap for most of our test instances. Decomposition algorithms are based on Benders cuts and the alternating direction method of multipliers. We have shown that if one uses the regular Benders cut without care, the solution will be rather far from the optimal operating point. By taking the nonconvexity into account properly, our adaptive Benders decomposition often generates a solution of high quality. The resulting ADMM algorithm is capable of yielding a robust solution, which is numerically proved to be the global optimum. These proposed decomposition schemes deal with each contingency separately, therefore they are suitable for large-scale problems and can be implemented in parallel to handle many contingencies. R EFERENCES [1] J. Carpentier, “Contribution to the economic dispatch problem,” Bulletin Society Francaise Electriciens, vol. 8, no. 3, pp. 431–447, August 1962. [2] A. J. Wood and B. F. Wollenberg, Power Generation Operation and Control. New York: John Wiley & Sons, Inc., 1996. [3] O. Alsac and B. Stott, “Optimal load flow with steady state security,” IEEE Trans. Power Appa. Syst., vol. PAS-93, no. 3, pp. 745–751, 1974. [4] A. Monticelli, M. V. F. Pereira, and S. Granville, “Security-constrained optimal power flow with post-contingency corrective rescheduling,” IEEE Trans. Power Syst., vol. 2, no. 1, pp. 175–180, 1987. [5] F. Capitanescu, M. Glavic, D. Ernst, and L. Wehenkel, “Contingency filtering techniques for preventive security-constrained optimal power flow,” IEEE Trans. Power Syst., vol. 22, no. 4, pp. 1690–1697, 2007. [6] F. Bouffard, F. D. Galiana, and J. M. Arroyo, “Umbrella contingencies in security constrained optimal power flow,” in 15th Power Systems Computation Conference (PSCC 05), Liège, Belgium, Aug 2005. [7] D. Ernst, D. Ruiz-Vega, M. Pavella, P. M. Hirsch, and D. Sobajic, “A unified approach to transient stability contingency filtering, ranking and assessment,” IEEE Trans. Power Syst., vol. 16, no. 3, pp. 435–443, 2001. [8] A. M. Geoffrion, “Generalized Benders decomposition,” J. Optim. Theory Appl., vol. 10, no. 4, pp. 237–260, 1972. Benders Adaptive Prop. Benders ADMM It Time Cost It Time Cost It Time 94.6 22 3.4 94.7 67 7.8 7 1.9 397.8 18 3.5 397.5 33 62 8 7.3 1347.0 21 14.1 1347.6 25 24.7 13 1165.1 26089.6 19 1295.2 26085.7 24 3582.5 [9] Y. Li and J. D. McCalley, “Decomposed SCOPF for improving effciency,” IEEE Trans. Power Syst., vol. 24, no. 1, pp. 494–495, 2009. [10] F. Capitanescu and L. Wehenkel, “A new iterative approach to the corrective security-constrained optimal power flow problem,” IEEE Trans. Power Syst., vol. 23, no. 4, pp. 1533–1541, November 2008. [11] K. Karoui, H. Crisciu, A. Szekut, and M. Stubbe, “Large scale security constrained optimal power flow,” in 16th Power System Computation Conference, Glasgow, Scotland, 2008. [12] A. Marano-Marcolini, F. Capitanescu, J. Martinez-Ramos, and L. Wehenkel, “Exploiting the use of DC SCOPF approximation to improve iterative AC SCOPF algorithms,” IEEE Trans. Power Syst., vol. 27, no. 3, pp. 1459–1466, Aug. 2012. [13] J. Lavaei, “Zero duality gap for classical OPF problem convexifies fundamental nonlinear power problems,” in American Control Conference, 2011, pp. 4566–4573. [14] F. Capitanescu, J. L. Martinez Ramos, P. Panciatici, D. Kirschen, A. Marano Marcolini, P. Platbrood, and L. Wehenkel, “State-of-theart, challenges, and future trends in security constrained optimal power flow,” Electric Power Systems Research, vol. 81, no. 8, pp. 1731–1741, 2011. [15] D. T. Phan, “Lagrangian duality and branch-and-bound algorithms for optimal power flow,” Operations Research, vol. 60, no. 2, pp. 275–285, 2012. [16] J. Eckstein and D. P. Bertsekas, “On the Douglas-Rachford splitting method and the proximal point algorithm for maximal monotone operators,” Math. Program., vol. 55, pp. 293–318, 1992. [17] B. H. Kim and R. Baldick, “A comparison of distributed optimal power flow algorithms,” IEEE Trans. Power Syst., vol. 15, no. 2, pp. 599–604, 2000. [18] M. Kraning, E. Chu, J. Lavaei, and S. Boyd, “Dynamic network energy management via proximal message passing,” Foundations and Trends in Optimization, to appear. [19] M. A. Pai, Energy Function Analysis for Power System Stability. Boston, MA: Kluwer Academic Publishers, 1989. [20] R. Zimmerman, C. Murillo-Sanchez, and R. Thomas, “MATPOWER: steady-state operations, planning, and analysis tools for power systems research and education,” IEEE Trans. Power Syst., vol. 26, no. 1, pp. 12–19, feb. 2011. [21] J. H. Money and Q. Ye, “Algorithm 845: EIGIFP: a MATLAB program for solving large symmetric generalized eigenvalue problems,” ACM Trans. Math. Softw., vol. 31, no. 2, pp. 270–279, 2005. [22] A. Gopalakrishnan, A. U. Raghunathan, D. Nikovski, and L. T. Biegler, “Global optimization of optimal power flow using a branch and bound algorithm,” in 50th Annual Allerton Conference on Communication, Control, and Computing, 2012, pp. 609–616. [23] A. Wächter and L. T. Biegler, “On the implementation of an interiorpoint filter line-search algorithm for large-scale nonlinear programming,” Math. Program., vol. 106, no. 1, pp. 25–57, 2006. Dzung Phan is a Research Staff Member of the Business Analytics and Mathematical Sciences Department at IBM T.J. Watson Research Center, Yorktown Heights, NY. His research interests lie in the field of the optimization theory and algorithms including methods for problems arising from power systems. He received a Ph.D. in Mathematics from the University of Florida in 2010. Jayant Kalagnanam is a Program Manager in the Business Analytics and Mathematical Sciences Department and has been at IBM T.J. Watson Research Center as a Research Staff Member in 1996. Before joining IBM Research he worked as a Research Faculty at the Department of Engineering and Public Policy at Carnegie Mellon University where he developed large cost minimization models for designing and retrofitting utilities with new technologies. He graduated with a Ph.D. in 1991 from the same department. After joining IBM research, he worked on developing optimization models for production planning and scheduling in the context of manufacturing for various industries including steel, semiconductor, mining and shipbuilding.