# Solving diagonal simultaneous quadratic equations

A problem I am trying to solve has led to me to the following system of equations:

$$A(x^2) + Bx + c = 0$$

Where $$A$$ and $$B$$ are known matrices, $$c$$ is a known vector, $$x$$ is the vector of unknowns for which we want to solve, and $$(x^2)$$ is the vector of squared elements of $$x$$, i.e. $$(x^2) = (x_1^2, x_2^2, ... x_N^2)$$.

I have been reading a lot about solving simultaneous quadratic equations, quadratic matrix equations, etc, but it seems to be a rather large field with few general solutions known. But somehow the form above seems particularly simple, since there are no "cross-terms" (no $$x_0\cdot x_1$$ etc.). So I wonder if it is solvable? I think my matrices are all real and invertible, but if a solution exists that requires further special properties I'd still be interested. Actually I am even interested in efficient numerical solutions, since I ended up with these equations as part of an attempt to avoid performing some numerical minimisation in high-dimensions (like 100, so these matrices will be of that sort of size).

• Do you need one solution, or all of them? – Federico Poloni Mar 21 at 12:52
• Well for my cases there should usually only be one solution, I'm pretty sure. There is probably some extra structure that ensures this, but I'm afraid I'm not sure what it is. Is there some sort of equivalent of a determinant that could be computed to show this? – Ben Farmer Mar 22 at 13:24
• Bézout's theorem tells you that generically you can expect a lot of solutions, $2^N$ of them (possibly complex). Are you sure you have only one in your case? Have you tried small-dimensional cases to check what happens there? – Federico Poloni Mar 22 at 15:15
• I've tried it numerically and not found other solutions, but I haven't been very thorough about it. These are a set of extremisation conditions for finding the maximum likelihood estimator in a statistics problem, so it will be the regularity of the parameter space plus some sort of asymptotic theory guaranteeing the (asymptotic) uniqueness of the maximum likelihood estimator that means there should be only one solution, at least for sufficiently largely amounts of data. Not sure how that property will manifest in these equations though. – Ben Farmer Mar 22 at 17:40

The first numerical method to try might be a multivariate Newton's method. The Jacobian of $$F(x) = A (x^2) + B (x) + c$$ is $$J(x)= 2 A\; \text{diag}(x) + B$$, and you iterate the map $$x \mapsto x - J(x)^{-1} F(x)$$ As usual, this is not guaranteed to succeed, but when it does work (if you start "sufficiently close" to a solution) it is efficient. Of course, you probably won't actually compute $$J(x)^{-1}$$, rather you compute the solution of $$J(x) y = F(x)$$ and take $$x -y$$.

We have the following system of quadratic equations in $$\mathrm x \in \mathbb R^n$$

$$\mathrm A (\mathrm x \circ \mathrm x) + \mathrm B \mathrm x + \mathrm c = 0_m$$

where $$\mathrm A \in \mathbb R^{m \times n}$$, $$\mathrm B \in \mathbb R^{m \times n}$$ and $$\mathrm c \in \mathbb R^m$$ are given, and $$\circ$$ denotes the entry-wise product.

Let $$\mathrm a_i, \mathrm b_i \in \mathbb R^n$$ denote the $$i$$-th rows of matrices $$\rm A$$ and $$\rm B$$, respectively. Let $$c_i$$ denote the $$i$$-th entry of vector $$\rm c$$. Hence, the system of quadratic equations can be rewritten as follows

$$\mathrm x^\top \mbox{diag} (\mathrm a_i) \, \mathrm x + \mathrm b_i^\top \mathrm x + c_i = 0, \qquad i \in [m]$$

Note that

\begin{aligned} \mathrm x^\top \mbox{diag} (\mathrm a_i) \, \mathrm x + \mathrm b_i^\top \mathrm x + c_i &= \begin{bmatrix} \mathrm x\\ 1\end{bmatrix}^\top \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix} \begin{bmatrix} \mathrm x\\ 1\end{bmatrix}\\ &= \mbox{tr} \left( \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix} \begin{bmatrix} \mathrm x\\ 1\end{bmatrix} \begin{bmatrix} \mathrm x\\ 1\end{bmatrix}^\top \right)\\ &= \mbox{tr} \left( \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix} \begin{bmatrix} \mathrm x \mathrm x^\top & \mathrm x\\ \mathrm x^\top & 1\end{bmatrix} \right)\end{aligned}

Let

$$\mathrm M_i := \begin{bmatrix} \mbox{diag} (\mathrm a_i) & \frac 12 \mathrm b_i\\ \frac 12\mathrm b_i^\top & c_i\end{bmatrix}$$

The original system of $$m$$ quadratic equations in $$\mathrm x \in \mathbb R^n$$ can then be replaced by the following system of $$m$$ linear equations in symmetric positive semidefinite matrix $$\mathrm Y \in \mathbb R^{(n+1) \times (n+1)}$$

$$\langle \mathrm M_i , \mathrm Y \rangle = 0, \qquad i \in [m]$$

plus the constraint $$\mbox{rank}(\mathrm Y) = 1$$ and the equality constraint $$Y_{n+1,n+1} = 1$$.

The rank constraint is a major difficulty. Let us discard that constraint and minimize the rank instead

$$\begin{array}{ll} \text{minimize} & \mbox{rank}(\mathrm Y)\\ \text{subject to} & \langle \mathrm M_i , \mathrm Y \rangle = 0, \qquad i \in [m]\\ & Y_{n+1,n+1} = 1\\ & \mathrm Y \succeq \mathrm O_{n+1}\end{array}$$

Since the rank is non-convex, let us use the nuclear norm, which is a convex proxy for rank. Hence, we obtain the following convex optimization problem

$$\begin{array}{ll} \text{minimize} & \| \mathrm Y \|_*\\ \text{subject to} & \langle \mathrm M_i , \mathrm Y \rangle = 0, \qquad i \in [m]\\ & Y_{n+1,n+1} = 1\\ & \mathrm Y \succeq \mathrm O_{n+1}\end{array}$$

Let $$\bar{\rm Y}$$ denote the minimal solution of the convex program above. There are two cases to consider:

• If $$\mbox{rank} (\bar{\rm Y}) = 1$$, find $$\bar{\mathrm x} \in \mathbb R^n$$ such that $$\begin{bmatrix} \bar{\mathrm x}\\ 1\end{bmatrix} \begin{bmatrix} \bar{\mathrm x}\\ 1\end{bmatrix}^\top = \bar{\mathrm Y}$$ Note that $$\bar{\mathrm x}$$ is a solution of the original system of quadratic equations.

• If $$\mbox{rank} (\bar{\rm Y}) \neq 1$$, try another approach.