M BUZZ CRAZE NEWS
// updates

How to calculate the gradient of matrix equation

By Emma Johnson
$\begingroup$

Short question: How do I calculate the gradient of the $MSE(a, b)$ equation below?


Longer explanation: This problem arises, while I'm following a derivation of a term for an optimal beamvector $a$ in a data transmission. The mean square error (MSE) of this data transmission is calculated as follows:

$$MSE(a, b) = a^H(Hbb^HH^H+R_n)a + 1 - a^HHb - b^HH^Ha$$

where:

  • $a$, $b$: vectors, which can be chosen
  • $H$, $R_n$: matrices, which are fixed
  • $a^H$: denotes the Hermitian adjoint of $a$

The vector $a$ can be optimized (in dependece of $b$) by setting the gradient of the MSE to zero.

The problem is that I don't know how to calculate the gradient when the equation has the above form. The $a^H$ at the beginning and the $a$ at the end of the first summand irritates me...

The answer shall be:

$$ a^* = (Hbb^HH^H+R_n)^{-1}Hb = R_n^{-1}Hb\frac{1}{1+b^HH^HR_n^{-1}Hb}$$

But how to calculate this?


Update:

Using equations from The Matrix Cookbook I got this far:

$$\frac{\partial MSE(a, b)}{\partial a} = \frac{\partial}{\partial a} \left[ a^H\left(Hbb^HH^H+R_n\right)a\right] + \frac{\partial}{\partial a} 1 - \frac{\partial}{\partial a} \left[a^HHb\right] - \frac{\partial}{\partial a} \left[b^HH^Ha\right]$$

With

  • $\frac{\partial}{\partial a} 1 = 0$
  • $\frac{\partial b^TX^TDXx}{\partial X} = D^TXbc^T + DXcb^T$ (Cookbook (74))

I get:

$$\frac{\partial MSE(a, b)}{\partial a} = (Hbb^HH^H+R_n)^Ha + (Hbb^HH^H+R_n)a - \frac{\partial}{\partial a} \left[a^HHb\right] - \frac{\partial}{\partial a} \left[b^HH^Ha\right]$$

And that's it. I don't even know if I used equation (74) from the cookbook right, but it was the closed equation for the first summand. I'm sorry, I just don't get it...

$\endgroup$ 7

2 Answers

$\begingroup$

I'm not sure whether the following results hold for complex cases.

Let all the vectors and matrices be real valued. Then $$A=a^TBa+1-a^THb-b^TH^Ta$$ where $B=Hbb^TH^T+R_n$. $B$ is symmetric if $R_n$ is symmetric. Then $$dA=da^TBa+a^TBda-da^THb-b^TH^Tda$$ Let the gradient be zero. $$a^T(B^T+B)-2b^TH^T=0$$ If $B$ is symmetric, we have $2Ba=2Hb$ which implies $$a^*=B^{-1}Hb=(Hbb^TH^T+R_n)^{-1}Hb$$

But for the rest of your expected answer, I'm not sure. Because $$(cc^T+R_n)^{-1}c=\frac{R_n^{-1}c}{c^TR_n^{-1}c+1}, c=Hb$$ implies $$cc^TR_n^{-1}=c^TR_n^{-1}cI$$ Take trace on both sides of the above equation. The equation holds only when dimension is one.

$\endgroup$ $\begingroup$

Define two new variables$$\eqalign{ c = Hb,\quad M = cc^H+R_n \\ }$$Write the cost function in terms of these new variables. Then calculate the gradient with respect to $a$while treating $a^H$ as a constant (i.e. the Wirtinger derivative).$$\eqalign{ \phi &= a^HMa + 1 - c^Ha - a^Hc \\ d\phi &= a^HM\,da - c^Hda \\ &= (M^Ha - c)^Hda \\ \frac{\partial\phi}{\partial a} &= (M^Ha - c)^H \\ }$$Set the gradient to zero and solve$$\eqalign{ M^Ha &= c \\ a &= (M^H)^{-1}c \\ &= (cc^H+R_n^H)^{-1}c \\ &= (Hbb^HH^H+R_n^H)^{-1}Hb \\ }$$This equals the prescribed answer only if $\,R_n^H=R_n$

$\endgroup$

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy