{"id":458,"date":"2017-03-28T14:10:48","date_gmt":"2017-03-28T14:10:48","guid":{"rendered":"http:\/\/drsfenner.org\/blog\/?p=458"},"modified":"2017-03-28T14:11:40","modified_gmt":"2017-03-28T14:11:40","slug":"dc-svd-ii-from-values-to-vectors","status":"publish","type":"post","link":"https:\/\/drsfenner.org\/blog\/2017\/03\/dc-svd-ii-from-values-to-vectors\/","title":{"rendered":"DC SVD II:  From Values to Vectors"},"content":{"rendered":"<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>In our <a href=\"http:\/\/drsfenner.org\/blog\/2017\/02\/dc-svd-i-just-cant-let-it-go\/\">last installment<\/a>, we discussed solutions to the secular equation. These solutions are the eigenvalues (and\/or) singular values of matrices with a particular form. Since this post is otherwise light on technical content, I&#8217;ll dive into those matrix forms now.<!--more--><\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"setting-up-the-secular-equation-to-solve-the-eigen-and-singular-problems\">Setting up the Secular Equation to Solve the Eigen and Singular Problems<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>In dividing-and-conquering to solve the <em>symmetric tridiagonal eigenvalue<\/em> problem, we start with a symmetric tridiagonal matrix <span class=\"math inline\">\\(T\\)<\/span>, and eventually, we have to find the eigendecomposition &#8211; eigenvalues and eigenvectors &#8211; of a &quot;rank-one update&quot; of a diagonal matrix: <span class=\"math inline\">\\(ROU = D + zz^T\\)<\/span>. For my mathematician friends, I&#8217;m using <span class=\"math inline\">\\(ROU\\)<\/span> as a multi-character name for one variable. Sorry &#8211; I&#8217;ve moved on from Fortran 77.<\/p>\n<p>Similarly, in DC SVD, we started with a (lower) bidiagonal matrix <span class=\"math inline\">\\(B\\)<\/span>; eventually, we compute the SVD (singular values and right\/left singular vectors) of a matrix <span class=\"math inline\">\\(M\\)<\/span> (it&#8217;s the &quot;middle matrix&quot; in an equation that derives DC SVD) which has non-zero entries only in its first column and on its diagonal:<\/p>\n<p><span class=\"math display\">\\[M =<br \/>\n\\begin{bmatrix}<br \/>\nz_1 \\\\<br \/>\nz_2 &amp; d_2 \\\\<br \/>\n\\vdots &amp; &amp;  \\ddots \\\\<br \/>\nz_n &amp; &amp; &amp; d_n<br \/>\n\\end{bmatrix}<br \/>\n\\]<\/span><\/p>\n<p>Two quick notes. (1) For <span class=\"math inline\">\\(B\\)<\/span>, the top-left entry is the value of <span class=\"math inline\">\\(z_1\\)<\/span>. We also introduce <span class=\"math inline\">\\(d_1 = 0.0\\)<\/span> to simplify our bookkeeping. (2) <span class=\"math inline\">\\(T \\neq ROU\\)<\/span> in the eigen case and <span class=\"math inline\">\\(B \\neq M\\)<\/span> in the SVD case. These are subcomputations that are needed in a larger process that produces the full decompositions.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>We can do a little bit of matrix algebra to highlight the similarities between the eigen and singular problems. Treat <span class=\"math inline\">\\(z\\)<\/span> as a column-vector, <span class=\"math inline\">\\(d_1=0\\)<\/span> in the <span class=\"math inline\">\\(M\\)<\/span> (singular value) case, and recall that the singular values of <span class=\"math inline\">\\(M\\)<\/span> are the <span class=\"math inline\">\\(\\sqrt{\\text{eig}(MM^T)}\\)<\/span>. Strictly, the last fact holds when <span class=\"math inline\">\\(M\\)<\/span> is tall or square (i.e., <em>cols<\/em> <span class=\"math inline\">\\(&gt;=\\)<\/span> <em>rows<\/em> ), but rest assured, there&#8217;s a corresponding argument when <span class=\"math inline\">\\(M\\)<\/span> is wide instead of tall. We get the following:<\/p>\n<p><span class=\"math display\">\\[<br \/>\n\\begin{aligned}<br \/>\nROU = D + zz^T =<br \/>\n\\begin{bmatrix}<br \/>\nd_1 &amp;     &amp;        &amp; \\\\<br \/>\n    &amp; d_2 &amp;        &amp; \\\\<br \/>\n    &amp;     &amp; \\ddots &amp; \\\\<br \/>\n    &amp;     &amp;        &amp;  d_n<br \/>\n\\end{bmatrix}<br \/>\n+<br \/>\n\\begin{bmatrix}<br \/>\nz_1 z_1 &amp;  &amp; z_1 z_n \\\\<br \/>\n &amp; \\ddots &amp;  \\\\<br \/>\nz_n z_1 &amp;  &amp; z_n z_n<br \/>\n\\end{bmatrix}<br \/>\n=<br \/>\n\\begin{bmatrix}<br \/>\nd_1 + z_1^2 &amp; z_1 z_2 &amp; &amp; z_1 z_n \\\\<br \/>\nz_1 z_2 &amp; d_2 + z_2^2 &amp; &amp; \\\\<br \/>\n&amp; &amp; \\ddots &amp; \\\\<br \/>\nz_1 z_n &amp; &amp; &amp; d_n + z_n^2<br \/>\n\\end{bmatrix}<br \/>\n\\overset{eig}{\\rightarrow} Q \\Lambda Q^T<br \/>\n\\end{aligned}<br \/>\n\\]<\/span><\/p>\n<p><span class=\"math display\">\\[<br \/>\n\\begin{aligned}<br \/>\nM =<br \/>\n\\begin{bmatrix}<br \/>\nz_1 &amp; &amp; &amp; \\\\<br \/>\nz_2 &amp; d_2 &amp; &amp; \\\\<br \/>\n&amp; &amp; \\ddots &amp; \\\\<br \/>\nz_n &amp; &amp; &amp; d_n<br \/>\n\\end{bmatrix}<br \/>\n\\end{aligned}<br \/>\n\\quad \\text{and} \\quad<br \/>\n\\begin{aligned}<br \/>\nM^T =<br \/>\n\\begin{bmatrix}<br \/>\nz_1 &amp; z_2 &amp; &amp; z_n\\\\<br \/>\n &amp; d_2 &amp; &amp; \\\\<br \/>\n &amp; &amp; \\ddots &amp; \\\\<br \/>\n &amp; &amp; &amp; d_n<br \/>\n\\end{bmatrix}<br \/>\n\\end{aligned}<br \/>\n\\implies<br \/>\n\\begin{aligned}<br \/>\nMM^T =<br \/>\n\\begin{bmatrix}<br \/>\n0 + z_1^2&amp; z_1 z_2 &amp; &amp; z_1 z_n \\\\<br \/>\nz_1 z_2 &amp; d_2^2 + z_2^2 &amp; &amp; \\\\<br \/>\n &amp; &amp; \\ddots &amp; \\\\<br \/>\nz_1 z_n &amp; &amp; &amp; d_n^2 + z_n^2<br \/>\n\\end{bmatrix}<br \/>\n\\overset{svd}{\\rightarrow} U \\Omega V^T<br \/>\n\\end{aligned}<br \/>\n\\]<\/span><\/p>\n<p>So, up to the difference between using <span class=\"math inline\">\\(d\\)<\/span> and <span class=\"math inline\">\\(d^2\\)<\/span>, we have <span class=\"math inline\">\\(ROU=MM^T\\)<\/span> and <span class=\"math inline\">\\(\\text{eig}(ROU) = \\text{eig}(MM^T)\\)<\/span>. If we really wanted to massage out the differences, we could define <span class=\"math inline\">\\(M\\)<\/span>&#8216;s diagonal with the square roots of the values. But that seems excessive and gets us further away from the formulations used in the literature.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"back-to-the-secular-equation\">Back to the Secular Equation<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>For clarity, the secular equations for the eigen and singular cases are:<\/p>\n<p><span class=\"math display\">\\[f_\\text{eig}(\\lambda) = 1 + \\sum_{i=1}^n \\frac{z_{i}^2}{d_i &#8211; \\lambda} \\qquad<br \/>\nf_\\text{sv}(\\omega) = 1 + \\sum_{i=1}^n \\frac{z_{i}^2}{d^2_i &#8211; {\\omega}^2}\\]<\/span><\/p>\n<p>I won&#8217;t wade through the mathematics that gets us there, but about 10 lines of linear algebra will get the job done. You just have to get the right 10 lines. :grin: Seriously though, in the references from last time Arbenz lays it out in some detail and Demmel does it short-and-sweet (and, leaves some of it as an exercise!).<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>Practically speaking, when we want to solve the equivalent problems for the eigen or SV problem we are looking at code like this that compares a &quot;known good&quot; NumPy solution with our methods we are building:<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[&nbsp;]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">test_eig<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">ROU<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">diag<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">outer<\/span><span class=\"p\">(<\/span><span class=\"n\">z<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span> <span class=\"c1\"># rank-one updated matrix<\/span>\n    \n    <span class=\"n\">expected<\/span> <span class=\"o\">=<\/span> <span class=\"n\">nla<\/span><span class=\"o\">.<\/span><span class=\"n\">eig<\/span><span class=\"p\">(<\/span><span class=\"n\">ROU<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">actual<\/span>   <span class=\"o\">=<\/span> <span class=\"n\">outer_eigval_eigvec<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span>\n    \n    <span class=\"k\">return<\/span> <span class=\"n\">compare_eig_results<\/span><span class=\"p\">(<\/span><span class=\"n\">expected<\/span><span class=\"p\">,<\/span> <span class=\"n\">actual<\/span><span class=\"p\">)<\/span>\n\n<span class=\"k\">def<\/span> <span class=\"nf\">test_sv<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">M<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">diag<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">M<\/span><span class=\"p\">[:,<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">z<\/span>\n\n    <span class=\"n\">expected<\/span> <span class=\"o\">=<\/span> <span class=\"n\">nla<\/span><span class=\"o\">.<\/span><span class=\"n\">svd<\/span><span class=\"p\">(<\/span><span class=\"n\">M<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">actual<\/span>   <span class=\"o\">=<\/span> <span class=\"n\">outer_singval_singvec<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"o\">**<\/span><span class=\"mi\">2<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span>\n    \n    <span class=\"k\">return<\/span> <span class=\"n\">compare_sv_results<\/span><span class=\"p\">(<\/span><span class=\"n\">expected<\/span><span class=\"p\">,<\/span> <span class=\"n\">actual<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>The code for the <code>outer_XX<\/code> functions is very, very similar. They are so similar it is almost worthwhile refactoring it to a common function. The argument against doing so is two-fold. (1) The problems being solved are slightly different: compare <span class=\"math inline\">\\(\\lambda\\)<\/span> and <span class=\"math inline\">\\({\\omega}^2\\)<\/span> above and the different matrices in the corresponding decompositions <span class=\"math inline\">\\(ROU \\rightarrow Q \\Lambda Q^T\\)<\/span> versus <span class=\"math inline\">\\(M \\rightarrow U \\Omega V^T\\)<\/span>. (2) The names and concepts of the two don&#8217;t unify entirely perfectly. For example, <span class=\"math inline\">\\(Q\\)<\/span> and <span class=\"math inline\">\\(U\\)<\/span> are basically the same. Calling the variable <code>q_or_u<\/code> sounds like a band name to me. In fact, there is a mathrock band called <a href=\"https:\/\/en.wikipedia.org\/wiki\/Q_and_Not_U\">Q and not U<\/a>. What is mathrock? Think of absurd key signatures and time changes.<\/p>\n<p>The visible differences in computing the SVD are: (1) passing in <span class=\"math inline\">\\(d^2\\)<\/span> instead of <span class=\"math inline\">\\(d\\)<\/span> (these terms act as constants), (2) adding a <code>np.sqrt<\/code> to get from <span class=\"math inline\">\\({\\omega}^2 \\rightarrow {\\omega}\\)<\/span>, and (3) we also need a series of steps to compute <span class=\"math inline\">\\(V^T\\)<\/span>. There are one or two differences behind the scenes. If we ignore the reasons against refactoring (because we are laz<sup>h<\/sup>h^h, err, efficient and we follow the DRY principle: don&#8217;t repeat yourself), we get code that looks like this:<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[&nbsp;]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">outer_eigval_eigvec<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">z<\/span><span class=\"o\">.<\/span><span class=\"n\">size<\/span><span class=\"o\">==<\/span><span class=\"mi\">1<\/span><span class=\"p\">:<\/span> <span class=\"c1\"># for consistency with singval\/vec<\/span>\n        <span class=\"k\">return<\/span> <span class=\"n\">z<\/span><span class=\"o\">**<\/span><span class=\"mi\">2<\/span><span class=\"p\">,<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">ones<\/span><span class=\"p\">((<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">))<\/span>\n    <span class=\"n\">lambdas<\/span><span class=\"p\">,<\/span> <span class=\"n\">Q<\/span> <span class=\"o\">=<\/span> <span class=\"n\">unified<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">Q<\/span> <span class=\"o\">\/=<\/span> <span class=\"n\">nla<\/span><span class=\"o\">.<\/span><span class=\"n\">norm<\/span><span class=\"p\">(<\/span><span class=\"n\">Q<\/span><span class=\"p\">,<\/span> <span class=\"n\">axis<\/span><span class=\"o\">=<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>   <span class=\"c1\"># norm dup&#39;d row wise<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">lambdas<\/span><span class=\"p\">,<\/span> <span class=\"n\">Q<\/span>\n\n<span class=\"k\">def<\/span> <span class=\"nf\">outer_singval_singvec<\/span><span class=\"p\">(<\/span><span class=\"n\">d_sq<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">if<\/span> <span class=\"n\">z<\/span><span class=\"o\">.<\/span><span class=\"n\">size<\/span><span class=\"o\">==<\/span><span class=\"mi\">1<\/span><span class=\"p\">:<\/span> <span class=\"c1\"># slightly ill-behaved without this<\/span>\n        <span class=\"k\">return<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">ones<\/span><span class=\"p\">((<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">)),<\/span> <span class=\"n\">z<\/span><span class=\"p\">,<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">ones<\/span><span class=\"p\">((<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">))<\/span>\n\n    <span class=\"n\">omega_sq<\/span><span class=\"p\">,<\/span> <span class=\"n\">U<\/span> <span class=\"o\">=<\/span> <span class=\"n\">unified<\/span><span class=\"p\">(<\/span><span class=\"n\">d_sq<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">U<\/span> <span class=\"o\">*=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">sign<\/span><span class=\"p\">(<\/span><span class=\"n\">z<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">reshape<\/span><span class=\"p\">(<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n    \n    <span class=\"c1\"># the calculations for V are described below<\/span>\n    <span class=\"n\">V<\/span> <span class=\"o\">=<\/span> <span class=\"n\">U<\/span> <span class=\"o\">*<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">d_sq<\/span><span class=\"p\">)<\/span><span class=\"o\">.<\/span><span class=\"n\">reshape<\/span><span class=\"p\">(<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">V<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">,:]<\/span> <span class=\"o\">=<\/span> <span class=\"o\">-<\/span><span class=\"mf\">1.0<\/span>\n    <span class=\"n\">U<\/span> <span class=\"o\">\/=<\/span> <span class=\"n\">nla<\/span><span class=\"o\">.<\/span><span class=\"n\">norm<\/span><span class=\"p\">(<\/span><span class=\"n\">U<\/span><span class=\"p\">,<\/span> <span class=\"n\">axis<\/span><span class=\"o\">=<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n    <span class=\"n\">V<\/span> <span class=\"o\">\/=<\/span> <span class=\"n\">nla<\/span><span class=\"o\">.<\/span><span class=\"n\">norm<\/span><span class=\"p\">(<\/span><span class=\"n\">V<\/span><span class=\"p\">,<\/span> <span class=\"n\">axis<\/span><span class=\"o\">=<\/span><span class=\"mi\">0<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">U<\/span><span class=\"p\">,<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">omega_sq<\/span><span class=\"p\">),<\/span> <span class=\"n\">V<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>For both problems, once we have the <span class=\"math inline\">\\(\\lambda\\)<\/span>s (or <span class=\"math inline\">\\(\\omega\\)<\/span>s), we can compute a temporary vector: <span class=\"math display\">\\[<br \/>\nt_k=\\sqrt{\\frac<br \/>\n{\\prod\\limits_{i=1}^{k-1}(d_k &#8211; {\\lambda}_i)  \\prod\\limits_{i=k}^{n}({\\lambda}_i &#8211; d_k)}<br \/>\n{\\prod\\limits_{i=1}^{k-1}(d_k &#8211; d_i)  \\prod\\limits_{i=k}^{n}(d_i &#8211; d_k)}}<br \/>\n= \\sqrt{\\left|<br \/>\n\\frac{\\prod\\limits_{i=1}^{n}d_k &#8211; {\\lambda}_i}<br \/>\n     {\\prod\\limits_{i \\neq k}d_k &#8211; d_i}<br \/>\n\\right|}<br \/>\n\\]<\/span> We get the second equality by noting that the effect of the partial products is to make sure each individual term is positive. If that is confusing you, remember that the <span class=\"math inline\">\\(d\\)<\/span> values are sorted in ascending order with the corresponding <span class=\"math inline\">\\({\\lambda}\\)<\/span> sit between them. So, we can simplify this by taking an absolute value at the end. Note that in the denominator, we also have to skip the term <span class=\"math inline\">\\(d_k &#8211; d_k = 0\\)<\/span> to avoid division by zero. In the code below, we compute the product term-by-term to prevent very large or small values from accumulating (if we wanted to be very careful, we could work with logarithms).<\/p>\n<p>In the eigen case, we then compute the <span class=\"math inline\">\\(k\\)<\/span> column vectors making up <span class=\"math inline\">\\(Q\\)<\/span>. For the SV case, <span class=\"math inline\">\\(U\\)<\/span> is identical except that we let <span class=\"math inline\">\\(\\text{sign}(t_i) = \\text{sign}(z_i)\\)<\/span> to get the directions of the vectors correct (where &quot;correct&quot; means doing the same thing LAPACK does). Note, the normalization (the <span class=\"math inline\">\\(\\|\\cdot\\|_2\\)<\/span> values in the denominators) happens after the results are returned from the <code>unifed<\/code> function. This is because the <span class=\"math inline\">\\(V\\)<\/span> calculation for the SV case needs the unnormalized vectors. At least I think so, I&#8217;m fairly certain I broke a testcase when I tried to normalize before calculating <span class=\"math inline\">\\(V\\)<\/span>).<\/p>\n<p><span class=\"math display\">\\[<br \/>\n\\begin{aligned}<br \/>\nq_k \\leftarrow<br \/>\n\\left(<br \/>\n\\frac{t_1}{d_1 &#8211; {\\lambda}_k},<br \/>\n\\dots,<br \/>\n\\frac{t_n}{d_n &#8211; {\\lambda}_k}<br \/>\n\\right)^T \\big\/ \\|q_k\\|_2<br \/>\n\\end{aligned}<br \/>\n\\qquad<br \/>\n\\begin{aligned}<br \/>\nu_k \\leftarrow<br \/>\n\\left(<br \/>\n\\frac{t_1}{d_1^2 &#8211; {\\omega}_k^2},<br \/>\n\\dots,<br \/>\n\\frac{t_n}{d_n^2 &#8211; {\\omega}_k^2}<br \/>\n\\right)^T \\big\/ \\|u_k\\|_2<br \/>\n\\end{aligned}<br \/>\n\\]<\/span><\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[&nbsp;]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span><\/span><span class=\"k\">def<\/span> <span class=\"nf\">unified<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span><span class=\"n\">z<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">N<\/span> <span class=\"o\">=<\/span> <span class=\"n\">z<\/span><span class=\"o\">.<\/span><span class=\"n\">size<\/span>\n    <span class=\"c1\"># filling in column by column<\/span>\n    <span class=\"n\">lambdas<\/span><span class=\"p\">,<\/span> <span class=\"n\">d_m_lambda<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">),<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">((<\/span><span class=\"n\">N<\/span><span class=\"p\">,<\/span><span class=\"n\">N<\/span><span class=\"p\">))<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">k<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">lambdas<\/span><span class=\"p\">[<\/span><span class=\"n\">k<\/span><span class=\"p\">],<\/span> <span class=\"n\">d_m_lambda<\/span><span class=\"p\">[:,<\/span><span class=\"n\">k<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">solve_secular_oneroot<\/span><span class=\"p\">(<\/span><span class=\"n\">d<\/span><span class=\"p\">,<\/span> <span class=\"n\">z<\/span><span class=\"p\">,<\/span> <span class=\"n\">k<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"n\">T<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">k<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">):<\/span>\n        <span class=\"n\">numers<\/span><span class=\"p\">,<\/span> <span class=\"n\">denoms<\/span> <span class=\"o\">=<\/span> <span class=\"n\">d_m_lambda<\/span><span class=\"p\">[<\/span><span class=\"n\">k<\/span><span class=\"p\">],<\/span> <span class=\"n\">d<\/span><span class=\"p\">[<\/span><span class=\"n\">k<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">d<\/span>\n        <span class=\"n\">lower_prod<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">prod<\/span><span class=\"p\">(<\/span><span class=\"n\">numers<\/span><span class=\"p\">[:<\/span><span class=\"n\">k<\/span><span class=\"p\">]<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">denoms<\/span><span class=\"p\">[:<\/span><span class=\"n\">k<\/span><span class=\"p\">])<\/span>\n        <span class=\"n\">upper_prod<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">prod<\/span><span class=\"p\">(<\/span><span class=\"n\">numers<\/span><span class=\"p\">[<\/span><span class=\"n\">k<\/span><span class=\"p\">:<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">denoms<\/span><span class=\"p\">[<\/span><span class=\"n\">k<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">:])<\/span>\n        <span class=\"n\">term<\/span> <span class=\"o\">=<\/span> <span class=\"n\">lower_prod<\/span> <span class=\"o\">*<\/span> <span class=\"n\">upper_prod<\/span> <span class=\"o\">*<\/span> <span class=\"n\">numers<\/span><span class=\"p\">[<\/span><span class=\"o\">-<\/span><span class=\"mi\">1<\/span><span class=\"p\">]<\/span>\n        <span class=\"n\">T<\/span><span class=\"p\">[<\/span><span class=\"n\">k<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">abs<\/span><span class=\"p\">(<\/span><span class=\"n\">term<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">T<\/span><span class=\"p\">,<\/span> <span class=\"n\">out<\/span><span class=\"o\">=<\/span><span class=\"n\">T<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">with<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">errstate<\/span><span class=\"p\">(<\/span><span class=\"n\">invalid<\/span><span class=\"o\">=<\/span><span class=\"s1\">&#39;ignore&#39;<\/span><span class=\"p\">,<\/span> <span class=\"n\">divide<\/span><span class=\"o\">=<\/span><span class=\"s1\">&#39;warn&#39;<\/span><span class=\"p\">):<\/span>\n        <span class=\"c1\"># Q is Q for eigen, U for SVD<\/span>\n        <span class=\"n\">Q<\/span> <span class=\"o\">=<\/span> <span class=\"n\">T<\/span><span class=\"o\">.<\/span><span class=\"n\">reshape<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">d_m_lambda<\/span>  <span class=\"c1\"># dup Q column wise<\/span>\n        <span class=\"n\">Q<\/span><span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">logical_or<\/span><span class=\"p\">(<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">isnan<\/span><span class=\"p\">(<\/span><span class=\"n\">Q<\/span><span class=\"p\">),<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">isinf<\/span><span class=\"p\">(<\/span><span class=\"n\">Q<\/span><span class=\"p\">))]<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">1.0<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">lambdas<\/span><span class=\"p\">,<\/span> <span class=\"n\">Q<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>Speaking of <span class=\"math inline\">\\(V\\)<\/span>, it is calculated in <code>outer_singval_singvec<\/code> above, but here&#8217;s the mathematical definition of the columns:<\/p>\n<p><span class=\"math display\">\\[<br \/>\nv_k \\leftarrow \\left(<br \/>\n-1.0,<br \/>\n\\frac{d_2 t_2}{d_2^2 &#8211; {\\omega}_k^2},<br \/>\n\\dots,<br \/>\n\\frac{d_n t_n}{d_n^2 &#8211; {\\omega}_k^2}<br \/>\n\\right)^T \\big\/ \\|v_k\\|_2<br \/>\n\\]<\/span><\/p>\n<p>The denominators in <span class=\"math inline\">\\(V\\)<\/span> are the same as in <span class=\"math inline\">\\(U\\)<\/span>. The numerators in <span class=\"math inline\">\\(V\\)<\/span> simply need an additional <span class=\"math inline\">\\(d\\)<\/span> term. So, we can basically copy <span class=\"math inline\">\\(U\\)<\/span> and update a few things. A quick note: we take the &quot;easy-coding&quot; way out of reusing <span class=\"math inline\">\\(U\\)<\/span>, calculating the first row, and then overwriting the first row with <span class=\"math inline\">\\(1\\)<\/span>s. If we were being completely memory conscious and receiving pre-allocated memory to return our answer in, we could just fill in the correct spaces once, without repeated work. I&#8217;m saving up a whole series of posts on how to do that &quot;one big memory allocation and make use of it in subcalls&quot;: this is one of the ways LAPACK stays lean-and-mean. But, we can do much the same thing in Python &#8211; if we&#8217;re careful and tricky. In that order, please!<\/p>\n<p>If you noticed, these cells don&#8217;t <em>do<\/em> anything other than define several functions. I do promise that I&#8217;ll get to test code &#8212; after we discuss deflation in the next post.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"a-postscript\">A Postscript<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>A quick note about the mathematics behind the scenes. When we perform these calculations, we have a given a matrix <span class=\"math inline\">\\(S\\)<\/span> that we start with. To simplify the discussion, I&#8217;ll talk about singular values, but the same applies to the eigen case. When we calculate the singular values of <span class=\"math inline\">\\(S\\)<\/span>, we are actually computing <em>approximations<\/em> (numerical approximations) to the singular values of <span class=\"math inline\">\\(S\\)<\/span>. The numerical approximation error here can be magnified in the calculations for <span class=\"math inline\">\\(U\\)<\/span> and <span class=\"math inline\">\\(V\\)<\/span> and can result in &quot;reasonable&quot; singular values but &quot;unreasonable&quot; singular vectors. Specifically, the singular vectors can lose their orthogonality. Doh!<\/p>\n<p>What do we do? The approximate singular values we calculcated can be used to reconstruct a new matrix <span class=\"math inline\">\\(\\hat{S}\\)<\/span>. We make <span class=\"math inline\">\\(\\hat{S}\\)<\/span> in such a way that the <em>approximate singular values of<\/em> <span class=\"math inline\">\\(S\\)<\/span> are the <em>exact singular values of<\/em> <span class=\"math inline\">\\(\\hat{S}\\)<\/span>. Then, the singular vectors we get by applying the formulas for <span class=\"math inline\">\\(U\\)<\/span> and <span class=\"math inline\">\\(V\\)<\/span> (really <span class=\"math inline\">\\(\\hat{U}\\)<\/span> and <span class=\"math inline\">\\(\\hat{V}\\)<\/span>) will be highly accurate. How does this help us? <span class=\"math inline\">\\(S\\)<\/span> and <span class=\"math inline\">\\(\\hat{S}\\)<\/span> are numerically close, as are the respective <span class=\"math inline\">\\(U\\)<\/span> and <span class=\"math inline\">\\(V\\)<\/span>. Phew. This postscript doesn&#8217;t do near enough justice to the mathematical sleuthery (new word alert!) that solved this problem in such an elegent way. In the &quot;old days&quot;, people had to hack together extra precision solutions to the singular values to create more accuracte singular vectors.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3>\nAdditional Resources<br \/>\n<\/h3>\n<p>\nYou can grab a <a href=\"http:\/\/drsfenner.org\/public\/notebooks\/DC_SVD_02_SingularAndEigenVectors.ipynb\">copy of this notebook<\/a>.\n<\/p>\n<p>\nEven better, you can <a href=\"http:\/\/nbviewer.ipython.org\/url\/drsfenner.org\/public\/notebooks\/DC_SVD_02_SingularAndEigenVectors.ipynb\">view it using nbviewer<\/a>.\n<\/p>\n<h3>\nLicense<br \/>\n<\/h3>\n<p>\nUnless otherwise noted, the contents of this notebook are under the following license. The code in the notebook should be considered part of the text (i.e., licensed and treated as as follows).\n<\/p>\n<p><a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><img decoding=\"async\" alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https:\/\/i.creativecommons.org\/l\/by-nc-sa\/4.0\/88x31.png\" \/><\/a><br \/><span xmlns:dct=\"http:\/\/purl.org\/dc\/terms\/\" property=\"dct:title\">DrsFenner.org Blog And Notebooks<\/span> by <a xmlns:cc=\"http:\/\/creativecommons.org\/ns#\" href=\"drsfenner.org\" property=\"cc:attributionName\" rel=\"cc:attributionURL\">Mark and Barbara Fenner<\/a> is licensed under a <a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License<\/a>.<br \/>Permissions beyond the scope of this license may be available at <a xmlns:cc=\"http:\/\/creativecommons.org\/ns#\" href=\"drsfenner.org\/blog\/about-and-contacts\" rel=\"cc:morePermissions\">drsfenner.org\/blog\/about-and-contacts<\/a>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>In our last installment, we discussed solutions to the secular equation. These solutions are the eigenvalues (and\/or) singular values of matrices with a particular form. Since this post is otherwise light on technical content, I&#8217;ll dive into those matrix forms now.<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,7],"tags":[],"class_list":["post-458","post","type-post","status-publish","format-standard","hentry","category-mrdr","category-sci-math-stat-python"],"_links":{"self":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts\/458","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/comments?post=458"}],"version-history":[{"count":1,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts\/458\/revisions"}],"predecessor-version":[{"id":459,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts\/458\/revisions\/459"}],"wp:attachment":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/media?parent=458"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/categories?post=458"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/tags?post=458"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}