{"id":430,"date":"2016-02-08T16:28:20","date_gmt":"2016-02-08T16:28:20","guid":{"rendered":"http:\/\/drsfenner.org\/blog\/?p=430"},"modified":"2016-02-08T16:29:59","modified_gmt":"2016-02-08T16:29:59","slug":"basic-cholesky-implementation","status":"publish","type":"post","link":"https:\/\/drsfenner.org\/blog\/2016\/02\/basic-cholesky-implementation\/","title":{"rendered":"Basic Cholesky Implementation"},"content":{"rendered":"<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>I <a href=\"http:\/\/drsfenner.org\/blog\/2015\/12\/into-the-weeds-i-lapack-lapacke-and-three-paths\/\">spent<\/a> a <a href=\"http:\/\/drsfenner.org\/blog\/2016\/01\/into-the-weeds-iii-hacking-lapack-calls-to-save-memory\/\">bunch<\/a> of <a href=\"http:\/\/drsfenner.org\/blog\/2015\/12\/into-the-weeds-ii-lapack-lapacke-and-two-paths-without-copies\/\">time<\/a> talking about using lower level libraries (LAPACK directly and via LAPACKE or hand wrappers). My next set of posts is going to look at manually implementing these techniques. Mostly, I wanted to spend somem time with the numerical algorithms. I last visited them about 15 years ago in graduate school. There are also several performance lessons along the way (demonstating the limits of a &quot;just compile it&quot; for optimization). For starters, we&#8217;ll implement the Cholesky decomposition which is very easy to describe mathematically. <!--more--><\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"spin-up\">Spin Up<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[2]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"kn\">from<\/span> <span class=\"nn\">math<\/span> <span class=\"kn\">import<\/span> <span class=\"n\">sqrt<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">numpy<\/span> <span class=\"kn\">as<\/span> <span class=\"nn\">np<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">numba<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">itertools<\/span> <span class=\"kn\">as<\/span> <span class=\"nn\">it<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">functools<\/span> <span class=\"kn\">as<\/span> <span class=\"nn\">ft<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">operator<\/span> <span class=\"kn\">as<\/span> <span class=\"nn\">op<\/span>\n<span class=\"kn\">import<\/span> <span class=\"nn\">timeit<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"some-testing-helpers\">Some Testing Helpers<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>I really need to factor these testing helpers out into a <code>testing_utils.py<\/code> module for all these posts. The problem with that is the posts are no longer self-contained for reading and running. *sigh* Nothing&#8217;s perfect. Turns out, there was enough &quot;testing&quot; related code that I <em>did<\/em> <a href=\"http:\/\/drsfenner.org\/blog\/2016\/02\/testing-a-cholesky-implementation\/\">factor it out into a standalone post<\/a>. In turn, I placed the python code from that post in <code>chol_test_post.py<\/code> and I&#8217;m going to <code>import<\/code> it here.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[3]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"kn\">from<\/span> <span class=\"nn\">chol_test_post<\/span> <span class=\"kn\">import<\/span> <span class=\"p\">(<\/span><span class=\"n\">time_routines<\/span><span class=\"p\">,<\/span> <span class=\"n\">same_results<\/span><span class=\"p\">,<\/span> <span class=\"n\">expand<\/span><span class=\"p\">,<\/span> <span class=\"n\">compose<\/span><span class=\"p\">,<\/span> \n                           <span class=\"n\">make_random_vecs_with_cond<\/span><span class=\"p\">,<\/span> <span class=\"n\">make_spd_opt<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"basic-cholesky\">Basic Cholesky<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>As I said, the mathematical description of Cholesky is pretty simple. For a symmetric, positive-definite matrix (so says Atkinson, Intro. to Numerical Analysis &#8212; I forget the corner case of positive-semidefinite), we can factor it: <span class=\"math inline\">\\(A \\rightarrow LL^T\\)<\/span> where <span class=\"math inline\">\\(L\\)<\/span> is a lower-triangular matrix. Sometimes, this is called a &quot;matrix square root&quot;. Element wise, the diagonal and off diagonal entries of <span class=\"math inline\">\\(L\\)<\/span> are:<\/p>\n<p><span class=\"math display\">\\[l_{kk} = \\sqrt{a_{kk} &#8211; \\sum^{k-1}_{j=1} l^2_{kj}}\\]<\/span> <span class=\"math display\">\\[l_{ik} = \\frac{1}{l_{kk}} \\left( a_{ik} &#8211; \\sum^{k-1}_{j=1} l_{ij} l_{kj} \\right)\\]<\/span><\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[4]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"k\">def<\/span> <span class=\"nf\">cholesky0<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L (full sized, zeroed above diag)<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">zeros_like<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"c1\"># Perform the Cholesky decomposition<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">dot<\/span><span class=\"p\">(<\/span><span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,:<\/span><span class=\"n\">col<\/span><span class=\"p\">],<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,:<\/span><span class=\"n\">col<\/span><span class=\"p\">])<\/span>         \n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> <span class=\"c1\"># Diagonal elements<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span> <span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"nb\">max<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">))<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">])<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>We&#8217;re doing as direct an implementation as possible. But, that allows us to call on <a href=\"https:\/\/continuum.io\">ContinuumIO&#8217;s<\/a> great <a href=\"http:\/\/numba.pydata.org\">numba<\/a> project to JIT-compile (just-in-time) our code. It helps (in some ways), as we&#8217;ll see below. I&#8217;ll throw a note out: I&#8217;m not a jit-jedi. More specifically, I&#8217;ve only really toyed around with <code>numba<\/code>, so nothing I say here is necessarily current (<code>numba<\/code> moves pretty fast in terms of features) or even best-practice.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[5]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:](<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:]))<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cholesky1<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L (full sized, zeroed above diag)<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">zeros_like<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"c1\"># Perform the Cholesky decomposition<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">dot<\/span><span class=\"p\">(<\/span><span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,:<\/span><span class=\"n\">col<\/span><span class=\"p\">],<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,:<\/span><span class=\"n\">col<\/span><span class=\"p\">])<\/span>         \n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> <span class=\"c1\"># Diagonal elements<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span> <span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"nb\">max<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">))<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">])<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>I&#8217;m not a numba wizard, but it seemed that with my version, I had to eliminate <code>np.dot<\/code> to enable <code>numba<\/code>&#8216;s <code>nopython<\/code>-mode (full-speed-ahead-numba).<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[6]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:](<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:]),<\/span> <span class=\"n\">nopython<\/span><span class=\"o\">=<\/span><span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cholesky2<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L (full sized, zeroed above diag)<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">zeros_like<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"c1\"># Perform the Cholesky decomposition<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n\n            <span class=\"c1\"># eliminating np.dot allows us to go numba nopython<\/span>\n            <span class=\"c1\"># but, the release notes for numba 0.23.0<\/span>\n            <span class=\"c1\"># reference:  https:\/\/github.com\/numba\/numba\/pull\/1560<\/span>\n            <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n            <span class=\"k\">for<\/span> <span class=\"n\">j<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">):<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">+=<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">j<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">j<\/span><span class=\"p\">]<\/span>\n\n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> \n                <span class=\"c1\"># diag elts.<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"c1\"># off-diag elts.<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">))<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>One of the key rules of numerical linear algebra is that you gain performance when you don&#8217;t do work. Ha! Actually, that&#8217;s a general rule of life. But, here, it means that (1) when we know the structure of a matrix, then (2) we take advantage of that structure to do less work. Practically speaking, instead of working with a zero-ed starting matrix, we&#8217;ll start with raw (arbitrary) memory, and only use values that we know have been set properly. We&#8217;ll zero out the super-diagonal pieces separately.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[7]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:](<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:]),<\/span> <span class=\"n\">nopython<\/span><span class=\"o\">=<\/span><span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cholesky3<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L (full sized, zeroed above diag)<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty_like<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">)<\/span>\n\n    <span class=\"c1\"># Perform the Cholesky decomposition<\/span>\n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n            <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n            <span class=\"k\">for<\/span> <span class=\"n\">j<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">):<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">+=<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">j<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">j<\/span><span class=\"p\">]<\/span>\n            \n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> \n                <span class=\"c1\"># diag elts.<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"c1\"># off diag elts.<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">(<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">))<\/span>\n        <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span> <span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">:]<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"packed-storage-calculations\">Packed Storage Calculations<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>Turns out, that we really were only scratching the surface of respecting the structure of the matrix. We were still using a <span class=\"math inline\">\\(n^2\\)<\/span> piece of memory for a triangular matrix when a <span class=\"math inline\">\\(\\frac{n(n+1)}{2}\\)<\/span> piece of memory will do. Sure, it&#8217;s still <span class=\"math inline\">\\(O(n^2)\\)<\/span>, but savings is savings. And saving 50% is a good thing. We&#8217;ll use a long, flat array to store each row of the triangular matrix. To help us out, we&#8217;ll definte some helpers to convert from row-and-columns &quot;full&quot; indices to a &quot;straightline&quot; packed index.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[8]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"c1\">#if we were column packing<\/span>\n<span class=\"c1\">#@numba.jit(numba.uint(numba.uint, numba.uint, numba.uint), nopython=True)<\/span>\n<span class=\"c1\">#def col_packed_lt(r,c,n):<\/span>\n<span class=\"c1\">#    return r + (c * (-c + 2*n + 1) * .5)<\/span>\n\n<span class=\"c1\"># this would be row-packed upper triangular<\/span>\n<span class=\"c1\">#@numba.jit(numba.uint(numba.uint, numba.uint, numba.uint), nopython=True)<\/span>\n<span class=\"c1\">#def row_packed(r,c,n):<\/span>\n<span class=\"c1\">#    return col_packed(c,r,n)<\/span>\n\n<span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">uint<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">uint<\/span><span class=\"p\">,<\/span> <span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">uint<\/span><span class=\"p\">,<\/span> <span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">uint<\/span><span class=\"p\">),<\/span> <span class=\"n\">nopython<\/span><span class=\"o\">=<\/span><span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">r<\/span><span class=\"p\">,<\/span><span class=\"n\">c<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">c<\/span> <span class=\"o\">+<\/span> <span class=\"p\">(<\/span><span class=\"n\">r<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">r<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">))<\/span> <span class=\"o\">*<\/span> <span class=\"o\">.<\/span><span class=\"mi\">5<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>And, for testing purposes, we may want to expand our packed form back to a full, normal form.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[9]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"k\">def<\/span> <span class=\"nf\">expand_row_packed<\/span><span class=\"p\">(<\/span><span class=\"n\">L<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">,<\/span> <span class=\"n\">cols<\/span><span class=\"o\">=<\/span><span class=\"bp\">None<\/span><span class=\"p\">):<\/span>\n    <span class=\"n\">E<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">((<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"n\">cols<\/span><span class=\"p\">))<\/span> <span class=\"k\">if<\/span> <span class=\"n\">cols<\/span> <span class=\"k\">else<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">((<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">))<\/span>\n    <span class=\"n\">E<\/span><span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">tril_indices<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span>  <span class=\"n\">m<\/span><span class=\"o\">=<\/span><span class=\"n\">cols<\/span><span class=\"p\">)]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">L<\/span>\n    <span class=\"n\">E<\/span><span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">triu_indices<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"n\">m<\/span><span class=\"o\">=<\/span><span class=\"n\">cols<\/span><span class=\"p\">)]<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">E<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>Here&#8217;s a conversion of <code>[0, 1, 2, 3, 4, 5]<\/code> interpreted as a 3&#215;3 lower triangular matrix:<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[10]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">arange<\/span><span class=\"p\">(<\/span><span class=\"mi\">6<\/span><span class=\"p\">)<\/span>\n<span class=\"n\">expand_row_packed<\/span><span class=\"p\">(<\/span><span class=\"n\">L<\/span><span class=\"p\">,<\/span> <span class=\"mi\">3<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt output_prompt\">Out[10]:<\/div>\n<div class=\"output_text output_subarea output_execute_result\">\n<pre>array([[ 0.,  0.,  0.],\n       [ 1.,  2.,  0.],\n       [ 3.,  4.,  5.]])<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>The elements in column 0 of a lower-triangular matrix stored in row-packed form will be at:<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[11]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">N<\/span> <span class=\"o\">=<\/span> <span class=\"mi\">3<\/span>\n<span class=\"k\">for<\/span> <span class=\"n\">i<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">range<\/span><span class=\"p\">(<\/span><span class=\"n\">N<\/span><span class=\"p\">):<\/span>\n    <span class=\"k\">print<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">i<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">N<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>0\n1\n3\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"packed-cholesky-routines\">Packed Cholesky Routines<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[12]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:](<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:]),<\/span> <span class=\"n\">nopython<\/span><span class=\"o\">=<\/span><span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cholesky_packed1<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L packed (row-wise) in a vector<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">numElts<\/span> <span class=\"o\">=<\/span> <span class=\"n\">n<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span> <span class=\"o\">\/<\/span> <span class=\"mi\">2<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">(<\/span><span class=\"n\">numElts<\/span><span class=\"p\">)<\/span>\n    \n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> \n                <span class=\"c1\"># diag elts<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n                <span class=\"n\">base<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">for<\/span> <span class=\"n\">j<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">):<\/span>\n                    <span class=\"n\">tmp_sum<\/span> <span class=\"o\">+=<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base<\/span> <span class=\"o\">+<\/span> <span class=\"n\">j<\/span><span class=\"p\">]<\/span> <span class=\"o\">**<\/span> <span class=\"mi\">2<\/span>\n\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"c1\"># off diag elts<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n                <span class=\"k\">for<\/span> <span class=\"n\">j<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">):<\/span>\n                    <span class=\"n\">tmp_sum<\/span> <span class=\"o\">+=<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"n\">j<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"n\">j<\/span><span class=\"p\">]<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">((<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)])<\/span> <span class=\"o\">*<\/span> \n                                               <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">][<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">))<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>    \n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>And now, let&#8217;s factor out some come computations:<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[13]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:](<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:]),<\/span> <span class=\"n\">nopython<\/span><span class=\"o\">=<\/span><span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cholesky_packed2<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L packed (row-wise) in a vector<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">numElts<\/span> <span class=\"o\">=<\/span> <span class=\"n\">n<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span> <span class=\"o\">\/<\/span> <span class=\"mi\">2<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">(<\/span><span class=\"n\">numElts<\/span><span class=\"p\">)<\/span>\n    \n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> <span class=\"c1\"># Diagonal elements<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n                <span class=\"n\">base<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">for<\/span> <span class=\"n\">j<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">):<\/span>\n                    <span class=\"n\">tmp_sum<\/span> <span class=\"o\">+=<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base<\/span> <span class=\"o\">+<\/span> <span class=\"n\">j<\/span><span class=\"p\">]<\/span> <span class=\"o\">**<\/span> <span class=\"mi\">2<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"mf\">0.0<\/span>\n                <span class=\"n\">base1<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">base2<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"k\">for<\/span> <span class=\"n\">j<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">):<\/span>\n                    <span class=\"n\">tmp_sum<\/span> <span class=\"o\">+=<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base1<\/span> <span class=\"o\">+<\/span> <span class=\"n\">j<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base2<\/span> <span class=\"o\">+<\/span> <span class=\"n\">j<\/span><span class=\"p\">]<\/span>\n                <span class=\"c1\">#+= base1 and base2 here *crushed* the timing<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base1<\/span><span class=\"o\">+<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">((<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base2<\/span><span class=\"o\">+<\/span><span class=\"n\">col<\/span><span class=\"p\">])<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">][<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">))<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>    \n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[14]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"nd\">@numba.jit<\/span><span class=\"p\">(<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:](<\/span><span class=\"n\">numba<\/span><span class=\"o\">.<\/span><span class=\"n\">double<\/span><span class=\"p\">[:,:]),<\/span> <span class=\"n\">nopython<\/span><span class=\"o\">=<\/span><span class=\"bp\">True<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">def<\/span> <span class=\"nf\">cholesky_packed3<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">):<\/span>\n    <span class=\"sd\">&quot;&quot;&quot;<\/span>\n<span class=\"sd\">       Performs a Cholesky decomposition of on symmetric, pos-def A.<\/span>\n<span class=\"sd\">       Returns lower-triangular L packed (row-wise) in a vector<\/span>\n<span class=\"sd\">    &quot;&quot;&quot;<\/span>\n    <span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n    <span class=\"n\">numElts<\/span> <span class=\"o\">=<\/span> <span class=\"n\">n<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span> <span class=\"o\">\/<\/span> <span class=\"mi\">2<\/span>\n    <span class=\"n\">L<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">empty<\/span><span class=\"p\">(<\/span><span class=\"n\">numElts<\/span><span class=\"p\">)<\/span>\n    \n    <span class=\"k\">for<\/span> <span class=\"n\">row<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">):<\/span>\n        <span class=\"k\">for<\/span> <span class=\"n\">col<\/span> <span class=\"ow\">in<\/span> <span class=\"nb\">xrange<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"o\">+<\/span><span class=\"mi\">1<\/span><span class=\"p\">):<\/span>\n            <span class=\"k\">if<\/span> <span class=\"p\">(<\/span><span class=\"n\">row<\/span> <span class=\"o\">==<\/span> <span class=\"n\">col<\/span><span class=\"p\">):<\/span> <span class=\"c1\"># Diagonal elements<\/span>\n                <span class=\"n\">base<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">sum<\/span><span class=\"p\">(<\/span><span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base<\/span><span class=\"p\">:<\/span><span class=\"n\">base<\/span><span class=\"o\">+<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span><span class=\"o\">**<\/span><span class=\"mi\">2<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)]<\/span> <span class=\"o\">=<\/span> <span class=\"n\">sqrt<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span><span class=\"n\">row<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">)<\/span>\n            <span class=\"k\">else<\/span><span class=\"p\">:<\/span>\n                <span class=\"n\">base1<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">row<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">base2<\/span> <span class=\"o\">=<\/span> <span class=\"n\">row_packed_lt<\/span><span class=\"p\">(<\/span><span class=\"n\">col<\/span><span class=\"p\">,<\/span> <span class=\"mi\">0<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n                <span class=\"n\">b1p<\/span> <span class=\"o\">=<\/span> <span class=\"n\">base1<\/span> <span class=\"o\">+<\/span> <span class=\"n\">col<\/span>\n                <span class=\"n\">b2p<\/span> <span class=\"o\">=<\/span> <span class=\"n\">base2<\/span> <span class=\"o\">+<\/span> <span class=\"n\">col<\/span>\n                <span class=\"n\">tmp_sum<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">sum<\/span><span class=\"p\">(<\/span><span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base1<\/span><span class=\"p\">:<\/span><span class=\"n\">b1p<\/span><span class=\"p\">]<\/span> <span class=\"o\">*<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">base2<\/span><span class=\"p\">:<\/span><span class=\"n\">b2p<\/span><span class=\"p\">])<\/span>\n                <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">b1p<\/span><span class=\"p\">]<\/span> <span class=\"o\">=<\/span> <span class=\"p\">((<\/span><span class=\"mf\">1.0<\/span> <span class=\"o\">\/<\/span> <span class=\"n\">L<\/span><span class=\"p\">[<\/span><span class=\"n\">b2p<\/span><span class=\"p\">])<\/span> <span class=\"o\">*<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">[<\/span><span class=\"n\">row<\/span><span class=\"p\">][<\/span><span class=\"n\">col<\/span><span class=\"p\">]<\/span> <span class=\"o\">-<\/span> <span class=\"n\">tmp_sum<\/span><span class=\"p\">))<\/span>\n    <span class=\"k\">return<\/span> <span class=\"n\">L<\/span>    \n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"testing-correctness\">Testing (Correctness)<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[15]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">A<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">array<\/span><span class=\"p\">([[<\/span><span class=\"mi\">6<\/span><span class=\"p\">,<\/span> <span class=\"mi\">3<\/span><span class=\"p\">,<\/span> <span class=\"mi\">4<\/span><span class=\"p\">,<\/span> <span class=\"mi\">8<\/span><span class=\"p\">],<\/span> \n              <span class=\"p\">[<\/span><span class=\"mi\">3<\/span><span class=\"p\">,<\/span> <span class=\"mi\">6<\/span><span class=\"p\">,<\/span> <span class=\"mi\">5<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">],<\/span> \n              <span class=\"p\">[<\/span><span class=\"mi\">4<\/span><span class=\"p\">,<\/span> <span class=\"mi\">5<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"mi\">7<\/span><span class=\"p\">],<\/span> \n              <span class=\"p\">[<\/span><span class=\"mi\">8<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">,<\/span> <span class=\"mi\">7<\/span><span class=\"p\">,<\/span> <span class=\"mi\">25<\/span><span class=\"p\">]],<\/span> <span class=\"n\">dtype<\/span><span class=\"o\">=<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">float64<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">print<\/span> <span class=\"s2\">&quot;A:&quot;<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">A<\/span>\n\n<span class=\"k\">print<\/span> <span class=\"s2\">&quot;L (np.linalg):&quot;<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>A:\n[[  6.   3.   4.   8.]\n [  3.   6.   5.   1.]\n [  4.   5.  10.   7.]\n [  8.   1.   7.  25.]]\nL (np.linalg):\n[[ 2.44948974  0.          0.          0.        ]\n [ 1.22474487  2.12132034  0.          0.        ]\n [ 1.63299316  1.41421356  2.30940108  0.        ]\n [ 3.26598632 -1.41421356  1.58771324  3.13249102]]\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[16]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">routines<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky0<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky3<\/span><span class=\"p\">]<\/span>\n<span class=\"n\">same_results<\/span><span class=\"p\">(<\/span><span class=\"n\">routines<\/span><span class=\"p\">,<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,))<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt output_prompt\">Out[16]:<\/div>\n<div class=\"output_text output_subarea output_execute_result\">\n<pre>True<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[17]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">expander<\/span> <span class=\"o\">=<\/span> <span class=\"n\">ft<\/span><span class=\"o\">.<\/span><span class=\"n\">partial<\/span><span class=\"p\">(<\/span><span class=\"n\">expand_row_packed<\/span><span class=\"p\">,<\/span> <span class=\"n\">n<\/span><span class=\"o\">=<\/span><span class=\"n\">A<\/span><span class=\"o\">.<\/span><span class=\"n\">shape<\/span><span class=\"p\">[<\/span><span class=\"mi\">0<\/span><span class=\"p\">])<\/span>\n<span class=\"n\">routines<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">]<\/span> <span class=\"o\">+<\/span> <span class=\"p\">[<\/span><span class=\"n\">compose<\/span><span class=\"p\">(<\/span><span class=\"n\">expander<\/span><span class=\"p\">,<\/span> <span class=\"n\">r<\/span><span class=\"p\">)<\/span><span class=\"k\">for<\/span> <span class=\"n\">r<\/span> <span class=\"ow\">in<\/span> <span class=\"p\">(<\/span><span class=\"n\">cholesky_packed1<\/span><span class=\"p\">,<\/span> \n                                                                <span class=\"n\">cholesky_packed2<\/span><span class=\"p\">,<\/span> \n                                                                <span class=\"n\">cholesky_packed3<\/span><span class=\"p\">)]<\/span>\n<span class=\"n\">same_results<\/span><span class=\"p\">(<\/span><span class=\"n\">routines<\/span><span class=\"p\">,<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,))<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt output_prompt\">Out[17]:<\/div>\n<div class=\"output_text output_subarea output_execute_result\">\n<pre>True<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"testing-performance\">Testing (Performance)<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[18]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">A<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">array<\/span><span class=\"p\">([[<\/span><span class=\"mi\">6<\/span><span class=\"p\">,<\/span> <span class=\"mi\">3<\/span><span class=\"p\">,<\/span> <span class=\"mi\">4<\/span><span class=\"p\">,<\/span> <span class=\"mi\">8<\/span><span class=\"p\">],<\/span> \n              <span class=\"p\">[<\/span><span class=\"mi\">3<\/span><span class=\"p\">,<\/span> <span class=\"mi\">6<\/span><span class=\"p\">,<\/span> <span class=\"mi\">5<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">],<\/span> \n              <span class=\"p\">[<\/span><span class=\"mi\">4<\/span><span class=\"p\">,<\/span> <span class=\"mi\">5<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"mi\">7<\/span><span class=\"p\">],<\/span> \n              <span class=\"p\">[<\/span><span class=\"mi\">8<\/span><span class=\"p\">,<\/span> <span class=\"mi\">1<\/span><span class=\"p\">,<\/span> <span class=\"mi\">7<\/span><span class=\"p\">,<\/span> <span class=\"mi\">25<\/span><span class=\"p\">]],<\/span> <span class=\"n\">dtype<\/span><span class=\"o\">=<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">float64<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">print<\/span> <span class=\"s2\">&quot;A:&quot;<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">A<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>A:\n[[  6.   3.   4.   8.]\n [  3.   6.   5.   1.]\n [  4.   5.  10.   7.]\n [  8.   1.   7.  25.]]\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[19]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">time_routines<\/span><span class=\"p\">([<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">],<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,),<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">10000<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>cholesky : 0.1380\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[20]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">time_routines<\/span><span class=\"p\">([<\/span><span class=\"n\">cholesky0<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky3<\/span><span class=\"p\">],<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,),<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">10000<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>cholesky2 : 0.0064\ncholesky3 : 0.0065\ncholesky0 : 0.2947\ncholesky1 : 0.3358\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[21]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">time_routines<\/span><span class=\"p\">([<\/span><span class=\"n\">cholesky_packed1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed3<\/span><span class=\"p\">],<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,),<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">10000<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>cholesky_packed1 : 0.0073\ncholesky_packed2 : 0.0076\ncholesky_packed3 : 0.0179\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[22]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">_all<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">,<\/span> \n        <span class=\"n\">cholesky0<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky3<\/span><span class=\"p\">,<\/span>\n        <span class=\"n\">cholesky_packed1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed3<\/span><span class=\"p\">]<\/span>\n<span class=\"n\">time_routines<\/span><span class=\"p\">(<\/span><span class=\"n\">_all<\/span><span class=\"p\">,<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,),<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">1000<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>       cholesky2 : 0.0006\n       cholesky3 : 0.0006\ncholesky_packed1 : 0.0007\ncholesky_packed2 : 0.0007\ncholesky_packed3 : 0.0017\n        cholesky : 0.0138\n       cholesky0 : 0.0296\n       cholesky1 : 0.0336\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3 id=\"bigger-fish\">Bigger Fish<\/h3>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>What happens when we scale up to bigger input matrices? Well, since we need SPD (symmetric, positive-definite) matrices, we have to construct them. Doing that in a naive way has a major issue: we can get a <em>horribly<\/em> conditioned matrix (and Cholesky is sensitive to the <em>square<\/em> of the condition number). We can improve that by adding (relatively) large elements on the diagonal to push the matrix towards being &quot;diagonally dominant&quot;.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[23]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">random<\/span><span class=\"o\">.<\/span><span class=\"n\">random_integers<\/span><span class=\"p\">(<\/span><span class=\"mi\">1<\/span><span class=\"p\">,<\/span><span class=\"mi\">10<\/span><span class=\"p\">,(<\/span><span class=\"mi\">5<\/span><span class=\"p\">,<\/span><span class=\"mi\">1<\/span><span class=\"p\">))<\/span>\n<span class=\"n\">outer_n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">outer<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">outer_n<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">real_if_close<\/span><span class=\"p\">(<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">eig<\/span><span class=\"p\">(<\/span><span class=\"n\">outer_n<\/span><span class=\"p\">)[<\/span><span class=\"mi\">0<\/span><span class=\"p\">])<\/span> <span class=\"c1\"># remove small imaginary parts<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cond<\/span><span class=\"p\">(<\/span><span class=\"n\">outer_n<\/span><span class=\"p\">)<\/span> <span class=\"c1\"># can go to float(&quot;inf&quot;)<\/span>\n\n<span class=\"c1\"># condition normalizer; not-so-regular Tikhonov regularization<\/span>\n<span class=\"c1\"># pushes towards diagonally dominant<\/span>\n<span class=\"n\">outer_n<\/span> <span class=\"o\">=<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">outer<\/span><span class=\"p\">(<\/span><span class=\"n\">n<\/span><span class=\"p\">,<\/span><span class=\"n\">n<\/span><span class=\"p\">)<\/span> <span class=\"o\">+<\/span> <span class=\"mi\">5<\/span> <span class=\"o\">*<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">eye<\/span><span class=\"p\">(<\/span><span class=\"mi\">5<\/span><span class=\"p\">)<\/span> \n<span class=\"k\">print<\/span> <span class=\"n\">outer_n<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">real_if_close<\/span><span class=\"p\">(<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">eig<\/span><span class=\"p\">(<\/span><span class=\"n\">outer_n<\/span><span class=\"p\">)[<\/span><span class=\"mi\">0<\/span><span class=\"p\">])<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cond<\/span><span class=\"p\">(<\/span><span class=\"n\">outer_n<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>[[ 49  35  70  14  70]\n [ 35  25  50  10  50]\n [ 70  50 100  20 100]\n [ 14  10  20   4  20]\n [ 70  50 100  20 100]]\n[  0.00000000e+00   2.78000000e+02   1.84889275e-32   8.20146785e-17\n   0.00000000e+00]\n3.80552400456e+49\n[[  54.   35.   70.   14.   70.]\n [  35.   30.   50.   10.   50.]\n [  70.   50.  105.   20.  100.]\n [  14.   10.   20.    9.   20.]\n [  70.   50.  100.   20.  105.]]\n[   5.  283.    5.    5.    5.]\n56.6\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>But, we&#8217;re going to make use of the <code>make_spd_opt<\/code> routine we defined in our testing helpers. Here&#8217;s what it looks like (3 is the length of the diagonal, 10 is the condition number of the result).<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[24]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">A<\/span> <span class=\"o\">=<\/span> <span class=\"n\">make_spd_opt<\/span><span class=\"p\">(<\/span><span class=\"mi\">3<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">)<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">A<\/span>\n<span class=\"k\">print<\/span> <span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">eig<\/span><span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">)[<\/span><span class=\"mi\">0<\/span><span class=\"p\">]<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>[[ 3.02040816  2.75510204  2.20408163]\n [ 2.75510204  8.43877551  0.55102041]\n [ 2.20408163  0.55102041  5.04081633]]\n[  1.   10.    5.5]\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[24]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">A<\/span> <span class=\"o\">=<\/span> <span class=\"n\">make_spd_opt<\/span><span class=\"p\">(<\/span><span class=\"mi\">2<\/span><span class=\"o\">**<\/span><span class=\"mi\">10<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">)<\/span>\n<span class=\"n\">_all<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">,<\/span> \n        <span class=\"n\">cholesky0<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky3<\/span><span class=\"p\">,<\/span>\n        <span class=\"n\">cholesky_packed1<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed3<\/span><span class=\"p\">]<\/span>\n<span class=\"n\">time_routines<\/span><span class=\"p\">(<\/span><span class=\"n\">_all<\/span><span class=\"p\">,<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,),<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>        cholesky : 0.0240\n       cholesky3 : 0.1463\n       cholesky2 : 0.1465\ncholesky_packed2 : 0.2107\ncholesky_packed1 : 0.2144\ncholesky_packed3 : 0.3146\n       cholesky0 : 1.1020\n       cholesky1 : 1.3282\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[25]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">A<\/span> <span class=\"o\">=<\/span> <span class=\"n\">make_spd_opt<\/span><span class=\"p\">(<\/span><span class=\"mi\">2<\/span><span class=\"o\">**<\/span><span class=\"mi\">12<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">)<\/span>\n<span class=\"n\">drop_pokey<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">,<\/span>\n              <span class=\"n\">cholesky2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky3<\/span><span class=\"p\">]<\/span>\n<span class=\"n\">time_routines<\/span><span class=\"p\">(<\/span><span class=\"n\">drop_pokey<\/span><span class=\"p\">,<\/span> <span class=\"p\">(<\/span><span class=\"n\">A<\/span><span class=\"p\">,),<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre> cholesky : 0.8519\ncholesky3 : 10.5039\ncholesky2 : 10.5152\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[26]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">time_routines<\/span><span class=\"p\">([<\/span><span class=\"n\">cholesky_packed2<\/span><span class=\"p\">],<\/span> <span class=\"p\">[<\/span><span class=\"n\">A<\/span><span class=\"p\">],<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>cholesky_packed2 : 14.2552\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing code_cell rendered\">\n<div class=\"input\">\n<div class=\"prompt input_prompt\">In&nbsp;[27]:<\/div>\n<div class=\"inner_cell\">\n<div class=\"input_area\">\n<div class=\" highlight hl-ipython2\">\n<pre><span class=\"n\">A<\/span> <span class=\"o\">=<\/span> <span class=\"n\">make_spd_opt<\/span><span class=\"p\">(<\/span><span class=\"mi\">2<\/span><span class=\"o\">**<\/span><span class=\"mi\">13<\/span><span class=\"p\">,<\/span> <span class=\"mi\">10<\/span><span class=\"p\">)<\/span>\n<span class=\"n\">showdown<\/span> <span class=\"o\">=<\/span> <span class=\"p\">[<\/span><span class=\"n\">np<\/span><span class=\"o\">.<\/span><span class=\"n\">linalg<\/span><span class=\"o\">.<\/span><span class=\"n\">cholesky<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky2<\/span><span class=\"p\">,<\/span> <span class=\"n\">cholesky_packed2<\/span><span class=\"p\">]<\/span>\n<span class=\"n\">time_routines<\/span><span class=\"p\">(<\/span><span class=\"n\">showdown<\/span><span class=\"p\">,<\/span> <span class=\"p\">[<\/span><span class=\"n\">A<\/span><span class=\"p\">],<\/span> <span class=\"n\">number<\/span><span class=\"o\">=<\/span><span class=\"mi\">1<\/span><span class=\"p\">)<\/span>\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"output_wrapper\">\n<div class=\"output\">\n<div class=\"output_area\">\n<div class=\"prompt\"><\/div>\n<div class=\"output_subarea output_stream output_stdout output_text\">\n<pre>        cholesky : 5.3662\n       cholesky2 : 82.1272\ncholesky_packed2 : 113.4598\n<\/pre>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<p>When the data gets bigger, we can&#8217;t compete. Why not? Because while we are increasing the efficiency of the steps of computation, we&#8217;re not increasing the efficiency of data movement. And to solve that, generally we switch to &quot;block-algoirthms&quot; that rely on moving chunks (submatrices) of data around, instead of just single rows and\/or columns. Here&#8217;s the <a href=\"http:\/\/apfel.mathematik.uni-ulm.de\/~lehn\/flens\/flens\/examples\/tut01-page08.html\">fundamental issue laid out nicely<\/a> and <a href=\"http:\/\/apfel.mathematik.uni-ulm.de\/~lehn\/FLENS-Trinity\/flens\/examples\/tut01-page08.html\">here&#8217;s how deep that rabbit hole goes<\/a>.<\/p>\n<p>If you want to implement a block-Cholesky, see:<\/p>\n<ul>\n<li>http:\/\/www.netlib.org\/utk\/papers\/factor\/node9.html<\/li>\n<li>http:\/\/www.cs.utexas.edu\/users\/flame\/Notes\/NotesOnCholReal.pdf<\/li>\n<li>https:\/\/www.cs.utexas.edu\/users\/plapack\/icpp98\/node2.html<\/li>\n<\/ul>\n<p>I may come back to it myself, but I&#8217;ve had my fun and there are other fish to fry.<\/p>\n<p>Also, even though the packed routines save a lot of work, the cost of indexing (and the lack of efficient memory transfer) outweighs the benefits. I was curious if the <code>jit<\/code>-ed calls could be <a href=\"https:\/\/en.wikipedia.org\/wiki\/Inline_function\"><em>inlined<\/em><\/a> and I found this: https:\/\/github.com\/numba\/numba\/issues\/160 which seems to say the answer is &quot;yes, it is down automatically, if LLVM heuristics say to&quot;.<\/p>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"cell border-box-sizing text_cell rendered\">\n<div class=\"prompt input_prompt\">\n<\/div>\n<div class=\"inner_cell\">\n<div class=\"text_cell_render border-box-sizing rendered_html\">\n<h3>\nAdditional Resources<br \/>\n<\/h3>\n<p>\nYou can grab a <a href=\"http:\/\/drsfenner.org\/public\/notebooks\/ImplementingCholeskyCoding.ipynb\">copy of this notebook<\/a>.\n<\/p>\n<p>\nEven better, you can <a href=\"http:\/\/nbviewer.ipython.org\/url\/drsfenner.org\/public\/notebooks\/ImplementingCholeskyCoding.ipynb\">view it using nbviewer<\/a>.\n<\/p>\n<h3>\nLicense<br \/>\n<\/h3>\n<p>\nUnless otherwise noted, the contents of this notebook are under the following license. The code in the notebook should be considered part of the text (i.e., licensed and treated as as follows).\n<\/p>\n<p><a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\"><img decoding=\"async\" alt=\"Creative Commons License\" style=\"border-width:0\" src=\"https:\/\/i.creativecommons.org\/l\/by-nc-sa\/4.0\/88x31.png\" \/><\/a><br \/><span xmlns:dct=\"http:\/\/purl.org\/dc\/terms\/\" property=\"dct:title\">DrsFenner.org Blog And Notebooks<\/span> by <a xmlns:cc=\"http:\/\/creativecommons.org\/ns#\" href=\"drsfenner.org\" property=\"cc:attributionName\" rel=\"cc:attributionURL\">Mark and Barbara Fenner<\/a> is licensed under a <a rel=\"license\" href=\"http:\/\/creativecommons.org\/licenses\/by-nc-sa\/4.0\/\">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License<\/a>.<br \/>Permissions beyond the scope of this license may be available at <a xmlns:cc=\"http:\/\/creativecommons.org\/ns#\" href=\"drsfenner.org\/blog\/about-and-contacts\" rel=\"cc:morePermissions\">drsfenner.org\/blog\/about-and-contacts<\/a>.<\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>I spent a bunch of time talking about using lower level libraries (LAPACK directly and via LAPACKE or hand wrappers). My next set of posts is going to look at manually implementing these techniques. Mostly, I wanted to spend somem time with the numerical algorithms. I last visited them about 15 years ago in graduate [&hellip;]<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[6,7],"tags":[],"class_list":["post-430","post","type-post","status-publish","format-standard","hentry","category-mrdr","category-sci-math-stat-python"],"_links":{"self":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts\/430","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/comments?post=430"}],"version-history":[{"count":1,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts\/430\/revisions"}],"predecessor-version":[{"id":431,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/posts\/430\/revisions\/431"}],"wp:attachment":[{"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/media?parent=430"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/categories?post=430"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/drsfenner.org\/blog\/wp-json\/wp\/v2\/tags?post=430"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}