Givens Rotations and QR

Today I want to talk about Givens rotations. Givens rotations are a generalization of the rotation matrix you might remember from high school trig class. Instead of rotating in the plane of a 2D matrix, we can rotated in any plane of a larger dimension matrix. We’ll use these rotations to selectively place zeros in a target matrix. Continue reading

Basic Cholesky Implementation

I spent a bunch of time talking about using lower level libraries (LAPACK directly and via LAPACKE or hand wrappers). My next set of posts is going to look at manually implementing these techniques. Mostly, I wanted to spend somem time with the numerical algorithms. I last visited them about 15 years ago in graduate school. There are also several performance lessons along the way (demonstating the limits of a "just compile it" for optimization). For starters, we’ll implement the Cholesky decomposition which is very easy to describe mathematically. Continue reading

Testing a Cholesky Implementation

In working through the Cholesky, QR, and SVD methods for solving the normal regression equations, I got curious about the underlying implementations of each. I did QR and Cholesky in grad school (a great Numerical Analysis sequence). I never got around to SVD (I think we did some variation of tridiagonalization). Anyway, Cholesky is so simple that I think testing Cholesky generated more interesting points for me. There were two big issues and the "testing" post is (I think) longer than the "implementation" post.

Executing the tests brought up an interesting software engineering (code reuse) issue and creating good test inputs brought up an intersting mathematical issue. Read on! Continue reading

Into the Weeds III – Hacking LAPACK Calls To Save Memory

In the last post, we tried to use lower-level LAPACK/LAPACKE calls to remove some of the copying overhead of our operations. Well, it turns out that it didn’t work out so well (see the table at the end of the post). The fundamental reason why is that LAPACKE took the easy way out, when they decided to wrap LAPACK functions. In particular, whenever there is a C-/Fortran-ordering issue (for a reminder), LAPACKE simply makes a copy. Continue reading

Into the Weeds II – LAPACK, LAPACKE, and Two Paths without Copies

In the last post, we tried to use lower-level LAPACK/LAPACKE calls to remove some of the copying overhead of our operations. Well, it turns out that it didn’t work out so well (see the table at the end of the post). The fundamental reason why is that LAPACKE took the easy way out, when they decided to wrap LAPACK functions. In particular, whenever there is a C-/Fortran-ordering issue (for a reminder), LAPACKE simply makes a copy. Continue reading

Into the Weeds I – LAPACK, LAPACKE, and Three+ Paths

In the last post, we looked at three ways of solving the least-squares regression problem (Cholesky, QR, and SVD decompositions) using LAPACK routines in a round-about fashion. The different decompositions (i.e., factorizations) were performed by (1) calling a NumPy np.linalg routine that called (2) an LAPACK driver routine for a "bigger" problem like linear equation solving or least-squares. The driver routine called one of the decompositions internally. Continue reading