## Posts Tagged ‘**linear regression**’

## Curve Fitting using Linear Algebra

My initial interest in curve fitting came a while ago when programming tanks for Robocode, but realizing the complexity given my limited knowledge of calculus, my plans came to a screeching halt. However, I recently got into the concept of hacking, and subsequently found HackThisSite, which poses training puzzles to the hackers of the future. One, required the player to reverse engineer an encryption algorithm (an extremely simple one at that) but the puzzle made me realize that I could probably use what I was learning in my math classes to automate the process and tackle any simple encryption algorithms. And then I realized that actually, this is how targeting algorithms are built.

Linear algebra is cool because it let’s us compose functions that take map an arbitrary number of arguments to a real. This is useful in a lot of cases. For one, I could map angular velocity, angular acceleration, and distance of a target to firing angles. Alternatively, I could map the value of a pot, the number of opponents, and my current winnings to an amount to bet.

Let’s start simple. Let’s say we wanted to fit a curve of power 2 through the points (1,2), (3,4), and (5,6). To do this, we want to find the general solution to y=ax^2+bx+c where a, b, c are unknown. So let’s write our matrices out.

a*(1)+b*(1)+c*1=(2)

a*(9)+b*(3)+c*1=(4)

a*(25)+b*(5)+c*1=(6)

Using linear algebra, we can turn all of this into three matrices of the form Ax=b. In this case, A=((1,1,1), (9,3,1), (25,5,1)), x=((a), (b), (c)) and b = ((2), (4), (6)). In this form, this solution unfolds itself very clearly: x=A^-1*b. The solution is a=0, b=1, c=1.

Now let’s get a little more complex. The next question is what if we have more rows than columns? In this case, A has no inverse. The solution: we turn to projection. For those who are unfamiliar, the details are messy, but basically, we come up with (A^t)Ax=(A^t)b. Since (A^t)A is square, we can invert it, and therefore we get x=((A^t)(A))^(-1)(A^t)b. It’s a bit messy, so I won’t present an example, but it works.

What that equation basically gives us is actually the least squares approximation. So basically, if we were to take A as a column vector of the x’s, and b a column vector of the y’s, and if A had a bunch of rows with some noise in the data, we would end up with a line that goes through them all pretty nicely.

The next question is, how we do fit a plane through a set of data points- and subsequently, how do we fit an n-dimensional object through a set of data points?

The solution is surprisingly simple. Just add a column for the new input! So let’s say we wanted to fit a straight plane through some data points- Well, the equation for a plane is ax+by+cz=d. Or more relevantly, ax+by+d =z (the negative signs don’t really matter since a, b, and d will take care of that). Now it looks like something we can handle. Simply write out each row in A to be (x, y, 1), x to be a column vector of (a, b, d) and b to be the corresponding z’s. Use the equation presented above, and you’ve got your solution!

This is awesome, because now we can add as many variables as we want, and add dependencies as well. Say the enemy tanks are factoring in some amount of distance*velocity into their equations in order to trip us up a little- we simply add in a distance*velocity vector into A, another variable in x, and now we’ve got it covered.

The only problem as I see it is the enormous scale we are looking at here. Basically, if we look to include all dependencies, the width of our matrix A is equal to n^v, where n is the degree of the polynomial that we’re trying to fit through the data, and v is the number of variables we have. Not the worst, especially considering we generally want small polynomials (since large polynomials tend to behave oddly) but with a bunch of variables, even inititally small amounts of computation could become costly.

Next objective: Build a program that updates a curve with new data without recalculating the whole thing.