Ceasar's Mind

Follow me: @Ceasar_Bautista

Some Segmentation Ideas

leave a comment »

Continuing with my earlier post on Robocode, I’d like to describe some thoughts on  segmentation strategies.

So the basic idea is, we want to minimize our bins at all times. In effect, this means calculating the Maximum Escape Angle, calculating the arc formed by the intersection of the target and the circle of radius distance to target with center on our tank (that is, the orbital path our opponent could take), and then taking the ceiling of one half of the ratio (one half because a single shot will hit our opponent just as well as if hits the leftmost part of our opponent as the rightmost side).

Interestingly, we can adjust the Maximum Escape by adjusting the speed of our bullet. Given that, we can actually reduce or expand the number of bins as we desire- a useful ability when trying to maximize probability (since things are discrete, a change from N to N-1 bins can be fairly significant assuming the probability to hit is a uniform distribution at both ranges).

Furthermore, we can supplement our segmentation data by adjusting the range at which we fight, spiraling toward and away our opponent as necessary, in order to keep our targets at optimal distances.

 

Written by Ceasar Bautista

2011/03/22 at 03:05

Wave Surfing Explained

leave a comment »

I’ve recently reignited my interest in Robocode but this time armed with everything I’ve learned since September.

For the uninitiated, Robocode is an educational programming game, originally written with Java in mind, where players code up algorithms to guide virtual tanks in destroying each other. It’s a unique challenge, combining fields like programming, statistics, linear algebra, and game theory all into one.

Originally intended for students, the game was quickly picked up by experienced programmers and is now rather competitive. Many players use sophisticated algorithms in order to dodge enemy bullets and make sure their own find their mark.

Anyway, let me continue to explain wave surfing- but first I need to explain segmentation.

Segmentation

Since in Robocode neither bot can see any of the bullets, developers have to find ways to track their enemy through alternative means. Segmentation is a targeting technique that involves finding which firing angles are most likely to hit the target by dividing the continuous range of firing angles into discrete bins, and finding which bins are most popular. To implement segmentation, one needs to track each bullet with “waves”.

A wave is basically analogous to a ripple in a pond after tossing a stone into the water. We track the bullets until the “wave” hits our opponent. At that point, we can determine what the correct firing angle was at the time of firing, and subsequently increment the value in the bin which contains the correct firing angle. We then proceed to fire at the center of the most popular bin whenever we get the chance.

If our bins looked like this, we would fire into the 15% bin.

A primitive approach of segmentation might only segment by firing angle. However more sophisticated approaches first segment by distance, and then angle. (And even more sophisticated approaches segment by even more dimensions.) However, these improvements come at a cost of slowing down the rate of learning, but the accuracy gain is typically well worth the price.

Optimal segmentation (I think anyway) ought to reduce the number of bins down to the minimum needed to hit every possible position of the opponent. For example, an opponent at point blank range would only require one bin, since all firing angles within the escape angle would be guaranteed to hit the target. As the distance increases however, more and more bins become necessary (I believe at an exponential rate). By reducing the number of bins in this fashion, we increase learning speed, and reduce memory costs.

Wave Surfing

Wave Surfing is an evasive technique first developed and implemented by ABC in mid-2004. To put it succinctly, wave surfing is anti-segmentation. Every time the enemy fires a shot, we create a wave, and see whether or he not the bullet hit when the wave reaches our bot, and subsequently increment our own bin. In this way, we know what our opponent believes to be the most popular firing angle, and therefore make sure to avoid it, producing a near uniform distribution at all ranges.

Why this is optimal

Refer back to my earlier post on the game theoretical solution to Rock-Paper-Scissors. Basically, to do anything else is to tip our opponent off to a flaw in our strategy. If you know for example that your opponent will play Rock 2/3 of the time and Paper 1/3 of the time, your optimal strategy becomes to keep playing Paper.

Implementation Issues

A literal implementation of the above is still susceptible to intelligent algorithms. For example, at the start of a game or Rock-Paper-Scissors, if your opponent first plays Scissors, it would be illogical to assume that because your opponent has so far only played Scissors, it would be logical to assume he would do so again. This illustrates the problem of determining when the opponent is playing sub-optimally with an unequal probability distribution. Thankfully, statistics provides an answer. Using statistical methods to determine statistically significant variance will lead to a solution- typically, with low amount of data, statisticians say nothing can be concluded, and when more data, more confidence can be placed on inferences. A simple implementation of confidence intervals ought to be sufficient.

Written by Ceasar Bautista

2011/03/20 at 01:31

Educational Gates or: How I Learned to Hate School and Love Games

with 2 comments

If you’ve yet to watch Salman Kahn’s talk at TED, head over now and give it a watch- you won’t regret it.

Sal says a lot of insightful stuff here, so I’ll be revisiting his talk several times, but at the moment I want to comment on a small point that Sal makes.

Sal explains that one reason education is failing is because it passes students who don’t fully understand what they are being taught. Not just on a grade-to-grade basis, but lecture-to-lecture. A student might not fully understand a subject, get labeled a B student after the test, and then be expected to understand the next lecture which builds on what the student ought to know but doesn’t. Sal makes the analogy, it would be liking a father trying to teach his son to ride a bike, evaluating him after a week, seeing his son is having trouble maintaining his balance on left turns and noticing trouble with managing the brakes, and then handing his son a unicycle, and expecting his son to manage. It’s obviously faulty, it doesn’t work, and there’s a good reason this analogy doesn’t take place outside of schools.

If you don't know how to jump by now, you're fucked.

Personally, I’ve experienced this the hard way. As a game designer during high school, I once collaborated with a friend to produce a tactical puzzle game (called “Pinnacle”) where a team of five players had to coordinate their military units in order to defeat an AI opponent by utilizing a particular tactic. At one point, we made the decision to, rather than failing the players if they couldn’t figure out a level and making them try again, instead just push them through to the next level (the idea being, we wanted to make it more arcade-y and let players taste the entire game, rather than getting stuck and quitting). In theory, this could possibly work. If each level didn’t require an understanding of previous levels, this would be totally okay. Not being the case though, it (and the players) failed miserably, with players progressing to harder and harder levels despite having never learned the basics (which were often difficult to convey with one try). We quickly realized our mistake and reverted it, and learned firsthand why in Super Mario Bros and other professional games, you can’t just skip ahead nor does the designer push you forward- if you don’t understand the skills required to pass the current level, you’re experience with the next level is going to suck.

I’m not sure if the concept has a name, but if I had to call it something, I’d call it an educational gate. You can’t pass until you have the skills that will be expected of you on the other side of the gate. Most notably, these gates show up in the form of boss battles (although frankly, almost every instance, from the first Goomba in Mario, to random pits in Prince of Persia are technically all gates). Rather than trying to challenge the player, boss battles are typically designed to stop the player from progressing until he has achieved a certain level of mastery with a particular skill. (If you’ve ever tried you’re hand at one of the Legend of Zelda games, you know exactly what I mean.) In fact, I recall reading an article on Gamasutra that detailed a designers experiences with designing boss battles that did not test a player’s skills, and his explanation of how they sucked.

Frankly, I love this article because the contrast between exams and boss battles is ridiculous, despite them being analogous. I mean, okay, they both test us, but really, how much cooler would tests be if instead of just testing abstract concepts, all of the questions were connected to a central theme, that made us feel like we were really accomplishing something?

And furthermore, what if each lecture was a test in itself, that also made us feel like we were accomplishing something, while preparing us to take the exam? That’s how games work. Consider the scene below from Valve’s critically acclaimed Portal.

This scene made me cry.

In this particular scene in the game, the player must sacrifice his friend, the Companion Cube, in order to progress by dropping it in an incinerator. A relatively simple task, but it forces the player to understand how incinerators work.

Another incinerator, but this time, it's used to avenge the Companion Cube and destroy Glados, Portal's boss.

Later, an understanding of incinerators is required to defeat Glados, Portal’s boss. This is the only the tip of the iceberg though- the entirety of Portal, Super Mario Bros, Zelda, Metroid, and many other classics were designed using this pattern. In reality, games are hardly games at all- they’re more like extremely engaging classrooms. (Spoiler: Learning is actually fun.)

Really, schools have such a long way to go, having made virtually no progress in pedagogy despite game designers having illuminated the way since the 70s. Anyway, now you understand why I’m such a critic of education. It’s just too hard not to be when you see it consistently done wrong.

Written by Ceasar Bautista

2011/03/19 at 20:43

Curve Fitting using Linear Algebra

with one comment

My initial interest in curve fitting came a while ago when programming tanks for Robocode, but realizing the complexity given my limited knowledge of calculus, my plans came to a screeching halt. However, I recently got into the concept of hacking, and subsequently found HackThisSite, which poses training puzzles to the hackers of the future. One, required the player to reverse engineer an encryption algorithm (an extremely simple one at that) but the puzzle made me realize that I could probably use what I was learning in my math classes to automate the process and tackle any simple encryption algorithms. And then I realized that actually, this is how targeting algorithms are built.

Linear algebra is cool because it let’s us compose functions that take map an arbitrary number of arguments to a real. This is useful in a lot of cases. For one, I could map angular velocity, angular acceleration, and distance of a target to firing angles. Alternatively, I could map the value of a pot, the number of opponents, and my current winnings to an amount to bet.

Let’s start simple. Let’s say we wanted to fit a curve of power 2 through the points (1,2), (3,4), and (5,6). To do this, we want to find the general solution to y=ax^2+bx+c where a, b, c are unknown. So let’s write our matrices out.

a*(1)+b*(1)+c*1=(2)

a*(9)+b*(3)+c*1=(4)

a*(25)+b*(5)+c*1=(6)

Using linear algebra, we can turn all of this into three matrices of the form Ax=b. In this case, A=((1,1,1), (9,3,1), (25,5,1)), x=((a), (b), (c)) and b = ((2), (4), (6)). In this form, this solution unfolds itself very clearly: x=A^-1*b. The solution is a=0, b=1, c=1.

Now let’s get a little more complex. The next question is what if we have more rows than columns? In this case, A has no inverse. The solution: we turn to projection. For those who are unfamiliar, the details are messy, but basically, we come up with (A^t)Ax=(A^t)b. Since (A^t)A is square, we can invert it, and therefore we get x=((A^t)(A))^(-1)(A^t)b. It’s a bit messy, so I won’t present an example, but it works.

What that equation basically gives us is actually the least squares approximation. So basically, if we were to take A as a column vector of the x’s, and b a column vector of the y’s, and if A had a bunch of rows with some noise in the data, we would end up with a line that goes through them all pretty nicely.

The next question is, how we do fit a plane through a set of data points- and subsequently, how do we fit an n-dimensional object through a set of data points?

The solution is surprisingly simple. Just add a column for the new input! So let’s say we wanted to fit a straight plane through some data points- Well, the equation for a plane is ax+by+cz=d. Or more relevantly, ax+by+d =z (the negative signs don’t really matter since a, b, and d will take care of that). Now it looks like something we can handle. Simply write out each row in A to be (x, y, 1), x to be a column vector of (a, b, d) and b to be the corresponding z’s. Use the equation presented above, and you’ve got your solution!

This is awesome, because now we can add as many variables as we want, and add dependencies as well. Say the enemy tanks are factoring in some amount of distance*velocity into their equations in order to trip us up a little- we simply add in a distance*velocity vector into A, another variable in x, and now we’ve got it covered.

The only problem as I see it is the enormous scale we are looking at here. Basically, if we look to include all dependencies, the width of our matrix A is equal to n^v, where n is the degree of the polynomial that we’re trying to fit through the data, and v is the number of variables we have. Not the worst, especially considering we generally want small polynomials (since large polynomials tend to behave oddly) but with a bunch of variables, even inititally small amounts of computation could become costly.

Next objective: Build a program that updates a curve with new data without recalculating the whole thing.

Written by Ceasar Bautista

2011/03/09 at 00:43

Broken Windows Revisited

leave a comment »

Phrased in the sense of external cues giving permission to express internal desires, I realized that the Broken Windows theory showed up in more places than I had originally thought.

First off, consider diets. I recently read an article at Alternet that explained that diets don’t work, and they actually make us fatter. I’m not sure I entirely agree, but there were a few gems to be gleaned, one, particularly interesting, was a study conducted by Janet Polivy and Peter Herman in 1999. In the experiment, Polivy and Herman created two groups, one composed of dieters and the other of non-dieters. Then they split each of those groups into three, with one group told to drink one milkshake, another group told to drink two milkshakes, and finally a control group which did not drink any milkshakes.  Then they offered the participants as much ice cream as they wanted, and observed. The results are interesting, though hardly surprising:

The results revealed that the nondieters ate as you might expect: those who hadn’t consumed any milkshakes ate the most ice cream, those who’d consumed one milkshake ate less ice cream, and those who’d consumed two milkshakes ate the least. The dieters, by contrast, reacted in the opposite way. Those who were offered no milkshakes before the taste test ate small amounts of ice cream, those who drank one shake ate more ice cream, and those who’d consumed two milkshakes ate the most ice cream!

This makes sense in light of the Broken Windows theory. The milkshakes gave permission to the dieters to pig out a little. As all of us have probably been through this before, it’s very easy to understand what happened- the diet is hard to maintain and deprives of sweets, and when we get a little, we think “I already screwed up, why not enjoy myself a little.” (It also doesn’t help that sugar increases blood sugar, which increases appetite.) On that note, truth is diets can work, but you have to actually stay on it.

Likewise, this phenomenon appears when doing homework in the form of distraction. Since we never really want to do our homework, one Wikipedia entry turns into five, which turns into Facebook, email, Youtube, internet games, and so on, usually until a good deal of time has been wasted and we suddenly realize we actually have something worth doing.

Those examples considered, it’s actually quite clear that the theory pops up a lot.  Furthermore, those examples reveal a practical piece of insight into improving ourselves a little- that is, don’t ever give into desire even just a little when you have a goal that requires discipline.

Written by Ceasar Bautista

2011/02/16 at 02:15

The Broken Windows Theory

leave a comment »

I’ve spent a good deal of time trying to write this post, and every time I come to a conclusion I rethink it and I come to discover I may be concluding more than I can. So anyway, here’s my best attempt to link some important theories together.

First off, let me explain what the Broken Windows theory even is. It was first introduced to me (and probably to most people) by Malcom Gladwell’s The Tipping Point, but in reality, the BW theory was first proposed by two sociologists in 1982 as a way to attempt to explain why crime is so bad in places and why it’s not so in other. The full article is here, and I encourage you to read at least the first two pages, because it’s extremely interesting, but if you don’t want to read it, I’ll just quote the most important part here:

Philip Zimbardo, a Stanford psychologist, reported in 1969 on some experiments testing the broken-window theory. He arranged to have an automobile without license plates parked with its hood up on a street in the Bronx and a comparable automobile on a street in Palo Alto, California. The car in the Bronx was attacked by “vandals” within ten minutes of its “abandonment.” The first to arrive were a family—father, mother, and young son—who removed the radiator and battery. Within twenty-four hours, virtually everything of value had been removed. Then random destruction began—windows were smashed, parts torn off, upholstery ripped. Children began to use the car as a playground. Most of the adult “vandals” were well-dressed, apparently clean-cut whites. The car in Palo Alto sat untouched for more than a week. Then Zimbardo smashed part of it with a sledgehammer. Soon, passersby were joining in. Within a few hours, the car had been turned upside down and utterly destroyed. Again, the “vandals” appeared to be primarily respectable whites.

Untended property becomes fair game for people out for fun or plunder and even for people who ordinarily would not dream of doing such things and who probably consider themselves law-abiding. Because of the nature of community life in the Bronx—its anonymity, the frequency with which cars are abandoned and things are stolen or broken, the past experience of “no one caring”—vandalism begins much more quickly than it does in staid Palo Alto, where people have come to believe that private possessions are cared for, and that mischievous behavior is costly. But vandalism can occur anywhere once communal barriers—the sense of mutual regard and the obligations of civility—are lowered by actions that seem to signal that “no one cares.”

So basically, the name comes from the idea that when someone sees a broken window, they come to think that nobody cares, and thus believe it’s okay to behave badly. Some believe that the theory can be used to recursively explain the majority of crime, the idea being that someone who sees this broken window might think it’s okay to tag walls with graffiti, and someone will see that and think it’s okay to steal, and so on, until ultimately you have murder and rape and the worst of anarchy.

Now, I spent the last few hours considering how this might be applied to activism. In his book, Here Comes Everybody, Clay Shirky describes the what happened in the East German city of Leipzig during 1989. At the beginning of that year, people in the city began protesting. At first the protests were small- only 500 or so people and a small fraction of them might be arrested. In the government’s eye, the protests were inconsequential to use any real force against them without looking bad. Soon though, the protests began taking place weekly, and as the protests only grew a little bit each week, the government continued to do nothing, and people seeing the government do nothing, would join in small numbers. In September, the numbers swelled to a size where the government began to get worried, but by then it was too late- by October, the protests attracted tens of thousands of protesters, and at a  protest early in November, some 400,000 people showed up to protest in Leipzig. The entire East German government subsequently resigned. (The lesson for oppressive governments here is to quell protests before they get big.)

Okay, so we have two examples here, one for good and one for bad. So now we just need to find what it’s common here in order to leverage the theory.

First, you need something that is visible that transmits a message. In one case, a broken window showing nobody cares, in another, a protest showing people want freedom.

Second you need people to see your mark, and make more marks. Broken windows lead to looted cars, and each protest lead to a slightly bigger protest.

This last point is the most important. You need a population who wants to do whatever you want to do in the first place, but who are too afraid to do it on their own. In one case, we had people who want to do bad stuff, but just can’t bring themself to do for fear of punishment, or simply moral guilt. In another, we had people who genuinely wanted freedom, but were afraid to challenge an autocratic government. In neither case did the visible markers cause these sentiments- they only serve as permission to express them.

Thus, marketers and revolutionists take heed (although I would suppose most know this already)- no amount of demonstration, ads, or what have you will do anything for you unless people want to do something in the first place. So aim to change the sentiments of people first.

Written by Ceasar Bautista

2011/01/28 at 23:52

Randomness and Strategy Games

leave a comment »

At first, randomness and strategy games would seem like unlikely partners. Strategy games emphasize the role of information in making decisions, while randomness is chaotic, and undermines planning. Despite their differences however, when used properly they can combine nicely.

First off, let me begin with some terminology. When I refer to tactics, I refer to a series of maneuvers designed to accomplish a specific task. In other words, tactics are the little tricks that you can calculate in your head. When I refer to strategical moves, I refer to moves that are made in the absence of calculation, and instead guided by heuristics (in particular, experience).

With those definitions in mind, it is clear that when randomness is introduced into the low-levels (like, at an individual unit level) of a game, it elevates many tactics to the level of strategy. This can be good or bad, depending on your intentions with the game.

Randomness should not however come into high-levels of a game. I often debate the idea of macro-strategies, and how if done improperly, there can very easily be a rock-paper-scissors game disguised as a strategy game. This is bad, just in general.

Randomness should also be predictable. Not like, in the sense that it isn’t random, but in the sense that it shouldn’t be a 1% chance to create some uber-unfair effect. Again, depending on your intentions, the proc-rates will vary, but typically, I would claim that higher values are worse than lower values, the reason being, high proc-rates will cause players to depend on them working, and when they don’t it’ll cause a lot of frustration, while with low values, a player will find it more like it was just good luck.

For those unfamiliar with the idea, check out prospect theory. To make things brief though, marginal gains and marginal losses both decrease. Put in another way, let me present to you a classic problem. An impending disease will strike a city, and you are put in charge of choosing a plan that will protect the lives of its inhabitants. In Case 1, you are presented two options: Plan A will save 100 people. Plan B has a 50% chance to save 200 people, and a 50% chance to save none. Record your choice. In Case 2, you are again presented two options: This time, Plan A will result in the death of 100 people and Plan B has a 50% chance to result in the death of 200 people, and a 50% chance to result in the death of none. Record your choice. Now, if you are like most people, you chose Plan A in Case A and plan B in case B, despite both being equally rational choices. Here’s why: In Case 1, the value of the first 100 lives saved is to our brains worth more than the value of the next 100 lives saved. Thus, a 50% chance to double the number of lives saved doesn’t cut it. In Case 2, the value of the first 100 lives lost is worth more than the next 100 lives lost. In this case, where we want to minimize loss, taking the risk is a good idea.

What does this mean for designers? Well, for one, it means that if people are going to use something, they will prefer a 100% proc with a small effect rather than anything less than a 100% proc with a bigger effect. If they’re opponent is going to use something, they prefer the inverse. And finally, it means that testers will likely report on randomness in ways that pretty much mean nothing.

Debatably, it may make sense to not even make randomness random at all. I believe it was Civilization Revolution that completely did away with randomness altogether, and instead opted for an approach that guaranteed that a 1/x proc would proc once every x times, citing that players typically felt that the game was cheating them when abilities proc-ed repeatedly in their opponent’s favor. I personally think that if this idea can be reasonably implemented, it’s a better approach, as people tend to remember the negative experiences a lot more strongly than the positive ones. (This is scientifically verified, not just opinion. This is why when we are unsure we keep our original answers on tests rather than changing them, despite evidence that says to do the opposite. That is, we remember when we change an answer and got something wrong, and as opposed to changing an answer and getting it right.)

Randomness also removes a player’s sense of agency, which is vital in games. I can’t remember where I read it, but player’s prefer an opponent to be buffed than for them to be negatively buffed (and, I presume, the opposite for their enemies). Randomness has the same effect, and should be used sparingly.

Anyway, while you should factor all of those thoughts in, the real aim of this post was how it relates to strategy games. As said before, we want our randomness to not be so random, in order to develop sounder strategies. I would rather have a 50% proc which forces me to plan for what happens in either case, as opposed to a >50% proc in which I will count on it working, and totally be screwed, or perhaps better yet, a <50% proc, where I will count on the proc failing, and be happy when it turns out to go well.

In order to maintain senses of agency, which are of utmost importance in strategy games, randomness ought to have minimal effect, which means the range of its possible outcomes should be small. That is, if a unit is to deal on average 50 damage, it’s better to have it deal 40-60 with equal weights as opposed to 0-100 with equal weights. Furthermore, careful effort should be made to ensure that luck never (directly, or rather, noticeably) affects the outcome of a game.

Finally, let me refer back to this post on the TDG forums that I made a while ago, detailing randomness and its effects on intensity.

Written by Ceasar Bautista

2011/01/28 at 03:45