## Posts Tagged ‘**strategy**’

## The Princess and Monster game

The current assignment in my programming class (CIS 121) is to build two AIs for a variation on the classic Pursuit-Evasion game, the Princess and the Monster. I think it’s an extremely interesting problem and will present some thoughts on the game here, and would greatly appreciate any feedback or response to further guide me given how little I know about the subject.

For the uninitatiated, the Princess and Monster game goes like this: Take a connected graph of any size. Assign one node to be the location of the Monster and another to be the location of the Princess. Each turn, the Princess and Monster may move to any nodes adjacent to their respective locations. If the Monster and Princess move onto the same square, the Monster wins. Else, if the Princess can avoid the Monster for a certain amount of turns, the Princess wins.

*The game in practice varies a lot. For example, in some variants, the game is played on a continuous and bounded plane rather than a graph. Other interesting variations include revealing the starting location of each player to all of the player at the start of the game, enabling the Monster to capture the Princess if the Monster can simply get within a certain range, allowing the two player to “hear” each other if they are within a certain range, and changing the permitted speeds of each of the players.*

In particular, I’ll examine the case where the Monster wins if it moves onto a node adjacent to the Princess’ location, and where both players are notified of each other’s location if they are within three edges of each other. Furthermore, for simplicity, I’ll assume the game is played on a 10×10 grid, rather than a random undirected graph.

So some quick observations.

**The Really Simple Case**

If we simplify the situation so that the game is the pure game, with not knowledge of starting positions, no ability to hear, and no capture range, we can observe a few things. First off, so long as the monster’s strategy is non-deterministic, there is no way that the princess can guarantee she can evade capture. However, should the monster and princess choose a random-walk strategy, then given a 100 node graph, the probability that they meet is 1/100 * 1/100. So for now on, let’s just assume the princess adopts such a strategy.

Now we can make a far more interesting observation. With the princesses’ strategy in mind, we note that every time we explore a node and move on, the probability that the princess is at the node that we just explored is equal to the probability that the princess is at once of the adjacent nodes multiplied by the chance that the princess would move there. Furthermore, because the princess cannot reach any adjacent nodes from the location that we were just at, she in fact is less likely to be nearby. (Which is kind of weird, but it makes sense if you imagine us putting a dummy princess on each node, and each turn, splitting the dummies into multiple dummies, sending some out and keeping some around, so that, without the monster, each probability would remain 1.)

Given all that, the intuitive solution is to continuously move to the most probable area nearby. The problem with this approach is that it very easily leads to deterministic searches that cover only a small part of the graph. (See the video below.)

This is not say the approach is incorrect. It’s just that we aren’t looking far enough ahead.

The code used in the video above works like this. First we go through every node and score it equal to 1, representing it’s probability of 1%. Then we place the monster down, remove the 1% from where he’s standing, and distribute it to all of the other nodes evenly. Lastly, we simulate what would happen if the princess moved according to an arbtirary Markov process, which in this case is simple a 20% chance to move to any other neighbor. (For those unfamiliar, as I initially was, this preserves the chance that all nodes will equal 1% given the absence of the monster since the princess will leave the corner only 2*20% of the time when she is on a corner, and she will only enter a corner 20% of the time if she is on either of the adjacent nodes.) Specifically, this is implemented with two hashmaps. One keeps track of the current probability, and another is a copy of the first. Then we go through each node and tell it to distribute 20% of it’s current probability to each of neighbors in the copied hashmap (which is needed so as to not corrupt the flow). Finally we replace the hashmap we are using with the copy. (Note here that 20% is arbitrary. Ideally, we would use some machine learning techniques to figure out what exact matrix the princess is using. Note also that if the chance of the princess moving is 0, then the best strategy becomes to create a Eulerian path that explores each node exactly once.)

So basically, as mentioned on the Wikipedia page for Princess and Monster games, it becomes evident that simply standing still most of the time and only occasionally find a new spot to chill is a pretty viable strategy for the monster. This will kind of create a drain, with the monster being the gutter, simplifying the search from one that is completely blind to only that is only half blind. Unfortunately, I’m not at all on the specifics on how long to wait, but it seems to make some intuitive sense. (Note here that the more the princess moves, the better this strategy of waiting is. )

**Add in Capture Range**

And it’s no different obviously. The graph basically becomes relatively smaller. There are sometimes a few problem handling corners nicely, but largely it’s inconsequential.

**Add in both players knowing the start location of the other or add in hearing range**

These are both extremely interesting changes and deserve their own posts, so that I’ll give them. (But probably not until next Tuesday when we turn in our AIs. Check back then!)

## Dynamic Segmentation

In order to properly parse the input data gathered by a bot in Robocode, it’s insufficient to statically declare a number of bins beforehand and simply insert the data into the structure.

The main problem with this approach is that the data is getting subdivided way too many times to be useful. For example, if wanted to increase accuracy, we would desire to eliminate any confounding information, and therefore segment on every possible independent variable and more- the reason being that if we can assume bots are imperfect, than our goal is to find actually find flaws. This means we want to segment on a large number of variables, for example distance to target, time since start of battle, bullet power, target velocity, target acceleration, target angular acceleration, target heading, target x-distance from center, target y-distance from center, target hit points, shoot velocity, shooter acceleration, shooter angular acceleration ,shooter heading, shooter x-distance from center, shooter y-distance from center, shooter hit points, and possibly more. That’s a lot, and if we segment on all of that, it’s going to take a very long time to acquire meaningful data that we can act on.

Another problem is that by dividing information, information loses value near the divisions. This can be remedied to an extent through some information blurring, but generally it’s not a trivial problem to solve. Particularly, static divisions are particularly worrisome since there is no good way to know where to divide the data.

Fortunately, there is a solution- dynamic segmentation. The idea is to, rather than statically declaring the segmentation divisions, simply declare along which axes the data can be segmented, and then build a tree structure, which splits data when it makes sense to. This is not as simple as it seems, but to illustrate the general idea, if our enemy is moving on a clockwise spiral relative to our position moving randomly towards our away for sizable amounts of time, then while our first few shoots will all be clumped together in one node, eventually the bot should realize that by splitting the data up into ‘target moving closer’ and ‘target moving away’ that it’s shots will become more accurate. This is very cool, because the bot will generally have some pretty good guesses most of the time, and only improve with more data. Furthermore, it reduces the need to worry about any kind of covariance, since the bot will automatically detect it, being able to split anywhere- for example, rather than tracking distance to corners, the bot will eventually learn where the corners are (provided the enemy acts unusually in them) by first segmenting on target x-distance from center and then segmenting on target-y distance from center.

The problem now is to determine when is it appropriate to split. By virtue of the problem of determining whether or not a die is load, we can figure this out. Immediately, it’s apparent that small amounts of data won’t suffice. Additionally, it’s fairly clear that variance has something to do with it, since if we’re consistently hitting our target it would be rather foolish to split the data. The question is how much variance do we allow.

To that, unfortunately I’m not exactly sure. I think a chi-squared test is the solution, although from my research it seems it can get pretty involved. (Even determining whether or not a coin is biased is fairly complicated.) For now though, I just want to throw out the idea of utilizing a tree structure.

## Some Segmentation Ideas

Continuing with my earlier post on Robocode, I’d like to describe some thoughts on segmentation strategies.

So the basic idea is, we want to minimize our bins at all times. In effect, this means calculating the Maximum Escape Angle, calculating the arc formed by the intersection of the target and the circle of radius distance to target with center on our tank (that is, the orbital path our opponent could take), and then taking the ceiling of one half of the ratio (one half because a single shot will hit our opponent just as well as if hits the leftmost part of our opponent as the rightmost side).

Interestingly, we can adjust the Maximum Escape by adjusting the speed of our bullet. Given that, we can actually reduce or expand the number of bins as we desire- a useful ability when trying to maximize probability (since things are discrete, a change from N to N-1 bins can be fairly significant assuming the probability to hit is a uniform distribution at both ranges).

Furthermore, we can supplement our segmentation data by adjusting the range at which we fight, spiraling toward and away our opponent as necessary, in order to keep our targets at optimal distances.

## Wave Surfing Explained

I’ve recently reignited my interest in Robocode but this time armed with everything I’ve learned since September.

For the uninitiated, Robocode is an educational programming game, originally written with Java in mind, where players code up algorithms to guide virtual tanks in destroying each other. It’s a unique challenge, combining fields like programming, statistics, linear algebra, and game theory all into one.

Originally intended for students, the game was quickly picked up by experienced programmers and is now rather competitive. Many players use sophisticated algorithms in order to dodge enemy bullets and make sure their own find their mark.

Anyway, let me continue to explain wave surfing- but first I need to explain segmentation.

**Segmentation**

Since in Robocode neither bot can see any of the bullets, developers have to find ways to track their enemy through alternative means. Segmentation is a targeting technique that involves finding which firing angles are most likely to hit the target by dividing the continuous range of firing angles into discrete bins, and finding which bins are most popular. To implement segmentation, one needs to track each bullet with “waves”.

A wave is basically analogous to a ripple in a pond after tossing a stone into the water. We track the bullets until the “wave” hits our opponent. At that point, we can determine what the correct firing angle was at the time of firing, and subsequently increment the value in the bin which contains the correct firing angle. We then proceed to fire at the center of the most popular bin whenever we get the chance.

A primitive approach of segmentation might only segment by firing angle. However more sophisticated approaches first segment by distance, and then angle. (And even more sophisticated approaches segment by even more dimensions.) However, these improvements come at a cost of slowing down the rate of learning, but the accuracy gain is typically well worth the price.

Optimal segmentation (I think anyway) ought to reduce the number of bins down to the minimum needed to hit every possible position of the opponent. For example, an opponent at point blank range would only require one bin, since all firing angles within the escape angle would be guaranteed to hit the target. As the distance increases however, more and more bins become necessary (I believe at an exponential rate). By reducing the number of bins in this fashion, we increase learning speed, and reduce memory costs.

**Wave Surfing**

Wave Surfing is an evasive technique first developed and implemented by ABC in mid-2004. To put it succinctly, wave surfing is anti-segmentation. Every time the enemy fires a shot, we create a wave, and see whether or he not the bullet hit when the wave reaches our bot, and subsequently increment our own bin. In this way, we know what our opponent believes to be the most popular firing angle, and therefore make sure to avoid it, producing a near uniform distribution at all ranges.

**Why this is optimal**

Refer back to my earlier post on the game theoretical solution to Rock-Paper-Scissors. Basically, to do anything else is to tip our opponent off to a flaw in our strategy. If you know for example that your opponent will play Rock 2/3 of the time and Paper 1/3 of the time, your optimal strategy becomes to keep playing Paper.

**Implementation Issues**

A literal implementation of the above is still susceptible to intelligent algorithms. For example, at the start of a game or Rock-Paper-Scissors, if your opponent first plays Scissors, it would be illogical to assume that because your opponent has so far only played Scissors, it would be logical to assume he would do so again. This illustrates the problem of determining when the opponent is playing sub-optimally with an unequal probability distribution. Thankfully, statistics provides an answer. Using statistical methods to determine statistically significant variance will lead to a solution- typically, with low amount of data, statisticians say nothing can be concluded, and when more data, more confidence can be placed on inferences. A simple implementation of confidence intervals ought to be sufficient.

## Randomness and Strategy Games

At first, randomness and strategy games would seem like unlikely partners. Strategy games emphasize the role of information in making decisions, while randomness is chaotic, and undermines planning. Despite their differences however, when used properly they can combine nicely.

First off, let me begin with some terminology. When I refer to **tactics**, I refer to a series of maneuvers designed to accomplish a specific task. In other words, tactics are the little tricks that you can calculate in your head. When I refer to **strategical moves**, I refer to moves that are made in the absence of calculation, and instead guided by heuristics (in particular, experience).

With those definitions in mind, it is clear that when randomness is introduced into the low-levels (like, at an individual unit level) of a game, it elevates many tactics to the level of strategy. This can be good or bad, depending on your intentions with the game.

Randomness should not however come into high-levels of a game. I often debate the idea of macro-strategies, and how if done improperly, there can very easily be a rock-paper-scissors game disguised as a strategy game. This is bad, just in general.

Randomness should also be predictable. Not like, in the sense that it isn’t random, but in the sense that it shouldn’t be a 1% chance to create some uber-unfair effect. Again, depending on your intentions, the proc-rates will vary, but typically, I would claim that higher values are worse than lower values, the reason being, high proc-rates will cause players to depend on them working, and when they don’t it’ll cause a lot of frustration, while with low values, a player will find it more like it was just good luck.

For those unfamiliar with the idea, check out prospect theory. To make things brief though, marginal gains and marginal losses both decrease. Put in another way, let me present to you a classic problem. An impending disease will strike a city, and you are put in charge of choosing a plan that will protect the lives of its inhabitants. In Case 1, you are presented two options: Plan A will save 100 people. Plan B has a 50% chance to save 200 people, and a 50% chance to save none. Record your choice. In Case 2, you are again presented two options: This time, Plan A will result in the death of 100 people and Plan B has a 50% chance to result in the death of 200 people, and a 50% chance to result in the death of none. Record your choice. Now, if you are like most people, you chose Plan A in Case A and plan B in case B, despite both being equally rational choices. Here’s why: In Case 1, the value of the first 100 lives saved is to our brains worth more than the value of the next 100 lives saved. Thus, a 50% chance to double the number of lives saved doesn’t cut it. In Case 2, the value of the first 100 lives lost is worth more than the next 100 lives lost. In this case, where we want to minimize loss, taking the risk is a good idea.

What does this mean for designers? Well, for one, it means that if people are going to use something, they will prefer a 100% proc with a small effect rather than anything less than a 100% proc with a bigger effect. If they’re opponent is going to use something, they prefer the inverse. And finally, it means that testers will likely report on randomness in ways that pretty much mean nothing.

Debatably, it may make sense to not even make randomness random at all. I believe it was Civilization Revolution that completely did away with randomness altogether, and instead opted for an approach that guaranteed that a 1/x proc would proc once every x times, citing that players typically felt that the game was cheating them when abilities proc-ed repeatedly in their opponent’s favor. I personally think that if this idea can be reasonably implemented, it’s a better approach, as people tend to remember the negative experiences a lot more strongly than the positive ones. (This is scientifically verified, not just opinion. This is why when we are unsure we keep our original answers on tests rather than changing them, despite evidence that says to do the opposite. That is, we remember when we change an answer and got something wrong, and as opposed to changing an answer and getting it right.)

Randomness also removes a player’s sense of agency, which is vital in games. I can’t remember where I read it, but player’s prefer an opponent to be buffed than for them to be negatively buffed (and, I presume, the opposite for their enemies). Randomness has the same effect, and should be used sparingly.

Anyway, while you should factor all of those thoughts in, the real aim of this post was how it relates to strategy games. As said before, we want our randomness to not be so random, in order to develop sounder strategies. I would rather have a 50% proc which forces me to plan for what happens in either case, as opposed to a >50% proc in which I will count on it working, and totally be screwed, or perhaps better yet, a <50% proc, where I will count on the proc failing, and be happy when it turns out to go well.

In order to maintain senses of agency, which are of utmost importance in strategy games, randomness ought to have minimal effect, which means the range of its possible outcomes should be small. That is, if a unit is to deal on average 50 damage, it’s better to have it deal 40-60 with equal weights as opposed to 0-100 with equal weights. Furthermore, careful effort should be made to ensure that luck never (directly, or rather, noticeably) affects the outcome of a game.

Finally, let me refer back to this post on the TDG forums that I made a while ago, detailing randomness and its effects on intensity.

## The Nature of Imbalance

Before I can explain the combat effectiveness of a heterogeneous group, I need make sure the reader understands a few things first.

First, recall that in homogeneous groups, every additional unit increases the strength of the group exponentially.

Recall also that a rational player will target enemy units according to their Offense to Defense ratio, the highest of which will be targeted first.

With that in mind, I can explain.

Let’s consider the simple case of a group of two units, one unit with 1 Offense (O) and 1 Defense (D), and the other with sqrt(3) O and sqrt(3) D. Since both units in the group have the same Offense to Defense ratio, the order in which our enemy targets our units is unimportant and therefore we can calculate the strength of our group against the Universal Unit to get a K-value for our group. That said, it comes out to be (1 + sqrt(3))*1+sqrt(3)*sqrt(3)=4+sqrt(3) ~= 5.73K.

Notice though, that the unit with sqrt(3) Offense and sqrt(3) Defense has a K-value of 3, which means it worth exactly twice as much the other unit. That is to say, the value of our group is $1 + $2 = $3.

However, notice that $3 worth of 1O, 1D units produces an effective strength of 1+2+3=6K. This suggests that my previous suggestion of how to price units is not adequate.

Things break down even more when the Offense to Defense ratios are different. If a player rationally targets the unit with the highest Offense to Defense ratio, he will make heterogeneous groups even less effective. Consider for example, another group worth $3, except this time composed of a unit with 1 Offense and 1 Defense and a unit with 1 Offense a 3 Defense. In this case, the 1 Offense, 1 Defense unit is more threatening, and so with that unit targeted first the group’s strength comes out to a measly (1+1)*1+1*3=4K. Should the player have done otherwise, he would improved the group’s strength to an impressive (1+1)*3+1*1=7K.

This has something interesting implications. What we are basically saying here is that the effective strength for a two unit group is given by,

(O1+O2)D1+O2D2 = D1O1 + D1O2 + D2O2 or (O1+O2)D2+O1D1 = D1O1 + D2O1 + D2O2

What is important here in the middle factor, the D1O2. Typically, if Unit 1 and Unit 2 were the same price we would want to D1=O2, since if D1!=O2 then D2!=O1, and since our opponent gets to choose which unit he will target first, he will always pick the unit that makes the middle term smallest. Thus we try to make D1O2=D2O1, or equivalently D1/O1=D2/O2. Therefore, homogeneous groups are typically the most effective.

However, it is sometimes possible to force our enemy to attack our units in the way that *we* want *him* to. In this case, we want very heterogeneous units. In fact, if we could have it, we would have our units as different as possible- give one with infinite Offense and infinitesimal Defense and the other infinite Defense and infinitesimal Offense, and we get a middle term that equals to infinity^2!

As we can see, prices cannot possibly be determined by looking at stats alone. In fact, prices for units are completely arbitrary- fine-tuned to make the game play as the designer intended. Likewise, the strength of units cannot be measured, since units rarely fight alone or in purely homogeneous groups. Only so long as prices keep the decisions interesting then they are completely left to the designer’s discretion.

## Research

So a recent post on the TDG forums got me thinking about the value of upgrades and research, and it’s really rather simple, but I think worth saying anyhow since they are so common and to amateurs often a bit bizarre.

**Research**

Research enables new the construction of new units and technology can often be researched in a variety of ways. New units can often be unlocked by simply building or upgrading structures, and upgrades or new abilities can often be unlocked via researching the technology directly at a building.

However, the difficult part is really just determining if it’s really worth it to buy many of these upgrades or research the technology. Sometimes the choice can be simple- If for example, a player already owns a hundred soldiers and researching an attack upgrade is cheap, it’s a no brainer. The player will get his money’s worth as soon as the research is completed.

If however, a player is spending resources to simply unlock the *ability* to construct a new unit (or alternatively use a new ability) the price at first is a little more difficult to figure out. If we assume that the new units is more cost effective than existing units, the player could simply determine if he will reasonably build enough units to get his money’s worth. If for example, the current tier of units get a 100:10 strength to resource ratio, and the new tier would get a 200:10 strength to resource ratio, but initially costs 200 resources, then to make those lost resources back, the player would have to purchase 10 of the new tier units. Beyond that is a profit in strength.

It’s in my opinion though, that having such strict tiers, with each new tier nullifying old ones, is often bad design. (Sometimes, as in the case of Tower Defense War games this is necessary.) The alternative approach, as appears to be most common in RTS games, is to make units cost effective in certain scenarios, say good at defeating groups of units via splash or effective at combating biological units via poison, such that no unit is truly superior to any other, except in certain scenarios. It then becomes the player’s job to guess at how often the player will be able to effectively use unlocked units, and if the benefits outweigh the costs, then to purchase the technology.

To give an example- Consider you are playing StarCraft. You are the Terran (an army of gritty future humans) playing against the Zerg (an alien army that has evolved to kill). The game begins, and you start building Marines, the basic component of early Terran armies. Very quickly, the option to purchase Medics arrives, which are extremely useful units when the opponent is unable to kill your Marines quickly. Knowing that the Zerg only has access to Zerglings and Hydralisks, both of which are relatively fragile and also unable to overwhelm the healing effects of Medics, it’s a logical decision to buy them. The benefits greatly outweigh the costs, and you will make up the money spent on the research very quickly. You will not however, proceed to buy Firebats, because despite their splash attack, which is effective against clustered enemies, their short range makes them particularly weak against Hydralisk, which your enemy will have many of, and therefore, your Firebats will always be fighting in situations where they are not cost effective, and they will not make up the money spent on research.

As designers, this effectively means that the price of research is effectively arbitrary. Higher costs however, will mean that the research will much less often be purchased, since players will only see it as useful if they can pretty much guarantee that the unlocked units will be able to be deployed in cost effective circumstances.

Research is kind of interesting because players will shape their armies based on the conditions they are fighting in. Based on the map, it may or may not be cost effective to purchase certain units. For example, ground units lose a lot of their effectiveness of island maps. Additionally, the race of the opponent will have to be factored in. For example, Science Vessels are pretty much a must have against Zerg, with their ability to Irradiate enemy units and detect Lurkers. And perhaps even more interestingly, the units that you have already unlocked has to be considered. Certain units may work extremely well together, such as Marines and Medics.

So basically, research is kind of the depth aspect of RTS games. It takes experience to figure out what’s worth researching and what’s not, and these decisions, strategically, are often the most important, and really fuel the interesting interplay behind the RTS.

Rethinking research in the frame of context of group combat, it occurred to me that beyond the points covered in my original post on research, the real value of research comes from the ability to construct more specialized units, which in turn result in stronger heterogeneous groups. From this perspective, I would claim that in any well made RTS, the lowest tiered units would be largely unspecialized- namely, stats that are effective for 1v1 combat and as more research permits the creation of more units, the new units become more and more specialized. We see this in Starcraft- Terrans have the well rounded Marines, Zerg have well rounded Hydralisks, and Protoss typically produces well rounded combinations of Zealots and Dragoons.