User:Tarkisflux/Rants/Probability and Averages

New blog post time. I recently realized that a lot of people have a hard time with averages in conjunction with probability, or just aren't as familiar with the terms and how to apply the concepts as others here are. So I’m going to write about it for a while. Let’s do some work on the individual terms first though, for those who haven’t thought about this stuff at all, and then throw down some increasingly complicated examples.

Probability
Probability is a pretty straightforward word, and just means “the chance that a particular result will happen in response to a particular event”. This chance can be as big as 100% or as low as 0%, and any number in between. And I really mean ‘’any’’ number. You could have a 0.0000000001% chance of something happening, or even a much much much smaller one, and it would be a valid probability. Since ‘something’ happens in response to an event all the time, even if that ‘something’ is ‘nothing’, all of your individual probabilities have to add up to 100%. If they don’t add up to 100%, you have a set of ‘’undefined’’ possibilities and those are bad. It’s not that nothing happens in those cases, it’s that you don’t know what happens in those cases and that makes prediction impossible.

Most talk about probability on the wiki refers to die rolls and ignores the more abstract stuff. Rolling a single die, like a d20, will return one of 20 possible results. If the die is fair, we know that each result is equally likely and so the probability of any individual result is 100% / # of results, 20 in this case, which gives us 5% for any individual result. If the die was not fair, we couldn’t assume equal probability for each result and we have to resort to actually hard math or lots and lots and lots of trials to get the probability of each possible result. This works for any single fair die, regardless of size.

Multiple die rolls, like for damage rolls in DnD or in a 3d6 system, are a bit trickier. We’ll use 2d4 as an example. Your results range from 2 to 8, and while there is only one way to get a 2 (roll a 1 and 1) for values like 5 there are more than one way to get it. You could roll 1 and 4, 2 and 3, 3 and 2, or 4 and 1 and still wind up with 5 as a result even though those are 4 different results (this is especially clear if the dice are different colors). What this means is that getting a 5 has a higher probability than getting a 2. There are two basic ways to get probabilities for each possible result. You can sort a math equation to just give you number of combinations that would give you the result, called a counting function, but that takes some effort. The other thing you can do is brute force it and just write out all of the possibilities, then see what portion of them give you back a result. I choose 2d4 for the example here so I could do this without it getting really really long.

Die 1	Die 2	Result

1	1	2

1	2	3

1	3	4

1	4	5

2	1	3

2	2	4

2	3	5

2	4	6

3	1	4

3	2	5

3	3	6

3	4	7

4	1	5

4	2	6

4	3	7

4	4	8

Ok, that’s the brute force list. There are 16 possible results that correspond to 7 different values. You get a 5 four out of the 16 times, or 25% of the time. Hence, 25% is the probability of getting a 5 on 2d4. You get a 4 only 3 of 16 times, or 18.75% of the time. And so on. For comparison, if we were looking at 3d6 there are 216 individual results that correspond to values from 3 to 18. This sort of thing is easy to do with spreadsheets or simple computer code, but it’s not very pretty to look at on a page. And that’s as much probability as we need to worry about for now.

One last thing about probability before we move on. If something can happen on multiple results, like a to-hit roll meaning a hit on every number equal to or above your target number, then the odds of that happening are just the sum of the odds of getting each of the individual triggering values. This isn't particularly useful in a flat system like d20, but is extremely useful in a bell-curved system like 3d6 where the different numbers are differently likely to come up.

”Average”
Average can mean a lot of different things depending on who is using it and the context it’s found in. When we’re talking about it in games, we’re probably using it in one of two different ways and either talking about the mode or the mean.

The mode comes up more in actual games when people are talking about how likely they are to do something and is more of an “on average…” or “I’ll probably” construction, as in “I’ll probably hit him if I attack him”. It’s useful for knowing what the most likely result is, but that’s about it. It doesn’t tell you how you should expect things to go given enough attempts, just how most of those attempts will turn out.

The mean (just mean from now on) on the other hand tells you exactly how things will turn out given enough repeated attempts. When people here talk about an average, that’s the one they’re referring to. There are several types of mean though. The arithmetic mean is the old “add them all up and divide by the number of things you added together”, and it is equivalent to multiplying the value by the probability of getting that value where the probabilities are all equal. If the probabilities are not equal it means that some terms show up more than once in the possible results, like in our 2d4 example above. This is a straightforward weighted mean, and the weights in this case are exactly equal to the probability of getting the result. The math to prove that is pretty straightforward algebra that I will do if anyone asks, but I’m going to skip it for now because this is already technical enough. Because the weighted mean is weighted with probabilities, that’s the one that people use here.

So when I use “average” from here out, I’m referring to the number that you get when you multiply each result by its probability and add them all together, or weighted mean.

Minor side note, there is a nice property about averages that is worth pointing out. The average of a sum of different sets is equal to the sum of the averages of the different sets. As an example, the average of 1d6 + 1d8 is equal to the average of 1d6 + the average of 1d8. This is going to be really important anytime you need to add stuff together, like in damage rolls, because it means we can treat the damage list as a 1d6 + 1d6 + 1d6 +… instead of trying to figure out the average of Nd6 or whatever.

Damage Per Round
So, let’s say you wanted to know what the average damage you dealt per round with your attacks was. Since the average of few things that add together is the sum of their individual averages, we can focus on a single attack and just add extra ones to the total as needed. When you go to deal damage with an attack, the first thing you need to do is roll a d20. And there are 4 possible results when that happens: you miss, you score a normal hit, you threaten but do not back a critical hit (normal hit), and you threaten and back a critical hit. Since we’re taking an average, we need probabilities for each of these outcomes.

On any given attack roll there is going to be some minimum number that will cause you to score a hit, and if you roll beneath that number you’re going to miss. The odds of you rolling that exact number, or any other number on the die, is 5%. If you need to roll at least a 13 to hit, you also hit on 14-20. There are a total of 8 numbers in that range, each with a 5% chance of you rolling them, so your total probability of hitting is 40%. In general, your probability of hitting on any attack is (21 – the minimum number you need to roll to hit) x 5%. Since that is the probability of you hitting a guy, this is the weight that we multiply your damage by to get an average. We will call this number %Hit.

But some of those rolls will result in a critical hit. You threaten a critical hit when you roll a number on the d20 that is within your critical threat range. If your threat range is 18 to 20, there are three numbers that cause you to threaten a critical which gives you a 15% chance. In general, your probability of threatening a critical is (21 – minimum threat range number OR minimum hit number [whichever is higher]) x 5%. And we’ll call this number %Threat. It’s possible that this number and %Hit are the same in some edge cases, and that’s not a big deal.

After you threaten a critical hit, you need to confirm it with a second to-hit roll, one that has the same odds as the original to-hit roll. So we’ll use %Hit again for success here. Rather than invent a new term for missing, we’ll take advantage of the fact that %Hit includes the threatened values, and the only thing not included is the values on which you miss. So we can do the threatened but not confirmed probability as (1 - %Hit).

That’s enough to start constructing an equation for the average damage. What does it look like?

Average Damage = (1 - %Hit) x 0 + (%Hit - %Threat) x Normal + %Threat x (1 - %Hit) x Normal + %Threat x %Hit x Critical

I want to make one more distinction. When you succeed on a critical hit, you get your normal stuff + some bonus stuff. And if we write that as Normal + Bonus, we get:

Average Damage = (1 - %Hit) x 0 + (%Hit - %Threat) x Normal + %Threat x (1 - %Hit) x Normal + %Threat x %Hit x (Normal + Bonus)

And when you multiply that out and do some algebra on it you get:

Average Damage = %Hit x (Normal + %Threat x Bonus)

Really.

To get further we need to stick values in there. The %Hit and %Crit are going to

We also know what the Normal and Bonus terms look like. Normal is pretty easy. It’s going to look something like 1d6+3, but it may have a bunch of extra stuff on there as well. Since each damage die is just added together, we can just take their individual averages and add those together as well. For 1d6 we just add up its possible values, 1 + 2 + 3 + 4 + 5 + 6 = 21, and then divide by the number of them for its average, 21 / 6 = 3.5. So if you had a greatsword and dealt 2d6+7 damage on a hit, you would average 3.5 + 3.5 + 7 = 14 damage. If you had sneak attack, you would deal an extra 3.5 average damage per die. And so on.

The Bonus term is going to be anything you get ‘’in addition’’ to your normal hit, which means it can’t include your normal hit. If you have a x2 crit, this means that you just add one more weapon roll and set of bonuses, since half of the x2 is your normal damage and already included above. Similarly, if you have a x3 you only add two more weapon rolls. You also add in any sort of on-crit effects, and ignore anything that you don’t get on a crit.

And once you have all those %s and damage values, you just stick them into the equation and math them. If you have multiple hits you want to figure out, get the numbers for each hit and do them in their own equation, and then add them all together. What you get at the end is your total average damage per hit / round / hour / whatever. As long as you keep adding terms, it’s pretty extensible.

Average Rounds Feared
For our next example, we’re going to determine the average number of rounds spent suffering any fear condition as a result of the shake resolve ability of my ToP Intimidation. I’m choosing it because there are a lot of different results here because of the built in scaling, and it will nicely illustrate how to do an average when faced with multiple different outcomes.

So, there’s a base DC, and if you roll under it your target is not shaken at all. If you roll the DC or one better, the target is shaken for 2 rounds. If you roll the DC +2 or DC +3, the target is shaken for 3 rounds. If you roll the DC +4 the target is shaken for 4 rounds. If you roll the DC +5 or DC +6, you get 5 rounds of fearedness. And so on. You need to add up all of those individual “number of rounds” results, multiplied by the chances of getting it, and then you have your average. Each result can happen on no more than two values of the die, depending on what your DC is (since there is a maximum value on the die that cuts off our possibilities) and whether you are next to the higher results tier or not.

The brute force average looks like:

[(DC - 1) x 0 + 2 + 2 + 3 + 3 + 4 + 5 + 5 + 6 + 6 + 7 + 8 + 8 + ...] / 20

And at some point, after (21 – DC) non-zero terms you stop adding terms, because you’d have to roll over 20 to get them. For example, if the DC was 9 we would stop at the "...” above.

The non-brute force average takes advantage of the fact that each of the higher number of round results contains the lower results in it, similarly to our critical hit example above. So we can split the results up and regroup them, and get this instead:

[(21 - DC) / 20] x 2 + [(21 - (DC + 2)) / 20] + [(21 - (DC + 4)) / 20] + [(21 - (DC + 5)) / 20] + [(21 - (DC + 7)) / 20] + ...

Where you wind up adding the odds of getting an additional round for each possible addition that you could roll. Which is a bit more extensible than the brute force version, but still doesn't collapse into anything nice looking. If the equation was more smooth, or the same results happened on more than a couple of numbers, it would collapse to something more useable. If we pretend that it was smooth and took the average value on a success then multiplied by the odds of getting a success, we get a pretty close equation of (2 + 2 + (21 - DC) / 2) x (21 – DC) / 20, but if you want accurate you need to actually do the sum and divide.

This is the sort of thing I would break out a spreadsheet and a bunch of conditionals for and plug in lots of different DC numbers so that it was easier to see trends over various DCs. Even if you didn’t want to do that it’s not a hard thing to figure out by hand, it’s just a time consuming one.

Parting Stuff
This is long long longity long, so we’ll leave it here. If anyone has examples they’d like to see feel free to leave them on talk. I don’t guarantee I’ll do them though ;-).