Difference between revisions of "Dungeons and Dragons Wiki talk:The Same Game Test"

From Dungeons and Dragons Wiki
Jump to: navigation, search
(What does the hallway full of runes actually do?)
(What does the hallway full of runes actually do?)
Line 303: Line 303:
:::::::::: Yeah, good luck getting anyone who's not a sock puppet to agree with that. --[[User:Ghostwheel|Ghostwheel]] ([[User talk:Ghostwheel|talk]]) 15:41, 3 September 2019 (MDT)
:::::::::: Yeah, good luck getting anyone who's not a sock puppet to agree with that. --[[User:Ghostwheel|Ghostwheel]] ([[User talk:Ghostwheel|talk]]) 15:41, 3 September 2019 (MDT)
:::::::::: Hallway full of magical runes is: can you disable or somehow bypass magical traps? Then you pass. You can't? Then you (probably) fail. Not sure why you're having difficulty with this particular thing. [[User:Surgo|Surgo]] ([[User talk:Surgo|talk]]) 19:29, 3 September 2019 (MDT)

Latest revision as of 01:29, 4 September 2019

"Using Your Results" Word-age[edit]

Should the levels of balance be re-written from "monk, fighter, rogue, wizard" in the last section of the SGT article to "low, medium, high, very high" or are they preserved for a reason? --YouLostMe (talk) 00:30, 13 March 2013 (UTC)

It should have been updated already. Feel free to beat me to it :-) - Tarkisflux Talk 02:19, 13 March 2013 (UTC)


How can we be certain the Same Game Test is accurate, when D&D's balance point is a party of four of mixed class? --Jonathan Drain 12:02, September 20, 2009 (UTC)

Taken from somewhere else I wrote, but relevent here
Challenge Ratings for NPCs:

An NPC with a PC class has a Challenge Rating equal to the NPC’s level.

—Dungeon Master's Guide, page 37
The same thing means that a level X PC should be a CR X creature. So a level 5 Barbarian should be a CR 5 creature. And a level 8 Monk should be a CR 8 creature. And a level 7 Wizard should be a CR 7 creature. With me so far? I didn't go too fast? Alright, let's move on.
Two creatures of the same CR are supposed to be at around the same level of power, and when they directly face each other with neither having an overwhelming advantage (like fighting in a pit full of lava against a fire elemental), the victor should be completely random. That is, on average, the victor between the two creatures should be each of the creatures, 50% of the time. With trial combats between the two, one creature should win half the time, and the other should win half the time. Why? Because they're supposed to both be of equal power. This means that the Same Game Test attempts to balance classes according to the DMG.
No, a single monster of a CR equal to the average level of a party is not supposed to be a major threat.
Challenge Rating

This shows the average level of a party of adventurers for which one creature would make an encounter of moderate difficulty. Assume a party of four fresh characters (full hit points, full spells, and equipment appropriate to their levels). Given reasonable luck, the party should be able to win the encounter with some damage but no casualties. For more information about Challenge Ratings, see pages 36 and 48 of the Dungeon Master’s Guide.

—Monster Manual, page 7
A single monster of a specific Challenge Rating when faced by itself has an Encounter Level about equal to its Challenge Rating.
—Table 3-1, Dungeon Master's Guide, page 49
The average adventuring group should be able to handle four challenging encounters [of an Encounter Level equal to the party level] before they run low on spells, hit points, and other resources.
—Dungeon Master's Guide, page 50
This means that if a party faces a single creature with a CR equal to the party level, they should expend approximately 20-25% of their resources for the day. That's not a very hard challenge, and virtually none of the PCs risks dying. No, a whole party should not win only 50% of the time against a monster that has a CR equal to their party level. It should be a breeze for them. Only when a character is alone against a monster of his CR should he have a chance of dying, and then 50% of the time, since it comes down purely to the dice on who wins and who dies (since the PC is supposed to be of the same CR as the monster, by the book).
Thus, a PC who passes the Same Game Test is balanced according to the DMG. However, not all PC classes are balanced. Monks and Fighters are usually far below in power than creatures of equal CR. Wizards, Druids, and Clerics are often far higher in power than creatures of equal CR. The page you linked in the article that this talk page belongs to tries to explain that. So... yeah. Not only is the "system" right, but it doesn't design anything. Instead, it attempts to detail out the power levels of certain things. It explains how the system works.
This means that a character should be able to take on n same EL encounter by himself, and win 1/2 the time, since he is supposed to represent the same threat as the encounter. However, CR and EL is out of whack and doesn't really work, so... *shrug* The SGT only really works if you assume that the system works. But if you assume that, then it works great.
--Ghostwheel 12:11, September 20, 2009 (UTC)
The problem with that argument is that the DMG's not accurate when it says CR = NPC level. The developers admit this:
Reason I hate the drow #2: They’re basic humanoids, so I have to build them with class levels. The rule that says “an nth-level NPC is a CR n monster”… well, let’s just say that the rule isn’t beyond reproach. It’s true of some classes within some level ranges, but it’s simply not accurate as a general rule. I don’t think any designer will tell you with a straight face that a 1st-level NPC wizard is a good challenge for four 1st-level PCs. (Better hope the NPC gets that sleep spell off, huh?) So my low-level drow have 1 point of CR vanish into thin air, and they lose more oomph because they’re built with class levels.
—Design & Development: Let’s Get Small: Adventure Design, Part 1[1]
--Jonathan Drain 15:08, September 20, 2009 (UTC)
Drow are supposed to have LA. The issue in that case is with the drow, not the system. Also, what he's saying ("It’s true of some classes within some level ranges, but it’s simply not accurate as a general rule") is exactly what the Same Game Test points out as a flaw in the system; that works for rogues and similarly powerful classes (PsyWar, Swordsage, etc.), doesn't work (classes are too strong) for wizards, druids, and the like, and also doesn't work for fighters, monks, and the like, but this time because they are too weak. The SGT is a method for balancing homebrew classes that says that this (“an nth-level NPC is a CR n monster”) should apply to all classes over all levels. We know that this ("the DMG's... says CR = NPC level") isn't accurate, but that's because the designers made certain classes poorly, not because the system is flawed. I don't think that was quite as succinct as it could have been, but do you get where I'm coming from? -- Jota 16:00, September 20, 2009 (UTC)
I see the logic behind the Same Game Test. However, there's a complicating factor: what if the class' effectiveness changes when it's in a group? Consider a "Healer" class, who may be balanced or even overpowered when in an adventuring party, but would fail the Same Game Test nine times out of ten. Likewise, when my 3.5 game had a bard, we counted how much bonus damage his music contributed and it worked out to as much as the party average. Solo, he'd fail the test because his class is less powerful when he's not tested as part of a group. --Jonathan Drain 17:38, September 20, 2009 (UTC)
That is a worthwhile thing to note and is a complicating factor for the test. I know there is some discussion going on about the Marshal (3.5e Class) for that very reason. Check out the talk page if you are interested. The same game test is a standard to balance things to. We will need a more sophisticated standard for more complicated things. For most classes, however, they can solo just as well as they do in a party and the test works fine. --Andrew Arnott (talk, email) 17:59, September 20, 2009 (UTC)
Ugh, Arnott beat me to it, but here you go anyway.
Having a non-subjective measure against which you can judge a class's relative merit is worth quite a lot when you're trying to keep power levels in the same range and avoid the problems you admit the base system has. Just because it fails in the case of classes that don't do as much directly doesn't mean we should drop the test, though it is a good case for expanding it. Ghostwheel's Marshall has exactly this problem, and there have already been suggestions made on how to expand or alter the test to account for it. - TarkisFlux 18:14, September 20, 2009 (UTC)
A non-subjective measure is a great idea. The difficulty is making sure the measure is accurate. It's not just support classes that work differently a party than solo, but monsters too. Some have area attacks which work better against a party, some are more deadly when they survive longer due to not being outnumbered, and some are weak to standard group tactics like flanking. Conversely, few monsters are actually weaker against one man than against four.
The problem of course is that to test a class alongside three comrades (so that the test is accurate to real D&D conditions), how do you ensure those comrades don't contaminate the test results? If you use only classes previously proven balanced for his comrades, how do you test those first without picking their comrades? --Jonathan Drain 20:17, September 20, 2009 (UTC)
You have actually read the test and aren't just complaining about it on principle right? It is, in order: a 'trap' that's likely to kill you if you can't disable or avoid it, a big boring hard hitting melee monster with reach, a flying monster that can kite you all day, a closet / ambush monster who murders you if you can't detect it, a potentially tactical well defended melee monster, a pair of creatures who are weak in melee but carry mental SoDs, a creature who is weak in melee but has lots of supporting creatures and fort SoDs, a large group of slightly weaker melee monsters with reach, and a load of individually weak creatures that happen to be insubstantial and could overwhelm you. Which is basically all of the stuff that you just complained about it not measuring with the exception of the thing that I've already admitted probably needs to be addressed. It's not designed to test the monsters, so whether they function better against a party or not is pretty moot. It generally ignores puzzle monsters as well, because they're not consistently challenging.
The 4 man balance point you mentioned above did not work when the game came out, and it doesn't work now. It really falls apart because you have to account for atypical party composition and play styles as well as bizarre class feature synergies. I think there may be room in the SGT for a couple of '5th wheel' tests where we drop the class into a pre-determined encounter with a pre-determined party, but those really can't replace examining the class on its own against challenges that are actually appropriate for their level. - TarkisFlux 21:22, September 20, 2009 (UTC)
The four man balance point is a correct balance point, because that's how Dungeons & Dragons is played. That point isn't incorrect just because it's harder to reliably test than SGT. A test may be precise without being accurate.
Same Game Test is only proven valid if solo and group performance are proven equatable. This could be true, but the article I read doesn't prove that this is the case.
Example: Suppose a class passes SGT with a 50% rating, but anyone who uses the class in a game finds it underpowered. Is the game at fault, or the test? --Jonathan Drain 07:01, September 21, 2009 (UTC)
It works since the balance of a group is determined by its individual members. How they synergize can come into play, but unlike 4e, 3.x classes are for the most part made to be self-encapsulating and don't affect other party members too much (with an occasional exception, during which you'd be right that the SGT doesn't work well). It is also accurate, since a character of a certain level should be at around the same power as an encounter of the same level (see the quotes from the DMG above). Furthermore, someone who passed the SGT is going to be fine against equivalent encounters in a standard party (by definition), unless everyone else is using Tome material and/or are wizards/clerics/druids, since the power of someone who can pass the SGT is about that of a Tome of Battle class (Swordsage, Warblade, Crusader, etc). --Ghostwheel 07:19, September 21, 2009 (UTC)
Well, this discussion is on its descending end, but I simply feel I must add my own two cents. The Same Game Test fully achives what it attempts to do: test a character against a menagerie of challenges, to see if they are able to remain on par with 50% of them. Though, the real question is "If a class doesn't do anything by itself, how can it be tested by this system?" The answer is simply "Pair them with a class that beats the SGT evenly, and then double the power of the test." A few example of this 'doubling' are detailed below:
  • A hallway filled with magical runes (No change, if you can't get past em, you can't get past em).
  • Two Fire Giants (Hells yes).
  • Two Young Blue Dragons (I love the smell of Ozone in the morning).
  • Two Bebiliths. (Yeah, you're screwed).
  • Two Vrocks (Royally Screwed).
  • Two tag teams of Mind Flayers (Must've interrupted their football game).
  • Two Evil Necromancers (Probably Boyfriend/Girlfriend).
  • 12 Trolls (Not a good time for the matches to be wet).
  • Two hordes of Shadows (ie. A very dark room).
Well, having said that, you could also adjust the test to allow for parties of 3 characters (one standard, two buffers), or even four or five, and the test should hold true as a testing mechanism, of course, then there is also the question of making sure a party of 'two standard, one buffer' doesn't make things go off the charts. Now then, for the cases of unbalanced parties, well, honestly, there is no way to test that, because the values you are looking at are completely subjective, and therefore merely an academic exercises about a paticular group. Take away from this what you will, I just hope I cleared something up. Also, two more things, this section shouldn't be headed 'Inaccuracies', it should be headed 'Questions', for accuracy, and, if a class passes the SGT, then, in a game, feels underpowered, then you probably aren't playing the class correctly (taking bad feats, underoptimizing, etc.) → Rith (talk) 08:56, September 21, 2009 (UTC)

Reverting indentation

People have probably said this already, but one of the ideas behind the SGT is that not every class or build will win out over some of the same general types of encounters (bruisers, casters, groups, high-mobility enemies, etc.) Ideally, using a SGT you'd be able make a party that covers the varieties of encounters and let the party handle what they can reasonably expect to see.
I've seen people suggest testing a four-man group (for pretty much the same reasons you've said), JD, but the general feeling is it'd be too many variables and wouldn't let you check specific characters to see if they're under/overperforming. I wouldn't want to play a character who can't keep up with the rest of the group, and not many people want to play in a game where one character wins everything for the party, you know? --Genowhirl 11:48, September 21, 2009 (UTC)
I'm sorry JD, but DnD is not only played as a 4-man group game. It was an assumption made by the designers that is simply a ridiculous overgeneralization that leads to massive gaps in testing. It may be a common way to play, but attempting to test against it leaves out or ignores so many variables as to make it both less accurate and precise. Even if that wasn't the case, and you could safely ignore all non-4-man-groups in testing, it suffers an inability to identify the individual contributions of each member to ensure that they are keeping up with each other and not simply playing second fiddle in a symphony that works just as well without them.
So let's use a monk as an example, since he can easily be placed in a 'standard' party that passes a 4-man test. The designers did something like that anyway when they decided that it was a great class. That character is really weaker than the rest of the party, and I would say that it is the fault of the test for not picking it up precisely because it can be passed in spite of his not pulling his weight. That character could also have been placed in a party with a wizard, cleric, and druid, and would have been largely unnecessary and the test still wouldn't have seen it, because that group would pass. Do these cases mean that the class is balanced and fine in the right party, in spite of player complaints that it's not powerful enough or that he's just "running around distracting enemies while everyone else does the real work" (a real complaint from the last time I caved and let someone play a monk, at level 10)? As a counter example, he could be placed in a party with a paladin and a fighter and a barbarian and been almost as useful as everyone else, but the group probably wouldn't pass the test. Is that a failure of the group, or the class, or both?
The test you're proposing lacks the precision and granularity to find issues and also the accuracy to fail groups with bad apples but otherwise good coverage. You can take SGT results and stack them up on top of each other and get a good sense of how a group of characters will be able to deal with general challenges, because you will see the individual gaps that may or may not be filled by the additional group members. You can't go the other way with a 4-man test, and that major failing combined with incorrect assumptions about 'how the game is played' really do make it a non-starter. - TarkisFlux 16:37, September 21, 2009 (UTC)
Here's one hypothesis. Take a party of Fighter/Cleric/Rogue/Wizard and run these through Same Game Test as a baseline group. Next, replace one character with the class you want to test, and run the Same Game Test the same way. If average party efficacy goes up or down, we can measure this as the difference between the new character and the character he has replaced. By measuring only the difference between two classes, you can have an accurate reading in a context closest to real play where balance is most important. If Ghostwheel's argument above is correct, class synergies in 3E are only a minor factor, so what classes you pair them with won't matter too much.
I think testing in a group of four is reasonable. The average 3E group has between 3 and 6 players, surely. Larger and smaller groups exist, but I'm certain that four-man is more commonly played than one-man.
One-man SGT sounds like a good way to test a character's challenge rating as an NPC. This reveals that certain core classes are significantly weaker as solo opponents than Level = CR would suggest, which most D&D writers would agree with. Compare the SRD:Titan to a level 21 fighter and I'm certain you'll find the fighter woefully underpowered. I'm just not certain this means a level 21 fighter needs to be beefed up to the level of a Titan.
Certainly it's not true that character level always equals challenge rating, or else we wouldn't have level adjustment. --Jonathan Drain 18:44, September 21, 2009 (UTC)
I actually don't agree with Ghostwheel on the synergy point, but it's not really relevant at the moment. I don't think either of us is going to make any actual progress until someone sits down and does the test as you think it should be done so that it can be held up to more substantial scrutiny (instead of this theoretical back and forth), which it will either weather and prove me wrong or crack under. As I have neither any interest in it nor the time to do it, I'll leave that as an exercise for someone else.
You mention the Titan against a 20 fighter, and that the fighter would get stomped, or charmed, or mazed, or something equally nasty and loss causing pretty much all of the time. That alone does not mean that a fighter needs to be boosted, and suggesting so suggests a misunderstanding of what the SGT tells us. We don't care about individual results on the SGT, but only the aggregate results. We care about the fact that a fighter can't deal with any of the CR 20 challenges in an even way. On average, he loses to everything at that level, and as such he needs something if he's going to be more relevant at that level than a torchbearer. He may need to be boosted by artifact swords or by taking the leadership feat so he's really a fighter and a slightly weaker wizard who walks around making up for his shortcomings, but he does need to be boosted (and as these solutions generally involve you not playing a fighter anymore but a sword or a follower, I find them quite distasteful).
He may not need to be boosted to the level of a Titan, because the Titan may live in a specific role environment that isn't appropriate for the fighter, but he does need help if you want his level to mean anything. Back in 2e when you had different xp tracks and your level didn't mean as much as your xp total it was completely fine for an equal level fighter to be worse than an equal level wizard, because it took less to get there. Since the game has moved on to a point where the entire party is supposed to be roughly equal in power if they are of the same level (not that they can't be specialized in an area, but it needs to even out), that doesn't fly anymore.
No one here will dispute that character level is not always equal to challenge rating, and you could hold up a monk or barbarian or spellthief or warmage or lots of other things as proof without even bringing up the poorly done system that is LA. The point of this guideline is that it should be the case, because if it does work out then the game proceeds more smoothly for all involved and we have fewer people complaining about not feeling like they contribute. I'll grant you that the game doesn't always work like this and you don't have to follow this if you don't like it or are more concerned about other things (you can seriously ignore this for your own work and we will still offer whatever criticism we can to help you get to whatever balance point you want), but as a guideline it works very well for making sure that new homebrew classes aren't out of line with existing challenges or reasonable class builds. - TarkisFlux 20:00, September 21, 2009 (UTC)
  1. Level SHOULD always be equal to CR, despite the fact that many core classes fail overwhelmingly at this.
  2. Yes, Fighter sucks, thank you for pointing that out.
  3. LA is an asinine concept, and whoever thought of it should have been fired.
Thank you. → Rith (talk) 23:52, September 21, 2009 (UTC)
As a totally superfluous side note, I think Sean K. Reynolds has claimed the whole LA system as his idea/writing. Something like that. --Genowhirl 02:19, September 22, 2009 (UTC)
  1. If level should equal CR, when it does not, then the balance point defined by SGT is going to be significantly higher than the normal standard for Dungeons & Dragons. That'll create a new tier of special high-powered classes.
  2. Same Game Test doesn't take into account the amount of damage the fighter enables by taking hits for his party, and the amount of time he buys by tripping the enemy. I won't deny that it's one of the weakest core classes, but it may not be as weak as it seems.
  3. I concur that level adjustment doesn't work well. Here's a great example: Sean K. Reynolds' Anger of Angels features the Seraph, a CR4 creature with 4HD and LA +9. It's an ECL13 character with a host of special abilities, none of which make it more powerful in practice than a level 4 cleric.
Suppose the titan is the benchmark for a level 21 warrior-type class. The fighter already has some of the right attributes:
  • 370HP (for example, d12 hit dice with 30 Con)
  • AC32 (+3 Dex, +13 fullplate +5, +5 natural armour, +5 deflection, +2 insight AC)
  • +37/+32/+27/+22 to hit (+21 base, +10 from Str 30, +5 magic weapon, +2 feats)
  • Saves at +26/+13/+21 (+11/+6/+6 base, +10/+3/+3 Con/Dex/Wis, +5 cloak, +5 will misc)
To match the titan, he would also need to gain something equivalent to:
  • Damage reduction 15 to something rare
  • SR32 (about half of all magic misses)
  • 41 damage per hit on average (2d6+34: +10 Str, +5 magic, +5 feats, and 14pts or 4d6 from elsewhere)
  • Offensive spells at will, including chain lightning, fire storm, charm monster, greater dispel magic, hold monster - around DC22, so not very effective against CR20 monsters
  • Defensive spells at will, including invisibility and cure critical rounds
  • Quickened chain lightning each round for 20d6 or 70 damage; CR20 will pass the save easily so this amounts to about 38 damage to one target and 19 to every other.
--Jonathan Drain 11:25, September 22, 2009 (UTC)
Point by point:
  1. No, there is no "new" tier. There are just fighters that are on par with wizards, which freaks the crap out of people who don't know any better.
  2. You assume the fighter is going to be able to do these things (take hits, trip, etc.). If he's a great tripper, chances are he's blown most of his feats getting to that point and can't do much else (Spiked Chain, Combat Expertise, Improved Trip, etc.). In said case, how does he combat flying monsters or quadrupeds? What happens when highly mobile monsters ignore him due to his need to get close while they focus on everyone else? If he's got the right feats (Stand Still, etc.) to be a zone controller so that he can dictate what enemies do, that still requires that he gets close and it's also no guarantee of control, while wizards could just cast forcecage. The equivalence is just not there. Monsters that teleport at will get around him as well, so I guess the point is your intangible benefits, while not fake, do not make the SGT any worse of a predictor because the same would apply to other fighter-types who can pass the SGT (warblade, for example) with closer to the expected efficiency.
  3. N/A
  4. Regarding the titan: the fighter doesn't need to match the titan at anything. He just needs to be able to beat 50% of encounters with a CR equal to his level. If titans whoop his ass every time, that's fine, as long as he wins other combats. The problem with the core fighter is it can't.
Adding superfluous text here so that the next post lines up more indented-like. -- Jota 13:35, September 22, 2009 (UTC)
Level already does equal CR for a subset of the classes, and you could look over here for some of them. All that setting that as the standard would do is make sure that new stuff and corrections to old stuff get on the same balance point as existing well balanced material. This isn't some new ultra powered balance point that's different from a non-existent and not defined 'normal' balance point in DnD. It's just a rejection of the current situation where some classes are allowed to be weaker than others in every area of the game all of the time. It's a rejection of the idea that if you go out and you get the same XP as the guy next to you that you're allowed to suck more than he does in every area of the game all of the time because your class is different than his. It really is not as big of a deal as you are making it out to be. - TarkisFlux 17:30, September 22, 2009 (UTC)
I like the SGT because,
  • Helps newbie's create classes that can defeat unique challenges. Not just a sword-wielding humanoid.
  • A passing class can survive challenges more often. Which is more fun for everyone in the group.
  • Teaches that every class should see itself acting alone, at least once, without the parties cleric or wizard.
  • Encourages breadth, depth, and feedback.
I dislike the SGT because,
  • Ignores party play when testing alone.
  • Encourages a one-man army. Should the tested class pass >50% challenges.
I take this system with a grain of salt. I do not consider it perfect. I do think it is a positive learning experience. --Jay Freedman 19:08, September 22, 2009 (UTC)
Isn't it also important to note that some of the encounters listed in the SGT are likely not intended for an evenly balanced character to be able to take one-on-one (I would be referring to creatures that most players would probably regard as party killers such as dragons, titans, maybe drow, and similarly powerful beings). So maybe if your class can breeze through said encounters, it seems logical to guess that it may be overpowered instead of having a class that can win against the not-so-tough encounters but struggles against dragons and the like. Basically, Jay, it's not encouraging a one-man army; if anything, it seems to me like this test would simply be the red light to tell the player that their class/build qualifies as such. - TG Cid 20:56, September 22, 2009 (UTC)
That makes sense TG. No class is meant to win more than half of the encounters. Good call. --Jay Freedman 22:40, September 22, 2009 (UTC)
Every encounter on the SGT list is within 1 EL of the level it's for, taking into account that some monsters are awesome and have their CR understated. Dragons, for instance, have their CRs consistently understated by 4, so a CR 6 dragon is listed as a level 10 challenge in accordance with its real CR.
As for "4th Man" tests, the Gaming Den had a discussion a while back [2]. The conclusion was that they were too much work for the marginal improvement in accuracy. --IGTN 00:37, September 23, 2009 (UTC)

Other Options[edit]

If the character challenging the SGT has other options instead of attempting to fight and defeat the creature (sneaking past, being invisible, flying around it) would it still be considered a loss? What about if they have the ability to get away from the enemy almost at will? Would it still be considered an actual loss if they don't die? Just because if the only way to "defeat" encounters were to actually kill the enemies, the rogue would have no chance of passing the SGT without a flanking partner. --Ghostwheel 02:45, September 28, 2009 (UTC)

I would argue that if they can automatically trivialize the encounter in any way, then it is a win (whether or not they kill it). The DMG says that exp is awarded if a fight can be avoided. If you earn exp, you have defeated the encounter. --Andrew Arnott (talk, email) 02:53, September 28, 2009 (UTC)
Agreed mostly. A class that can teleport probably wins the hallway if it can see the end and skip the traps because it has bypassed them and moved on. Being able to run away is not the same as being able to get past the encounter though, and in cases where escape is trivial but moving forward is not, I would argue it is a loss. Which of those fit those criteria in the SGT is a good question. - TarkisFlux 03:32, September 28, 2009 (UTC)
IMO it a loss, a using wand of invisibility and a scroll of silence could get you by most low-level encounter. Making say, a human paragon a wizard-level class (it not), a character should be able to defeat the encounter not avoid it. Also some encounter cannot be avoided, like a animated statue ready to crush the nearby town (Cr 5) if you avoid it you win? Nah it going to raze the town. The key of the treasure of GAHHAHAHLHA is around the mummy neck? Bypassing the mummy won't let you win. Avoiding a encounter is far to circumstantial to be a valid tactic for metagaming observation. --Leziad 03:39, September 28, 2009 (UTC)
That would be a class using wizard-level tactics, and we know that wizards don't pass the SGT (they beat it), so that doesn't really work. Rather, the SGT is meant to measure a character's/class's inherent abilities, not the stuff they can pick up from other classes. --Ghostwheel 03:42, September 28, 2009 (UTC)
A human-paragon gain UMD (they can) and most will. It not like picking stuff from other classes, it not a valid tactic to pass the SGT imo. --Leziad 03:56, September 28, 2009 (UTC)
In theory, you can do this with any class. 4 ranks in UMD (at 2 skill points per rank) + 2 (masterwork item) + 5 (2.5k magic item) + 1 (charisma) = +12 to UMD, more than enough to UMD a wand. Does that mean all classes can beat the SGT without a problem? Of course not. They're just borrowing the wizard's tricks. Instead, one should focus on class abilities instead, rather than borrowing stuff from other classes. --Ghostwheel 04:10, September 28, 2009 (UTC)
And if you're doing Tome rules, Product of Infernal Dalliance + Greater Teleport (won't help at fifth level in most cases (no feat selection), but still) will allow some classes that otherwise could not pass the SGT to mutilate it... -- Jota 05:48, September 28, 2009 (UTC)
Eh, most stuff in the Tomes doesn't pass the SGT, with the Base Classes in the Dungeonomicon being an exception. --Ghostwheel 07:22, September 28, 2009 (UTC)
I take back what I said. Leziad is right -- combat encounters should be considered combat encounters. --Andrew Arnott 13:47, September 28, 2009 (UTC)
Why do people think that jesters and assassins aren't better than the Races of War classes? Remember, UMD exists - it's to be used.

Cleaning up the SGTs[edit]

So having just stared at this again, it stood out that the 5 and 15 SGTs don't really match up well to the level 10. There's more stuff in them, and more things that look like the same encounter types with different creatures. I want to edit them down a bit so that there is less overlap. I'll probably post them here first as soon as I have them sorted. But before I go spend time on it, anyone opposed? - TarkisFlux 18:17, April 20, 2010 (UTC)

About the 15 SGT, would any player with a half-decent ranged attack and flight with a decent speed be rogue-level? Still a few specifics there that I don't agree with, but that ... doesn't sound right. --Ghostwheel 03:51, April 21, 2010 (UTC)
In an outdoor campaign, maybe. There's lots of times where flight actually doesn't matter at all, they're just not well specified in the SGTs. I should probably assign default terrain to some of these as well (which I sorta assume anyway) so that it tests what we think it tests and not some other thing. - TarkisFlux 04:13, April 21, 2010 (UTC)
Mind weighing in on the talk page above? Seems like it's possible to do the above, which, like I said, doesn't really sound right... --Ghostwheel 05:46, April 22, 2010 (UTC)
Maybe later. I don't even have a direction to take it in that would be helpful. It's not the same game at level 15 that it was at 10 unless you're playing low rogue or below, and that whole page is attempting to treat it like nothing changed in 5 levels and the same tricks work. Which is the whole point why fighter types don't keep up, if the same tricks worked they'd be fine (and often are, ish, in fighter level games). Actually explaining that turns out to be really hard though. I need to spend more time dissecting the SGT and work through some edition confusion (wall of force won't let you sphere anymore? wtf?) before I can call BS / support / whatever anything over there. - TarkisFlux 15:27, April 22, 2010 (UTC)

Level 5 - Updated[edit]

So, the first one has been brought back in. If you feel that anything was missed or should be brought back or is decidedly unfair or whatever, please say something. - TarkisFlux 00:41, June 19, 2010 (UTC)

Levels 10 and 15 - Updated[edit]

Welp, I've made an attempt. It could probably be easier, harder, or more exotic. --Foxwarrior (talk) 04:37, 11 April 2013 (UTC)

Terrain and Detail[edit]

This post here reminded me that the SGT is woefully underdescribed. A few sentences talking about the environment for each encounter would vastly improve the SGT's usefulness for assessing mobility and battlefield control features. --Foxwarrior (talk) 18:21, 10 April 2013 (UTC)

That's part of what I started doing a while ago, before I got bored and stopped after doing the L5 SGT. Feel free to add them though. - Tarkisflux Talk 18:31, 10 April 2013 (UTC)
I believe that there's a general consensus now that the SGT is pretty damn whack and is quite meaningless as an actual test of where you should place a class. Changing it means nothing, since it makes it no less irrelevant, and whether we use this version or the one before, classes should not be changed based on it IMO. --Ghostwheel (talk) 02:19, 13 April 2013 (UTC)
Well, it's a form of quick-and-dirty theoretical playtesting. Usually, the process of writing out how the SGT encounters would go gives people a clearer idea of how the class would actually work if used in a game, which is nice when discussing the balance point. --Foxwarrior (talk) 02:56, 13 April 2013 (UTC)
I believe that the general consensus is that the SGT is open to too much interpretation, potentially suffers from gear / feat / spell bloat, and is generally imperfect. Which makes it of use for ballparking things and not for pegging things concretely, but far from meaningless or irrelevant. Fox is attempting to resolve some of the openness here (as I was before), which can only help it be a better generalizer than it was previously.
But this is mostly the same tired positioning again, and has added nothing but posterity. - Tarkisflux Talk 05:14, 13 April 2013 (UTC)


The above sentence pretty much tells me that the SGT may be too rough to specify this yet, but I'm running it with my own character and I've found on both the 5th and 10th level SGT that the way the Necromancer is played (and built, and even if it's a cleric or wizard) drastically changes the results of the test. Would it be possible for someone to specify how the author thought the spellcasters should be played? Even something as simple as "cleric, uses BFC and doesn't attack," or "wizard, runs away, buffs allies, and spams SoD/SoL's" - 12:25, 13 June 2013 (CNT)

The authors assumed that it was just the single character and they were using whatever the best option for the encounter was, so generally SoDs.
That said, any of your those possibilities are valid builds and give valid and different results, though the "runs away" part is problematic as the test assumes it's just you vs them and there are no allies to buff up. But that multiple build -> multiple result ambiguity is a real thing with the test. It helps you determine what sort of tactics are appropriate for a variable balance class (like most of the core stuff) in a game with who knows what level of balance. It's why the balance page refers to builds more than as classes, why we generally peg classes and feats and spells to their reasonably optimized level instead of worrying about their fully optimized or fully sandbagged builds, and why the SGT really frustrates some people here (well, that and the gear dependence). - Tarkisflux Talk 17:17, 12 June 2013 (UTC)
Actually, fully specifying the spell selection and class of the Necromancer would be a useful activity for someone or other to do. Specifying tactics (past the first round), though, would imply less versatility than a Cleric or Wizard Necromancer actually has. -- 18:06, 12 June 2013 (UTC)
I seem to have failed at reading this morning, because I thought you were asking about running a necro through rather than the L5 cleric and the L10 wiz specialist in the test. That's what I get for trying to be helpful after not sleeping. I'll kick it around, see if I can come up with anything more specific for them. - Tarkisflux Talk 18:55, 12 June 2013 (UTC)
Note that the validity of the SGT has been fairly strongly dismissed in that it's way too reliant on exactly what you have, what tactics you use, what consists of "defeating" an encounter, and the progression of an actual class. While it's half-way decent at ballparking, I would strongly recommend not using it as a basis on which you assign an article into a balance range. --Ghostwheel (talk) 19:45, 12 June 2013 (UTC)
What Ghosty said, not to mention that because it only provides a basis for a single character combat, the SGT is also partially unreliable due to the prevalence of party-based characters (that is, to say, the ones who would buff allies, etc.). Because their abilities are party-dependent, their actual power can't be determined in an individual setting like this one. - TG Cid (talk) 20:48, 12 June 2013 (UTC)
Yes, yes, these are well known problems. Dead horse is still dead, and I'm still waiting for a reasonable replacement other than the "compare with material we somewhat arbitrarily assigned to these levels, and still generate arguments over the exact same interactions and limitations as the SGT" setup that the SGT's downgrading has left us with. - Tarkisflux Talk 21:13, 12 June 2013 (UTC)
I still think that Aarnott had a fairly good starting point, setting "benchmarks" for different ranges which set their approximate ranges. The more benchmarks you fulfill of a certain area, the more likely it is that you fit into that level range. So for example, VH benchmarks might include the ability to fly by level 5-6 (covers wizzies and sorcs), extensive ability to use SoDs/SoSs, the ability to double (or more) the effectiveness of your entire party (looking at you, Haste/Polymorph), and the ability to scry-and-die at higher levels. Obviously these are just what I'm pulling off the top of my head, but I think they'd be a good starting point, and for the other ranges we could have damage benchmarks and the like, since they differ less from each other (mostly in terms of damage, honestly,) compared to VH classes. --Ghostwheel (talk) 21:34, 12 June 2013 (UTC)
It was a decent starting point. And then no one finished it. I put in a bunch of time on one before coming to the decision that it wouldn't be enough better for me to roll my own, on my own (because committee discussions weren't going anywhere), and then defend it and get it adopted given my position on the appropriate uses, benefits, and limitations of the SGT. Since you think it is enough better, perhaps you should give it a go. - Tarkisflux Talk 23:13, 12 June 2013 (UTC)

Just An Observation[edit]

This system works only in one dimension. While the tests are fantastic at gauging combat, they fall flat at telling me how well-rounded a character actually is. The trapped door is a start, but what about basic survival, urban or wilderness? How to gauge political savvy, or social skills on the personal level? Puzzle solving, or just observation ability? Even more important, how does this system tell me how well a character would do in a campaign where combat is just plain a bad idea? Examples that spring to mind are diplomatic/cold war campaigns, alliance winning games and the romance story arc. If my character can survive a burning forest full of elemental dire badgers 100% of the time, but can only succeed at skills to notice betrayal or clues in a mystery game on a nat 20, how is my character optimized?

I draw attention here because this seems to be where all unoriginal and bad ideas seem to be stemming from, and want to know why good ideas not based on this rating structure seem to be rated down. Then again, not sure what a wiki needs a rating system for anyway... 00:04, 20 November 2013 (UTC)

The balance rating system reflects the way WotC's content has a wide range of power levels. It lets some people make casters that work in a campaign with a Fighter and other people make warriors that work in a campaign with a Sorcerer, without the two groups of people getting the wrong ideas and raging at each other too much. It can't be a perfect system, because WotC didn't really seem to intend to create clear balance points (and for literally perfect balance, you'd need to know everything about the context in which it will be used).
It's possible to expand the SGT with a broader variety of challenges, but since people generally care about their win percent, it's somewhat presumptuous to change the SGT; I only did it when I discovered that the lack of environment descriptions by the creators was due to laziness rather than by design.
If you'd like to write up some sort of OGT (Other Game Test) for non-combat challenges that people can measure their classes against, that would probably be pretty cool. --Foxwarrior (talk) 00:44, 20 November 2013 (UTC)
Also note that 90% of WotC content is combat-oriented; if you want to have an intrigue, mystery, exploration, or similar type of campaign where combat is for the most part eschewed, there are many other systems, both more and less complex, that tend to do it better. Since D&D focuses almost exclusively on combat (and even many non-combat spells like Charm Person say what they do in combat), the focus of balance should be on the system's forte--combat, which will often be the centerpiece of the majority of campaigns, and take up most of the time during a session.
Note that I don't mean that that's ALL the system can do. I've run and played in many games that were less focused on combat. But having been introduced to other systems, I can say that they do such campaigns better, even if they're not as popular. --Ghostwheel (talk) 01:00, 20 November 2013 (UTC)
And some of us like to shove the square pegs in whatever holes they got. I'm talking about the guy who just bought D&D or got it as a christmas present, doesn't want or have the means to buy one of those "better systems," is too honest to pirate, but still wants to play something inspired by Jane Austin with wizards and sword fights every tenth session.
As for some examples: tracking challenges, "how many people can I feed in this sucky situation?", convincing people to surrender with decreasing advantages (because it would be easier to convince a warlord who has lost to give up compared to a warlord who is button push away from nuking everything), determining friend from foe, leading armies in the existing scenarios x100, etc. 01:15, 20 November 2013 (UTC)
On the first bit, I don't think that the person you're speaking of would find their way here--but if they did, then they'd get a primer on what's stronger in combat, which is where you're most likely to have the worst thing happen to your character (more-or-less), that they die. It's where the majority of DMs screw up, getting frustrated that people are beating combat too easily or not dying where the guidelines say that they should have no problems.
As for the second, those are booleans for the most part.
  1. Do you have Survival and the Track feat?
  2. Do you have Survival? (Or a spell that creates food--magic can do anything.)
  3. Do you have Diplomacy? Reduced to a single roll by the rules.
  4. Do you have Sense Motive?
  5. Do you have the Leadership feat?
Those are just some examples that illustrate how D&D is a poor system for those kinds of things. If you want to try to bash those square pegs into round holes, that's certainly one thing to do, but it's not going to fit as neatly, work as smoothly, or be nearly as interesting as using a different system for resolving those kinds of situations. --Ghostwheel (talk) 01:22, 20 November 2013 (UTC)
The one thing I never got about D&D 3/3.5- their advocates seem so keen on selling the system based on what is wrong with it. I see the same books and skills, and I see a system that is like any other good system- tries to cover most bases, but still incomplete enough to be fun. TSR made a mistake selling the game, that we have seen, but did right in the OGL. Now we have awesomeness like Pathfinder, Paizo, Green Ronin, and huge amounts of books and websites dedicated to providing ideas. And if you still aren't finding what you are looking for, make your own shit up and post it!
But here... so many people complaining about bad ideas. A rating system that reinforces the idea that there even are good or bad ideas in a subject that is 99.9% subjective...
One question: would members agree that not only is this a game of ideas, but that those ideas should be explored and proliferated, regardless of whether or not they suit individual tastes, on the grounds that they match someones tastes? 02:09, 20 November 2013 (UTC)
Bwa? I see a few conflated things here, so I'm going to deal with them one at a time...
The SGT - yes, this is a combat heavy test. It is not designed to tell you how well rounded a character is (which is almost impossible in the specific with the myriad options that skill and feat selections open up for any actual character), nor how well rounded a class is (which might be a slightly more tractable problem). It is entirely a test designed to answer complaints of "overpowered" or "underpowered" so that you can tell if one character is likely to overshadow another in a game, and deal with that as you feel like dealing with it. In that it sort of works with the right set of assumptions, sometimes, which is why we initially used it to determine our balance categories. The idea was that if you group classes with similar power curves under the same label, picking only from within a label will reduce complaints of "OP, nerf it!" in game (as opposed to using the test to edit classes up or down in power until they were all on the same page, which is also totally a thing you could do). And that grouping plan works pretty well in general even if it is still subject to campaign specific fluctuations, which is why we kept them even after moving away from this test.
That bears repeating: we don't really use this test very seriously anymore. It has some well known limitations, even within its rather limited goals of determining overall combat/trap encounter survivability, and there was no consensus on solving them, so the balance categories are mostly a "feel" and "compares well with" sort of thing these days. None of the balance categories are inherently good or bad though, and I've played enjoyable games in all of them.
On Other Challenges - The challenges you propose aren't irrelevant, but they're also not necessarily common in DnD games. Some of them also lack actual rules within the game (possibly the incomplete bits you were referring to) or the rules for them don't work well, and that makes testing them in similar fashion basically impossible. They're also completely orthogonal to this test, and you could well do both if you wanted a more complete picture of a class. If those bits are more important to you than the general combat/trap encounter survivability, then that test would be helpful in the same way that this is helpful for combaty things (at least, if you care about keeping PCs roughly equal anyway, no one said you have to) and you could safely ignore the balance categories generated by this one. It's supposed to be a tool to help people make their desired option in a way that fits their power playstyles, and if that's not a relevant concern for you then you should feel free to disregard the tool (or accept community label for your own material).
On Bad Ideas - There are a few actually bad ideas out there in that they fail to meet their stated goals, but many more ideas that are simply "bad for my playstyle preferences". People are welcome to downrate the former because it needs actual work still, but downrating the second counts as a violation of the merit guidelines and would get the rating blocked or removed (this was more explicitly codified relatively recently though, so some older ones may have slipped through. also, volunteer enforcement potentially leaves holes). Playstyle reasons are not valid for hating on something, and not valid for getting something removed. So in that respect, I completely agree with you that ideas should be explored and available whether or not they match my personal taste, so long as they're actually complete and functional and meet their stated goals. We have sandboxes for in-progress or needs-work things.
On Rating System - Originally this was designed as a system to promote the "best" articles to the top so that they could be featured on the main page and discovered by a larger group. It's gone through a few evolutions but still serves that purpose while also indicating that people have looked at an article (and were sufficiently interested to write something) or noting when things need to be sent back to the drawing board (as indicated above). So we have it as a peer review tool above and beyond normal comments, because it does useful things for content discoverability and pruning (there's already an "everything stays" wiki and the admins here wanted something different). Some people are very, shall we say, 'generous' with their opposition to articles that don't match their view of what the game should be. And that's extremely frustrating to me personally and rather damaging to being an open and welcoming community, but I haven't figured out a way to deal with it without losing the very real and tangible benefits that the rating system allows us in general.
Wrapping Up - Hopefully that clears a couple of things up. If you see an idea that you think is good get rated down, I encourage you to make an account and rate it up yourself (or login to yours if you're just posting anon for this conversation) or ask about it in our chat. Or leave a message on my talk asking me to take a look at it. I don't promise that I have time to take a look at everything, but I try to keep abreast of things here. - Tarkisflux Talk 08:21, 20 November 2013 (UTC)

What does the hallway full of runes actually do?[edit]

symbol spells? sepia snake sigil? explosive runes? something else? how many of each? caster levels? save DCs? hallway dimensions? material type? —Preceding unsigned comment added by (talkcontribs) at

I always imagined it as explosive runes. To pass it, basically have infinite hp/fast healing, be able to dimension door or teleport some other way, or have trapfinding and search/disable device. --Ghostwheel (talk) 08:44, 28 August 2019 (MDT)
How is this supposed to be a useful, objective test if it's left up to how each user "imagines it"? Why is this information not provided in the test itself? As is, the reader is given no indication how to actually run this test, making it useless. —Preceding unsigned comment added by (talkcontribs) at
Yeah, it's not really an objective test, and things like the hall of runes especially are maddeningly vague, but quickly running through what you think a character would do for each encounter in an hour has helped me get a better gut feeling for how strong and versatile a class is a few times. --Foxwarrior (talk) 14:29, 28 August 2019 (MDT)
How is anyone supposed to "run through what their character would do" when there's no indication of what kind of threat or problem they're facing? —Preceding unsigned comment added by (talkcontribs) at
That's idiotic, the whole point of a balance test is to provide a way of comparing relative balance.If everyone's making up their own test, it serves no purpose. —Preceding unsigned comment added by (talkcontribs) at
Added some recommendations for information to add so that the test is actually coherent. —Preceding unsigned comment added by (talkcontribs) at
I'm not a huge fan of the SGT, but basically it's a bunch of benchmarks for a character to check themselves against to see if they have what it takes, and you can usually eyeball the result pretty easily if you understand the game decently.
Also, leaving just recommendations on a main page isn't something we want. Feel free to discuss adding things here, but until it's finished, it shouldn't be added to the article page. --Ghostwheel (talk) 08:21, 29 August 2019 (MDT)
How do you "eyeball" a hallway full of magical runes? That might as well say nothing at all. What if it's a bunch of arcane marks I can just walk past? How does this help anyone? —Preceding unsigned comment added by (talkcontribs) at
Did you read what I said above? If you fulfill one of those criteria, or have something similar, you pass. If not, you don't. --Ghostwheel (talk) 08:59, 29 August 2019 (MDT)
Ok, the hall is 10 feet long and contains a single rune that does nothing. Guess I pass. —Preceding unsigned comment added by (talkcontribs) at
Yeah, good luck getting anyone who's not a sock puppet to agree with that. --Ghostwheel (talk) 15:41, 3 September 2019 (MDT)
Hallway full of magical runes is: can you disable or somehow bypass magical traps? Then you pass. You can't? Then you (probably) fail. Not sure why you're having difficulty with this particular thing. Surgo (talk) 19:29, 3 September 2019 (MDT)