Rhythm and Readability: Why Bubsy: Paws on Fire! is the Best Bit.Trip Runner

Rhythm Games are For Flow

Why do people play rhythm games?

I don’t speak for everyone, but based on the comments I could find online, I think a lot of people share my reason: Rhythm games let us lose ourselves in music, and that feels good.

Musicians will tell you: when things are going well, making music puts you in a euphoric state of complete absorption. You are no longer aware of your own self as a separate entity; you’re one with the music. An anonymous composer put it this way:

“You are in an ecstatic state to such a point that you feel as though you almost don’t exist. I have experienced this time and again. My hand seems devoid of myself, and I have nothing to do with what is happening. I just sit there watching it in a state of awe and wonderment. And [the music] just flows out of itself.”

This quote was provided by Mihaly Csikszentmihalyi in his TED talk on “flow”. Flow is a popular term in games analysis, but in case you haven’t come across it before, here’s a brief summary: “flow” is a term coined and popularized by Csikszentmihalyi to refer to a particular mental and emotional state of being “in the zone”. It’s a form of focus that allows for continual high-level performance without conscious thought. Researchers studying this state in musicians have described it as “effortless attention.”

Flow feels great, but it generally requires a high level of skill. Learning to play a musical instrument well enough to enter a flow state takes a lot of effort, and many of us don’t want to or can’t afford to spend our limited time and energy on this. And for those of us in that position, rhythm games provide another way to achieve flow with a much lower barrier to entry.

Some games like Guitar Hero or Rock Band simulate the playing of actual instruments, allowing the player to also fantasize about being a real-life rock star. But this isn’t necessary for creating flow, and many rhythm games take a more abstract approach. As long as the player’s actions match the rhythm, the player can get lost in the music even if what they’re doing doesn’t look like any real-life musical activity.

After all, sheet music is already an abstract visual representation of music. Sight-reading sheet music is a matter of parsing visual cues and performing the right motions with the right timing on a musical instrument. It’s not that different from parsing on-screen visual cues and performing the right motions with the right timing on a game controller. The connection is easy to see in games like Rock Band where the cues resemble scrolling sheet music, but it’s still the same in other games like Hatsune Miku: Project DIVA Future Tone where the cues don’t look like any sheet music that could exist on paper. These cues may be abstract, but they still represent specific actions that must be taken at the right time.

Other games may use other metaphors, like Bit.Trip Runner where the cues resemble an auto-running platformer. You still have to recognize what action is implied by each cue and perform it at the right time - it’s just that the cues are things like rocks which imply the action of jumping over them. Seeing an approaching rock means to hit the jump button at the right time, just like seeing an approaching musical note can mean to hit the right piano key at the right time.

So, while a game controller is much easier to get the hang of than most musical instruments and the cues of rhythm games tend to be less complicated than those for actual music, playing a rhythm game can tap into the same principles that create flow in actual music-making. You can lose yourself in a rhythm game in basically the same way you can lose yourself in a musical performance. Flow is flow, regardless of where you find it.

But that does mean that finding it requires the same conditions. There are a few things that researchers agree you need in order to create and maintain flow. Researchers may group and describe them somewhat differently, but these are the basic ideas:

  1. Clear goals
  2. Immediate and clear feedback
  3. Challenge that matches skill

Imagine a musician sight-reading a new piece. It’s within their ability to play, but unfamiliar and somewhat difficult so it requires focus. They don’t have time to think about all the notes they’re seeing on the page - the song keeps going, so they have to respond immediately to each note as it comes. They hear the notes as they play them, and make adjustments on the fly if something sounds off. This process is faster than conscious thought - it’s basically automatic. This is effortless attention. This is flow.

Video games can provide these same factors and create a similar flow experience. Their interactive nature means that they can provide immediate feedback to the player’s actions, and multiple difficulty settings let challenge be adjusted to match skill. But some rhythm games stumble a bit with readability and present unclear goals as a result. Even if the feedback and challenge are tuned perfectly, unclear goals will destroy flow, so readability absolutely shouldn’t be neglected. Let’s take a look at a few ways I’ve seen this go wrong.

Failure Mode: Unexplained Cues

Remember our sight-reading musician? They’re playing a new piece, are deep in flow, and having a great time. Each note in their sheet music presents a goal - to perform the right action with the right timing to create the sound the note represents. Suppose that suddenly in the middle of the song, mixed in with the rest of the notes, there’s a symbol the musician has never seen before.

Before this point, the notation has been standard, the sheet music has been readable, and the goals have been clear. But this new symbol is not standard, the musician doesn’t know how to read it, and the goal it presents isn’t clear. The musician can’t proceed automatically with effortless attention - they have to engage conscious thought and try to figure out what the symbol means, breaking their flow.

This is true even if the actual meaning of the symbol turns out to be simple and easily accomplished. It doesn’t matter whether the goal is easy if it isn’t clear. The musician won’t be able to maintain flow on songs that use this new symbol until they learn and internalize that symbol and can respond to it without conscious thought.

This situation isn’t common with sheet music which has had the same basic notation for several hundred years. But rhythm games don’t have standard notation to lean on. Like most video games, they need to teach the player what their own cues mean. The player needs to know how to read the cues in order for the goals to be clear and for flow to be possible. Tutorials in rhythm games are like learning to read musical notation. Once the player has been taught, they should be able to sight-read the game’s levels just as musicians can sight-read sheet music.

But in practice, that’s not always the case. Let’s take an example from Aaero, a rhythm shooting game. The tutorial explains how to maneuver the ship to follow curving rails and how to target and destroy enemies. These mechanics are easy to understand and after a bit of practice they become automatic. After completing the tutorial, most players can probably get through the first couple songs in a satisfying state of flow.

But then Aaero starts using cues it never taught to the player. The third song begins with moving barriers that damage the player’s ship if they don’t fly to the right place - if that happens a few times, they fail the song and must start over. But how is the player supposed to know what the right place is? There’s nothing like this in the tutorial, and the barriers don’t leave much time to react.

It turns out that the colored lights around the perimeter are the cue. Red lights indicate barriers that will block the player and blue lights indicate barriers that won’t, so the ship needs to be positioned at the intersection of the lines implied by the blue lights.

Screenshot of barrier with implied lines and safe flight zone drawn in.

Once explained, it’s easy enough to understand, but nothing in the game explains it. These barriers are like the unfamiliar symbol that stopped our hypothetical musician from sight-reading their sheet music - and similarly, the barriers make it impossible to sight-read Aaero. The player can’t respond automatically and must step back and figure out what the cue means.

For a rhythm game to be readable, it needs to explain its cues to the player before testing on them.

Failure Mode: Ambiguous Cues

Let’s go back to our sight-reading musician. Suppose they’ve learned all the symbols their sheet music uses, including the sharp symbol which normally indicates a note being a half-step higher. But it turns out that in this sheet music, a sharp actually means a half-step lower if it appears in an arpeggio.

Even though the musician knows everything required to play the song correctly, it would still be difficult to sight-read. Whenever they see a sharp, they have to fight their previously-trained instinct to automatically interpret it as a higher note and look at the surrounding notes to see whether it actually needs to be played as a flat. They might eventually be able to make this an automatic response, but it’d be difficult and in the meantime just seeing a sharp can break their flow.

Again, sheet music in real life doesn’t really do this, but some games do. Let’s take an example from Bit.Trip Runner’s sequel, Runner2. Runner2 does a good job avoiding the “unexplained cues” failure mode by introducing abilities in short tutorials at the beginnings of early levels. The player is taught how to use each ability, shown what cues prompt the use of that ability, and given the chance to practice using the ability in response to appropriate cues. Later on, these cues will appear mixed in with other cues as part of the game’s regular challenge. By that time, the player has hopefully internalized the meaning of the cues so they can respond automatically and maintain flow. Each cue is used throughout the game, and although levels are grouped into worlds that each have their own visual theme, the cues use consistent iconography and have identical behavior. Once the player is taught to slide under that fireball in world one, they know to slide under fireballs in later worlds as well.

But then something funny happens. After sliding under fireballs for many levels, a fireball comes along while the player character is riding a rail. The rail slopes downward, so the fireball is actually below the player character when it appears. If the player tries to go under the fireball, they’ll collide with another obstacle. Instead, and opposite to every fireball so far encountered, the player needs to jump over it.

This may seem like a minor quibble, but the game has broken its own rules and its implicit promise that the same cues will always require the same actions. A fireball usually means “slide” but sometimes it means “jump.” Just like the musician couldn’t react automatically to sharps anymore, the Runner2 player can’t react automatically to fireballs anymore. Now they have to consider the environment - are there height changes such that the meaning of the fireball is reversed?

The fireball cue no longer presents a clear goal, but an ambiguous one. To read it, the player must focus on other details they were previously taught to ignore. Fireballs now require conscious evaluation, breaking flow.

The next sequel, Runner3, adds even more ambiguity. That game introduces the slide move in its third level, teaching the player to use it to go under a series of wooden structures.

More wooden structures like these recur in the next several levels as obstacles to be slid under. They are a clear and consistent cue for using the slide move. That’s all well and good.

Then in the ninth level, Runner3 introduces a new move: the double-jump. At first this seems fine; it’s just a way to cross particularly long gaps or go over particularly tall obstacles. But then the game presents one of the wooden structures the player has previously needed to slide under - with a gold collectible on top of it.

Looking at just this, everything still seems fine. It’s clear that the player is supposed to double-jump over the structure and collect the gold. The cue is readable and the player can respond automatically. The problem is that - just like with the fireballs - this inverts the meaning of a previously-established cue. Up to this point, the game has consistently presented wooden structures as an obstacle to be slid under, and now suddenly the player is obliged to jump over them.

However, it’s actually much worse than the fireballs in Runner2. In that case, the player had to be vigilant to look for height changes around fireballs to know the proper response. But in Runner3, the wooden structures the player is meant to jump over are at the same height as the ones they are meant to slide under. By sending the player over them when it’s still possible to go under, the game is emphasizing that the double-jump means that the player now has choices.

A lot of the obstacles that previously needed to be slid under can be double-jumped over. The player can even go back to previous levels and jump over those obstacles if they want to. That includes the wooden structures, but also floating enemies and Runner3’s equivalent to the fireballs. All of these obstacles are now ambiguous cues - they can all be avoided by going over or under them - and sometimes, but not all the time, there will be a correct answer.

Every time the player comes across one of these obstacles, they don’t just have to figure out whether it’s slidable or jumpable - it’s usually both, and the player has to figure out what the consequences will be for each approach. There isn’t always gold sitting on top of or under it to make things clear. The player can’t respond automatically to each cue because the cue is no longer readable on its own. Its meaning changes with context, so the player has to look further ahead and consciously evaluate what’s coming or play the level multiple times so they know what’s coming.

For a rhythm game to be readable, its cues need to have consistent and unambiguous meanings that don’t change with external context.

Failure Mode: Misleading Cues

Let’s check back in with our sight-reading musician. They’ve internalized all the cues in their sheet music, including the ones that can mean multiple different things. They can handle anything the song might throw at them. But now suppose that in some parts of their sheet music, the printed notes actually move around when the musician is about to play them.

At this point, the sheet music is actively fighting the musician. Even if they are fully versed in every symbol used by the song, it’s nearly impossible to sight-read, because some of the information it contains is outright false. The musician has no real choice but to play the song and memorize which notes move around with very little chance of getting them right the first time.

Of course, notes don’t move around in printed sheet music. But sometimes cues in rhythm games act this way. Here’s an example from the very first level of Runner3, when the player is still learning the basic vocabulary of the game. After the player runs onto this platform, it suddenly drops down and then tilts to an angle.

The player couldn’t possibly predict either of these movements and they don’t have much time to notice that the end of the platform has ended up slightly below the next part of the path. There’s no obvious gap to make it clear the player should jump, but if the player doesn’t jump they’ll collide with the path and get sent back to the latest checkpoint. I’d expect the vast majority of players don’t jump on their first playthrough of this level - it’s not a clear goal since the cue is obscured by the sudden movements.

It’s a dirty trick for the game to play in its first level, but on the plus side I guess it’s good that the game shows its nature and intentions so early. Runner3 keeps using unpredictable changes as a source of unfair difficulty, through things like moving platforms and sudden shifts in camera angle. A particularly brutal example comes in the game’s seventh level, which teaches the player to use spring pads for high jumps. After a few demonstrations of using the springs to cross gaps or gain altitude, the player comes to a cliff. Below the cliff is a platform with a spring pad, and the player might naturally expect to drop down and bounce back up, but - surprise! - just before the player reaches the cliff, with not nearly enough time for the player to react, the platform shoots up to the player’s height, and they’ll crash into its side if they didn’t somehow know they were supposed to jump.

This kind of “gotcha” moment has no place in a rhythm game. It’s one thing to use surprises to slap the wrist of a careless player in a game like Dark Souls, but there is no way to play Runner3 cautiously. The player character runs forward at constant speed regardless of the player’s input. In a game like Runner3, all that these surprises do is create trial-and-error gameplay by obscuring goals to the point where the player cannot reasonably achieve them without the extra information gained through failure. It’s a near-guaranteed loss of flow, as players can’t just look at what’s on the screen and respond automatically, and it keeps happening level after level.

I get the impression the creators wanted the game to be more visually interesting and to provide a lot of surprises and “wow” moments, but it ends up sacrificing readability to do so, resulting in a game that’s consistently impossible to sight-read.

For a rhythm game to be readable, its cues need to be presented in straightforward ways that don’t hide their meaning or deceive the player.

Getting it Right: Bubsy: Paws on Fire!

As described above, Runner3 is very difficult to read. But the good news is that the developer turned things around with their next game: Bubsy: Paws on Fire!. This is a game which plays like Runner but stars characters from the Bubsy series. And it’s enough of a refinement of the Runner formula that it ends up being a textbook example for doing readability right in a rhythm platformer.

The Runner games never really had a problem with tutorials or unexplained cues, but Paws on Fire! deserves credit for getting all of its tutorials out of the way very quickly. Each of the game’s four playable characters is simple and easy to understand, and all their abilities are taught in their first level. After that, the player is done learning what buttons do and the rest of the game is about applying those abilities in more difficult and interesting ways.

Misleading cues aren’t really a thing either. The camera stays at a static angle, positioned to show the player what’s coming with plenty of time to react. And no parts of any stage suddenly move around before you get to them. What’s particularly interesting, though, is the way the game avoids ambiguous cues like Runner2’s fireballs or Runner3’s wooden structures.

At first glance, it may seem that Paws on Fire! has ambiguous cues - after all, it also includes obstacles that you can either go over or go under. But this time, the obstacles aren’t the cues. The collectibles are.

For example, one type of obstacle is a floating frog. Bubsy can jump onto these frogs, “pounce” through them, or avoid them completely. The player is never taught that the frog indicates a specific action. Instead, the pattern of collectible yarn balls around the frog makes it clear what the player should do.

Each level of Paws on Fire! has one hundred fifty collectibles. That’s enough to show an ideal path through the entire level. The game’s cues thus indicate a path the player must follow by using the player character’s abilities, rather than mapping one-to-one to individual actions. Certain patterns suggest specific actions - an arc of collectibles suggests a jump while a row of them suggests a pounce - but ultimately the player just needs to be in the right places at the right times.

Obstacles can therefore be used in different ways throughout the game without creating any ambiguity. The player is trained early to follow the collectibles, and regardless of the obstacles that’s all they ever have to do - using the exact same abilities taught in the very first level. Even the game’s final level, which strings together difficult maneuvers above a floor of deadly saw blades, is still fundamentally about jumping and pouncing through patterns of yarn balls.

The levels of Bubsy: Paws on Fire! thus remain consistently readable even as they become quite difficult. Harder levels may take some practice to clear flawlessly, but the player can still sight-read those levels and easily understand what the goals are. The player can thus spend a lot of their play time in a flow state - and since that’s the point of a rhythm game, Bubsy: Paws on Fire! is arguably the best and most successful iteration of the Runner formula yet.

Reading Cues Should Be The Easy Part

Real musical performances often require practice too. Just because you know how to read musical notation doesn’t mean you can play, say, The Flight of the Bumblebee on your first attempt. But the song is still readable. You could look at the sheet music for The Flight of the Bumblebee, understand what it’s asking you to do, and know whether you’re up to the challenge. Similarly, you could look at Guitar Hero 3’s note chart for Through the Fire and Flames, understand what it’s asking you to do, and know whether you’re up to the challenge. To play these songs, you don’t need to memorize arbitrary surprises from cues that suddenly change meaning or become ambiguous - you need to improve your skill at performing the actions required by the cues. Understanding what you are being asked to do isn’t the hard part - doing it is.

This may seem like a subtle distinction, so let’s illustrate by comparing to a sight-reading musician again. But this time, let’s take Lieutenant Commander Data from Star Trek: The Next Generation. Data’s an android with perfectly accurate control over his own body, so precise physical movements aren’t difficult for him. With Data, the distinction is clear. Understanding what he’s being asked to do is the only part that can be hard. The doing it part is always trivial. Once he knows musical notation, he can sight-read and play any song.

Data playing violin
Data playing oboe
Data playing guitar

It’s my argument that any well-designed rhythm game could also be sight-read by Data. Once he’s completed the tutorials, he should know everything he needs to know about the game’s cues and should be able to play every level or song perfectly on his first attempt. Any game for which this isn’t the case - where Data would sometimes fail because cues are ambiguous or misleading and thus cannot be sight-read - is not playing fair.

That’s the bar that I think designers should have in mind when making rhythm games. Cues should be consistently readable, such that the challenge comes from executing the required actions in time, not deciphering them. Every time the player has to stop and think, every time the player has to learn through trial and error, and every time the player has to guess what the goal is, flow is destroyed and the player isn’t getting what they came for.

Rhythm games are at their best when the player doesn’t need to consider or choose their actions but can respond automatically to clear and consistent cues and lose themselves in the music. An unreadable rhythm game is a bad rhythm game.