Jump to content
IGNORED

Super Tilt Bro. for NES


RogerBidon
 Share

Recommended Posts

Technical highlight: Real time online gaming with the NES

Super Tilt Bro. is not a retro-game. It always tried to be a modern game on retro hardware, and modern games are playable online!

We "simply" put a WiFi chipset in the cartridge, and let's rock! New millennium, here we come!

spacer.png
A prototype of the WiFi cartridge by @BrokeStudio.

Challenges of online gaming

Writing a game to be played online is not an easy task. At any time, we have to ensure that both players see the same scene. When the game is fast-paced, some milliseconds of ping can make a big difference.

Let's assume Alice is playing against Bob. Bob unleashed his super-attack, and Alice dodged it on the last frame, EPIC! But, there is a ping of 20 milliseconds between Alice and Bob's houses (typical), and a frame lasts 16 milliseconds. Alice did actually dodge on time, so her game shows her character ready for a deadly counter-strike. Now, the information of Alice's epic dodge took 20 milliseconds to reach Bob's home, so Bob sees that Alice dodged too late, taking heavy damage. That's a typical case of desynchronization: the two players see a completely different outcome of their fight.

spacer.png
Alice and Bob's different timelines. Everybody wins!

The rollback netcode

Super Tilt Bro. is based on a rollback netcode. It does not wait to know opponent's inputs, it guesses it. When inputs finally come, the game discovers if its guess was right or wrong. A right guess is good. If the guess was wrong, the game rollbacks it's scene taking real inputs instead of guessed ones.

What does it mean for the epic fight between Alice and Bob? Alice still dodges on time, her game shows it without problem. Bob's game begins to guess that Alice did not dodge (that was an epic dodge like we rarely see), so it begins to show the attack hitting. Then, less than 20 milliseconds after, the information arrives. The game rollbacks, Alice was never hit. There will typically be one frame of flicker (Alice is hit one frame, has dodged the next), but finally both players see the same action, and can continue to fight.

spacer.png
The rollback engine, casually rewriting the past.

The guessing algorithm is super simple. It always guesses that nothing changed, no button was released nor pressed. For a game running at 60 FPS, even for a nervous player doing six inputs per second, this simple algorithm is right 90% of the time.

Adding some input lag

Another trick is to delay inputs while sending them immediately over the network. Let's say we artificially delay all inputs by four frames. If an input takes less than four frames to be transmitted from a player to another, rollback is not even necessary.

Getting back to the game between Alice and Bob, but with an input lag of four frames. When Alice inputs the dodge on the last possible frame, it has no immediate effect, so she is hit hard. One or two frames later, Bob receives the network packet with the dodge input, it has not yet had any effect. Both players see the same action: Alice took the hit. It is less epic, but at least everybody sees the same thing and there is no visual glitch.

spacer.png
Input-lag: glitch-less, but frustrating at times.

Of course, both approaches can be used together. That's what Super Tilt Bro. does. There is a little bit of input lag, which should be sufficient in most cases. In case of a latency spike the input lag may not be enough, the rollback code saves the day ensuring minimal glitch. In a nutshell, rollback makes your opponent teleport at the beginning of a move, input lag makes your character slow to react, a balance between both has to be found.

Some other implementations

All that is good, now you know how the Super Tilt Bro.'s netcode mitigates internet's latency. Most game developers just never openly discuss their netcode, while it is extremely interesting to see different approaches. Here are some known ones.

Super Smash Bros, the direct inspiration for Super Tilt Bro. does not do the same thing. There is no rollback at all in the original series, instead the game slows down or even freezes waiting for inputs. It can be easily seen by playing on an unstable connection, the game will regularly freeze. To limit the impact of these slow downs, the input lag seems dynamic. (Yes, sorry for that "seem", most available info is reverse engineered or outright guessed.) We know for sure that Super Smash Bros Brawl rates your connection and attributes it an input lag that can vary from 3 frames for the best to 15 frames for the worst. People trying to measure input lag in Super Smash Bros Ultimate failed with online mode, the input lag was too varying. Maybe Ultimate is adjusting dynamically input lag for each player during the game.

The approach of avoiding rollbacks and freezing the game has its benefits and drawbacks. First, it is really easy to implement, the only special case to handle is to wait for the needed information to be available. As a side effect, it requires very little overhead for the CPU and memory. The game engine does not have to be able to rollback from a previous point in time, the game can just run forward, forgetting anything that happened on previous frames. The effect on laggy connections is a freeze, while with a rollback netcode characters may teleport around on laggy connections. It is a matter of preference, a freeze is more "understandable", while a slight teleport is more smooth to play. The biggest problem is competitive play. Even at a moderate level, players train their combo, learning to execute moves with very precise timing (even frame-perfect sometimes). If the game slows down, freezes or changes the input lag in the middle of a frame-perfect combo, it messes it up, making it fail while player's execution is perfect. Finally, freezes must be avoided, it is not an option. So the player cannot be offered the choice to configure its input lag, it has to be conservative.

Another well-known solution is GGPO. It is a standalone rollback netcode made to be used by developers on their own game. It is especially popular in arcade emulation, and is the solution of choice of Skull Girls (which is known to have a good netcode.) GGPO is a rollback engine, their documentation does not mention input lag, but Skull Girls allows it to be configured to a fixed number of frames. This way the player can balance himself between more rollbacks or more input lag.

It is a lot like the Super Tilt Bro.'s netcode. The good thing is that input lag is fixed and constant for an entire game. You can still perform your frame-perfect combos. The biggest problem is when there is a big latency spike, characters will teleport around for a while until the game successfully re-synchronize itself with the other player. This kind of action is really confusing for the player. Also, the game engine has to take save-states and manage it to be able to rollback, putting pressure on the CPU and memory.

All these modern implementations have something in common: they are peer-to-peer. Avoiding packets to transit by a server noticeably reduces latency. That's a really cool solution. Super Tilt Bro., however, is server based, let's see why.

Super Tilt Bro.'s game server

We saw that peer-to-peer is the best model for online versus fighting. Super Tilt Bro. does not have this luxury. We also saw that rollback netcodes are costly for the CPU as the game has to be able to rollback, and the RAM because it has to store its old states. The NES runs with a 8-bit CPU at 1.5 Mhz and 2 KB of RAM. It is really far from modern systems. Implementing the rollback system entirely on the NES would be very limiting, most resources would be allocated to it at the expense of gameplay. The server is here to help.

Remember Alice and Bob? With the Super Tilt Bro.'s protocol, when Alice presses a button the game only sends the state of the gamepad and a timestamp to the server. The server knows the game, it is able to simulate frames, and compute the game's state at any point. Based on this knowledge and Alice's input, the server computes the state of the game at the frame of Alice's input, then sends all that to Bob. Bob receives the timestamp, Alice's input and a full game state. If input lag did its work, the game state can be ignored. If a rollback is needed, Bob's NES can use the state received from the server. Bob's NES does not have to manage a list of game states, the server generates it when needed, removing almost all pressure on the limited NES memory.

The server can also help with CPU budget. Let's say there is always a minimum of two frames of delay between Alice and Bob. When receiving Alice's input, instead of computing the state at the time of the input, the server can compute the state two frames after. The server is actually predicting the future, avoiding Bob's NES to do it itself. Of course a full rollback engine has to also be implemented in the server, but can help a lot with big latencies. Currently, Super Tilt Bro. is able to rollback only three frames of gameplay before running out of time. The server doing the rollback for one or two frames can help a lot.

UDP vs TCP... And web browsers

There are two big, omnipresent protocols on the internet: UDP and TCP. TCP is the most common one, it allows a computer to connect to another and send data. In TCP land, we don't lose data, and data is received in the same order as it was sent. Most of the internet is constructed on TCP, it is simple to use. UDP is a more lightweight protocol. In UDP's world, when we send a packet, the only sure thing is that we sent it. The packet may be randomly lost, or arrive before a packet sent earlier. So, TCP is a better protocol, right? No. Sadly, TCP's magic has a cost.

In Super Tilt Bro.'s protocol we don't care a lot about lost packets. A message from the server to a client contains a full game state. A recent message completely invalidates any older one. If we use TCP and lose a message, it will be re-emitted before the receiver processes any other message. If we send two messages in a row, only the most recent one is useful to Super Tilt Bro. If the useful one is physically arrived but the useless one is lost, the useful one will be delayed until the useless one is re-emitted. During development, we tested a simple ping implementation using TCP and voluntarily losing 10% of packets. With UDP, it resulted in 10% packet loss, without impact on the ping (around 20 ms in our test setup). With TCP, it resulted in 0% of messages lost (that's the TCP magic), but ping value sometimes reaching three full seconds.

Imagine having your input delayed by three seconds because a useless packet was lost? No way!

If you played the game, you probably noticed that it is available in an HTML5 NES emulator on the itch.io page. As innocent as it seems, that's actually a big deal. Web browsers don't let web pages send UDP packets. Super Tilt Bro. protocol is based on UDP and needs it. The solution was to use WebRTC, a modern protocol made for visio conferences. That's what your favorite Google Meet, Skype online, Facebook visio uses. Hijacking a video conference protocol to use it for gaming is a large topic. I may or may not write a devlog entry on the subject, for now I'll let you with the real-time twitter thread of my fight against internet's security: https://twitter.com/RogerBidon/status/1259171335135211523

See you!

Implementing modern network capabilities for a NES game was an epic journey... And it is far from over! This made it to the ALPHA stage. It is not complete, and I am in dire need for feedback.

You can test it yourself: https://sgadrat.itch.io/super-tilt-bro

Join the Discord to find a rival and/or send your feedback: https://discord.gg/qkxHkfx

Hope this little piece of internet knowledge can help somebody out there. Do not hesitate to reach me for more details, nothing is secret. Super Tilt Bro. is an open source software. It is made to make the internet a better place, not to keep its secrets!
 

Edited by RogerBidon
Grammar
  • Like 2
Link to comment
Share on other sites

Little update on how's the project is going.

As often, I went quiet for a long time. I was re-implementing the music engine from scratch. I had nothing fancy to share. The new engine is in its infancy, but already has some interesting features:

  • Able to play Famitracker files with effects (Fxx, Gxx, Dxx, Sxx, Axx, Qxx, Rxx, 1xx, 2xx, 3xx, and 4xx for now)
  • Speed and code size comparable to ggsound and famitone (if not better, always debatable)
  • Data a little heavier than ggsound/famitone, but still better than Famitracker's driver

I made my own engine to prepare for my idea of dynamic music. When the music adapts itself to what happens. The engine will be able to seamlessly switch between calm or nervous versions of the same track depending on the actual action of the game. Anyway, I will write a technical highlight on what is already implemented, I am sure it has enough technicalities to write a big wall of text.

Also, I had to take a little break to make a new trailer, hope you will enjoy it.

It was made by a good friend of mine that is starting to take freelance jobs in motion design. If you have similar needs, you may contact her 😉

  • Like 2
Link to comment
Share on other sites

  • 3 months later...

Technical highlight: rollback netcode on the NES, the gory details

Hey folks! Ready for a really technical one?

Super Tilt Bro. ALPHA 5 just hit the public, and with it the ability to play online with any character on any stage. Time to take a step back, see how the netcode works, and what where the biggest challenges.

As you know it from reading the precedent technical highlight (or at least the illustrated part), Super Tilt Bro. Implements a rollback netcode on the NES. It means that the game is always smooth, when the messages from the other NES take time to transit, the game just predicts it and continue. Sometimes it happens that the prediction was wrong, the game has to revert predicted frames, and re-compute the real ones.

How the protocol works

Rollback netcodes are notoriously hard to implement. Browsing the internet, you will see very long posts explaining that the Switch is not powerful enough to do it. How the heck can it run on the NES?! (Spoiler: of course, by cheating!)

A very big pain point in traditional implementation of such a netcode is the memory management. The game has to keep a list of its previous states to be able to rollback from any of them. With only 2 KB of very slow memory, Super Tilt Bro. has to avoid it. The solution is to make the server smarter.

When you press a button, a message is sent to the server. This message is tiny, it contains just the button pressed and a timestamp. In turn, the server computes the state of the game when you pressed the button, so it can send an enhanced message to the other NES. The enhanced message contains the button pressed, a timestamp, and the state of the game at this point. The NES receiving such a message now has everything at hand. It just replaces its current state by the received one, and rollback enough frames to compensate for transit time.

The good point of this approach is that memory management on the NES is completely removed. Of course, there are drawbacks. The server must be smart, it actually has to incorporate an emulator to compute game's state. Worst, there is no way to switch to a peer-to-peer protocol. Time will tell if this rollback netcode is better than peer-to-peer input-lag netcodes.

Introducing the main foo: the CPU

Now the NES has an old state received from the server, and must simulate missing frames. It has to be speedy. We don't want to see characters moving in fast-motion to catch-up, that would destroy the timing of player's inputs, and make them miss combos. The goal is to rollback immediately, so the game is at the same point in time it was before the message's reception. We have exactly 20 milliseconds to do it without missing a video frame (on PAL.)

The first thing to do is to be able to simulate a frame in “rollback mode”. In this mode, nothing has to be displayed, so the engine can safely skip placing sprites on the screen, updating the background, and anything visual.

That done, simulating a frame in rollback mode take 25% of the 20 milliseconds. It would be enough to rollback four frames, and compensate 80 milliseconds of ping, if the rest of the game did not take 60% of the time. Notably, it has to simulate a frame in normal mode, to show sprites and other visual effects. All in all, it allows for almost no rollback.

Next step is to optimize everything that is done multiple times per frame. Collision detection is where we get the biggest gains. It was an old code, mostly implemented while learning 6502, and is run for both players, for each platform, multiple times. With this code rewrote, it is finally possible to play online even on complex stages.

Another big win is the optimization of the animation code. While in rollback we don't care about animating sprites on screen, in Super Tilt Bro. hitboxes are stored in animation data. Parsing all meta-sprites to extract an hitbox is time-consuming. Here the trick is to always put hitboxes at the beginning of such data, and stop reading there. Plus some low level optimization, it is always good to take.

KzIwRkm.png
Optimizing gave space to rollback around three frames. The work is not over 🙂

We are not yet fast enough to compensate big latencies. Thankfully, the server is smart. It can itself predict frames, and does just that to compensate for the minimal latency ever seen between players. That way, the NES just has to handle the variable part of it. Let's say your ping oscillate between 100 and 140 milliseconds, the server will predict for 100 ms, so in the worst case the NES will have to predict only two frames.

What's next?

First and foremost, this code needs to be battle tested. For sure, it is not yet perfect. The only solution is to try it, play with it, and report anything strange. The more it is used, the more we can find issues. Super Tilt Bro. needs you! Please bring some friends, play together, and send your feedback. It really helps! Your journey begins here: https://sgadrat.itch.io/super-tilt-bro

If the online mode should work, it is still very light. There is only a very simple matchmaking available, and the user interface is the ugliest you can imagine. I guess I will now focus on the things around the netcode, beginning with a ranking system. Wouldn't it be nice to go to the official ranking page and find that you are the best player in the world?

  • Wow! 1
Link to comment
Share on other sites

  • 4 weeks later...

compil_war_banner_small.png.6df39ae3831d5c36ea16c112db289bfd.png


Technical highlight: C compilers for 6502 benchmark: A new hope!

Adding a C compiler in your game's toolchain is not an easy task. Each have their own strength, and weakness. You may want speed, ease of use, compliance to the standard, freedom, or anything. We'll try to compare some compilers, maybe it will help you.

While choosing a compiler to enter my game's toolchain, the speed was the least of the concerns but discussions invariably derived to the subject. This article will mainly focus on this question: what compiler produces blazing fast code?

Some code samples will be compared, each time with an explanation about the code, and why it may be useful to benchmark it.

That done, I'll talk about my experience with each compiler. Their strong points, their weaknesses. This part is necessarily subjective, I'll try my best to be factual.

Finally, a short "how to choose a compiler" guide will conclude the article. Hope it helps.

Let's go!

Introducing contenders

cc65 is the most used compiler. It comes with an extensive toolsuite, is actively maintained, and has a big community of users. However, it is known to be slow... We'll see 🙂

vbcc is the cool kids' compiler. Less used than the king, it has the reputation of generating largely better code.

KickC is the young promising project. Is it a toy for nerds or a real option?

6502-gcc is the mysterious one. Nobody really masters it, the project seems dead, installation instructions are cryptic, ... We'll dig in its dark secrets!

6502-gcc has two interesting optimization flags:

  • "-O3" optimize for speed
  • "-Os" optimize for code size

Run length decoding

This benchmark comes from real life. It is a function prototyped in C for Super Tilt Bro. before being converted to assembly for performance. Data used are also the original ones.

The function parses a compressed blob to extract some random bytes from it. It is pure loop logic, burning many cycles reading the memory.

bench_rle.png.5ed9d941ecd8a3f83f8abbb84ca48d71.png
Red is execution time, blue is generated code's size.

First thing first, cc65 lives by its reputation of sluggishness there. It is a complete KO. Code generated is big and slow.

The winner is 6502-gcc, by a fair bit. KickC has a slight advantage over vbcc.

Fun fact: the more popular the compiler, the worst it perform in this test.

Memcopy

Here is the super common task of copying bytes in the memory. It is important to compare compiler on what our programs will spend the most time.

The bench comes in two flavors:

  • The normal one directly copies bytes with a for-loop
    • the compiler knows both addresses and number of bytes to copy, food for optimization
  • The no-inline is "memcpy-like" function, copying at most 256 bytes at once, not inlineable
    • the compiler knows nothing, it has to implement a generic copy function

bench_memcpy.png.a3ca6accc524c6400f13aecd265f19d9.png
All graphs are sorted: slower on the left, speedier on the right

cc65 is still on the last spot.

KickC confirms to be slightly better than vbcc. Note KickC absence from the no-inline version: KickC seems to always work with the full source, so no way to forbid an inlining.

6502-gcc takes a hit. Worst, extensive tests showed that results are highly varying when changing little things to the code.

Also, did you noticed how "-O3" poo'd itself in the inline scenario? gcc (the "real" one as well as 6502-gcc) works in two steps: the fronted optimizes C code, then the backend translates it to machine code. The fronted of 6502-gcc is the regular one, it does wonders, then comes the backend. The 6502 backend is seriously lacking love, and it is showing here: the fronted detected that the code is equivalent to a "memcpy", and told "copy 200 bytes from $0400 to $0200" to the backend. The backend was expected to implement it the most optimal way, instead it just called the standard "memcpy" function, which is far from optimal for an 8bit CPU.

RPG engine

This bench is voluntarily made to take advantage 6502-gcc strength: its fronted.

It is an RPG engine with lots of abstraction. There are structs, functions making the player strike, functions for hitting monsters, the player is wielding a weapon, ...

The benched function, initialize the game's state, play one turn, then return. So while there are functions updating structures, adding attack points, and subtracting hit points, the finality is just to set the memory in a particular state.

bench_rpg.png.007d8eb205b153f4d4e115aacf395b75.png

cc65... it becomes redundant. Ok, this test is definitely not for cc65 which refuse to do high-level optimization. This benchmark actually tests the quality of the high-level optimize.

KickC is a better vbcc performance-wise. That's a trend.

6502-gcc doesn't disappoint. It's frontend is world-class, and nails it perfectly. The generated code is a short list of "lda <constant> : sta <memory>", with even redundant LDAs pruned.

Code tailored for cc65

Did you read Ilmenit's great essay "Advanced optimizations in CC65"? Here it is: https://github.com/ilmenit/CC65-Advanced-Optimizations

Let's save cc65! Here we bench a code especially tailored for cc65, made by an expert. Should do it, yes?

The code itself is comparable to the RPG benchmark: it is a uselessly abstracted RPG engine. The difference: it loops 100 times and print characters on screen each time (so, no way to reduce it to a list of LDA+STA, computations will be done.) Then, Ilmenit applies various optimizations to make it really fast.

We'll bench two versions: the first one, without any optimization, and the last one, fully optimized.

bench_ccgame.png.fe0b5050cc24a37397618b4febee40bc.png

KickC disapeared, and vbcc is half absent!

KickC had just too many limitations to compile the code: notably it refuses to output any runtime modulo or division.

vbcc generated an infinite loop on the unoptimized code, and suffered my lack of experience when integrating it in the benchmarking tool (hence the lacking code size.)

6502-gcc exhibited a bad bug (or once again my lack of experience), it puts global variables in random segments. The assembly generated had to be fixed by hand. [Edit: the bug is fixed now, 6502-gcc works without manual intervention.]

So, even before watching graphs: cc65 is the winner! It compiled this code!

Interestingly 6502-gcc does not perform on the unoptimized version. After all what has been said about its magic high-level optimizations! The problem here is that it was not allowed to inline functions. In C, when you declare a function without preceding it by "static", it has to be accessible to other compilation units, it must be there and fully usable: passing parameters on stack and all that. Of course, cc65 don't inline anyway so it is of little difference to it.

On the optimized version vbcc seems to be lacking, and cc65 is slightly better than 6502-gcc. The code has been optimized by hand to be straightforward to compile, so the compilers have not a lot of freedom to improve things.

What about this "???" entry? If you didn't read "Advanced optimizations in CC65" by now (shame on you!), you don't know how the optimized code is a mess to read. All available tricks have been used, even the author does not recommend going so far in real life.

The "???" is "6502-gcc -Os" on a version with only two of the 12 optimizations plus the use of "static" keyword. The resulting code is a lot like the unoptimized code, and the performance penalty is almost gone. That's why I think high level optimizations are what matter: it allows writing in C, without caring much of the assembly generated. Let low levels tricks to the compiler, focus on the logic.

Aside of performance

Ok, performance-wise it seems that cc65 is seriously lacking, vbcc and KickC are almost on par, and 6502-gcc varies from excellent to trash.

Actually, performance should not matter a lot when choosing a C compiler. Be sure to learn the assembly language, and you'll be able to get perfs where it is needed. Here is a summary of pros and cons of each compiler.

cc65
Pros

  • Rock-solid and battle-tested, it will not let you down.
  • Active development
  • Great resources available to learn
  • One of the most complete toolsuite out there

Cons

  • Performances (seriously, that's its only dark point)

vbcc
Pros

  • Acceptable performances
  • Active user base
  • Extensive documentation
  • Complete suite of tools (assembler, linker, versatile config files, ...)

Cons

  • Terrible licensing (like, I will recommend avoiding it just for that)
  • Buggy

KickC
Pros

  • Good performances
  • Hard to integrate with other tools (made to compile an entire project)
  • Active development (and a good base, may the future be bright!)

Cons

  • Compliance with C standard very partial (even "const" is badly supported)
  • Compiling is slow
  • Terrible compilation errors (at times, you just have a stack-trace)

6502-gcc
Pros

  • 100% compliance with the C standard
  • God-tier high-level optimizations
  • Generates assembly for ca65 (taking advantage of the cc65's toolsuite)
  • Human readable and on-point warnings and compilation errors.

Cons

  • Development inactive
  • Buggy
  • Variable quality of the generated code

Ok, but which one to choose for my project?

As always, it depends! Who are you?

Are you an experienced developer accustomed to a particular one? Stay on it. You already learned to master your tool, others bring little benefit.

It is your first C project for the 6502? Go for cc65, it is the most mature, and you'll find help.

You want high-level optimizations, and something that just work? KickC is made for you.

You want high-level optimizations, to rely on fine-prints of the C standard, and are ready to build your own toolchain? 6502-gcc is the way.

No, I won't recommend vbcc. It's licensing is terrible: it is closed source and you cannot use it for "commercial purpose" (without definition of it, nor if it applies to generated executables.) Also, it incorporates various tools and libraries with various licenses. If you want to do things "the right way", you will have to check at least three licenses to know if you can do what you have in mind.

Last word

Benchmarks are a nice tool to see the general picture, but never can be perfect. Especially these, it was my first experience with most of those compilers, and I may have had some details wrong. If you want to play by yourself, the tool used is available here: https://github.com/sgadrat/6502-compilers-bench it takes C files, and output speed metrics as well as the generated assembler.

Hope this little research can help somebody out there. Remember, in retro-development, the most important is to have fun!

 

Edited by RogerBidon
  • Like 1
Link to comment
Share on other sites

As the author of the 6502-gcc compiler port -- thank you for including my work in your benchmark comparison! I don't have a lot of time to work on it at the moment but I love to see people getting some use out of it! I'm a bit puzzled as to why the memcpy test came out so bad. The libtinyc version of memcpy (if that's what's being used) is hand-written and I don't think THAT terrible...

  • Love 1
Link to comment
Share on other sites

Wooo! Hello Itszor, I did not find a way to reach you online, very humbled that you are reaching me! I wanted to thank you for your work on 6502-gcc, I use it for non time-critical parts of my game, and really love it! Thank you. Since I tried it, I am advocating for it on any opportunity.

About memcpy

For the memcpy, the thing is the compiled code uses an 8 bits counter, and don't return the original address. So even the perfectly written memcpy can be beaten. Also, it is on the inlineable version that "gcc -O3" called memcpy instead of  writing optimized code for the case.

Compiled code is there: https://github.com/sgadrat/6502-compilers-bench/blob/master/code_samples/memcpy/memcpy_8bit_c_style_static.c
Perfect asm (without unrolling): https://github.com/sgadrat/6502-compilers-bench/blob/master/code_samples/memcpy/memcpy_8bit_asm_static.s (just noticed it is buggy and copies from $401 to $201, anyway that's the idea)

ASM output by gcc:

    benched_routine:
    ; frame size 0, pretend size 0, outgoing size 0
        lda #$c8
        sta _r4
        lda #$00
        sta _r5
        sta _r2
        lda #$04
        sta _r3
        lda #$00
        sta _r0
        lda #$02
        sta _r1
        jsr memcpy
        rts

 

About the variables in random segments

This code:
 

int g_var;
int const g_const = 5;
int g_var2;

void benched_routine() {
    g_var = 5;
}

Compiles to (gcc -O3):

		.code
		.segment "CODE"
		.export benched_routine
	benched_routine:
	; frame size 0, pretend size 0, outgoing size 0
		lda #$05
		sta g_var
		lda #$00
		sta g_var+1
		rts
		.global g_var2
	g_var2:
		.res 2
		.export g_const
		.segment "RODATA"
	g_const:
		.word	$0005
		.global g_var
	g_var:
		.res 2

g_var2 is in "CODE" segment, while g_var is in "RODATA", while both should ideally be in "BSS" or at least in "DATA".

Note for my game I don't use global variables, so I discovered it while doing these benchs. This is the first bug I encounter while using 6502-gcc everyday.

The best would be to make it a github issue I guess. I wanted to take some time to search if there is an easy fix to make a pull-request instead of an issue (but you now... time... so rare a resource)

Edited by RogerBidon
Adding useless note on parenthesis
Link to comment
Share on other sites

Thank you for taking the time to write up those problems! I think both should be fixable - I'll try to find time to do it soonish. (If you find more problems, feel free to create github issues on the gcc-6502-bits repo. I don't check it all the time but I'll see them eventually!).

Thanks,

Julian

  • Thanks 1
Link to comment
Share on other sites

  • 4 weeks later...

Hey! First time I share a release note here. I guess you homebrew players also are in the brewery, it may be of interest. Skim-read it, and see what's new.

I will post such release note with each future version. Do not hesitate to give feedback on the content or the format.

Super Tilt Bro. 2.0-alpha6: Worldwide ranking!

What's new in the game?

Ranked play

You can now choose between “Casual” or “Ranked” mode. Casual plays just like before, while in ranked, your have a Match Making Rating (MMR). Winning a match gives you MMR points, while losing takes your MMR lower.

A worldwide leaderboard is available on the brand new official website: https://super-tilt-bro.com/leaderboard.html

leaderboard.png.be7e1007062ed932b186cc09d37ad15e.png
Come, the top spot is waiting you!

Private game

You can ensure to be matched with your friend in this mode. Simply share a password to be matched together.

private_game.gif.026a844d2604a6e80ce59d454bff9e9f.gif
Starting a private game

Longer hitstun

When you receive a hit, there is a small time on which your character blinks and you cannot do anything. It is essential for combos, which consist of repeatedly hitting your opponent while in hitstun to maintain them in this state. Also kill moves, which send the opponent far away, rely on it to avoid the momentum being cancelled by a special move.

Hitstun duration has been slightly increased, it is now 1.5 times what it was. Before that, there was literally no true combo in the game. So, now Sinbad has more efficient combos, and Kiki's strong strikes are more dangerous.

Kiki's moveset change

Originally, Kiki's down-special put a wall bellow her while neutral-special was a counter strike. These moves swapped input, down-special is now the counter strike while neutral-special creates a platform.

It is a simple fix, as the platform is Kiki's main recovery move, and it is unintuitive to press “down” to avoid falling.

Kiki's counter strike revamp

The counter strike has now more active frames, and more end-lag frames. Moreover, if down in the air, it slows Kiki's fall. The counter strike is still active at frame one.

counter.gif.159e71d87bc378dc2a10a81287970362.gif
Kiki's new counter strike

More active frames means your timing may be a bit less precise to pull it effectively. More end-lag, on the other hand, make it is easier to punish Kiki for missing it.

The slow fall forces Kiki to setup a proper bait. The sudden change in momentum may make the opponent miss a strike, leaving Kiki vulnerable. It also lessens the risk being KO-ed by falling too low while in end-lag.

What's new under the hood?

New servers

Originally, there was only one server per region. This server handled the matchmaking and the games.

Now there is a new login server, which handles user accounts, a ranking server that computes MMR and a website. This architecture has been intentionally split in tiny very specific servers. Very specific servers are easier to debug and operate than big ones.

It also allows for a future migration to cloud hosting, theoretically if we split the matchmaking from the game server, we could spawn as many game servers as needed to serve all players at any time. Spawnable game servers would have another advantage: the end of region server, just spawn the server near the player. Of course, all that is just wishful thinking for now but better be ready, it costs nothing.

Link to comment
Share on other sites

  • 2 months later...

Super Tilt Bro. 2.0-alpha7: Welcome Pepper & Carrot!

What's new in the game?

New character

 

Pepper is a young witch from the awesome online comic Pepper & Carrot. She lives a thrilling life, learning alchemy and witchcraft while exploring the world, and its dungeons. The comic itself is of excellent quality, and free (go read it!) It does not only mean that you don't pay money nor watch ads to access it. It is free as in freedom, you are free to read it, print it, improve it (if you can), to do fan-art, ... Actually, to put Pepper in Super Tilt Bro. is assumed fan art! (Did you know? Super Tilt Bro. share the same freedom ideas.)

Now, how does she play in Super Tilt Bro.?

Pepper is a fast-moving, short-ranged character with unrivaled combo ability. She can teleport around, and fly on her broomstick giving her the mobility to follow combos, even on strong hits. She also throws firework potions around, giving her a decent zoning and strong juggling.

Kiki changes

Kiki's recovery move is to draw a platform behind her feet. It was originally her down-special, which was unintuitive. In the last version it was change to be the neutral-special, which was not better. The only good mapping for a recovery move is up-special, so it is finally correctly, and definitely fixed!

Down-tilt had no startup-time nor end-lag. It was not on par with the design of Kiki, who should hit strongly but being punishable if she misses. It was also too strong, so there is now a little startup-time and some end-lag.

Counter-strike now keeps some momentum. It should not impact gameplay much, but feels more natural.

Formerly, you could fly by spamming up-special. It is fixed, try it and you will fall (while appearing ridiculous.)

Common gameplay changes

Fast-fall is now activated by releasing the down-button. It allows inputting down-aerial, and down-special without involuntarily fast-falling.

Input of down and up attacks has been eased, the game is more tolerant to other buttons being pressed at the same time. It is especially noticeable when attacking at the beginning of a jump, controls feel more responsive.

Fast-fall is now cancelled by mid-air jumps.

Online mode

Improved netcode on connections with high variance in ping. Because there is server-side prediction, on such connection it happened that the game receive inputs in the future. It was terribly handled, leading to big desynchronization between players.

Fixed being disconnected for being idle more than 30 seconds in game (only on the web client.)

Added a new menu “Settings” in online mode. From there you can create an account for ranked play, configure your Wi-Fi (only on real cartridges), or update the game (also on real carts.)

Link to comment
Share on other sites

  • 1 month later...

@fcgamerNo problem, I'll contact you via private message on VGS, that's ok?

Note that it will still take months before the game is in a good-enough state to be sold. I want to make it the best I can before asking for money. In the meantime, you can play the alpha, which just received an update! Release-note time!

------

Super Tilt Bro. 2.0-alpha8: New musics, and fixes

What's new in the game?

New musics

Two new musics, by Kilirane, can be heard in game.

The first is inspired by Sinbad, while the second depicts Pepper's personality.

You will also notice that the main theme is now played at correct pace in the menus. The audio engine has actually received some long-deserved love in this patch, including the feature of playing NTSC musics correctly on PAL system, and inversely.

Kiki

egWOVjA.gif
Kiki can now walljump

While Kiki does not really need a walljump, it makes it available to all characters. Kiki was the only one unable to do it, disturbing after playing other characters. So walljump is now a standard move, like the double-jump, all future characters are expected to have it.

Fixes

Kiki can no more have a free jump by buffering the jump action during an aerial. Long-time bug, it was originally on all her moves, then reduced patch by patch. All should be squashed now.

Kiki no more ignores gravity when spamming side-special.

Kiki's AI no more KO itself when thrown out of screen. Before this fix, you could often have Kiki AI draw a platform out of screen, then run straight to the blast-line.

Sinbad cannot bufferize grounded moves when his side-tilt lands aside from the platform.

Pepper's teleport no more kill her randomly.

Pepper's teleport on-spot hitbox is now correctly placed at her starting position.

Join the bug-squashing squad!

The game can be freely played (in emulator) here: https://sgadrat.itch.io/super-tilt-bro
To find an online match, there is no place better than discord : https://discord.gg/qkxHkfx

  • Love 1
Link to comment
Share on other sites

Rapid update, you can now register your email address on https://www.super-tilt-bro.com to be notified when the physical release draws near 🙂

It is a one-time mailing list. You will only receive the Kickstarter* launch notification, straight to the essential!

* Maybe a preorder notification if we decide to go without Kickstarter.

postcard.png.c80d4a3925ac88692fdcdd9ec7e34ef3.png

  • Like 2
Link to comment
Share on other sites

Super Tilt Bro. 2.0-alpha9: Redrawn menu, replays, and wishlist

What's new around the game?

Wishlist the cartridge now!

Ok, there is no Steam for retro game cartridges. We do it in the pure homebrew tradition: our wishlist system is crafted at home with love!

To wishlist your Super Tilt Bro. cartridge, go to https://super-tilt-bro.com/wishlist.html and fill your e-mail address. This is a notification system, you will receive one mail when the pre-orders are available. Nothing more, nothing less.

As soon as the mails are sent, the list will be deleted. We care about your privacy, and the well-being of your inbox. If you want to follow the game's development, you should also follow Super Tilt Bro.'s Twitter or join us on Discord

Replays

You outrageously out-played your opponent but forgot to record your screen? That happens all the time!

We know this terrible feeling when you cannot show your best moves to the world. You deserve internet felicity for every win! So here comes the replay system.

After playing an online game, you can now head to https://super-tilt-bro.com/replay.html to watch the replay of the game. As simple as that!

It's then up to you to record your screen, and share the game. You can also use this page to watch games, it is kind of a Super Tilt Bro. TV channel.

What's new in the game?

Redrawn online menu

The online menu has been completely redrawn by Martin Le Borgne

online_mode.gif.1ae9d60fec6cdb8dbf226d24b42d13bf.gif
Woah!

Other menus have also been updated. It is more subtle but you may notice that there are more sound effects than before, the navigation is slightly better, and clouds' parallax is more consistent.

Bad ping is gone

About the infamous "bad ping" message when trying to connect to a server.

This message happens when the game decides that the network conditions are too bad to even try to connect. It originally was when ping was above 200 ms.

With time, and netcode improvements, this limit became too low. The game can handle 200 ms ping and still be playable. The limit has been raised to 800 ms.

Gameplay improvements

The fast-fall mechanics has been improved. Now, your character shines while fast-falling, and weird cases where you stayed in fast-fall mode for too long have been fixed.

Added sound effects for jump, fast-fall, and land moves.

Finally, the maximum time allowed on the respawn platform has been decreased, from five to four seconds.

Kiki changes

Most animations of Kiki have been extended. The goal is to smooth the animation, and to add end-lag to her moves. Kiki always has been a character with a good range, and strong attacks. The idea is that she is strong at punishing her opponent's mistakes, but should not spam here attacks. Search the weakness in your foo's play, and exploit it. Adding end-lag to her animations ensures that Kiki's player has to think twice before attacking.

Animations change details:

- Jab is now 16 frames instead of 12. It can still be cancelled by another jab at any point.
- Side-tilt: Added 4 frames of end-lag. The animation is now 20 frames instead of 16.
- Up-tilt: Added 4 frames of start-lag, and 4 frames of end-lag. Removed 8 frames with active hitbox. The animation's duration is unchanged, 16 frames.

Talking about animations, Kiki now has a proper animation for drawing a platform above her head.

kiki_paint_up.gif.c40a055ed575d15d297e8eb31724c344.gif

When landing during an attack, Kiki slides on the ground performing the attack. This is not new, but has changed a little. Tilt-slide, and jab-slide distances are slightly reduced.

Finally, Kiki's platforms now last 2 seconds instead of 2.40 seconds previously.

This patch happens to be quite a nerf for Kiki. It will need some serious play-test to see if balance is not broken in the unusual way.

Sinbad changes

Slide attacks go slightly farther.

What's new under the hood?

Real NTSC compatibility

There are some differences between the PAL and NTSC NES. The most impacting one is the frame rate. An NTSC NES displays 60 frames per second, while the PAL version outputs only 50 frames per second.

Super Tilt Bro. has been primarily developed on PAL hardware. That said, since the beginning the game is playable on NTSC systems. Until now the trick has been to double one out of six frames, to virtually reduce NTSC frame rate to 50 frames per second.

ntsc_pal_framerate.png.06ceb68df427250b43b35075fdb8ae30.png
Frame doubling trick to get 50 FPS on NTSC

While it basically works, it is far from ideal. This trick not only loses the extra smoothness of 60 FPS, it adds extra bumpiness by oscillating between 60 and 30 FPS.

So, how do get rid of this doubled frame? There are three impacted domains: physics, animation, and audio.

In physics, all speeds have to be slower on NTSC. An object that moves 1 pixel per frame at 50 FPS needs to move at 0.83 pixels per frame at 60 FPS, to keep progressing at the same speed (50 pixels per second.) So speeds needs to be multiplied by 0.83 but here's the catch, the NES is very poor at two things: multiplications, and floating point arithmetic. Full description of the solution would not fit in this release note. Long story short: by decomposing the multiplication byte by byte, and reserving 1 KB of ROM for lookup tables, it is possible to multiply by any constant (even 0.83) blazing fast.

Characters animations also need to be slowed down. These animations actually do not run at 50 FPS, they are more like animated GIF: each encoded frame takes ROM space, so we limit it. For such animation, the frame doubling trick is completely unnoticeable, lightly extending one animation frame's duration from time to time. Just take care of always extending the same animation frames, so the animations are always played exactly the same. It is important to have consistent frame data in versus games.

And the audio engine. Yes, the audio is impacted by the frame rate: in the NES, the most reliable timing information we have is the periodic display of a new frame. We naturally use it to regulate music's beat. As most authors compose for the NTSC NES, Super Tilt Bro.'s audio engine has been converted to NTSC-native. On PAL, it occasionally plays two beats in one frame to simulate 60 FPS.

It is to be noted that while feeling the same, the game is slightly altered: 50/60 is not exactly 0.83, and frame data is not the same between PAL and NTSC. That sadly means that the netcode cannot accommodate a cross-play PAL/NTSC, desynchronizations would be terribly common. An option could be to fallback to the frame doubling trick only on cross-play, maybe better than nothing.

And voilà! You know the fun parts of being PAL/NTSC compatible (the right way.) It's been a long time without "under the hood" section in the release note. Please tell if it was a nice surprise, or "too technical, didn't read." Maybe this subject deserves a complete devlog entry?

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
 Share


×
×
  • Create New...