backslide to arcanum

Technical Details

I’ve always really enjoyed reading about the technical details involved in other demos, like the write-ups from Leonard, Paradroid, and Slummy. So, I’ve included my own commentary below.

Logo & Title

The logo and title screens were coded up only recently when it became obvious that static pictures were not going to cut it. Something more than fades was needed to stand up against Magnar’s awesome music!

Both the logo and title use the same code, which is pretty simple, but the details make the end results quite interesting.

At its core, there’s a static image, and that just gets copied to the back-buffer in 16x16 tiles. The tile destinations stay the same each frame, but the source positions are updated for whichever warping pattern is playing.

It’s all a little stale-looking if the destination tile grid remains the same, so instead, we copy one extra row and column, which allows us to then adjust both the x and y scrolling. The source positions also need to be adjusted to compensate. The consequence of this is that the tile splits are constantly moving, which helps liven the whole thing up and make it less obvious what’s actually going on.

Using the Blitter to copy an arbitrary 16x16 square requires a 32x16 blit. To run at 50 Hz, the screen is limited to 21x11 tiles for a 320x160 8-color image, which is close to the limit for the Blitter. We drive the Blitter from our copper list to keep it fed as quickly as possible while leaving the CPU free.

While the Copper is busy working the Blitter, the CPU is writing new source positions into the next copper list. For the patterns themselves, there really isn’t enough CPU grunt to do anything interesting, so I cheated. All frames for the source positions are in a large animation buffer that takes up about 360k of Slow-RAM. Thankfully, LDOS is so fast at loading and decompressing that there’s time to load a second animation in the background, which means the patterns can be different for the logo and title. The animation data itself is stored as deltas which compress really well, so there was no concern about disk space.

To generate the patterns, I actually prototyped things in Shadertoy. The swirling and wobbly zooms for the “Arcanum” title were the original patterns and took a little time to perfect. When I was happy with those, I ported them to a PC tool that could output a suitable Amiga binary. The first two animation patterns for the “Cosmic Orbs” logo are quite a bit simpler to better suit the style of the artwork.

I think the real cherry on top for this sequence is the sound effects Magnar added. Those really sell the effects and make it all feel a lot more purposeful.

Chunky 3D Rotozoom

I’ve always felt a little disappointed by most copper chunky effects, including my own, and originally I didn’t intend to work on a new one. So it’s amusing that I ended up making this extra chunky example!

The effect starts out by poking fun at both my routine and all the others out there. The idea was to heighten the impact of the final full-3D rotator, which I’m pretty sure is a first on A500. But it doesn’t seem like there’s been much reaction to that one, maybe just because the big rotator that follows grabbed the attention.

The screen is set up so each chunky line of copper list instructions can be repeated easily using copper jumps. Dithering and flickering cover up the low resolution using a combination of wait offsets and jumps to execute the correct chunky line at each y position.

None of this is really new, but a unique addition was to use the Blitter to fill the copper list colors. The CPU code works on the rotozoom and fills an intermediate buffer which the Blitter then writes to the copper list. Doing things this way opens up the option for the Blitter to combine multiple buffers from past frames, which is how we’re able to add a chromatic blur. Funnily enough, the gray effect for the first “Boring” rotator is using the Blitter to recombine one buffer. It takes one of the color channels and moves that into each of the RGB positions.

There are really three different rotators for this section, each one fills the overscan as much as possible. The first colorful rotator has chunks that are 8x4 pixels and a resolution of 44x68. This one is using the classic self-modifying technique, where each line reuses the same single set of interpolated UV offsets. This causes some break-up artifacts when you zoom in close, so for the next version, we switch to a lower resolution with 8x8 chunks but trade up to a precise per-pixel interpolator, which makes the zooming look much nicer. It’s a subtle improvement but maybe something other coders will have noticed.

For the 3D rotator, we stick with the 8x8 chunks, but the interpolator gets a lot trickier. This time we use a quadratic approximation to get as close as possible to a perspective-correct rotator. Arriving at a fast inner loop with just enough precision was a fun challenge, and the supporting math took a while to figure out. That it runs inside 50 Hz is a bit of a miracle, but there was just enough time left to keep the window of text on top to let people know it’s a little special.

Syncing this to the music with movement and chromatic blur flares was really satisfying, thanks in no small part to some great moments in Magnar’s track.

Facet’s texture maps also really make the difference for this effect. It took some last-minute experimenting to find the right style, and to avoid an aliased mess when zoomed out.

Big Rotozoom

This effect is my tribute to both Mr. Pet’s version which I first saw in “World of Commodore” and Kreator’s even better iteration from “Brian the Lion”. Those routines were mind-blowing and really stuck with me as something I always wanted to implement.

I suspect both examples were inspired by the “Digital Image Warping” book from 1990, in which George Wolberg details how rotation can be separated into two skew passes. The result implicitly adds some zooming which you’re stuck with, and you have to specifically handle each quarter turn since skewing only gets you 90 degrees from one diagonal to the next.

However, the two passes map quite nicely to the Amiga because you can skew vertically with the Blitter and then horizontally along with some vertical scaling by using the Copper. Full respect to Mr. Pet for making those mental leaps 30+ years ago!

It is, however, not quite as straightforward as it sounds. For a start, if you skew using the Blitter by moving each column of pixels then you can’t handle anything very large at 50 Hz. And so, you need to come up with a way of doing this incrementally. Mr. Pet’s effect rotates slowly, this seems to be so the skew only needs to move at most half of the image up or down by one pixel each frame, which is very clever but limited.

To rotate more quickly you have to copy the entire previous buffer in 16-pixel columns, some moving up/down and some needing a split down the middle. Tracking where you’re shifting things and carefully choosing the right splits so this works without looking wobbly or messy can be tricky. Kreator’s version follows this approach and does rotate faster, but at a fixed speed. I believe this is partly because the necessary blits are pre-calculated for the one specific speed.

Another constraint comes from the change needed every quarter turn. At each 45-degree diagonal, the underlying buffers need to completely switch to a 90-degree rotated copy of the image, but skewed in the opposite direction. To continue rotating you then start to unskew this buffer.

In order to keep turning through 360-degrees this needs to happen at each diagonal. The easiest way to deal with this is to keep all the fully skewed images in memory. There’s a snag however when you want to keep rotating past 360-degrees or change directions. In those cases, you’ll need the oppositely skewed image. You end up needing eight different buffers and after adding in the padding necessary for the blits and scrolling it easily uses up all of the Chip-RAM and more. This is maybe why we don’t see rotation past 360-degrees or in reverse from either Mr. Pet or Kreator’s code.

For my effect, I wanted to take things a step beyond what Kreator had achieved. However, he was already using most of the Blitter bandwidth and RAM. My implementation does manage to use a little more bandwidth and makes the rotator significantly bigger, but the compromise is that it’s in eight colors.

Nevertheless, I’ve managed to make these improvements:

Getting the vertical skew to work correctly for varying speeds took plenty of perseverance and some clever ordering of blits. But surprisingly the copper list updates for horizontal skewing and scaling also turned out to be a challenge due to the Blitter limiting available CPU cycles.

To handle the many skew buffers that don’t normally fit in RAM, I generate each as needed using any leftover CPU time. By storing a compact clone of each skewed buffer there’s normally just enough time to fully recreate each just before the rotation comes around and the buffer is needed.

This meant I could keep my promise to Magnar and the music was able to use half of the Chip-RAM for samples.

With this approach, the CPU is barely able to keep up during faster rotation speeds. So a final trick is to move the rotator around in translation. When doing so we adjust the bitplane fetches and this frees up some much-needed cycles so our buffers can always be ready in time.

It turned out there was even a little time remaining which I used to add greetings on top each time a buffer is generated. Many thanks to Prowler for the nice clean font that manages to look great while also remaining readable.

In the end, I’m happy with the result and think it takes a good step beyond Kreator’s version. But then, Kreator still managed to also squeeze in: a regular zoomer, a per-line zoomer, a mosaic effect, and a whole lot of sprite activity. So, I’m still completely in awe of those classic routines!

End Part

The wire-frame polygon ending was going to be more interesting, but as time ran out it became necessary to strip things down to something simple I could finish. Thankfully, Magnar’s music is excellent!

There isn’t much to dig into, but I will say that the math for all the 3D work adds up to a lot for the more complex shapes. Since I’d seen the same shapes in “Rink’a’Dink: Redux”, I felt sure Paradroid must have been doing something really clever. But no matter how much I optimized I couldn’t get it to fit in 50 Hz. After hitting a wall I gave up and just started adding a ring-buffer so the code could get ahead during the simple shapes and then catch up when the complex shapes were needed. This works fine as long as the complex shapes don’t stick around for too long.

Yet it still bugged me that my math code must be slower than Paradroid, so I got in touch with him looking for answers. That was illuminating, because it turned out he hadn’t even really tried to make his math code fast and had gone with a ring-buffer right away. I’d been so sure there must be some secret sauce I was missing. Sometimes the magic really is just about using the right trick, I love that about demos.

That’s All

I really enjoyed making “Backslide to Arcanum” and would like to thank Facet, Magnar, and Prowler for helping elevate my code and bring everything together.

I’d also like to thank all the Amiga coders who’ve inspired me from past and present, there’s far too many to mention!

Thank you for reading.