aboutsummaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorFranklin Wei <franklin@rockbox.org>2019-10-08 14:04:56 -0400
committerFranklin Wei <franklin@rockbox.org>2019-10-08 14:04:56 -0400
commitbf16fee7eb78b9385f6007c14e9e9c4c857dc174 (patch)
treec79f6830f7cb700b511012ddbc2b9efc7e7f8ca4
parentcfa7559866c05b0422b6965dd1e32080538e17bd (diff)
downloadblog-bf16fee7eb78b9385f6007c14e9e9c4c857dc174.zip
blog-bf16fee7eb78b9385f6007c14e9e9c4c857dc174.tar.gz
blog-bf16fee7eb78b9385f6007c14e9e9c4c857dc174.tar.bz2
blog-bf16fee7eb78b9385f6007c14e9e9c4c857dc174.tar.xz
Clean up.
-rw-r--r--.gitignore2
-rw-r--r--adieu-quake.html135
-rwxr-xr-xbuild.sh15
-rwxr-xr-xdeploy.sh6
-rwxr-xr-xextract_field.sh5
-rw-r--r--files/quake.jpg (renamed from quake.jpg)bin6905 -> 6905 bytes
-rw-r--r--footer.inc4
-rw-r--r--header.inc1
-rw-r--r--index.csv2
-rw-r--r--posts/adieu-quake.md (renamed from adieu-quake.md)158
-rw-r--r--posts/index.md7
11 files changed, 127 insertions, 208 deletions
diff --git a/.gitignore b/.gitignore
new file mode 100644
index 0000000..82b1d8e
--- /dev/null
+++ b/.gitignore
@@ -0,0 +1,2 @@
+out/
+*~
diff --git a/adieu-quake.html b/adieu-quake.html
deleted file mode 100644
index f9ec53a..0000000
--- a/adieu-quake.html
+++ /dev/null
@@ -1,135 +0,0 @@
-<!DOCTYPE html>
-<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
-<head>
- <meta charset="utf-8" />
- <meta name="generator" content="pandoc" />
- <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
- <title>Adieu, Quake!</title>
- <style>
- code{white-space: pre-wrap;}
- span.smallcaps{font-variant: small-caps;}
- span.underline{text-decoration: underline;}
- div.column{display: inline-block; vertical-align: top; width: 50%;}
- </style>
- <link rel="stylesheet" href="/style.css" />
- <!--[if lt IE 9]>
- <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
- <![endif]-->
-</head>
-<body>
-<h1 id="adieu-quake">Adieu, Quake!</h1>
-<p><a href="http://www.youtube.com/watch?v=74i8aBOmyos" title="Quake on Rockbox"><img src="http://img.youtube.com/vi/74i8aBOmyos/0.jpg" alt="Quake on Rockbox" /></a></p>
-<p><img src="quake.jpg" /></p>
-<p>TL;DR: I made Quake run on MP3 players. Read how it happened.</p>
-<p>I spent part of this summer playing with two of my favorite things: <a href="https://rockbox.org">Rockbox</a> and id Software’s <a href="https://en.wikipedia.org/wiki/Quake_(video_game)">Quake</a>. I even got the chance to combine the two by porting Quake to run <em>on</em> Rockbox! What more could I ask?</p>
-<p>This post is my story of how it went down. It is a protracted one, dragging on for nearly two years. It is also my first attempt at documenting the development proess in long form and “in the raw,” as opposed to the finished technical documentation I’ve written way too much of – do bear with me. There will be technical details, but I will try to focus on the thought process behind the code.</p>
-<p>Alas, the time has come to bid Rockbox and Quake goodbye, at least for the near term. My free time will be preciously scarce in the coming months, so I’m trying to get this brain dump in before the deluge.</p>
-<h2 id="rockbox">Rockbox</h2>
-<p><a href="https://rockbox.org">Rockbox</a> is a fun open-source project I spend far too much time hacking on. The web page explains it best: “Rockbox is a free replacement firmware for digital music players.” That’s right, we provide a complete replacement for the manufacturer’s software that came on your Sandisk Sansa, Apple iPod, or any of a wide array of other supported targets.</p>
-<p>Not only do we aim to replicate the original firmware’s functionality, we support loadable extensions called <em>plugins</em> – small programs to run on your MP3 player. Rockbox already has a bunch of nifty games and demos, the most impressive of which were probably the first-person shooters <em><a href="https://www.rockbox.org/wiki/PluginDoom">Doom</a></em> and <em><a href="https://www.rockbox.org/wiki/PluginDuke3D">Duke Nukem 3D</a></em>. But I still felt there was something missing.</p>
-<h2 id="enter-quake">Enter Quake</h2>
-<p>Quake is a fully 3D first-person shooter. Let’s break that down. They key words there are <em>fully 3D</em>, as opposed to <em>Doom</em> and <em>Duke Nukem 3D</em>, both of which are usually considered <em>2.5D</em> – imagine a 2D map with an additional height component. Quake, on the other hand, is fully 3D. Every vertex and polygon exists in 3-space. What this means is that the old pseudo-3D tricks no longer work – performance will suffer. Anyhow, I digress. In short, Quake is the Real Deal™.</p>
-<p>Quake is no joke, either. Some research showed that Quake “requires” a ~100 MHz x86 with a FPU and ~32 MB of RAM. Before you chuckle, keep in mind that Rockbox’s targets are probably nothing close to what John Carmack had in mind when writing the game – Rockbox runs on devices with CPUs as slow as 11MHz and as little as 2 MB of RAM (of course, Quake wasn’t going to be running on <em>those</em> devices). With this in mind, I looked at my ever-shrinking DAP collection and picked out the most powerful surviving member: an Apple iPod Classic/6G, with a 216 MHz ARMv5E and 64 MB of DRAM. Nothing to sneeze at, but certainly marginal when it comes to running Quake.</p>
-<h2 id="the-port">The Port</h2>
-<p>There exists a wonderful version of Quake which runs on SDL. It is called, unsurprisingly, <a href="https://www.libsdl.org/projects/quake/">SDLQuake</a>. Thankfully, I already ported the SDL library to Rockbox (that’s for another article), so getting Quake to compile was rather straightforward, if not the most glorious work: copy over the source tree; <code>make</code>; fix errors; rinse; repeat. I’m probably glossing over a lot of minutiae here – but just imagine my excitement when I eventually got a successfully compiling and linking Quake executable. I was ecstatic.</p>
-<p><em>Let’s load her up!</em> I thought.</p>
-<p>And it booted! The beautiful Quake console background greeted me, as did the menu. <em>All good</em>. But not so fast! When I started a game, something wasn’t right. The “Introduction” level seemed to load fine, but the spawn position was completely outside the map. <em>Strange</em>, I thought. I poked and prodded, debugged and <code>splashf</code>’d, but to no avail – the bug was too hard for me, or so it felt.</p>
-<p>And so it remained, for years. I should probably give a little timing information at this point. This first attempt at Quake took place in September 2017, after which I gave up. My Quake-Rockbox abomination sat on a shelf, collecting dust, until July 2019. By just the right combination of boredom and motivation, I resolved to finish what I had started.</p>
-<p>I got to debugging. Now, my flow state is such that I remember virtually no details of what exactly I did, but I’ll try my best here to reconstruct.</p>
-<p>As I discovered, the structure of Quake is divided into two main parts: the engine code, in C; and the high-level game logic, in QuakeC, a bytecode-compiled language. Now, I had always stayed away from the QuakeC VM due to some weird fear of debugging other people’s code. But now it forced me to delve in. Here again I vaguely recall a mad flow session in which I sought out the root of the bug. After what must’ve been a whirlwind of <code>grep</code>s, I found my culprit: <code>pr_cmds.c:PF_setorigin</code>. This function takes a 3-vector specifying the player’s new coordinates when starting a map, which, for some reason, was always <code>(0, 0, 0)</code>. <em>Hmm…</em></p>
-<p>I traced the data flow back and found where it originated – a call to <code>Q_atof()</code> – the classic string to float converter. And then it dawned on me: I had provided a set of wrapper functions, which overrode Quake’s <code>Q_atof()</code> – and my <code>atof()</code> function must’ve been broken. Fixing it was straightforward. I replaced the flawed <code>atof</code> with a correct one. Et voila! The glorious three-passage introduction level loaded flawlessly, and “E1M1: The Slipgate Complex” loaded fine too. The sound output still sounded like a 2-cycle lawnmower, but hey – I’d gotten Quake to boot on an MP3 player!</p>
-<h2 id="down-the-rabbit-hole">Down the Rabbit Hole</h2>
-<p>This project finally gave me an excuse to do something I’d been putting off for a while: learn ARM assembly language. The application was in a performance-sensitive sound mixing loop in <code>snd_mix.c</code>. A <code>SND_PaintChannelFrom8</code> function took an array of 8-bit mono sound samples and produced a 16-bit stereo stream, with left and right channels scaled independently. Here’s the assembly version I churned out after a couple hours (C version follows):</p>
-<pre><code>SND_PaintChannelFrom8:
- // r0: int true_lvol
- // r1: int true_rvol
- // r2: char *sfx
- // r3: int count
-
- stmfd sp!, {r4, r5, r6, r7, r8, sl}
-
- ldr ip, =paintbuffer
- ldr ip, [ip]
-
- mov r0, r0, asl #16 // pre-scale both volumes by 2^16
- mov r1, r1, asl #16
-
- sub r3, r3, #1 // we&#39;ll count backwards
- // sl = 0xffff0000
- ldrh sl, =0xffff
-
-.loop:
- ldrsb r4, [r2, r3] // load *sfx[i] -&gt; r4
-
- // keep endianness in mind here
- // buffer looks like [left_0, left_1, right_0, right_1] in memory
- // but it is loaded as [right1, right0, left1, left0] to registers
- ldr r8, [ip, r3, lsl #2] // load paintbuffer[0:1] = RIGHTCHANNEL:LEFTCHANNEL
-
- // handle high half (right channel) first
- mul r5, r4, r1 // SCALEDRIGHT = SFXI * (true_rvol &lt;&lt; 16) -- bottom half is zero
-
- // r7 holds right channel in high half (dirty bottom half)
- qadd r7, r5, r8 // RIGHTCHANORIG = SCALEDRIGHT + RIGHTCHANORIG (high half)
-
- bic r7, r7, sl // zero bottom bits of r7
-
- // trash r5, r6 and handle left channel
- mul r5, r4, r0 // SCALEDLEFT = SFXI * (true_rvol &lt;&lt; 16)
-
- mov r8, r8, lsl #16 // extract original left channel from paintbuffer
-
- // r8 holds left channel in high half with zero bottom half
- qadd r8, r5, r8
-
- // combine the two 16-bit samples in r7 as 32-bit [left:right]
- // (use lsr to not sign-extend the lower half)
- orr r7, r7, r8, lsr #16
-
- str r7, [ip, r3, lsl #2] // write 32-bit to paintbuffer
- subs r3, r3, #1
- bgt .loop // must use instead of bne because of the corner case count=1
-
- ldmfd sp!, {r4, r5, r6, r7, r8, sl}
-
- bx lr</code></pre>
-<p>There’s some hackery going on here. I’m using the ARM DSP <code>qadd</code> instruction to get saturation addition for cheap, but <code>qadd</code> only works with 32-bit words, and the sound samples are 16 bits. The hack, then, is to first shift the samples left by 16; <code>qadd</code> the samples; and shift them back. This accomplishes in one instruction what GCC took seven to do.</p>
-<p>The C version is below for reference:</p>
-<pre><code>void SND_PaintChannelFrom8 (int true_lvol, int true_rvol, signed char *sfx, int count)
-{
- int data;
- int i;
-
- // we have 8-bit sound in sfx[], which we want to scale to
- // 16bit and take the volume into account
- for (i=0 ; i&lt;count ; i++)
- {
- // We could use the QADD16 instruction on ARMv6+
- // or just 32-bit QADD with pre-shifted arguments
- data = sfx[i];
- paintbuffer[2*i+0] = CLAMPADD(paintbuffer[2*i+0], data * true_lvol); // need saturation
- paintbuffer[2*i+1] = CLAMPADD(paintbuffer[2*i+1], data * true_rvol);
- }
-}</code></pre>
-<p>I calculated about a 60% improvement in instructions/cycle over the optimized C version. Most of the saved cycles come from using <code>qadd</code> and packing two 16-bit samples in a 32-bit read and write.</p>
-<h3 id="a-prime-conspiracy">A “Prime” Conspiracy</h3>
-<p>You’ll notice the assembly listing has a comment by the <code>bgt</code> instruction (branch if greater than) noting that <code>bne</code> (branch if not equal) cannot be used because of a corner case that freezes if the sample count is 1. This will lead to an integer wraparound to <code>0xFFFFFFFF</code> and an extremely long delay (that will eventually resolve itself).</p>
-<p>This corner case was triggered by one sound in particular, of 7325 samples in length (the sound triggered by a 100 health pickup). What’s so special about 7325, you ask? Try taking it modulo any power of two:</p>
-<pre><code>7325 % 2 = 1
-7325 % 4 = 1
-7325 % 8 = 5
-7325 % 16 = 13
-7325 % 32 = 29
-7325 % 64 = 29
-7325 % 128 = 29
-7325 % 256 = 157
-7325 % 512 = 157
-7325 % 1024 = 157
-7325 % 2048 = 1181
-7325 % 4096 = 3229</code></pre>
-<p>Notice anything? That’s right – by some coincidence, 7325 prime whenever taken modulo a power of two. This leads to the sound mixing code being passed a one-sample array, causing the freeze.</p>
-<p>I spent at least a day rooting out this bug, only to find that it all came down to <em>one</em> wrong instruction. Life is like that sometimes, isn’t it?</p>
-<h2 id="adieu">Adieu</h2>
-<p>I’ve omitted a couple interesting things here for the sake of space. There is, for example, the race condition that occured only when gibbing a zombie. Or the assorted alignment issues and the micro-optimizations for rendering. But those are for another time. For now, it is time to say goodbye to Quake – it’s been good to me.</p>
-</body>
-</html>
diff --git a/build.sh b/build.sh
new file mode 100755
index 0000000..23030c4
--- /dev/null
+++ b/build.sh
@@ -0,0 +1,15 @@
+#!/bin/bash
+
+rm -rf out
+mkdir -p out
+
+cd posts
+
+for f in *.md
+do
+ pandoc --email-obfuscation=javascript -s -t html --css=/style.css -B ../header.inc -A ../footer.inc --metadata pagetitle="FWEI.TK | ""$(../extract_field.sh ../index.csv $f 2)" -o ../out/${f%.md}.html $f
+done
+
+cd -
+
+cp files/* out/
diff --git a/deploy.sh b/deploy.sh
new file mode 100755
index 0000000..bb32201
--- /dev/null
+++ b/deploy.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+
+ssh-add
+ssh root@fwei.tk rm -rf /var/www/html/blog
+ssh root@fwei.tk mkdir -p /var/www/html/blog
+scp out/* root@fwei.tk:/var/www/html/blog
diff --git a/extract_field.sh b/extract_field.sh
new file mode 100755
index 0000000..aab4b6a
--- /dev/null
+++ b/extract_field.sh
@@ -0,0 +1,5 @@
+#!/bin/bash
+
+# Usage: ./extract_field.sh DBNAME KEY FIELDIDX
+
+awk 'BEGIN { FS = ":" } $1 == "'"$2"'" { print $'"$3"'}' < $1
diff --git a/quake.jpg b/files/quake.jpg
index cbe2a5e..cbe2a5e 100644
--- a/quake.jpg
+++ b/files/quake.jpg
Binary files differ
diff --git a/footer.inc b/footer.inc
new file mode 100644
index 0000000..12c4059
--- /dev/null
+++ b/footer.inc
@@ -0,0 +1,4 @@
+</div>
+<footer>
+ <a href="/blog">Blog index</a> | <a href="/">Home</a>
+</footer>
diff --git a/header.inc b/header.inc
new file mode 100644
index 0000000..a432df0
--- /dev/null
+++ b/header.inc
@@ -0,0 +1 @@
+<div id="container">
diff --git a/index.csv b/index.csv
new file mode 100644
index 0000000..0003e82
--- /dev/null
+++ b/index.csv
@@ -0,0 +1,2 @@
+adieu-quake.md:Adieu, Quake!
+index.md:Quite Frankly
diff --git a/adieu-quake.md b/posts/adieu-quake.md
index 70c70b1..8a81e3a 100644
--- a/adieu-quake.md
+++ b/posts/adieu-quake.md
@@ -4,7 +4,7 @@
![](quake.jpg)
-TL;DR: I made Quake run on MP3 players. Read how it happened.
+**TL;DR** I made Quake run on MP3 players. Read how it happened.
I spent part of this summer playing with two of my favorite things:
[Rockbox](https://rockbox.org) and id Software's
@@ -36,19 +36,19 @@ Not only do we aim to replicate the original firmware's functionality,
we support loadable extensions called *plugins* -- small programs to
run on your MP3 player. Rockbox already has a bunch of nifty games and
demos, the most impressive of which were probably the first-person
-shooters *[Doom](https://www.rockbox.org/wiki/PluginDoom)* and *[Duke
-Nukem 3D](https://www.rockbox.org/wiki/PluginDuke3D)*. But I still
-felt there was something missing.
+shooters [Doom](https://www.rockbox.org/wiki/PluginDoom) and [Duke
+Nukem 3D](https://www.rockbox.org/wiki/PluginDuke3D). But I still felt
+there was something missing.
## Enter Quake
Quake is a fully 3D first-person shooter. Let's break that down. They
-key words there are *fully 3D*, as opposed to *Doom* and *Duke Nukem
-3D*, both of which are usually considered *2.5D* -- imagine a 2D map
-with an additional height component. Quake, on the other hand, is
-fully 3D. Every vertex and polygon exists in 3-space. What this means
-is that the old pseudo-3D tricks no longer work -- performance will
-suffer. Anyhow, I digress. In short, Quake is the Real Deal™.
+key words there are *fully 3D*, as opposed to Doom and Duke Nukem 3D,
+both of which are usually considered *2.5D* -- imagine a 2D map with
+an additional height component. Quake, on the other hand, is fully
+3D. Every vertex and polygon exists in 3-space. What this means is
+that the old pseudo-3D tricks no longer work -- everything is now
+full-blown 3D. Anyhow, I digress. In short, Quake is the Real Deal™.
Quake is no joke, either. Some research showed that Quake "requires" a
~100 MHz x86 with a FPU and ~32 MB of RAM. Before you chuckle, keep in
@@ -63,8 +63,8 @@ marginal when it comes to running Quake.
## The Port
-There exists a wonderful version of Quake which runs on SDL. It is
-called, unsurprisingly,
+There exists a wonderful version of Quake which runs on
+[SDL](https://libsdl.org). It is called, unsurprisingly,
[SDLQuake](https://www.libsdl.org/projects/quake/). Thankfully, I
already ported the SDL library to Rockbox (that's for another
article), so getting Quake to compile was rather straightforward, if
@@ -117,75 +117,76 @@ too. The sound output still sounded like a 2-cycle lawnmower, but hey
## Down the Rabbit Hole
This project finally gave me an excuse to do something I'd been
-putting off for a while: learn ARM assembly language. The application
-was in a performance-sensitive sound mixing loop in `snd_mix.c`. A
-`SND_PaintChannelFrom8` function took an array of 8-bit mono sound
-samples and produced a 16-bit stereo stream, with left and right
-channels scaled independently. Here's the assembly version I churned
-out after a couple hours (C version follows):
+putting off for a while: learn ARM assembly language.
+
+The application was in a performance-sensitive sound mixing loop in
+`snd_mix.c` (remember the lawnmower-like sound?).
+
+The `SND_PaintChannelFrom8` function takes an array of 8-bit mono
+sound samples and mixes it into an existing 16-bit stereo stream, with
+left and right channels scaled independently based on two integer
+parameters. GCC was doing a terrible job at optimizing the saturation
+arithmetic, so I took a shot at it myself. I rather like how it turned
+out.
+
+Here's the assembly version I came up with (C version follows):
```
SND_PaintChannelFrom8:
- // r0: int true_lvol
- // r1: int true_rvol
- // r2: char *sfx
- // r3: int count
+ ;; r0: int true_lvol
+ ;; r1: int true_rvol
+ ;; r2: char *sfx
+ ;; r3: int count
stmfd sp!, {r4, r5, r6, r7, r8, sl}
ldr ip, =paintbuffer
ldr ip, [ip]
- mov r0, r0, asl #16 // pre-scale both volumes by 2^16
+ mov r0, r0, asl #16 ; prescale by 2^16
mov r1, r1, asl #16
- sub r3, r3, #1 // we'll count backwards
- // sl = 0xffff0000
- ldrh sl, =0xffff
-
-.loop:
- ldrsb r4, [r2, r3] // load *sfx[i] -> r4
-
- // keep endianness in mind here
- // buffer looks like [left_0, left_1, right_0, right_1] in memory
- // but it is loaded as [right1, right0, left1, left0] to registers
- ldr r8, [ip, r3, lsl #2] // load paintbuffer[0:1] = RIGHTCHANNEL:LEFTCHANNEL
-
- // handle high half (right channel) first
- mul r5, r4, r1 // SCALEDRIGHT = SFXI * (true_rvol << 16) -- bottom half is zero
+ sub r3, r3, #1 ; count backwards
- // r7 holds right channel in high half (dirty bottom half)
- qadd r7, r5, r8 // RIGHTCHANORIG = SCALEDRIGHT + RIGHTCHANORIG (high half)
+ ldrh sl, =0xffff ; halfword mask
- bic r7, r7, sl // zero bottom bits of r7
+1:
+ ldrsb r4, [r2, r3] ; load input sample
+ ldr r8, [ip, r3, lsl #2] ; load output sample pair from paintbuffer
+ ; (left:right in memory -> right:left in register)
+ ;; right channel (high half)
+ mul r5, r4, r1 ; scaledright = sfx[i] * (true_rvol << 16) -- bottom half is zero
+ qadd r7, r5, r8 ; right = scaledright + right (in high half of word)
+ bic r7, r7, sl ; zero bottom half of r7
- // trash r5, r6 and handle left channel
- mul r5, r4, r0 // SCALEDLEFT = SFXI * (true_rvol << 16)
+ ;; left channel (low half)
+ mul r5, r4, r0 ; scaledleft = sfx[i] * (true_rvol << 16)
+ mov r8, r8, lsl #16 ; extract original left channel from paintbuffer
+ qadd r8, r5, r8 ; left = scaledleft + left
- mov r8, r8, lsl #16 // extract original left channel from paintbuffer
+ orr r7, r7, r8, lsr #16 ; combine right:left in r7
+ str r7, [ip, r3, lsl #2] ; write right:left to output buffer
+ subs r3, r3, #1 ; decrement and loop
- // r8 holds left channel in high half with zero bottom half
- qadd r8, r5, r8
-
- // combine the two 16-bit samples in r7 as 32-bit [left:right]
- // (use lsr to not sign-extend the lower half)
- orr r7, r7, r8, lsr #16
-
- str r7, [ip, r3, lsl #2] // write 32-bit to paintbuffer
- subs r3, r3, #1
- bgt .loop // must use instead of bne because of the corner case count=1
+ bgt 1b ; must use bgt instead of bne in case count=1
ldmfd sp!, {r4, r5, r6, r7, r8, sl}
bx lr
```
-There's some hackery going on here. I'm using the ARM DSP `qadd`
-instruction to get saturation addition for cheap, but `qadd` only
-works with 32-bit words, and the sound samples are 16 bits. The hack,
-then, is to first shift the samples left by 16; `qadd` the samples;
-and shift them back. This accomplishes in one instruction what GCC
-took seven to do.
+There's some hackery going on here that could use some explaining. I'm
+using the ARM `qadd` DSP instruction to get saturation addition for
+cheap, but `qadd` only works with 32-bit words, and the sound samples
+are 16 bits. The hack, then, is to first shift the samples left by 16
+bits; `qadd` the samples together; and then shift them back. This
+accomplishes in one instruction what GCC took seven to do. (Sure, I
+could've avoided this hack altogether if I were working with ARMv6,
+which has MMX-esque packed saturation arithmetic with `qadd16`, but
+alas -- life isn't so easy. And besides, it was a cool hack!)
+
+Notice also that I'm reading and writing two stereo samples at a time
+(with a word-sized `ldr` and `str`) to save a couple more cycles.
The C version is below for reference:
@@ -210,20 +211,21 @@ void SND_PaintChannelFrom8 (int true_lvol, int true_rvol, signed char *sfx, int
I calculated about a 60% improvement in instructions/cycle over the
optimized C version. Most of the saved cycles come from using `qadd`
-and packing two 16-bit samples in a 32-bit read and write.
+for saturation arithmetic and packing of memory operations.
### A "Prime" Conspiracy
-You'll notice the assembly listing has a comment by the `bgt`
-instruction (branch if greater than) noting that `bne` (branch if not
-equal) cannot be used because of a corner case that freezes if the
-sample count is 1. This will lead to an integer wraparound to
-`0xFFFFFFFF` and an extremely long delay (that will eventually resolve
-itself).
+Here's another interesting bug I ran into along the way. You'll notice
+the assembly listing has a comment by the `bgt` instruction (branch if
+greater than) noting that `bne` (branch if not equal) cannot be used
+because of a corner case that freezes if the sample count is 1. This
+will lead to an integer wraparound to `0xFFFFFFFF` and an extremely
+long delay (which will eventually resolve itself).
This corner case was triggered by one sound in particular, of 7325
-samples in length (the sound triggered by a 100 health pickup). What's
-so special about 7325, you ask? Try taking it modulo any power of two:
+samples in length (the sound triggered by a 100 health pickup,
+incidentally). What's so special about 7325, you ask? Try taking it
+modulo any power of two:
```
7325 % 2 = 1
@@ -240,9 +242,12 @@ so special about 7325, you ask? Try taking it modulo any power of two:
7325 % 4096 = 3229
```
-Notice anything? That's right -- by some coincidence, 7325 prime
-whenever taken modulo a power of two. This leads to the sound mixing
-code being passed a one-sample array, causing the freeze.
+*5, 13, 29, 157*...
+
+Notice anything? That's right -- by some coincidence, 7325 is prime
+whenever taken modulo a power of two. This *somehow* (I'm actually not
+sure exactly how) leads to the sound mixing code being passed a
+one-sample array, triggering the corner case and freeze.
I spent at least a day rooting out this bug, only to find that it all
came down to *one* wrong instruction. Life is like that sometimes,
@@ -252,6 +257,13 @@ isn't it?
I've omitted a couple interesting things here for the sake of
space. There is, for example, the race condition that occured only
-when gibbing a zombie. Or the assorted alignment issues and the
-micro-optimizations for rendering. But those are for another time. For
+when gibbing a zombie but only when the audio sample rate was 44.1
+kHz. (This was a result of the sound thread trying to load a sound --
+a explosion -- while the model loader tried to load the gib
+model. These two sections relied on a common function that relied on
+the same global variable.) And then there's the assorted alignment
+issues (love 'ya, ARM!) and the rendering micro-optimizations I made
+to squeeze out a few more frames. But those are for another time. For
now, it is time to say goodbye to Quake -- it's been good to me.
+
+So long, and thanks for all the fish!
diff --git a/posts/index.md b/posts/index.md
new file mode 100644
index 0000000..0732d2b
--- /dev/null
+++ b/posts/index.md
@@ -0,0 +1,7 @@
+# Quite Frankly
+
+This is my humble blog. Welcome.
+
+- [Adieu, Quake!](adieu-quake.html) (27 Aug 2019)
+
+Contact: <me@fwei.tk> \ No newline at end of file