Essential Math Weblog Thoughts on math for games


New Thoughts on Random Numbers, Part Two

Filed under: Erratical,Mathematical — Jim @ 10:32 pm

Sorry about the long delay on this. I originally intended to post this right away, but then realized I should do some further testing… and for whatever reason that never happened until last week. In any case, as I mentioned in my previous post, I recently added a new pseudo-random number generator to the Skia test suite. The last post was about verifying that a new PRNG was needed, this one is about what I used to replace it.

My initial thoughts about how to replace the Linear Congruential Generator was to jump right to the Mersenne Twister. However, the Mersenne Twister requires a buffer of 624 integers. While this is not terrible, it’s probably overkill for what we need, and the large buffer is not great for low-memory situations. In addition, part of my research showed that it’s possible to get large random periods with other, simpler methods.

To cut the story short, because we do a lot of testing on mobile platforms I wanted something low-memory and super simple. I ended up taking one of Marsaglia’s generators and modifying it slightly. This is the “mother-of-all” generator that I mention in the random number generator chapter:

return((k << 16)+j);

where k and j are unsigned 32-bit integers, initialized to non-zero values.

When I ran this through the Tuftests suite, it passed the GCD and Birthday Spacings tests, but failed the Gorilla test on half of the bits. So I figured, what the hey, I’ll just combine the high bits of k into the low bits of the result as well, or

return(((k << 16) | (k >> 16)) + j);

Surprisingly, this passed all three tests. In addition, I ran a test to try to see what its period is. It ran over a weekend and still didn’t hit 2^64 values. While a good part of that time was running the test on each new value, it does show that in a game or your average application, we probably don’t need these large periods because practically we’ll never generate that many values (the exception might be a game server, which is up for days). For Skia’s particular bench and unit tests, we might generate a few thousand random values — as long as our period is more than that we should be okay.

I’ve since run the TestU01 suite on this, and it passes SmallCrush, fails one of the Birthday Spacing tests (#11) on Crush, and two other Birthday Spacing tests (#13 and #15) on BigCrush. Other than that it passes. So while it’s not as random as other generators, at this point it doesn’t seem worth it to change it and rebaseline all of our results again, just for those failures.

If that doesn’t work for you, an alternative worth trying might be Marsaglia’s KISS generator, which takes four different random number generators and mixes their bits together. He claims (or claimed: Dr. Marsaglia unfortunately passed on in 2011) that the 2011 version has a period of greater than 10^40000000. Regardless of quality, the 2011 version requires a large buffer which is probably not practical for low-memory mobile platforms. The 1993 version, on the other hand, requires only six values, and has a period of 2^123. It does still fail one Crush and one BigCrush test, but is still probably good enough for games.

One important note: KISS is not cryptologically safe, as proven here, and I strongly doubt the generator I describe above is either. And if you do need a cryptologically secure generator, it’s best to find an approved implementation and use that, as even slight changes in speed due to a naive implementation can be enough to give an attacker enough information to break your code. But of course, if the NSA was involved, you might want to be wary…

In any case, to get results suitable for game logic, you really don’t need Mersenne Twister. Using one of these simple generators will probably do, and are leaps and bounds better than any Linear Congruential Generator.


New Thoughts on Random Numbers, Part One

Filed under: Erratical,Mathematical — Jim @ 9:00 am

I recently (December 2012) switched jobs from being an Engine Programmer at Insomniac Games, to being a Software Engineer at Google. At Google, I’m now working on Skia, which is the 2D graphics library that underlies Chrome, Firefox and parts of Android. One of the first things that came to my attention when I arrived was the pseudorandom number generator (PRNG) used to generate test cases for regressions — namely, that there was some concern that it wasn’t very random. In particular, the least significant bit flips back and forth, and possibly other lower-order bits weren’t very random. “Aha,” I thought, “Here’s something small and isolated that I can take on as a new hire. And it won’t take long; I can just draw on the random number chapter from our book.” Turns out that was partially right.

The first step was determining how good or bad the original PRNG was. Looking at it, it was clear that it was a linear congruential generator (LCG), which do have notoriously poor randomness in lower-order bits. They also fail the spectral test, i.e. if you generate points in space they tend to fall along visible planes. Not so good for generating tests for a graphics library. But how to verify this?

In the book I discuss some batteries of tests, such as Marsaglia’s Diehard and Brown’s DieHarder. In doing some research I found a couple more. First, there’s TestU01, which is a more extensive suite released in 2009. And Marsaglia and Tsang have created a smaller set of tests called tuftests, which they claim are nearly as rigorous as the large suites — i.e. if a PRNG passes tuftests, it’s highly likely to pass Diehard. For our case, we aren’t trying to find the ultimate PRNG, just something fast and reasonably random. So tuftests should suffice.

Tuftests consists of only three tests: the GCD test (as opposed to the GDC test, which probably involves giving a coherent talk after a night of parties), the Gorilla test, and the Birthday Spacings test. The GCD computes a large series of two pseudorandom variables, finds their greatest common denominators, and then compares the distribution of cycles necessary to compute the GCDs to the cycles needed for truly random variables. The Gorilla test is a variant of the Monkey test, where random strings of bits are generated using a single bit position of a random variable. This is done multiple times. The count of strings not generated should fall in a normal distribution. The Birthday Spacings test generates a set of birthdays in a “year”, and then determines the spacings between them, which should fall in a Poisson distribution. For all of these, the generated distribution is compared with the expected distribution, using chi square, Anderson-Darling or other method.

Testing the LCD used by Skia, it was clear that it failed the Gorilla test for all but the highest order bits (it may have also failed the Birthday test, I don’t recall now). So the belief that Skia’s PRNG needed improvement was verified. The next step was to find a decent replacement that doesn’t require a lot of overhead (important for testing on mobile devices). And I’ll cover my search for that next time.


More Errata

Filed under: Erratical,Mathematical — Jim @ 5:02 pm

So Jacob Shields, a student at Ohio State, added a set of errata to the comments on another post. To address them, I’ve pulled them out one by one:

I came across this on page 117 of the Second Edition: “There is a single solution only if the rank of A is equal to the minimum of the number of rows or columns of A.”

If I’m interpreting this correctly, it follows that a 2×3 coefficient matrix with a rank of 2 has a single solution. In reality, it has 1 free variable and thus infinitely many solutions. Shouldn’t there be a single solution only if the rank of A is equal to the number of unknowns, i.e. columns?

Yeah, I was probably only thinking of square matrices when I wrote that. I’ll have to think of another way to put that. Saying that it’s equal to the number of columns doesn’t work if you’re solving xA = b.

Also, on page 126 it says: “The first part of the determinant sum is u_[0,0]*U~_[0,0].” Below, it displays this equation: “det(U) = u_[0,0]*U~_[0,0]”

Assuming I’m interpreting this correctly, this must be wrong. The determinant of a real matrix should be a real number, not another matrix. In both cases I think it means to say u_[0,0]*det(U~_[0,0]) — right?

Yes, that’s correct, it’s a typo. The notation should follow from that used on page 123.

[…] on page 130 it says: “It turns out the columns of R are the eigenvectors of A, and the diagonal elements of D are the corresponding eigenvectors.” Shouldn’t it be that “the diagonal elements of D are the corresponding eigenvalues”?

Yup, another typo.

Could you please explain what is meant by the first matrix multiplication shown on page 138 in Section 4.2.3, Formal Representation [of affine transformations]?

For example, where did the matrix come from? Its entries are vectors, and the dimension of the product is (m+1)x1. What is the significance of the product, and how does it apply to what was just said?

I understand the matrix multiplication below it. In this case, the dimension of the product is 1×1 and, when expanded out, is equal to T(P). On initial reading it seems like the first matrix multiplication and the second one are supposed to be equivalent (before the second it says, “We can pull out the frame terms to get…”), but they’re clearly different, so how did you go from one to the other?

I’m not sure what the issue is; it’s similar to the linear transformation discussion on 104-106. The first n columns of the first matrix are the T(v_j) terms and the last is the T(O_b) term. The product of the two matrices is the transformed result T(P). The last sentence on 138 sums it all up.

On page 180, the description in the first whole paragraph about rotating an object to demonstrate gimbal lock says to “rotate the object clockwise 90 degrees around an axis pointing forward … [then] rotate the new top of the object away from you by 90 degrees … [then] rotate the object counterclockwise 90 degrees around an axis point up …. The result is the same as pitching the object downward 90 degrees (see Figure 5.4).”

However, if I’m interpreting this correctly, the rotation described appears to be a fixed angle x-y-z (-pi/2,-pi/2,pi/2) rotation. The result of this is the same as pitching the object _upward_ 90 degrees. Furthermore, Figure 5.4 demonstrates a different rotation: a fixed angle x-y-z (pi/2,pi/2,pi/2) rotation.

If the x-axis is pointing away from you and you rotate clockwise (from your perspective) by 90 degrees, that’s the same as rotating around the x-axis by positive 90 degrees, using the right-hand rule. Similarly, rotating the top of the object away from you is rotating around the y-axis by positive 90 degrees. The result should be the same as the diagram.

And this is why math uses symbols rather than trying to explain it using natural languages…

I might be missing something here, but on pages 181-182 it says this: “Near-parallel vectors may cause us some problems either because the dot product is near 0, or normalizing the cross product ends up dividing by a near-zero value.”

For near-parallel vectors, wouldn’t the dot product be near +/- 1, not 0?

I’m not sure what I meant there. Possibly that the dot product magnitude being near 1 (not 0, as you point out) can, through floating point error, lead to a result with magnitude slightly greater than 1, which is not valid input to acos.

On page 178, under the heading “5.3.3 Concatenation” it says this: “Applying (pi/2, pi/2, pi/2) twice doesn’t end up at the same orientation as (pi, pi, pi).”

However, for fixed angle z-y-x (the most-recently mentioned convention in the book prior to this section), applying (pi/2, pi/2, pi/2) twice does appear to end up at the same orientation as (pi, pi, pi). This isn’t the case for fixed angle x-y-z, though (and presumably other conventions). That had me confused for a while!

I believe that should be: Applying (pi/4,pi/4,pi/4) twice doesn’t end up at (pi/2,pi/2,pi/2). Your confusion is understandable. Though as (very) weak defense, when I multiply it out on the computer, floating point error does give me slightly different results (some of the “zero” terms end up being very small f.p. values).

On page 191, about halfway down the page, it mentions vectors w_0 and w_1, and then says that if we want to find the vector that lies halfway between them, we need to compute (w_1+w_2)/2. I’m assuming that it really means (w_0+w_1)/2?

As an aside, what is the significance of the division by 2? That is, using this method, what is the expected length of the resultant vector (without normalizing)? For instance, let w_0 = (1,0,0) and w_1 = (0,1,0). Then the halfway vector is (1+0,0+1,0+0)/2 = (0.5,0.5,0), whose length is 1/sqrt(2). Clearly this points in the correct direction, but its length is not the length of w_0, the length of w_1, or the average length of them, so what is it used for?

Correct, it should be w_0 and w_1. And the division by two is just to represent the average. As I point out, you don’t need to divide by 2 if you want a normalized result — you can skip that and do the normalize step. The length is never used.

Equation 5.7 on page 191 has a hat over the q (i.e. unit quaternion notation). However, I don’t believe that it’s actually a unit quaternion–specifically, I think its length is something like 2*sqrt(2*cos(theta) + 2).

Yes, that’s a typo. No hat.

On page 191, when talking about converting from a rotation matrix to a quaternion, it says that “if the trace of the matrix is less than zero, then this will not work.” I was wondering if you could explain why?

While trying to figure out, I found another way of performing this conversion which may or may not be equivalent to your own method:

In that case, it involves taking sqrt(Trace+1), so it actually fails if Trace+1<0. What is the difference between that method and your method which makes the difference between Trace<0 and Trace+1<0? Is it related to the fact that if the rotation was in a 4×4 affine transformation matrix then you’d get +1 when you take the trace?

The trace+derived axis+normalization technique will fail as theta gets closer to pi, because r approaches a zero vector as theta approaches pi (see page 183). In this case, you’d end up with something close to (-1,0,0,0), which is the inverse of the identity quaternion and probably not the correct result. So if the trace is close to -1, you need to do more work.

So why check against 0? In an abstract math world with infinite precision, doing this with a trace between -1 and 0 should be fine. But with floating point your derived rotation axis vector effectively becomes noise as it gets smaller and smaller. It avoids a lot of floating point headaches just to do the zero check.

The website also does a zero check; it’s basically equivalent to what I have but the normalization is handled by the 0.5f/sqrt(trace+1.0f) term (modified slightly for w). Theirs is probably faster as it doesn’t require squaring and summing the components to do the normalization.


Collision Response errata

Filed under: Erratical — Jim @ 8:38 pm

On page 636 of the second edition, the second paragraph begins as follows:

Each object will have its own value of epsilon. […]

On page 637, equation 13.18 contains a term epsilon_a, and below it is written

The equation for j_b is similar, except that we substitute epsilon_b for epsilon_a.

The code on page 638 follows this by referring to m_Elasticity and other->m_Elasticity.

These assertions are incorrect. The coefficient of restitution is not a property of an individual object — it is a property of the collision itself. This can be seen clearly in equation 13.15, which applies to both objects. So if we are considering four materials A, B, C, and D, there would be six possible coefficients of restitution: A colliding with B, A colliding with C, A colliding with D, B colliding with C, B colliding with D, and C colliding with D.

Many thanks to James McGovern of the Art Institute of Vancouver for pointing this out.

An error that was corrected in the second edition, but is not noted in the errata for the first is in the equation for the impulse magnitude for rotational collision (Equation 12.24, page 618). The sum of cross products in the denominator should be dotted with the normalized collision normal.

And in both editions, the moments of inertia tensors used in the collision response equations (on pages 617-618 in the first edition, and page 639 in the second edition) are referred to as I_a and I_b. These should be J_a and J_b to match the previous notation used. The intent of this was to separate the notation for the identity matrix from the notation for the inertia tensor. However, standard convention is to use I for the inertia tensor, hence the typo.

Thanks for Johnny Newman, also of Vancouver, for discovering the latter two errors.

Apologies for any confusion any of this may have caused readers.


Two CD Updates

Filed under: Erratical — Jim @ 9:36 pm

Yes, it’s been a while. But here are two updates for those who make use of the CD that comes with the book.

First, Ben Bass over at Coded Structure wrote to let me know that the code on CD doesn’t build properly with Mac OS 10.6. Fortunately he has provided a guide on his blog showing how to patch up the Makefiles. Thank you, Ben!

Secondly, another reader wrote to point out that the CD notes say that built Windows versions of the OpenGL examples would be provided, but in fact they aren’t on the CD. These are now available in a separate zip file which you can download here.


More errata

Filed under: Erratical — Jim @ 9:41 pm

A couple more errata have come in since the last update. From Tim Lowery comes:

“On page 117, the figures look mislabeled, or the axis is incorrect. For example, Figure 3.5a shows x-axis rotation, but the object is rotating around the y-axis. Likewise, for Figure 3.5b.”

In this case, the captions are correct (the intent is to go in xyz order), but the figures are reversed.

From k. avery comes:

“At the end of p. 623, cot theta should be csc theta / sec theta, not sec theta / csc theta.”

Indeed it should be. And I had thought I caught that one once before. Evasive devils, those cotangents.


Errata and Patch Files

Filed under: Erratical — Jim @ 10:13 pm

I’ve finally gotten around to creating a formal errata file, and a code update. The code update is an encrypted ZIP file; the password is the second word on page 37. The paths should be set up correctly to match the original CD. To use, simply unzip it to the directory where you copied the CD contents.

If there are issues, you can post a comment here, or write me by using my first name at this domain.


Reparameterizing a Curve, Part Three

Filed under: Erratical,Mathematical — Jim @ 9:08 pm

Now we come to the final part: merging Newton-Raphson and bisection together, and choosing the correct one depending on the situation.


Reparameterizing a Curve, Part Two

Filed under: Erratical,Mathematical — Jim @ 9:11 pm

Due to preparing for a week of vacation, and then going on a week of vacation, and then recovering from a week of vacation, this blog has fallen from the wayside a bit. But I’m back, and I’ll try to wrap up this topic in a couple of postings.

When last we left our intrepid heroes, we were discussing Newton-Raphson root finding, how it’s used for reparameterizing a curve, and when it fails (namely when the speed on the curve approaches zero). The solution hinted at was the bisection method.


Reparameterizing a Curve

Filed under: Erratical,Mathematical — Jim @ 8:52 pm

I’m still optimizing memory and speed so nothing exciting to report under the lighting topic. Maybe later this week or next week.

So let’s look at a bit of code from the book, shall we?


Errata: Tensor Product

Filed under: Erratical,Mathematical — Jim @ 10:31 pm

William Brennan points out that on page 84, the equation

is not correct. There are two ways to rewrite this. In one the intended order is correct, but is missing the transpose operator to indicate that it’s a row vector:

Alternatively, you can generate the same result with a column vector by doing:

In Chapter 3, we simplify the Rodrigues formula (page 123) and general planar reflections (page 128) by using the tensor product. Since the arguments of the tensor product are the same in these cases, the ordering doesn’t matter. However, the ordering of the arguments needs to be reversed in the shear matrices on page 132, so


Other tensor product arguments may need to be reversed elsewhere in the text, though I can’t find any at this time.

Errata: Figure 1.9

Filed under: Erratical — Jim @ 9:23 am

Reported by Richard Ruth: In Figure 1.9 on Page 26, the label on the hypotenuse should be .


Getting Axis and Angle from Rotation Matrix

Filed under: Erratical,Mathematical — Jim @ 11:36 am

There is a bug in the code that gets the axis and angle information from a rotation matrix. Where the angle is 180 degrees (the last else statement) it computes the reciprocal value incorrectly. The original code is:
float recip = 0.5f/s;
The fixed code should be:
float recip = 1.0f/s;
This should be corrected in both IvMatrix33 and IvMatrix44.


Correction: Quaternion Creation

Filed under: Erratical,Mathematical — Jim @ 8:37 pm

This is both a correction, and an improvement to the creation of a quaternion from two vectors. Or as it is commonly known, Shortest RotationArc.

The current code doesn’t handle the case where the two vectors are opposing; there’s an if statement to detect it, and a comment, but no recovery code. The problem is that there are an infinite number of orthogonal rotation axes — we just need to pick one. The solution is to check to see which axis the start vector (it could be the finish vector, it doesn’t matter) is generally pointing along. If it’s pointing generally along the -axis, then we take the cross product with the -axis to get our rotation axis. Otherwise we take the cross product with the -axis.


Correction: Quaternions

Filed under: Erratical,Mathematical — Jim @ 9:35 pm

Ah, our first errata. How bittersweet.

Eric Mumau sent me this one. On page 499, equation 10.11 is wrong. It should read:

Correspondingly, the following equation should read:

The rest of the page follows as before.

This affects the code which starts at the bottom of page 499. At the top of 500, the first three lines should be rewritten as:
float vMult = 2.0f*(x*vector.x + y*vector.y + z*vector.z);
float crossMult = 2.0f*w;
float pMult = crossMult*w – 1.0f;
Note that the code on the CD is not the same as that given in the text, and in fact will actually work. However, it is slightly less efficient than what is given here.

Powered by WordPress

Fatal error: Uncaught exception 'wfWAFStorageFileException' with message 'Unable to verify temporary file contents for atomic writing.' in /home/jvsquare/public_html/essentialmath/blog/wp-content/plugins/wordfence/vendor/wordfence/wf-waf/src/lib/storage/file.php:47 Stack trace: #0 /home/jvsquare/public_html/essentialmath/blog/wp-content/plugins/wordfence/vendor/wordfence/wf-waf/src/lib/storage/file.php(650): wfWAFStorageFile::atomicFilePutContents('/home/jvsquare/...', '<?php exit('Acc...') #1 [internal function]: wfWAFStorageFile->saveConfig('livewaf') #2 {main} thrown in /home/jvsquare/public_html/essentialmath/blog/wp-content/plugins/wordfence/vendor/wordfence/wf-waf/src/lib/storage/file.php on line 47