JavaScript EditorFree JavaScript Editor     Ajax Editor 

Main Page
  Previous Section Next Section

Explicit Animation Techniques

We will begin by reviewing explicit animation techniques. Although they are being substituted by the more powerful implicit paradigm lately, there are still many applications for explicit methods. Some games will need to render lots of characters onscreen, and the CPU hit imposed by implicit methods will be prohibitive. Other games will need simple animations attached to characters, and using an implicit solution might be a bit too much effort for the results needed. Whichever the case, we will review three classic approaches to explicit animation.

Frame Animation

The simplest way of animating a character can be derived from traditional animation techniques. In old-school animation, the frames of a character in an animated movie were drawn on cellophane sheets, so there were as many sheets as poses or frames we should depict. Frame animation works similarly: It involves storing each frame in the animation sampled at whichever rate we will be displaying it (usually 25 frames per second or more). We will thus store many copies of the same mesh, one per frame. Then, at runtime, we will just paint the mesh that depicts the current state of the character. By synchronizing this with a timer or an input controller, the illusion of animation can be achieved.

So, what do we need to store for each frame? Clearly, we need to store vertex coordinates. But it does not really make much sense to store mapping coordinates as well, because texture mapping will very likely be constant throughout the animation cycles. We only need one copy of this data for the whole animation. In the end, you will discover we only really need the vertices for each position and probably the shading information: vertex colors, normals for lighting, and so on. This greatly reduces the memory footprint. Even better, if our character is defined by means of an indexed list of faces, the index list only needs to be stored once.

Even with all these optimizations, frame animation is memory hungry. This brute-force approach can only be recommended for very specific situations involving very few frames and small triangle counts.

Frame animation can also have another potential downside. In-game frame rates are never completely stable. Oscillations in geometry complexity, number of onscreen characters, or many other situations can make our ideal frame rate go up or down. As you saw in the opening chapters, we need to implement some measures to ensure that the in-game action does not vary in speed as well. We don't want our characters to walk faster or slower depending on the circumstances.

The solution comes from using a real-time clock and using the timing to compute real-time positions for all in-game elements. The problem with frame animation is that we have the mesh evaluated at discrete points in time. We can choose the closest available evaluation to the current point in time, but we will always see the jitter in animation speed.

To keep animation running at a smooth rate, we will need to accurately compute in-between frames, so we can display a good approximation of the animation with arbitrary precision. To do this, we will use interpolation techniques, which use mathematical functions to compute the desired value. Interpolation is extensively used in the next section, so read on for more information.

Keyframe Animation

A more involved technique can help us reduce our memory needs while keeping the same simplicity and elegance as regular frame animation. To better understand this variant, we first need to discuss how animation is created in content-creation packages such as 3ds max, Maya, or Lightwave. Animators usually create a motion by defining a number of "key frames," or well-known positions that determine the complete motion in their package of choice. For a 50-frame sequence, this might involve setting keyframes at frame 0, 15, 30, and 45. The character will then loop through the sequence, using keyframes and interpolation to derive all the remaining frames. This is a timesaver for the artists because they can determine a complex motion using just a few directions. A two-second cycle (which would total 50 frames) can often be described within 5 to 10 keyframes.

Taking advantage of this property is the core of keyframe animation. We will only store these keyframes as specified in the modeling package. This is clearly a memory savings from traditional frame animation. The animation engine will take care of computing any in-between frame at runtime. As shown in Figure 15.3, if we are requesting the frame at 0.67, we need to blend the two closest keyframes (the one at 0.5 and the one at 0.9). You might think the interpolator is an added CPU cost to the algorithm, but this is actually untrue. Any frame animator would end up using interpolation as well, and thus, the CPU cost of frame and keyframe animation is essentially the same.

Figure 15.3. Interpolation is required to render a frame in-between two keyframes, such as frame 0.67 in the figure.


There are several variants of the keyframe animation scheme, and each differs fundamentally in the interpolator used. As a first (and most popular) option, we can choose to compute the required frames using linear interpolation, which will interpolate each value (X, Y, and Z) using a straight line between an initial and an end value. Linear interpolation can be computed easily using the following equation:

Interpolator=(timevalue-lastkeyframe)/ (nextkeyframe-lastkeyframe);
Interpolated value=lastvalue*(1-Interpolator) + nextvalue*Interpolator

In the equation, lastkeyframe and lastvalue store the time and value of the last keyframe (to the left), and nextkeyframe and nextvalue store the time and value for the next keyframe (to the right). Remember to perform this same computation for X, Y, and Z. Here is the complete code to interpolate two 3D points:

point interpolate(point p1,long time1,point p2,long time2,long currenttime)
float alpha=(currenttime-time1)/(time2-time1);
float alphainv=1-alpha;
point res;
res.x=p1.x*alphainv + p2.x*alpha;
res.y=p1.y*alphainv + p2.y*alpha;
res.z=p1.z*alphainv + p2.z*alpha;
return res;

Linear interpolation ensures smooth movement at a very low CPU cost. In the preceding example, we get a cost of six additions/subtractions, six multiplies, and one divide per vertex. If we are just using frame animation (time2-time1) equals one, we save one subtraction and one expensive divide. On the other hand, linear interpolation can sometimes yield poor results. Most animation packages use more powerful interpolators (Bézier, Hermite, and so on), so when we see the results with just linear interpolation applied, some motions tend to look robotic and unnatural.

One solution to raise the quality is to make sure your keyframes are not too far apart, thus reducing the blockiness of the motion. A popular mathematical lemma states that any continuous curve can be reduced to a straight line if the interval is small enough, and this is what we will try to exploit here. Trial and error in this situation is the best choice because fast motions (such as fighting moves) will require higher sampling rates than slower, paused motions. Note, however, that more keyframes implies a higher memory cost.

Thus, a better solution is to automatically extract keyframes from a high-resolution animation sequence, so the chosen keyframes ensure that a fixed quality threshold is maintained, and blockiness is minimized. The coding idea is pretty straightforward: Store the animation at a high sampling rate (such as 25 fps), and then use an analyzer to find out which frames can be discarded with little or no effect on the results. This can be achieved in an iterative manner, discarding the least significant frame at each step until we reach a limit error level. By doing this, we can ensure that our resulting animation is well suited for real-time display, keeping the best possible quality. Be warned though: The problem of deciding on just the perfect set of frames to achieve optimal results is an inherently complex problem, so any solution you implement will be computationally expensive.

For those working on high-powered platforms, we can solve all these problems by using a more powerful interpolator, hopefully the same one used by your animation package of choice. This way you can store fewer frames and ensure one-to-one correspondence between what your artists animate and what the engine renders.

As an example of this approach, one of the most popular interpolators utilizes a cubic curve, which smoothly interpolates using the beginning and endpoints as well as the beginning and ending normals. Techniques such as Hermite polynomials or even Bézier curves can be used to do the math.

Tagged Interpolation

Both frame and keyframe animation methods are easy to code, but come with a host of problems for the programmer. Memory footprint, as we have seen, can become a serious issue, especially if we try to animate high triangle counts or need to store lots of keyframes.

Still, there are other, more subtle problems. Imagine that you are creating a character for an action title. The character must perform a variety of animations, so the game is visually rich. For example, he must be able to stand still, walk, and run (both forward and backward). He also needs to jump, crouch, and shoot three different weapons. A first, superficial analysis reveals we will be using 10 different animation cycles:

  • Stand still

  • Walk forward

  • Walk backward

  • Run forward

  • Run backward

  • Jump

  • Crouch

  • Shoot first weapon

  • Shoot second weapon

  • Shoot third weapon

Everything looks fine until the lead designer comes along and suggests allowing the player to shoot while moving for increased realism. This might look like an innocent suggestion to the untrained eye, but if you do the math, you will discover you don't need 10 cycles anymore but 28. You have seven "poses" (stand, walk forward, walk back, run forward, run back, jump, and crouch) and four "actions" (do nothing, and shoot each of the three different weapons). Notice the combinatorial pattern? That's the root of an annoying problem with keyframe approaches.

  • Stand; do nothing

  • Walk forward; shoot weapon 1

  • Walk backward; shoot weapon 2

  • Run forward; shoot weapon 3

  • Run backward

  • Jump

  • Crouch

In the Quake II game engine, this problem could be easily spotted during multiplayer games. To save the then scarce animation resources, these combinatorial cycles simply were not available. So, an enemy running toward us while shooting would correctly render the "shoot" pose while keeping both legs still as in an idle animation. This was sometimes dubbed "player skating" because players seemed to glide over the map.

For Quake III, the team at id Software found a satisfactory solution, which they called tagged animation, and implemented in the MD3 animation system. The key idea is to think of each character as if it were divided into several body parts, much like action figures. For Quake III, characters were usually divided into a "head" block, a "torso" block (from the neck to the belt level), and a "legs" block (from the belt downward). Then, each of these body parts had its own animation cycles, so combinatorial actions could be achieved by stacking cycles from each body part. In this case, our head part would remain unanimated. Animation cycles for the torso and the legs are as follows.


  • Stand still

  • Walk forward

  • Walk backward

  • Run forward

  • Run backward

  • Jump

  • Crouch


  • Stand still

  • Shoot first weapon

  • Shoot second weapon

  • Shoot third weapon

Once the animation cycles have been identified, it is time to link the body pieces together, so the whole looks like a complete character. To do so, the designer must manually specify a pivot point in each body part, so each part basically knows where it should be attached. These pivots (which are nothing but coordinate systems essentially) are called tags, and hence the name of the animation system. In typical Quake III terminology, we have three body parts (head, torso, and legs), and four tags. The first tag (tag_floor) is used to indicate the ground level, so the character stands well on the terrain. This tag is usually attached to the legs. A second tag (tag_legs) specifies the joint between the legs and torso. A third tag (tag_head) provides the binding between the torso and head. The fourth tag (tag_weapon) is usually placed inside the right hand, so the character can hold interchangeable weapons realistically. You can see a tagged animation character with body parts highlighted in Figure 15.4.

Figure 15.4. A character ready for tagged animation. In this case, the body has been divided into head, torso, legs, and arm, so the arm can be animated as well.


Notice that tags specify where each body part should be placed (both in terms of rotations and translation). But how do we ensure continuity between legs and torso (or torso and head)? Ideally, we should stitch both body parts together to ensure a perfect joint between them by sewing continuous triangles together. However, the Quake III animation system chose the easiest solution. No sewing is performed, so the two body parts are not really integrated: They are just put one beside the other like two bricks. To prevent players from noticing this, the body parts interpenetrate each other slightly, so the whole character always stays together. Artist tricks such as wide belts and clever textures help convey the sense of continuity.

Tagged systems can be extended to achieve quite sophisticated effects. Imagine that you need to bind characters to vehicles, whether it's a horse for a medieval role-playing game (RPG) or a motorbike in a racing game. All you have to do is tag your vehicle so your animation engine knows where to place each element. Need to do a system where your character can be heavily customized with weapons, a helmet, and shoulder pads? Tags are the way to go whenever flexibility and expandability are needed. Tags are also very convenient. You can add props to your character dynamically. In the end, a character like a flying dragon that carries a rider holding a sword, which in turn has an apple stuck to the tip, can be achieved with a tagged-animation system. In fact, the paradigm is so well thought-out that some implicit animation systems use tags to handle props for characters.

Tagged systems offer a significantly lower memory footprint than regular keyframed systems, especially for those cases where we need to perform different actions with different body parts simultaneously. Obviously, memory footprint will still grow linearly to the number of vertices as you add cycles to your animation. CPU use, on the other hand, will remain more or less stable and similar to that of regular keyframe animation, because we are just interpolating keyframes and placing geometry blocks in space. If the graphics hardware supports hardware transforms, the placement of each body part comes at almost no cost to the CPU.

However, tagged animation systems have some limitations that we must be aware of. First, animation is still bound explicitly to the character. So, if we need to represent a lot of different characters using the same animation cycles (say, nonplaying characters in a role-playing title), we will need many copies of those cycles, and sooner or later we will run out of memory. It would be great to share animation loops between different characters so two characters would only use one walk cycle.

Second, a tagged animation engine will not provide us with environment adaptation: Following the slopes in the ground, reaching out for objects realistically, and so on, are nearly impossible to do with the methods we have seen so far. Even if we have a large collection of animation cycles, our range of actions will be limited to what has been canned previously.

Third, there is a limit to body part division. Imagine that you want to do a sword fighting game, where you need to have a wide range of motions. Your animation programmer proposes doing a tagged system, using more tags to ensure that no combinatorial cycles are ever needed. Specifically, he proposes the following list:

  • Head

  • Torso (from neck to belt, no arms)

  • Left arm

  • Right arm

  • Left leg

  • Right leg

This may look good at first glance, but notice how many motions require synchronizing many body parts. Holding a weapon, for example, will involve both arms and the torso. Walking will involve both legs, and if the upper body is actionless, both arms and the torso. All these relationships will need to be coded into the system, so our animation engine will require a significant layer of glue code to ensure that body hierarchies are well kept. Compare that to the classic MD3 file where there are just two body parts to take care of (the head is usually not animated). The animation logic layer is still there, but it is simpler to manage.

So clearly, tagged animation is a case for the "too much of a good thing can be bad" discussion. In the right measure, it can greatly improve upon a keyframe approach, often reducing the memory footprint by half or even better. But dividing our body into more parts than is strictly required will make things worse: The control layer will grow out of control. When programmers fall into this "overdivision" syndrome with a tagged system, it is usually a sign that they really need to get a skeletal animation system instead. They are just trying to force the tagged system to behave like a skeletal system at a fraction of the cost, and that's generally not a good idea.

Implementing a tagged animation system is almost as straightforward as implementing a regular, single-mesh keyframe interpolator. We only need to store animation pointers for each body part, so we can keep track of which animation and frame each mesh is playing. Then, we need to be able to render the whole hierarchy, using the pivot points. To do so, most systems use a geometric object that hints at the pivot's location and orientation. In most Quake-based modelers, a triangle is used, so the first point in the triangle is used as the pivot's location, and then two vectors from that point to the other two define the coordinate axes. By computing the cross product between these two vectors, an "up" vector is returned.

At runtime, the pivot's position and orientation are usually stored in a quaternion, a mathematical operator that encapsulates locations and orientations elegantly. A quaternion is the generalization of a complex number, consisting of four floating-point values. Quaternions are explained in the next chapter as a way of implementing camera interpolators in games like Tomb Raider. Generally speaking, they are the way to go whenever you need to smoothly blend between successive interpolations. Read the next chapter, especially the section referring to the spherical linear interpolation, for a thorough description on how a quaternion can be used to compute the pivot point at runtime. Once we have the pivot point in place, all we need to do is use the transform stacks to apply the right rotations and translations to the vertices.

      Previous Section Next Section

    JavaScript EditorAjax Editor     JavaScript Editor