Implicit Animation OverviewNow that we have mastered explicit techniques and learned to exploit their benefits, we will move on to implicit animation. Be warned though: Implicit systems are significantly more complex than explicit systems, so you better make sure your matrix knowledge is up-to-date. You will surely need it. On the other hand, implicit engines are much more powerful and offer sophisticated controls for the animation programmer. Perfect environment adaptation, very low memory hit, and higher triangle counts are just a few of its benefits. As stated in the opening section, implicit animation stores a description of the movement and evaluates it on the fly, effectively animating vertices in real time. To do this, it requires different data structures than those used in explicit animation. To begin with, we need to know about the character topology. Usually, a skeletal structure is stored, specifying the rigid bones that drive that character's movement as well as the relationship between bones and vertices. We need to know which vertices each bone should influence in order to animate them properly. We will also need to know which bones are connected by joints, so we know how to inherit movement. Moving the knee should affect the foot, for example. Additionally, we will need an animation file that stores the animated cycle in implicit form. Usually this file will consist of a series of snapshots that sample not the geometry, but the joint configuration. By knowing the bone structure and the rotations of each joint at a given time, we can apply the pose to the geometry and thus animate it. Two general paradigms exist for implicit animation: forward (or direct) kinematics (FK) and inverse (or reverse) kinematics (IK). In FK, we will start from a root node (usually, the pelvis) and propagate the skeleton downward, inheriting motions as we advance. The upper arm should inherit the chest, the lower arm both the upper arm and chest, and finally the hand. FK is coded by stacking series of transformation matrices as we enter each body part. It is the method of choice for motion-capture data or for coding general, broad moves. IK works the other way around: It starts at the terminal element (the hand, for example) and computes the joints as it moves higher in the animation skeleton. Thus, IK allows us to locate an element in space, and then calculate the needed configuration of its ancestor joints. We can begin by saying "the hand must be here," and automatically compute where the elbow and shoulder should be. IK is very useful for adaptive and physically correct animation. Characters can adapt their footsteps to irregular ground or reach out to grab an object realistically. But this sophistication comes at a cost: IK is computationally more expensive than FK. IK is not, however, a miracle solution. IK alone allows us to position hands and feet on the ground, but this has nothing to do with realistic walking or running. A new technique has emerged that blends both techniques for better results. The core idea of this hybrid approach is to use FK (possibly coupled with motion capture) for overall movement, and blend the results with IK for terminal joints and its direct ancestors. This way we can combine the perfection and realism of a canned walk cycle while getting realistic knee and feet adaptation to the ground. Some systems use physically based animation for increased realism. Under this technique, movement is computed as the consequence of physical processes. Each body part has a weight and inertia. As the result of the physical simulation, joint configurations are computed, and the character is finally animated using FK. Implicit animation has many advantages (and a few disadvantages) over explicit systems. On the plus side, memory use is greatly reduced. Implicit systems only store joint configurations, which is orders of magnitude cheaper than storing the complete mesh at each keyframe. Moreover, implicit systems can implement many advanced effects such as physically correct animation, adaptation to rough terrain, or realistic intercharacter interactions, to name a few. Last, but not least, an implicit animation system can share animation data between different characters. If you have 10 different in-game characters that share a common walk cycle, you only need to store the animated joint list once, because the vertices are computed on the fly. The main disadvantage of an implicit animation system is the coding complexity and computational cost. Programming a skeletal animation system is significantly harder than storing explicit animation data. Because matrix math is involved, if you decide to implement IK or a physically correct system, the equations are not trivial. As a direct consequence of this, your animation subsystem will consume more CPU cycles. Depending on the application and target platform, this computational cost might be prohibitive, and explicit systems might be the way to go. Forward KinematicsFK is the easiest way of implementing implicit animation. In its simplest conception, we start at some base node (usually the pelvis) and propagate transforms to the descendants. Each joint has its own local coordinate system, which we will use to propagate transforms in a hierarchical manner. The pelvis will only have its own transform, whereas the femur will stack both the pelvis and the local transform, and so on until we reach our toes. You can see the hierarchical nature of FK pictured in Figure 15.5. Figure 15.5. Left-to-right, sequence of FK animation, starting from the pelvis and moving downward to the right foot.FK requires three components: the skeleton, the base mesh, and the animation cycles. To begin with, the skeleton defines the hierarchy of body parts that constitute one complete character. A simple skeleton can consist of about 18 bones:
But more bones can be added as needed: Fingers and facial muscles can all be simulated using extra bones. As an example, we will review bone-based facial animation at the end of this chapter. The second element of an FK animation system is the mesh, which is affected by the underlying bones. Skeletal systems usually employ single-skinned meshes, so continuity is preserved. Additionally, they can choose to let each vertex in the mesh be influenced by one and only one bone in the skeleton or to allow for multibone influences for smoother animation. This is sometimes not very relevant in highly rigid body parts like a finger, but a chest, or even more so, a face, will definitely show improved quality by using multibone influences. Old titles, like the original Tomb Raider, did not use a single-skinned mesh, but instead used a list of separate body parts: Head, arms, legs, and so on were all divided. This way identifying vertex-bone relationships became trivial. Each object was coupled with one bone and only one bone. But this technique was quickly abandoned because gaps appeared as the animation was applied, and the body did not hold together sometimes. Skeletons and meshes are usually stored in a reference pose, which involves horizontal arms to each side of the body, palms-down, and legs straight vertically. We will assume this pose does not have any joint rotation component. Then, animation cycles simply store joint rotations. Most joints are monoaxial (like the elbow, for example), whereas others are biaxial (a shoulder can yaw and pitch, but cannot roll). A third group allows the three degrees of freedom, like the neck or wrist. At runtime, we carry out a two-step process. We first reconstruct the matrices for each body part. To do so, we begin at the base node (pelvis) and follow the skeletal hierarchy, stacking each joint's transforms. Again, this can be achieved easily both with regular matrices or quaternions (see Chapter 16, "Cinematography"). If you choose matrices, be aware that gimbal lock may arise, especially in joints that can rotate over more than one axis. If this is the case, switch to quaternions instead. Once we have recomputed the condensed matrix for each joint, we need to apply matrices to vertices. This process is the most time-consuming, and the one we need to optimize carefully to reach optimal performance. After all, it's nothing more than matrix-vector multiplies. Logically, skeletal animation has a much lower memory footprint than frame-based methods. We only need to store each joint's orientation, and the run-time engine will take care of reconstructing the mesh. Math of Skeletal AnimationLet's now assume we have a character mesh where each vertex is influenced by one or more bones. We will also assume each bone has three degrees of freedom (roll, pitch, and yaw). This can be optimized easily later on for joints with less degrees of freedom. Additionally, bones are laid out in a hierarchy, so we can access ascendants and descendants easily. The algorithm works as follows: We first process the skeleton and compute the actual transform for each vertex. To do so, we begin at the root, which is assigned the transform specified by its roll, pitch, and yaw. Then, we move into the skeleton, chaining transforms along the way. A first degree descendant of the base node will have the base transform and the descendant's own transform applied. The process is recursed until all bones have been processed in a treelike fashion. Let's examine the specifics. We start at the base bone. Then, we perform a nested loop. The outer loop scans the bones that are direct descendants from the one we called the routine with. If we start with the pelvis, we will probably have three bones to scan: the lower back, the right femur, and the left femur. We then need to scan each vertex in the mesh and compute whether the bone we are examining actually influences that vertex. This is performed by a weight function. The weight function usually comes from your modeling package of choice as a series of values that determine which bones influence which vertices and how. But if you don't have such a function, you can cook your own. All you need to do is compute weights based on the distance from the vertex to the bone (a line segment). Computing segment-point distances is not very hard. Store the segment in parametric equation form, such as: X= X1 + (X2-X1)*t Y= Y1 + (Y2-Y1)*t Z= Z1 + (Z2-Z1)*t Thus the segment is defined with t-values in the 0..1 range. Then, solve the point-line test, computing not the actual point, but the t-value assigned to it. Three cases are possible:
Then, the weight function can take the form Weight = k/distancen with k determining the scale of the function, and n controlling the falloff speed. For human-sized characters, values of 1 and 4, respectively, do the job remarkably well. Back to our main algorithm, we now have two loops. The outer loops bones, and the inner iterates vertices and computes bone-vertex weights. Then, if the weight is below a threshold, we can forget about it and leave the vertex unaffected by that bone. If, on the contrary, we detect a significant weight, it is time to transform the vertex. To do so, we need to first translate it with regard to the bone's pivot point. Because rotations are centered around the origin, we need to make sure the bone is properly centered before the math can actually start. We can then stack the transform, either in quaternion or matrix form. At this stage, we actually apply the transform from the bone to the vertex. This involves both the point versus matrix/quaternion multiply and scaling the processed vertex by the bone weight. Remember that other bones further along the hierarchy can affect this vertex as well, so we need to accumulate results from each analyzed bone. In addition, we can undo the translate transform so the bone and vertices are referenced to world coordinates again. This is a recursive algorithm. If the current bone has any descendants, we must call the same routine, so more transforms are weighted in and incorporated into the solution. In the end, once terminal nodes are reached, the whole mesh will have been affected by different bones. Here is the complete pseudocode for this routine: deform (bone root, mesh originaldata, mesh processeddata) for each children of root for each vertex in the original mesh compute weight from bone to vertex if weight > threshold translate vertex by negated bone pivot (place pivot at origin) multiply vertex by joint rotations scale the resulting point by the bone weight translate by bone pivot increment the definitive value of such vertex by this solution end if end for if this bone has children deform (children of this node, original mesh,processed mesh) end if end for Now, the ideal way to perform this on a single-skinned continuous mesh is to store vertices indexed (so less processing is needed) and then paint them using a call to DrawPrimitiveIndexed (Direct3D) or DrawElements (OpenGL). DirectX programmers will have the advantage of internal support for quaternions, but overall the algorithm is not very complex. However, this is a poor solution in terms of performance. Several stages can be improved by thinking a bit. For example, it makes no sense to compute vertex weights on a per frame basis for all bones and vertices. It is just a waste of resources. In the end, the weight is a number in the 0..1 range, which can be easily quantized to a byte. Assuming V vertices and B bones, we could store the weights as a VxB matrix. As an example, such a matrix for a character consisting of 1,000 vertices and 20 bones (a reasonable amount by any of today's standards: notice I said 1,000 vertices, not 1,000 triangles) would occupy around 19KB, which is fairly reasonable. Remember that the weight computation is pretty intensive because there are segment-point tests. Preprocessing all that information looks like a pretty good idea, because we can turn it into a simple table lookup. Hardware-Assisted Skeletal AnimationAnother interesting idea is to rearrange your code so you actually store matrices as you go instead of vertex coordinates. This way you can actually cache results for bones that did not change from one frame to the next, and so on. The source code would use an array of matrix lists, so we can store which matrices affect which vertices. It would look something like this: deform (bone root, mesh originaldata, mesh processeddata) for each children of root for each vertex in the original mesh compute weight from bone to vertex if weight > threshold translate vertex by negated bone pivot (place pivot at origin) multiply vertex by joint rotations translate by bone pivot add the condensed matrix to the list of matrices for the vertex end if end for if this bone has children deform (children of this node, original mesh,processed mesh) end if end for The main advantage of this technique is twofold. First, we can cache partial bone structures and their matrices for parts of the body that did not actually change. Second, we can take advantage of better animation support in later versions of Direct3D (version 7 and later). In Direct3D 9, for example, we can use the SetTransform interface to, in the end, stack matrices on a per vertex basis. The code is as follows: for (vert=0;vert<numvertex;vert++) { // I assume four matrices per vertex d3ddev->SetTransform(D3DTS_WORLDMATRIX(0),matrixarray[vert][0]); d3ddev->SetTransform(D3DTS_WORLDMATRIX(1),matrixarray[vert][1]); d3ddev->SetTransform(D3DTS_WORLDMATRIX(2),matrixarray[vert][2]); d3ddev->SetTransform(D3DTS_WORLDMATRIX(3),matrixarray[vert][3]); d3ddev.DrawPrimitive(D3DPT_TRIANGLELIST,FVF_VERTEX,vertex[vert],3,0); } And vertex weights are passed as part of the Flexible-Vertex-Format (FVF). A final improvement over this approach is to implement all the animation pipeline in a vertex shader. If we follow the previous approach and accumulate the matrices required for each vertex, we can animate on the hardware by passing the mesh in base position and the matrices on a per vertex basis. In the Cg programming language, for example, we can do so with a very simple shader. We receive vertices along with four matrices, which are multiplied and weighted to compute the final sum. Here is the source code for such an approach: struct inputs { float4 position : POSITION; float4 weights : BLENDWEIGHT; float4 normal : NORMAL; float4 matrixIndices : TESSFACTOR; float4 numBones : SPECULAR; }; struct outputs { float4 hPosition : POSITION; float4 color : COLOR0; }; outputs main(inputs IN, uniform float4x4 modelViewProj, uniform float3x4 boneMatrices[30], uniform float4 color, uniform float4 lightPos) { outputs OUT; float4 index = IN.matrixIndices; float4 weight = IN.weights; float4 position; float3 normal; for (float i = 0; i < IN.numBones.x; i += 1) { // transform the offset by bone i position = position + weight.x * float4(mul(boneMatrices[index.x], IN.position).xyz, 1. 0); // transform normal by bone i normal = normal + weight.x * mul((float3x3)boneMatrices[index.x], IN.normal.xyz).xyz; // shift over the index/weight variables, this moves the index and // weight for the current bone into the .x component of the // index and weight variables index = index.yzwx; weight = weight.yzwx; } normal = normalize(normal); OUT.hPosition = mul(modelViewProj, position); OUT.color = dot(normal, lightPos.xyz) * color; return OUT; } Given the CPU hit of any skeletal animation approach, this new hardware-skinning approach will likely become a de facto standard because we get skeletal animation at a very low cost. All the heavy weight lifting is performed on the GPU. |