Cinematic Cameras: Placement Algorithms
We will complete this chapter with an overview of camera placement algorithms for character interaction. Consider this scenario: We have two people speaking, and a third person is approaching them. Which is the ideal location for a camera? Think about it for a second, and you will see it's a nontrivial problem. We must grant visibility to all three characters while keeping the camera as close as we can so no detail is lost. On top of that, we need characters not to occlude each other, so the whole scene is transmitted clearly. Camera placement algorithms have emerged in recent years due to the trend toward greater cinematic values. Many titles are implementing such algorithms already, and many others will surely join them in the near future. I'll provide a survey on how to compute good camera angles, especially when several items (both characters and scenario) must be displayed simultaneously.
The first item to keep in mind is that we will not use full characters but will instead use their bounding volumes for camera computation. Using full characters would be too costly, and as far as results go, the effort would not pay off. Thus, many algorithms will work on spheres or boxes that represent our locations and characters. We then need to compute camera positions and orientation vectors, which help convey the action of the scene. There are some general rules that we will need to implement in order to achieve realistic results. These really boil down to three basic rules:
These rules leave lots of room for experimentation. Using the previous example, we can choose the distance where we will place the camera with respect to the characters. If they were in an empty area, we would place the camera as close as possible, so we could see the action with full details. But if the scenario is somehow relevant, we might choose to move back a bit, so some context information is included in the shot. Let's refine each of our choices a bit.
Selecting the Camera Target
We will begin by aiming the camera in the right direction. But what is "the right direction"? This is basically a storytelling concept: Where is the camera supposed to be facing to cover the shot? There's no golden rule, so let's examine some case-by-case situations.
In a general case, we would begin by targeting the player, who is, after all, the screen alter ego of the gamer. Aiming at the chest of the character ensures that he is well located in the center of the screen, and we get some visibility of his surroundings.
But what if a monster is approaching the character? Maybe we should aim somewhere in the line of interest, which is defined by the character's and the monster's position, much like fighting games target the camera at the midpoint between the two fighters. This way the character shifts closer to the edge of the screen, and the monster is placed on the opposite side. Clearly, any good camera placement algorithm must make sure we not only see the main character, but also everything he is seeing.
A more complex case arises from group situations, such as a conversation between several peers or a small battle. Here the point of interest must be placed around the barycenter of the scene to make sure everyone can fit on the screen. It makes no sense to aim at the person currently speaking because this state will change often, and the camera will move back and forth all the time, surely annoying the player. For these cases, a neutral camera placement is preferred.
Generally speaking, cameras should be moved only when needed. Computer graphics possess a higher control level over camera placement than real-world cameramen do. We can shake the camera as fast as we can, converting our game into a virtual roller coaster ride. This can be seen in many computer graphics-based short films. But we need to understand that the camera should be as unobstrusive as possible, enhancing the gameplay instead of destroying it. Camera moving and even switching between different cameras should be kept to a strict minimum.
As an example, imagine a car racing game where the screen follows the action from different ground-based cameras, much like in TV retransmissions. In one camera, the car is seen sideways, advancing toward the right of the screen as it moves along the racecourse. Then, we switch to a different camera, which makes the car advance in a top-down direction on the screen. Obviously, the player will get the feeling that controls have changed, and thus interaction will be degraded. Games are about feeling in control of the gameplay, and switching cameras often destroys that feeling.
Selecting the Relevant Information
We know where we want to aim the camera. Now, how can we ensure that all the interesting information is inside the view cone? This is usually a two-step process that includes 1) understanding what's relevant and what's irrelevant for a given scene and 2) computing an adequate camera frustum so everything fits. The first problem is largely game dependent. We need to tag important elements so we can later use them to place the camera. Generally, the following are considered "relevant" items:
Each of these items must come with a bounding volume. For this algorithm, boxes are preferred for their tight-fitting property. We must then consolidate all bounding boxes into one, which represents the bounding box of the relevant information we want to target. Then, all we need is half a page of trigonometry to compute, given the aperture of the camera, the distance required to fit all the information in it. Notice how we must take into consideration the camera optical parameters (aperture, mainly), scenario characteristics (in the form of the bounding box for the relevant information), and camera view directions. Some scenarios will be larger in one dimension with respect to the others, and thus the distance where we place the camera will also depend on which direction we are aiming from. For example, imagine that you are trying to take a picture of a car from the front or sideways. Clearly, the second situation requires more distance because the object you are targeting covers a larger angle in the picture.
Thus, the variables for the analysis are as follows:
We then need to compute a point along the line that passes through P and has direction V so that the whole bounding box B lies within the angular distance defined by F. The algorithm is as follows.
We first create a point along the line emanating from P with V as its direction. Call that point Q. Then, for each vertex in the bounding box, we compute a vector QV from Q to the vertex. We compute the angle from QV to V, and store the bounding box point, which is at maximal angular separation from V. This is the point that will define when we are seeing the whole scene, so we only need to work with this point from now on. Let's call this bounding box point R.
Now we need a point along V that ensures that the point R is at an angular separation from V of exactly F. This is just regular trigonometry. The point is in the form:
X = Px – Vx*t Y = Pz – Vz*t Z = Pz – Vz*t
And the angular test can be easily added: The angle from V (the axis) to QR must be F:
V dot RQ = acos(F)
which is expanded into
Vx*(Qx-X) + Vy*(Qy-Y) + Vz*(Qz-Z) = acos(F)
Vx*Qx – Vx*X + Vy*Qy – Vy*Y + Vz*Qz – Vz*Z = acos(F)
Notice how we have four equations and four variables, so we can solve this system easily using any algebraic method. P, V, F, and Q are just constants we need to feed into the different equations.
A word of advice: Nobody likes relevant information to lie in an extreme corner of the screen. It is a wise idea to keep a safety zone, so there is a margin between any relevant item and the edge of the screen. This can be easily added to the preceding algorithm by substituting the real aperture of the camera by a value that is decreased in a set percentage, such as 10–15 percent. This way the camera will move back a bit to make sure we get some space around the different objects in our game world.
Selecting View Angles
Placing the camera at the right distance is a relatively straightforward problem. All we need to do is compute some angular math to do the job. Making sure we select a good view direction is quite a bit harder. We need to ensure that all elements are well proportioned, avoiding occlusions between them. If you review the preceding sections, you'll see that by now we have a point of interest we are aiming the camera to and a view distance suitable to show all objects from. That gives us a spherical crust where we can place the camera. Which points in the crust provide a better viewing angle?
From a purely technical standpoint, we can compute the best view direction as the one that generates fewer occlusions in the scene. But there are other issues to consider: If occlusions were the driving force, most scenes would be shot from a crane at a top-down angle. Obviously, we want as few occlusions as possible while keeping a comfortable, not too vertical viewpoint.
Let's begin by working with the occlusion portion and later add features on top of that. To simplify the math, we can select the position that generates fewer occlusions by working with average angular separations (AASs) between objects. The AAS is, given a point, a measure of the divergence of the rays emanating from it and heading toward the different objects in a scene. A purely horizontal viewpoint will have smaller angular separation than a vertical viewpoint, thus recommending vertical over horizontal viewpoints, at least as far as occlusions go.
AAS can be computed in a variety of ways. Because the number of involved elements will usually be small, the algorithm will not imply a significant CPU hit. Here is the pseudocode for such an algorithm:
for each object in the scene trace a ray from the viewpoint to the object end for average=0 for each ray select the closest ray (in terms of angular separation) average=average+angular separation between them end for average=average/number of rays
The select the closest ray function encapsulates most of the algorithm's complexity. For smaller data sets, brute-force, optimized approaches can be used. If this is not the case, spatial indexing techniques can somehow improve performance.
Now we need to compute the AAS for several points in the scene to discover which one yields better occlusion behavior. Generally speaking, vertical viewpoints have higher AAS values, so we need to compare two camera placements located at the same height for the analysis to provide some meaningful results.
Therefore, we need to modify our AAS value so more horizontal view directions are preferred over higher, vertical shots. One good idea is to use the angle to the ground as a modifier of the computed AAS value. By doing so, you can effectively compute suitable camera locations so the action and gameplay can be followed intuitively.