Facial Animation Basic Concepts

Facial Expressions

The most basic idea in the facial animation system is the facial expression. A facial expression can be anything the face can do, from nodding to closing the eyes to an expression of complete terror. It can be as simple or as complicated as required.

When a facial expression is applied to a model, it alters the models appearance in some way. The expression can be applied to differing degrees, ranging from -1 to 1. Setting the expression to 0 means the expression has no effect. Setting the value to 1 means that the expression is having its full impact. Setting it to -1 means do the opposite of whatever the expression does.

Later on we will see that a facial animation sequence involves specifying how these expressions should be applied at each point on a timeline using curves.

Primitive Expressions

Expressions can be either primitives or compounds. A primitive expression changes only a single thing about the face. Currently there are three supported types of primitive expressions:

Vertex Morphs.
Bone movement.
Attachment movement.

A vertex morph is a way of altering the positions of vertices in a mesh directly, without affecting the skeleton. The vertices are moved from their current position towards a morph target. If the expression value is set to 1, the morph target will be fully applied, and the model will appear as the morph target. If the expression value is some other value, the model vertices will be linearly interpolated to provide an intermediate appearance. A negative expression value will result in negative linear extrapolation (i.e. the vertices are moved away from their morph target positions).

Morphs are suitable for many aspects of facial animation. For instance, they could be used to create an animation of a character puffing out his cheeks, furrowing his brow or wrinkling his nose. These things are not easily done using skeletal animation.

Morphs are generally set up in the modeling package, and are stored as part of the 3d model file. They are generally referred to by name. A bone movement expression involves rotating a certain bone in the skeleton in a given direction and by a certain amount. Any bone in the skeleton can be moved, even those that are not part of the face. The rotation can be specified as a set of three values representing the rotations around each axis. The actual rotation that is applied to the bone is interpolated based on the expression value.

Attachment movement expressions allow us to move attachments.

Since head models are generally attached as skin attachments, they share the main skeleton of the body at runtime. This means that moving the bones can affect not only the main character model but also the head (in this case it can be used instead of attachment movement expressions).

Compound Expressions

Compound expressions allow us to combine primitive expressions into more sophisticated ones. A compound expression, like all other expressions can be applied to a certain value. The compound then specifies what other sub-expressions to play and to what degree, based on this input value.

A compound comprises a list of sub-expressions, and for each of these there is an associated curve. This curve maps the input expression value to the expression value of the sub-expression.

For instance, an angry expression might be made up of two sub-expressions: a vertex morph that furrows the brow, and a second morph that purses the lips. When we are completely angry, we might want the brow to be fully furrowed. Therefore the curve that maps the input expression value to the brow expression value could be a straight diagonal line, for which the value at -1 is -1 and the value at 1 is 1. In this case whatever value we set the compound expression to will also be the value we set the primitive expression to.

However it might be that the lip pursing expression is too strong for our taste. We might prefer that when we are fully angry we show the lips only half pursed. To do this we can set the curve for the lip pursing to be a straight line such that an input value of -1 results in an output of -0.5, and an input of 1 results in an output of 0.5.

Animation Sequences

We have seen how facial expressions can be used to alter the appearance of the facial model. We can apply a facial expression to a certain value from -1 to 1, and the expression specifies how the model should be altered.

However this only deals with the appearance of the model at a given time. To achieve animation, we need to specify how much to apply each expression at each point in time. To do this we need to create an animation sequence.

A sequence comprises a list of expressions, and for each expression there is a curve that maps the time to the value for that expression at that time. When the engine needs to render the facial model at a given point on the timeline, it queries the sequence to decide what value to set each expression to. Each expression and its corresponding curve is referred to as a channel.

In addition to specifying the value for each expression, the sequence can choose to play an expression with balance. Balance is a way of causing morphs to affect one side of the face only. For instance, we can use balance on an expression that raises the eyebrows so that only one eyebrow is raised. This technique can be used to reduce the number of morphs that are required in the model.

Procedural Animation

In an effort to provide more life-like behavior for characters, the system provides a procedural animation system. Note that this is different from the procedural animation sequence channel described below - that works as part of a sequence, and is updated as the sequence is updated, whereas this is independent of any sequence and updates in real-time.

The procedural animation system plays various expressions by name, including blinks and yawns.

Lip Synching

A crucial part of facial animation is lip synching. This involves playing expressions that represent the basic mouth movements that a human makes when speaking. This can be achieved manually, by adding channels for each of the phoneme sequences and manually adjusting the curves, but this would be highly labor intensive and also inefficient, as there are many phoneme expressions that are required. Instead the system provides special support for this process.

Speech in the facial animation system is made up phonemes. A phoneme is a movement of the mouth that is used to create a sound. Each phoneme needs a corresponding expression that performs that mouth movement. Lip synching basically involves playing the correct phoneme expressions at the correct time so that the mouth appears to match the sound in a natural way.

The editor uses a library to analyze speech sounds and extract a list of phonemes that a human making those sounds would use. During playback of the animation the system plays particular expressions corresponding to each phoneme in order to move the lips accordingly. The system requires that appropriate expressions for each phoneme be created and added to the expression library in order for it to work.

The phoneme extraction process can be performed on any speech sound file. However, the quality of the result is greatly improved if matching text is provided.

Once the phoneme extraction is performed, the user is able to manipulate the resultant phonemes to clean up any inaccuracies he or she perceives. Doing this can further increase the quality of the result.

Joysticks

The previous section describes how animation sequences are made up of expressions and corresponding time-dependent curves. These curves can be edited directly, but there is another way that is sometimes more convenient. This is to use virtual joysticks.

A joystick is a two-dimensional box with a knob. The knob can be dragged around inside the box, and by doing so the values of expressions can be set.

Each axis of the joystick can be assigned to a channel in the sequence. New joysticks can be created and named and used to control a particular aspect of the animation in this way. For example one joystick could be used to control the brows - the vertical axis could control the raising or lowering of the brows, and a second could control the balance that is applied to the sequence, so that only one eyebrow is raised/lowered.

Documentation

Facial Animation Basic Concepts