XR Animation Pipelines for Maximum Efficiency

9 min readJul 28, 2023

Immersive experiences require sophisticated animation systems. Characters, objects and environments must move and interact in a realistic and responsive manner — only this way people will be able to enjoy XR experiences. Given the processing constraints of current technology, efficiency is paramount in achieving a good animation. That necessitates optimized animation pipelines that minimize computational requirements while maximizing visual fidelity.

What do we mean by animation pipeline specifically in the context of extended reality? It encompasses the full workflow, from creating 3D assets and characters through rigging, animating, blending and editing animation, up to rendering and displaying the final animated result. There are many optimization opportunities throughout this pipeline, from concept to runtime, to reducing the resources demanded by animation.

When wearing a VR or AR headset, users expect interactions and movements within the virtual or augmented environment to occur as quickly as possible, with minimal latency between input and response. Most current devices have relatively limited processing capabilities. For example, smartphone processors power many AR applications and VR standalone headsets. Such chips have lower clock speeds, fewer CPU (Central Processing Unit) cores, and less dedicated graphics power compared to conventional devices.

Creating detailed and responsive animations that match the quality of flatscreen games and videos is a significant challenge for XR developers. Even simple character movements or actions in virtual worlds can consume a large amount of processing time without optimization. Complex interactions between multiple animated characters or objects still stress current XR platforms. What we need are comprehensive pipelines designed from the ground up with efficiency and performance in mind. They are crucial for any XR application with substantial animation requirements, such as OWNverse.

The Problem with Complexity

When designing 3D characters and environments for XR, we must consider the complexity and polygon count of models from the start. Every additional triangle that must be skinned, animated and rendered adds to resource demands. Similarly, character riggers must balance the flexibility and range of motion enabled by a rig with its complexity and memory footprint. Optimized skinning methods are also important to reduce computational costs. During animation editing, techniques like motion matching, machine learning and automatic animation retargeting can create animations with significantly less manual labor compared to traditional keyframe animation. Such generated animations still need review to ensure quality before deployment. Caching and compression algorithms can then store up to 80–90% less animation data with minimal losses in fidelity.

Concept Creation

The concepts and initial designs for virtual characters, objects and environments set the foundation for an efficient animation pipeline in XR. The choices of asset creators have ramifications throughout the entire workflow. Prioritizing optimization from the start is therefore a go-to strategy.

When designing 3D concepts for XR, the first consideration should be minimizing the eventual resource demands of animation. Asset managers work to constrain polygon counts, UV unwrapping techniques, and texture resolutions based on the performance budgets for a given experience. As a rule of thumb, of course, simpler concepts with fewer polygons, UV shells and materials tend to create less strain.

Concept artists and 3D modelers leverage a variety of techniques for block modeling assets and characters, allowing for precise control over edge loops, smoothing groups and overall silhouette before high resolution details are added. This way they can keep poly counts reasonable from the start. Programs like Blender, 3DS Max and Maya are heavily utilized for effectively creating efficient asset concepts. With these suites, it is possible to pay particular attention to modeling techniques that optimize quad geometry and generate appropriate edge loops for animation deformation. Modelers finesse concepts through numerous iterations to achieve a balance between optimization and expressiveness.

Texturing is also an important consideration at the concept stage. Image-based lighting, PBR texture sets and bake textures are configured with performance in mind. Normal maps, ambient occlusion maps and roughness maps are carefully authored at appropriate resolutions and bit depths. Procedural texturing is additionally leveraged to generate certain high-frequency details which significantly decrease asset size and memory requirements compared to traditional textures.

Concept art likewise informs the development of optimized XR characters and environments. In particular, 2D turnaround views clearly indicate the silhouettes, proportions and ranges of motion envisioned for 3D assets before they are modeled in software. This is to ensure that virtual concepts achieve their intended purpose in a performant manner. Concept artists’ drawings help establish animation constraints within the planned adversity of a scene or experience.

All these techniques come together to define assets and characters that can eventually be rigged, skinned and animated with maximum efficiency. But no matter how well optimized a concept may be, riggers and animators ultimately determine whether its potential for performant animation is fully realized. Therefore, optimization-focused workflows for 3D concepts are just the first step toward an efficient animation pipeline.

Constructing the Motions — Rigging, Animating, Blending, Layering

Rigging character models efficiently is paramount. Character rigs are constructed with joints, which are connected in a hierarchy and define the articulation of the 3D model. Riggers must decide the skinning method that binds joints to the model’s mesh, choosing between simple methods like linear or dual quaternion skinning for efficiency, and more complex methods for higher quality. The number and type of joints are optimized to reduce the computational cost of skeleton calculations and skinning. Blend shapes are occasionally used to control facial expressions or other deformations.

For animation, the common options are both motion capture data from suits or external cameras and traditional keyframe animation. Keyframe enables the greatest control but is time-intensive, while motion capture provides higher realism at faster speeds. Machine learning techniques are being explored to automate parts of the animation process from motion capture data to generate in-between keyframes. To maximize efficiency at this stage, the focus should be on the most perceptible motions of assets and simplify or omit unnecessary movements.

Different animation layers, such as those for limbs, facial expressions and cloth simulation, are blended together using various techniques. Blend shapes are lightweight and efficient for blending facial expressions. Additive layering causes animations to combine through simple summation, while layer prioritization determines which animations overwrite others. Together, these layering methods bring reusability and flexibility into the final assets.

What else to consider in regard to rigging and animation?

Selection of skinning methods — Dual quaternion skinning produces the highest quality results by preserving rotations, but is more complex and computationally expensive. Linear blend skinning is the simplest and fastest method.

Inclusion of machine learning — Techniques like neural networks can be trained on motion capture data to output animations, allowing automated generation of motions with minor variations or transitions between motions.

Motion editing — Animators can use tools to pose characters manually, adjust timing curves, scrub through animations frame-by-frame, and apply post-processing filters.

Performance capture — Real-time motion capture of human actors allows for highly realistic animations, in which data is mapped onto a 3D character rig in real-time.

Layer priorities — Some animation layers, like facial expressions, may be deemed “higher priority” and override motions in lower priority layers when they overlap temporally. This can bring an additional variety of expressive animations.

Building reusability — Having modular animation layers and blend shapes enables individual components to be reused across multiple full-body animations, which improves efficiency.

Keeping XR Characters Running Smoothly — Compression, Optimization and Rendering

To pre-compute and compress animation data for optimized playback in XR applications, we can use several techniques:

Skeletal compression reduces the number of joints in rigs to store only the essential hierarchy for a given animation, increasing compression ratios.
Mesh compression standards like FBX and BVH are optimized for animation data by only encoding vertex positions for deformed shapes.
Texture and material compression helps decrease file sizes

At runtime, we can perform various optimizations to reduce the load of animations on the XR engine. Animations that are not visible to the user due to occlusion or distance can be culled to save processing. Level of detail techniques adjust the complexity and fidelity of animations based on priorities. Animation retargeting maps animations from one rig to another similar rig to reuse assets. Blend shape optimizations only calculate changing weights at each frame.

The animation and rendering are closely integrated to produce a seamless experience. After being processed by the animation system, model and bone transforms are passed to the rendering engine to correctly position meshes and apply materials. Rendering optimizes draw calls and batching to minimize GPU overhead from animated objects. Rendering performance metrics can also feed back into the animation system for further optimization, such as indicating when to stream in higher fidelity animations. For even more optimized animation pipelines and life-like, responsive experiences for users, we offer following tips for caching, compression and optimization:

Baking — Animation transforms and data can be pre-computed and “baked” into the model to reduce CPU loads at runtime. On the other hand, it limits the interactivity and reusability of the animation.

Skeletal compression — In regard to the mentioned skeletal compression, we can apply quadratic compression, which can reduce the number of joints in a rig’s hierarchy by up to 70%, without greatly impacting animation quality.

Model LOD (Level of detail) — As distance from the viewer increases, lower fidelity meshes and skeletal structures can be loaded in to reduce draw calls and CPU calculations for animations.

Animation masking — Only parts of the character that are relevant to a particular animation are processed, and other parts are masked out.

Occlusion culling — Algorithms determine which animations are currently occluded from view by other objects, which can then be paused to free up resources.

Animation streaming — Higher fidelity or more complex animations are loaded in on an as-needed basis to reduce memory usage.

Material LOD — As with models, simpler materials can be used on more distant animated objects to reduce the load on the GPU (graphics processing unit).

Rendering batching — Groups of similar animated objects are drawn together in a single call to the GPU, which amortizes the overhead cost across multiple models.

Takeaway & Future Directions

Several prominent XR applications demonstrate that an optimized animation pipeline is possible. Pokémon Go uses simple skeletal rigs, animation masking and LOD techniques to efficiently animate hundreds of Pokémon on mobile devices. The Oculus Quest utilizes dual quaternion skinning, skeletal compression and CPU/GPU multi-threading to achieve high-fidelity hand animations and gestures at moderate polygon budgets. And Unreal Engine’s Animation Blueprints provide a visual scripting interface that streamlines the creation, editing and reuse of modular animation layers for high efficiency.

We have a reason to believe that the emerging techniques hold promise to further optimize XR animation:

AI-powered tools may automate rigging, skinning and retargeting of animations to different characters.
Cloud computing could allow offloading of complex calculations for detailed animations.
Hybrid rendering techniques combining rasterization with Ray tracing may provide next-level realism while minimizing performance impacts.
Advances in machine learning are also being applied for tasks like motion graphing, motion style transfer and motion prediction.

In the years ahead, production animation pipelines will likely incorporate greater degrees of automation, parallelization, and AI-assistance to create higher quality content within the narrow efficiency confines of XR applications and edge devices. Technologies like foveated rendering and variable rate shading are also great opportunities to optimize animations based on user attention and visual importance. All depends on clever engineering and design — the efficiency demands of XR need not constrain the creative potential of virtual characters and environments.

OWNverse specialiazes in custom XR utilizing high-quality animation and virtual scenery. For personalized solution discuss your case with us.

—

Join us on Linkedin, Discord & Twitter!