This piece was first published by Brainstorm in 2022 as their “guide to understanding XR” the editor of this publication believes it is the most comprehensive white paper about virtual production and XR published to date. It is published with the permission of Brainstorm the full pdf version can be downloaded here 


Virtual production is here to stay, and while the digital age has driven profound changes in how television was produced and consumed, the traditional way of doing business in television has been seriously impacted because of the COVID-19 pandemic.

Broadcasters and production companies must work ever harder to attract and retain viewers; however, the current situation also raises opportunities for virtual production, from remote shooting to virtual events. One of the balancing acts that broadcasters and content providers over the world have to manage is that of costs versus capabilities, and nowhere is that currently more evident than in the use of Virtual Reality (VR).

Virtual studios become more appealing because of the democratization of virtual technology, and the availability of powerful enough hardware at a price point that not just the big broadcasters can afford.

Although virtual sets and even Augmented Reality (AR) have been around for decades now, the new developments on rendering technologies like  Physically Based Rendering (PBR), real-time Ray Tracing, or the arrival of game engines such as Unreal Engine, brought a new era to virtual production. However, the concept has also diversified with the arrival of new display technologies like LED videowalls, that provide alternatives to traditional chroma keying. Amazingly enough, semantics are important in this matter, as many authors and vendors are including concepts that often only describe partial aspects of VR.

This White Paper, therefore, aims to shed some light on the concepts, the possibilities these technologies provide for virtual production, and the pros and cons of each approach

The terms “Virtual Reality” or “Augmented Reality” has been used lately to describe ways to enhance visual perspectives or views in a variety of media such as PCs, headsets, or mobile phones, adding data such as advertising or cultural information on pictures or maps.

Virtual Reality, in its vast variety of flavours, is a great tool for enhancing the information displayed in a broadcast program, while adding spectacularity to such content, and also for creating artificial worlds that can be assimilated as real by the audience. Moreover, the availability of an enormous amount of data to be displayed requires visually attractive ways to present these data to the audience at home, which can be done by combining real and virtual images.

Let’s start by assuming that all the synthetic or virtual “realities” we will mention in this document are the result of combining.

3D computer-generated (CG) imagery with real feeds, and not just standard chroma-keying over images or video. By using 3D CG imagery, the resulting camera views display a fully immersive world where cameras move freely in 3D space, compared to the relatively constrained camera movements of typical video compositing. So, we can start by defining Virtual Reality (VR) as the method to display computer-generated images along with real feeds. Therefore, by extension, we can say that traditional Chroma Keying could represent a form of Virtual Reality, as the resulting image is obtained by compositing different images, but in this document, we will not consider it VR as we established the requirement of using 3D environments.

It is also interesting to introduce the concept of Virtuality Continuum, as defined by Paul Migram and Fumio Kishino in 1994. They defined it as a continuous scale ranging between the completely real (reality) and the completely virtual (virtuality). The reality virtuality continuum, therefore, encompasses all possible variations and combinations of real and virtual objects.


Virtuality continuum

If we apply this concept to audiovisual content creation, we can then consider VR as the “container” for all the rest of the synthetic realities, from virtual sets to AR, ER, XR or IMR, which we will detail and classify based on the mentioned virtuality continuum concept. The term Augmented Reality (AR) has been used to describe ways to enhance visual perspectives or views in a variety of media such as PCs, GPS devices or mobile phones, adding data such as advertising or cultural information on pictures or maps. We will use this term to describe information graphics applied to any content in television or apps, or when 3D virtual elements are added, in context, on top of real footage. As the original content is “enhanced” by computer-generated information or objects, some use the term Enhanced Reality (ER) instead of AR, however, this is becoming obsolete and has not been widely adopted. We can also define Mixed Reality as the result of combining real and virtual imagery, specifically CG objects which are “enhancing” or “augmenting” the real footage, however, we may use a wider approach and include the 3D virtual sets here.

As a side note, some assimilate AR with Digital Signage, bringing confusion to both terms. Digital Signage is a method to display advertising in public spaces using digital media such as screens, tablets, etc, that of course, can feature AR to capture its audience if required. When in-context content is added as AR elements over 3D virtual sets or environments, and not over real footage, we talk about Immersive Mixed Reality (IMR). Some, consider this a special case of AR because the virtual environment can fill in the whole scene or just a portion of it -meaning the virtual background could be considered yet another AR element. AR and IMR require interactions between sets, talents, and virtual objects often created out of external data sources, such as statistics, charts, bars, and many others. These data-driven objects allow for visually engaging representations of the data which can be better explained by the presenters on the set.


As an example, during Election nights, News, Sports or Entertainment shows, data bars, and other statistics can interact with the talents creating an attractive AR environment that is more appealing to the audience. So, Extended Reality (XR) can be understood as the concept that englobes most of the above and can be defined as the combination of real and virtual environments. If we include interactive content consumption in the equation, we should add any human-machine interactions generated by computer technology and wearables. About the term itself, some assume the “X” stands for “eXtended”, while other authors maintain that ‘X’ represents a variable for any current or future spatial computing technologies. In practical terms for broadcast and film content creation, XR is normally understood as the combination of virtual environments projected in a LED wall, and sometimes LED floors, in combination with real characters on stage, sometimes including other graphic elements in a sort of IMR. When using LEDs as background/floor, the camera captures the combined image of the background rendering and the characters. However, there is no technical reason why the LEDs can’t be replaced by a Chroma set so, using one method or the other will depend on what users want to achieve and the available workflows, as we will see later in this document.

“In any case, regardless of which technology we use in our VR projects, the perfect integration between real and virtual objects and environments becomes essential, so any VR project must take into account the following aspects:
• Real-time performance.
• Seamless, realistic, and accurate integration between the different real and virtual elements.
• Precise perspective matching between the different elements with regards to the camera view, even when it moves, for which camera tracking is required.
• Photorealistic render quality when possible.
• Properly delivered and set up data.”

All the different types of Virtual Reality require a specific setup, which can differ for each of the different workflows. Some of the typical setups and workflows for the most common VR applications are detailed as follows.



Virtual Sets

A virtual set is a 3D digital background on which we place/integrate a real talent captured from a camera on a chroma set, and then keyed out using a chroma keyer (software or hardware). Virtual sets can be trackless, TrackFree™ or tracked, and can use one or multiple renders per camera (one workstation rendering the inputs of all cameras) or a render per camera, a workstation rendering only the input of the assigned camera.

But, in all cases, a camera captures the feed of the talent(s) on a chroma set and sends it to a computer, which will render the feed keyed over a 3D virtual set. More information on the types of virtual sets and camera tracking technology can be found on Brainstorm’s Guide to Virtual Sets (Edition 2).


Trackless virtual sets are those that use fixed cameras to capture the talent, so the position of the real camera in space is known and can then be recreated in the 3D scene. This excludes the need to calibrate the camera and lens, as opposed to tracked environments, but significantly limits the camera views. Brainstorm’s TrackFree™ technology was designed to take fixed camera setups to a different level, allowing the mix of tracked and trackless environments. This patented technology takes the signal of fixed or tracked cameras, creates virtual camera views from it, and includes some features that provide additional freedom for virtual camera movements that can resemble real PTZ cameras or even cranes.


The virtual camera views are independent of the live feed and can move freely in 3D space, just as pedestals or cranes do in live production environments, expanding the limits of the physical chroma set, and also allowing the virtual cameras to virtually detach themselves from the real camera view and position in 3D space. Of course, it is possible to use LED walls/floor instead of the chroma screen, however, the workflow may complicate, and the results may not be as expected, as we will see later in this document.

VIRTUAL SETS WITH TRACKED CAMERAS By using tracked cameras, the tracking device transmits the camera’s position in space to the workstation, which serves to render each frame accurately from the perspective of the camera. Hence, if the camera moves, the render will move accordingly to match the camera view, and this is applicable to spatial movements, focus and zoom, depending on the capabilities of the tracking system.

Simple PTZ cameras do not change the camera position in space, although can track Pan & Tilt movements and also provide Zoom information – however by using TrackFreeTM technology users can expand these camera movements into space to simulate the movement of the cameras on the set.

More complex tracking systems can also be integrated in pedestals or cranes, allowing for free three-dimensional camera movements in space, along with zoom and focus information, when present. Tracked virtual sets can use more than one camera, so each camera has a different framing which allows for more complex productions. As mentioned before, using TrackFree™ technology makes it possible to mix fixed and tracked cameras for increased production flexibility. One workstation may support multiple camera inputs (Brainstorm InfinitySet can also create multiple simultaneous renders), but for better performance, each camera can be assigned to a single workstation, which will then send the render to a studio mixer for live production or recording. In any case, the camera can send the feed directly to the workstation to use an internal software chroma keyer or to a dedicated chroma keying hardware, which will send the video+key image to the workstation. Brainstorm technology allows for any combination of the above: workflows combining single or multiple cameras, tracked or fixed, multiple renders in a single workstation or render per camera, internal or external chroma keyers, etc., are possible.

AUGMENTED REALITY As defined previously in this document, Augmented Reality (AR) is the in-context integration of information and/or graphics on any television content (or apps), or when 3D virtual elements are added, in context, on top of real footage captured by the camera. As cameras will move freely around the set, camera tracking is mandatory to ensure the added graphics are perfectly integrated into the original footage, both in movement, perspective and looks.

Advanced AR applications such as InfinitySet can add real reflections or shadows to the synthetic graphics, resulting in better integration with the real scene. The basic configuration for creating AR content starts with a tracked camera that feeds a computer with video and tracking data, so the graphics can be rendered on top of the real images but with correct perspective matching and movement control.


AR objects set extension


IMMERSIVE VIRTUAL REALITY: COMBINING VIRTUAL SETS WITH AR GRAPHICS Sometimes the real environment is not enough for our content requirements, so we must enhance the real scene with virtual imagery, along with the AR elements. These enhancements can be just the inclusion of a virtual set for the talent to be placed on, or the usage of larger virtual elements that can be displayed on top of the real content. The basic configuration for an Immersive Mixed Reality workflow is similar to an Augmented Reality setup, but we must take into account that increasingly complex scenes and objects may require, in most cases, a render-per-camera approach rather than a multi-render approach with a single workstation.

XR WITH LED WALLS/FLOORS When using LED walls/floors for an XR project, there is no need to use a chroma screen, as the background image is directly rendered and displayed on the LEDs, so the camera captures the “composite” scene of the LEDs plus the real talents in the set. The render of the virtual background is previously sent to the LED walls based on the camera tracking information, to ensure its correct perspective based on the camera position and parameters.
This, of course, needs careful calibration of the whole setup so there is no noticeable lag between the render and the current camera view/position.

If it is required to add AR elements on top of the scene, they can be also included in the background render in some cases, but if they have to be displayed in front of the presenter, this will require to use of the camera feed as the template to burn these elements in, involving additional hardware to render the AR elements.
All the above may result in a number of issues that need to be considered, as will be detailed in the next chapters of this document.
The basic configuration for a LED-based XR setup will require a tracked camera, with a render per camera setup (recommended), and a workstation to take care of this, including the AR elements, if present. It is the complexity of the scene and the possible in-front AR graphics which will define the final hardware requirements, so careful preparation is recommended to ensure the real-time performance of the setup.
Other, more complex, scenarios can be set up, and are detailed in the next chapter LED-based XR: What can we do with it?
The workflows detailed in this chapter are just some examples of what could be required to set up a virtual content environment. The possibilities are endless, only limited by the content creator’s imagination and the resources available. However, as mentioned, every setup has its own particularities, so the pre-production analysis and work is essential to ensure we achieve the expected results.



XR Led Multiscreen

Focusing on LED-based XR content, the simpler applications start with displaying a virtual set using the LED screens, a setup that can increase its complexity by including virtual set extensions or in-screen AR, and also include additional AR graphics on top of the scene. LED-based XR requires a similar workflow for any of these applications.

The basic setup requires a camera with tracking to provide the camera position in space, workstation(s) for the rendering, and LED video walls to display the render of the virtual background or scene created out of the tracking position. The cameras will capture the live scene as it is, which becomes the final content.

This basic workflow may include several cameras, workstations, or LED walls depending on the applications. In some cases, the camera does not capture the whole screen, but just a portion of it; in other cases, the camera sees the screen as a “window” that displays an “external” world.
In any case, one of the obvious issues of any LED-based XR workflow is the delays resulting from the time difference between the capture of the tracking data, that is, the time required for rendering the image and displaying it on the screen prior to the camera capture. If the camera only captures a portion of the LED screen, we will just need to render an image that fills in the field of view (FOV) of the camera, with the resolution of the camera capture. But, in any real situation, the camera moves while the rendering is being done and sent to the screen, so we will need to render a larger image than the required for the camera FOV, to fit a “safe projection area” which will prevent the camera to see a blank, non-rendered image when moving.

This is called “extra render” and can be created either as a larger than the camera capture (HDTV, 2K, etc) render, or as an interpolated image (extra FOV), an image with the same resolution of the camera output but stretched to fit the “safe” larger projection area. Extra-render LED-based XR – Single screen, single camera video Tracking data


This workflow is the simplest one, and just requires a single LED wall fed by a single workstation and a camera with a tracking system. The tracked camera sends its position in space and other data (zoom, focus…) to the workstation, which renders the virtual world to be displayed on the LED wall with the correct and precise geometry and FOV, based on the tracking information. Then, the resulting scene is captured by the camera.


screens can be configured in different shapes, sizes, and geometries, including curved walls or corner configurations with or without floors. This means it is possible to create immersive environments in which anywhere the camera moves, even with pan/tilt and head rotations, there will be a part of the LED available to render the virtual environment. Brainstorm’s InfinitySet includes a variety of tools that allow for creating several renders simultaneously, so a geometrically correct render can be displayed in several screens, even if they are configured as corners or even floors, regardless of the screen configuration, for the required camera view. InfinitySet establishes the position of each screen in the real world, as measured from the origin point specified by the tracking system prior to rendering. This will allow users to create, for example, an optimized corner with two LED walls and a floor that requires only one render, that is correctly mapped onto all configured screens. SINGLE OR MULTIPLE LED SCREENS & MULTIPLE CAMERAS

Multi-camera setups for LED-based XR provide better production options but result in some limitations that are important to keep in mind. They require one render (workstation) per camera, providing a continuous render image for each camera’s point of view. When using several LED screens, it may require one render per LED screen as well. Most importantly, this setup requires careful planning while in production, because when cutting between cameras on the video mixer, the image on the LED walls needs to change to the corresponding camera at the same time as the video mixer cuts to any camera. Since LED screens or video walls imply processing delays, the synchronization can be done by constructing macros on the video mixer/switcher, so the device can execute two commands when switching between cameras. The first command switches the signal (render) going to the LED walls to display the correct signal of the camera that we are cutting to, and the second command, accurately delayed in relation to the first one, switches to the image captured by the camera.

This makes operation more complex, often causing problems when video mixers are controlled by automation systems. Another approach for this situation is to implement a video wall controller capable of displaying all required camera signals simultaneously on the video walls, giving the option for each camera to always see its own unique point of view on the LED screens. Some LED-based XR – Multiple screens, single camera video Tracking data LED-based XR – Single or multiple screens, multiple cameras video Tracking data manufacturers provide systems where LED walls can be driven at a multiple of the camera frame rate, so the different cameras can be adjusted to see the signal designated for each camera, by using shutter and phase adjustments. Each camera’s unique point of view would be sent to the LED walls continuously through the controllers. Images would then be displayed in succession on the screen, and each camera will capture the background depending on its own phase and shutter angle adjustments. This setup allows for a more traditional multi-camera workflow reducing the complexity at the production switcher, while, as a result, fading between camera sources is technically possible although not necessarily correct. The system may, however, set certain technical requirements and limitations, like using progressive signals, cameras with shutter angle and phase adjustments, and, most importantly, the camera sensor most likely should be of a Global Shutter instead of a Rolling Shutter type. In any case, the video wall controllers are a separate issue within the LEDbased XR workflows, which must be discussed with the manufacturers/integrators to ensure they meet the requirements of the setup and content production, or to be aware of their limitations.


A set extension is required whenever we need to display the background beyond the limits of the screens as seen by the camera, which is achieved by creating a continuation of the background environment beyond the edges of the LED screens. This is similar to what we are used to with chroma keying. However, this has limitations on the camera views and positions compared with chroma sets, not to mention the required color calibration to match the different renders and displays. Also, the image differences between the LED display and the additional render may be visible due to image degradation, pixel pitch differences, etc., as the extra render is created on top of the image captured by the camera, which displays the output of the LED screens, and also implies careful video delay control. Set extensions may require to model a 3D object to represent the location and shape of the real-world led screens. This 3D model can be used to mask the screen area in the final composite. Conceptually, we will end up with a 360 environment where part of the render would be the LED screens, plus an overlaid image around the screens that display the rest of the virtual environment not displayed on the LEDs.

When the Set Extension is used together with a CAVE (Immersive 3D environment) effect we must consider the following:
• The XR Set Extension and the CAVE effect will operate in a different time delay domain.
• Along with the CAVE render, an additional render for the XR Set Extension is required.
• There might also be a color difference between the image seen on the cave screens and the XR set extension.
• Tracking and lens calibration accuracy plays an extremely important part in this setup, as any inaccuracy would also cause a tear between the cave image and the XR Set Extension. LED-based XR – Multiple screens, AR objects, and set extension video Tracking data


This application features objects that appear to be outside of the screen even though they are rendered inside the LED screens. They are rendered in the same scene, but because the camera position and FOV are known, a workstation, with InfinitySet for instance, can render such objects in the same scene, and they will be displayed with the correct perspective with regards to the camera view. For obvious reasons, these AR objects cannot exceed the screen boundaries and will inherently be behind the real-world talent or any other element in the world in front of the screen. With in-screen AR, all the 3D elements within the screen are obviously “behind” the presenter, as if the camera was looking through a window. If the 3D elements are arranged within the 3D set “in front” they will appear as if they were out of the screen from the camera position. So, it is essential to choreograph the movements considering that the talent can’t be in front of the AR elements, or the result will not be valid. XR WITH


Both in Set Extension or In-Screen AR applications, AR graphics can be rendered in front of real-world people located in front of the LED screens, and on top of the rest of the elements in the scene. This requires an additional rendering after the composite scene has been captured by the camera, which may imply an additional workstation and careful calibration of the tracking, scenes, and applicable delays to ensure that all elements work together in coordination. Within the Brainstorm environment, if this AR graphic is an Aston graphic it might be rendered within the same engine but if this AR graphic is an Unreal Engine render, it is strongly recommended to use an additional render engine for performance reasons. This additional render will allow for placing AR graphics in front of real-world talent. Note, however, that this will result in two delay domains.


The use of LED walls to display the background environments rather than using chroma-keying techniques is increasingly common. There are important advantages and time-saving benefits when using LEDs, although nothing is necessarily perfect. With cinema-style productions, we refer to productions where several background screens are used, and only the area of the LED screens seen by the camera receive a render. The rest of the screen(s) could remain unrendered. But we must also bear in mind that the background, the unrendered area not seen by the camera, is extremely relevant in a cinema-style production, as it provides additional light and content for real-world reflections. So, in a film workflow, all background screens need to stay active (display an image or a render instead of black) at all times, regardless of the FOV of the camera. Another possibility is using LEDs for projection of reflections, lights, etc, to avoid post-production time. Film productions can use a variety of different approaches to fill in content on the screens that are not seen by the camera. Screens may simply display an approximate still image or video of the correct content, a color, or a live geometrically correct render. We must keep in mind that film productions are not easy to set up, and there is no single correct workflow or set of technical tools to use. Film productions inherently spend a large time in pre-production, prototyping, and testing workflows, which in most cases imply going through possible issues that can arise while in production. Other uses for this approach to film production is to complement chroma key or live shooting by using LED screens or projectors to simulate a background, which can be quite useful to achieve correct reflections in cars and similar objects. LED-based XR – Multiple screens, AR objects in front of talent video Tracking data


As we’ve seen before, there is no technical reason why the results obtained for XR applications using LED screens or chroma sets can’t be similar. However, although we can reach a similar result with any of these methods, both have their own advantages and disadvantages, due to their own nature. As there are many factors that affect the results and which we must consider when one technology or the other, let’s see them in detail.


One of the most relevant benefits of LED-based XR is its ability to create a “real” environment in which the virtually generated background is seen as a real “world” from the perspective of the camera. This technique, then, updates the traditional filmmaking techniques that used other display sources such as projectors or painted walls, which have been common since the very early days of cinema. Capturing the scene as a whole means that chroma key compositing may not be required as the background is directly captured by the camera(s). This brings in significant benefits because, as compositing is not required, the scene is captured “as is”, which, in essence, avoids the need for compositing the scene after it is captured. However, real-time chroma keying can be applied with virtual set technology, which saves postproduction time or even avoids it as well. However, not everything is as nice as it may seem. Since we are displaying the environment directly as a background, the scene is as “fixed” as it could be in an outdoor shooting, which implies that making further adjustments in post will be as complicated as it will be in any standard shooting, therefore it will not enjoy most of the benefits of virtual production.

XR works best when shooting medium or close-up shots, because when shooting wide or full-body scenes the size of the LED walls increases significantly or requires a virtual set extension -which has its own issues as we have seen before- and requires carefully installed props. In any case, LED-based XR is likely to be used in the studio, and it is not practical, or even impossible, to use it for remote outdoor shooting. Chroma keying can be used outdoors with little or no problems, taking advantage of the real location environment for the floor, lighting, etc. Also, for studio shooting, real-time virtual set technologies allow for real-time chroma keying, even using photorealistic backgrounds rendered using

PBR  is based on game engines like Unreal Engine. Using advanced chroma keying software/hardware, there is no delay when compositing, and recording the layers independently allows for straightforward postproduction if this is required. On top of that, new chroma keying techniques, under development, that use Vantablack or similar non-reflective “superblack” backgrounds instead of green/blue sets do not have spill or color contamination issues. On the other hand, color grading is an essential process in the finishing of any video and film production. It is always a required step that provides the desired look and feel to the final product in postproduction. And the better the integration of the different elements in the image, the easier the process will be, so the colorist will focus on the creative and finishing side of the process rather than on fixing color issues.

So, when using LED-based XR, it is likely that color corrections may not be required, as the lighting and environment can be matched together for the scene. However, when using a multi-screen and/or multiple-camera setup, color calibration becomes extremely important as different physical screens might have distinct color reproduction characteristics. Also, note that the lighting of the real-world set may also require independent color calibration of screens as studio lights will affect how a camera sees the color reproduced on individual screens. This is even more complicated when doing Set Extensions, as the output will be displayed with two different media (virtual render and captured screens), so the extra render pass created by the set extension will require grading the input of the camera coming into the workstation. Otherwise, it will not be possible to match the colors of the graphical elements on the screen cave effects, compared to the set extension method.

On virtual set environments, the grading can be approximated by intelligent and dynamic lighting, which can be adjusted directly, for instance, from InfinitySet via DMX protocols. This allows for adjusting the color and intensity of the light as the background changes. In any case, most often this requires fine-tuning in postproduction to ensure the best possible integration, but, again, if the scene has been recorded in layers this is not much of an issue, as the background and foreground elements can be easily isolated prior to composting.


most of the issues of chroma keying like spill correction or fine details like hairs or transparencies are not a problem. Also, reflections of the background environment in shiny or reflective surfaces (glass, metal, etc) are captured live and represent real reflections/refractions of such environment. Then, no postproduction is required to apply them. However, managing light spill in XR environments is more difficult, as studio lights may illuminate on different LED walls from different angles and with different intensities, making it more difficult to match the color of the different LED displays. Also, the camera angle will affect the way reflections from studio lights on LED screens are captured. When integrating the talents and the background, shadows are an issue with LED-based environments, and what benefits reflections now makes shadow management difficult. For virtual production using green screens, the industry has spent decades refining chroma keying to ensure the light drops the shadows correctly as per the requirements of the scene, and that these are captured with enough detail and quality to be used in the virtual environment. Also, props and green elements were often placed to ensure shadows are correctly applied over virtual walls, etc.

Brainstorm TrackFree™ technology, with the 3D Presenter feature, helped to further refine this workflow by allowing to drop virtual shadows over virtual objects. But LED walls are light emitters, which means that shadows, especially when using LED floors, are quite difficult to manage in these environments. When using a prop-dressed floor the studio lighting surely helps, but in any case, the light emitted by large LEDs can interfere with the studio lighting, which must be taken into account, as seen before. Some productions aim to create virtual shadows in the LED floors, but this adds another layer of complexity and hardware requirements, not to mention the requirement of additional tracking for the character.



XR depth of field focus

Pixel pitch defines how large individual pixels are on a screen, so the smaller the pitch the higher the pixel density in a given screen and therefore better resolution. LED screens’ pixel pitch affects how the background image resolution will be perceived, and as such can cause “resolution issues” when focusing in on screens with the camera. Sharp images with not enough shallow depth of field may result in a visible moiré effect in the background when shooting the scene. When the end-user wants to do close-ups with a CAVE screen in the background, resolution and pixel pitch require special attention. Also, when the wall is not flat and its distance to the camera(s) is variable, the moiré can affect just parts of the image due to depth of field and focusing differences, which may result in an unusable scene. This must be considered when planning the production. Often the background will not be perfectly sharp, and because the combination of the DOF of the background image and that of the lens used, the resulting composition will feature a somehow blurred background, which looks natural. However, if we need a sharp background, the pixel pitch, size and resolution must be carefully taken into account. Just as with tracking, please note that pixel pitch issues will be more visible on multi-camera setups, as operators will use the full zoom range of their broadcast lenses.

Defocusing is more complex in XR workflows. In real world focusing from near to far elements is simple, but in XR “far” is beyond the videowall and the videowall must remain in focus. To solve this problem additional hardware and software solutions are required, but this workflow should be thoroughly tested beforehand in real world environment. Chroma key production allows for applying, in real-time, whatever focus and DOF we require depending on the scene and camera lens/zoom. In nonlive productions, the background image’s focus and DOF on the LED wall cannot be changed after shooting, while in chroma keying production we have room for any improvement in postproduction if the shooting has been recorded in layers.



When working with real-time renderings, delays are inherent to live operation. Even the fastest processing requires information to build the scene, so in any setup, including single camera ones, the image displayed on a real-world background screen will always be slightly out of sync with the camera movements. This is because the computer needs to firstly generate a background image (which requires a variable time depending on the complexity of the 3D scene, use of Unreal Engine, etc); this background image then is transported to, and displayed on, the LED screens, then the camera captures the image and sends it back to the workstation for composition of the final image. And this process goes on and on. Even if each process takes milliseconds, the delays are incremental and may even get close to a quite noticeable delay, depending on the complexity of the scene. So, the image that the camera captures shows a 3x delay compared to the camera position captured by the tracking device, meaning that it is not possible to completely synchronize the image that is being displayed on the screens, with the movements made by the camera. This can only be somehow hidden with smooth and slow camera movements and careful zooming. On top of that, when we need to place AR elements in front of the presenter, we must add yet another delay.

Depth of field and focusing issues with LED videowalls Incremental delays using LED walls
Delay 1 The tracking device reads the camera position in space and sends it to the workstation for rendering the scene.
Delay 2 Once the tracking position is known, the workstation renders each frame of the scene. This may take a variable time depending on its complexity.
Delay 3 Each frame needs to be mapped in the LED wall, so its processor receives the signal, prepares it for the different modules and displays it.
Delay 4 If AR elements must be placed in front of the presenter, this adds another delay.

On the other hand, virtual production based on chroma keying only feature a single delay, which is controlled automatically when rendering the composite image, and regardless the speed of the camera movements, the synchronization in the composed image is fully guaranteed. With set extensions, the delay on the background screens will be different to that of the virtual set extension, while all screens must reproduce their images with exactly the same delay.

This may limit the camera movements that can be done without breaking the illusion of the XR effect. XR effects are only visible when moving the cameras over large areas, which provide enough parallax for the viewer to fully see the required depth. So, tracking is mandatory, while tracking and lens calibration’s quality play a pivotal role on how functional an XR effect will be. Also, as multi-camera setups tend to use larger tracking areas, it means that tracking quality requirements are more stringent. The tracking systems of all cameras need to be calibrated to use exactly the same origin point to achieve the required accuracy. Any problems in tracking quality will be more noticeable in a multi-camera setup.

Delays also need to be considered in multi-camera productions with LEDs, as cutting between cameras becomes extremely complex, and sometimes impossible. As the cameras have, most likely, different FOVs and perspectives, live cuts from one camera to another imply sending a different render to the LED wall, and sync the whole operation accordingly, as we’ve seen in Single or Multiple LED Screens / Multiple cameras. Film productions most likely will be shot in single camera mode so this issue should not be that important here, but in live or live-to-tape productions multicamera is widely used for time savings. This needs to be considered when choosing the right tool for our productions. HARDWARE REQUIREMENTS Larger walls need more hardware, and the workstations must be coordinated accordingly, not to mention that large or higher resolution outputs may require significant computing power. But chroma key-based production does not require feeding large videowalls, only the final rendered image, and if we use a render per camera approach, each camera’s output should display the best possible signals we can get.


As we have seen, the practical benefits of shooting live and avoiding chroma keying also has some downfalls, which affect the production workflow. Immediacy is achieved at the cost of decreasing the flexibility we are used to have in complex productions. One of the biggest benefits of virtual production is the reduction of costs in actors’ time, props, setup, travel, and outdoor shooting time. But, as we mentioned, the scene shot in a LED wall environment is fixed and can’t be changed later unless the scene is shot again. Once a scene has been recorded, changes are often required. Many, of course, can be done while in production, just reviewing the shots and reshooting if required, if the actors and crew are still available. But, if changes are required later, as the scene is “baked” we face the same issue mentioned before, so we may have to re-shoot or enter in long and complex postproduction, meaning all the time and cost savings of virtual production disappear.
Using LED-based XR will still maintain some of these benefits, but at the cost of not being able to alter shots easily in post, so if we need changes in the scene, the background needs some adjustments, etc, we will still need to reshoot the scene. Of course, rehearsals in live productions can help with these issues, however, some other changes may not be possible to make because of scheduling, availability or change of mind after production, so they will require going into postproduction.
On the other hand, chroma-keying, when used with tracked cameras and multilayer shooting, can perform any changes in post with total ease. FIXING IT IN POST This is possibly the biggest issue for LED-based XR. As mentioned before, for live productions involving rendered backgrounds (videowalls), this approach is possibly the best option. However, for film and drama productions, shooting the background “as is” leads to a significant loss in flexibility when postproduction is required, such as adding compositing, VFX, environmental grading, particles, etc.

As the image is “fixed” rotoscoping or other techniques may be required to isolate parts of the image prior to apply effects, which makes no sense in complex productions, whereas using chroma keying will allow VFX operators to easily achieve all that is required, as the elements are already shot separately and stored independently.


As we have seen, both technologies have their own pros and cons, and it is the nature of the production we are planning which will define which technology fits best with what we want to achieve. Of course, both LED-based XR and chroma keying are not exclusive and can be mixed together in the same production or used for different scenes. When using live production for events, LED-based XR seems to fit better as the audience on site will also see the production as it was designed, while with chroma key-based production it is most likely that only the audience at home will enjoy the final content. However, chroma keying still has significant benefits in workflows and flexibility, not to mention time savings in production and postproduction.

aston 1280x200 1