Debugging & Optimization
How to get most out of RenderMan 21
Written by Patrik S. Hadorn | October, 2016
Still Life scene by Dylan Sisson
Artifacts and slow renders can be extremely frustrating, especially with a deadline closing in. Without the right tools and techniques, debugging a scene can quickly take longer than one anticipates and steal away precious time one needs to get the actual work done.
In this lesson, I'd like to talk about some basic techniques which can be used to get a better understanding of what's happening in your renders. This allows you to systematically track down problems and bottlenecks. Once resolved, your renders will finish faster and reach a higher visual quality. With this knowledge, you'll be able to anticipate issues while you're building the scene and eliminate them before they even become a problem.
1 What is debugging and optimizing?
Before we start, let's discuss what is meant with the words optimization and debugging in this lesson:
Often, artifacts are intolerable as they prevent you from achieving the desired picture so you often have no choice but to remove them. This can either be achieved by finding the source of the issue and fixing it (the topic of this lesson) or by painting over it in Compositing. Which approach to prefer depends on the situation. If it's just a handful of frames, painting it over in Compositing is usually quicker but you might run into the same issue in the future. If artifacts are present in many frames however, fixing the issue at its core might get rid of it once and for all and improve performance at the same time.
Bad render times, in contrast to artifacts, are tolerable to a certain extent. One could simply wait longer for the renders to finish or buy additional hardware and licenses. However, if you have your render times under control, you improve your productivity and lower your costs. If you need to reach the highest quality possible, there's no way around being resourceful and clever about your setups.
As you can see, debugging is often something you have to deal with, in one way or another. Optimizing on the other hand, you choose to do when you need to increase productivity, lower costs or be able to push even more details into your renders.
The good news is that these processes often go hand in hand. Some artifacts increase the time it takes for a render to finish. By removing them, you might get a cleaner result and improve performance in one go!
1.1 Typical artifacts
Users familiar with the older REYES architecture may remember the many different types of artifacts that could appear, depending on the scene setup and techniques used. It required a trained eye to differentiate between them and a good technical understanding to avoid them. Fortunately, this is not the case any more in RIS as the diversity of artifacts is quite manageable.
Now, the most typical artifact we have to deal with is noise in its various manifestations. Depending on the type of noise and its source, we need different techniques to identify and remove it.
1.1.1 Uniform Monte-Carlo (MC) Noise
Uniform Monte-Carlo Noise
This is the most typical artifact we're seeing when rendering with a path tracer. The renderer continuously shoots rays which cleans up this noise and the image is done once the desired quality has been met.
A good performing render shows uniform noise after a few iterations which clean up noticeably with each additional iteration.
1.1.2 High variance areas
High variance Noise
A high variance area is especially strong noise, concentrated at a certain location of your image.
In the picture above, you can see a sphere illuminated by a light and some indirect blue light coming from the screen left. If you look closely, you can see that the image is clean in most areas except for the blue sparkles on the floor. The noise on the floor has significantly higher variance than the rest of the image. To clean it up, the renderer will have to continue shooting rays for a long time.
This kind of artifact is also difficult for the renderer to recognize as noise and it might prevent adaptive sampling to work properly, as we'll discuss later. In many cases, high variance can be removed by simple tweaks to a scene or at least improved enough for adaptive sampling to take care of it, as we'll see in the example at the end of this lesson.
Fireflies are very bright spots in your render. This is extremely strong noise caused by something which is very hard for the renderer to sample. While uniform noise and high variance areas can usually be removed by simply rendering longer, fireflies are much harder to get rid off and they can create strong artifacts in denoised images, especially when rendering animations.
It's usually best to track down the source of the fireflies and eliminate it. Often these come from geometry which is very close to a light source which makes them so bright, that they'll strongly illuminate the scene through indirect contribution. There is the Intensity Near Dist parameter on lights which can help to keep these hot spots under control but superior results can be achieved if avoided entirely. We'll look into an example of this artifact later in this lesson.
In the example above, a small, bright light has been placed very closely to a wall creating a tiny but intensely illuminated spot. This is extremely difficult to sample and causes the artifacts described here.
1.1.4 Invalid values
We use numbers to define our scenes, be it for geometry, textures or even for shading networks. These numbers have to stay in a valid range and if they don't, all kinds of things can go wrong. The renderer often guards against invalid inputs but you might still get unexpected results.
Examples for these invalid setups are negative numbers for colors or extremely high values for displacements. There are two particularly important values which appear often in incorrect setups: NaN and INF. INF simply stands for infinite and is what the computer uses when a value is too large to represent. NaN stands for Not a Number and is a value used for results of invalid operations. For example, a division by zero is undefined but a computer still has to assign a value to it. In these cases, the result is being set to NaN.
One single value is often enough to cause quite a bit of mayhem. For example, if a single vertex of your geometry contains a NaN, it's possible that the whole object doesn't get rendered. If a single pixel in your texture contains a NaN it will destroy whole parts of your texture, especially when rendering from afar.
Texture containing an invalid pixel used as a bump map rendered at increasing distance (simulated by larger texture lookups)
For the picture above, a texture containing NaNs has been used as a bump map. From left to right, an increasing amount of blur was applied to the texture to simulate what's happening when rendering from further away.
When the camera is close, it's hard to spot the issue. However, as the distance to the object increases, the texture gets filtered and the problem becomes more apparent. This might cause much more than just some black areas. It could make objects disappear, cause denoising artifacts, cause a crash or even introduce fireflies.
When rendering larger scenes with countless textures, it can be extremely hard to track down such issues so it's always best to check your assets in isolation before using them in a complex scene.
2 What's keeping RenderMan busy?
To render an image, RenderMan is continuously shooting rays for each pixel until the target quality has been reached. The time taken to reach this goal is called render time. The render time can be split into three major contributions: Number of pixels, time to trace a full path (camera ray and bounces) and number of rays to reach the target quality. So, we can reduce render time by:
reducing the number of pixels: either by rendering at a lower resolution or by using crop regions
reducing time to trace a full path: lower number of bounces or by lowering the geometric complexity (less objects and lower poly-count)
reducing number of samples: by improving areas which are hard to sample or by lowering the target quality.
2.1 Target Quality: PixelVariance
The target quality is what RenderMan has to reach before it can finish rendering an image. This is controlled by specifying a target PixelVariance for your render, meaning you tell RenderMan how much noise is tolerable.
Cleaning up noise is a bit like cleaning your home. The first few sweeps go a long way and just take a few minutes. If you want it to be super clean though, it's not that easy. You have to invest much more time and the cleaner you want it, the longer it takes. Instead of cleaning all day, you stop when it's clean enough. As with rendering, there are multiple factors which affect when this point is reached. One of them is when it looks clean and another aspect is when you're running out of time.
The same principle applies to defining the target quality of your render. After a few iterations, we already see our picture but the details are still lost in the noise. We need to invest more time to clean before we have a picture we can work with. But, if we let the render get cleaner than what it needs to be, we're wasting a lot of time. As with the example above, we need to define the point when it's clean enough for the amount of time spent.
As PixelVariance is being reduced, render time rises dramatically. Data from an example scene,not all renders behave the same.
The picture above illustrates quite nicely how quickly render time increases with decreasing PixelVariance. Whether to use '0.005' or '0.0025' feels like a small difference but the lower value doubles our render time. Spending some time testing different PixelVariance and choosing the one which produces acceptable quality in the smallest amount of time can save a huge amount of render time. Personally, I think this is one of the most important optimizations there is!
Visual comparison of decreasing PixelVariance
As you can see in the image above, it doesn't take much time to reach an image without much noise. Even after just a few seconds, we get a pretty decent render for this simple scene. However, if we try to get rid of the barely visible noise at the bottom of the sphere, even this simple render suddenly takes quite a long time.
Noise is quite easy to notice on even surfaces like this grey sphere. When textures are being used though, noise is easier to hide, often allowing you to increase the PixelVariance a bit which gives a noticeable speed up.
PixelVariancehas to be tweaked visually to find the right spot for your project. Often, color adjustments (grading, LUTs, etc) or even defocus and motion blur in post are being applied to the final render and it's important to take that into consideration when tweaking the PixelVariance. Decide what quality you need for previews, for WIP renders and for final delivery to ensure you're not spending too much time on a render which doesn't require to be completely clean, especially when the denoiser is being applied.
Denoising is a powerful tool and should be used whenever you're rendering more complex scenes. Strong noise and fireflies can cause flickering and artifacts in the denoised result. This is why optimizations make sense, even when using a denoiser as this will further improve the results of your final images. Test your different PixelVariance settings with the denoiser to find the value which works best for your renders. We'll come back to denoising at the end of this lesson.
relativePixelVarianceis a setting, which allows you to specify different quality levels per object. This is a very interesting optimization when you're rendering sharp images which then get blurred in post, be it for depth of field or motion blur. Rendering objects cleanly which eventually will turn out to be blurred is wasteful so we could use relativePixelVariance to instruct RenderMan to render these objects with inferior quality.
However, if a certain object is noisier than the rest of your render, do not be tempted to use relativePixelVariancel This is usually an indication that the adaptive sampling isn't working correctly.
3 Adaptive sampling
While RenderMan's RIS is significantly easier to use than good old REYES, there are still a few things which are important to understand and easy to get wrong. Adaptive sampling is one of these things and probably the one most misunderstood so let's take some time to look at how it works.
When rendering an image with a path tracer, we're shooting rays (samples) to calculate the color of each pixel. Some renderers ask the user to set how many rays to shoot for each pixel but this is quite unintuitive. The answer to how many rays we want to shoot is: as many as it takes to look clean! For these renderers, one has to tweak this setting until the image reaches the visual quality we're after. This technique is called fixed sampling.
RenderMan's RIS on the other hand uses adaptive sampling by default. Instead of defining a fixed amount of samples to shoot per pixel, we tell the renderer what quality we desire. RenderMan then starts shooting rays and continuously estimates the amount of noise. Once a pixel has reached the target quality, it won't be sampled further and the renderer can focus on the areas which need more attention.
There are several major benefits of using this technique:
- Since the renderer can skip pixels which reached the target quality quickly, we don't waste samples where they aren't needed.
- Particularly difficult areas will get the attention they need without having to globally increase the number of samples.
- When using adaptive sampling correctly, the final image will have reached the target quality in all areas. This gives us a guarantee for the noise level in the image and it will be consistent when rendering animations.
There are only three main settings to control adaptive sampling: PixelVariance, minSamples and maxSamples. When understanding what each of them does, it's easy to get them right. Setting them incorrectly however might remove the benefits of adaptive sampling. We've already looked more closely at PixelVariance so we'll now discuss minSamples and maxSamples.
3.1 minSamples demystified
We're already telling the renderer at what quality to stop so why do we need additional controls? The reason for having these two additional settings is simply because adaptive sampling isn't perfect. We know that the renderer should stop rendering a pixel if the noise is low enough but how can the renderer know how much noise there is left?
As this information isn't available, RIS has to estimate the level of noise by measuring how much a pixel is changing with each iteration. For the first few rays, the pixel is changing color rapidly. Sample after sample, the pixel converges (getting closer and closer) to its true value. Once the variation is smaller than what's been set as a target goal, the pixel is considered done.
While this works fine in most cases, there are situations where things can go wrong. For example, some objects, especially when far away, can be much smaller than a pixel. It's not unlikely, that most of the rays shot for that pixel would completely miss the object.
When this happens at the beginning of a render, RenderMan sees a pixel which isn't changing color at all and it might incorrectly assume that this pixel is clean. This is particularly problematic when rendering an animation as other frames might hit the small object resulting in the object popping in and out of existence.
Tiny water droplet inside a pixel: The red dots (illustrates the camera rays) show that even after shooting quite a few rays, the object might be missed completely which fools adaptive sampling into thinking that this pixel doesn't need further sampling.
This doesn't just apply to small objects. Anything which has a low chance to be hit by a ray would behave in the same way. This could be hair (fortunately, the hainminwidth setting helps with this), tiny highlights, heavily motion blurred objects (a ray has to hit the object at the right time) or even just high variance noise discussed above. There's hardly a scene without the potential to run into this issue.
For this reason, we can set the minimum amount of samples to use before adaptive sampling is allowed to even consider stopping rendering a certain pixel. In the case of a tiny object, we'd need to set minSamples large enough to ensure that the object is being hit at least once so RIS knows that there's something which needs further sampling.
Rendering tiny dust particles with increasing values for minSamples. When minSamples is set too low, whole particles are missed and there will be flickering in the animation.
It's not always obvious that minSamples has been set too low. Sometimes, it might just lead to noisy areas in which RenderMan underestimated the noise and stopped too early. Some users might be tempted to simply lower the PixelVariance in these cases, after all, that's what controls the amount of noise we want to allow. While a lower PixelVariance might help reducing noise, it will oversample all the other areas which were fine before and lead to excessive render times. Increasing the minSamples on the other hand helps RenderMan getting better estimates of the noise and recognize more difficult areas.
If minSamples has been set correctly, the picture shouldn't change noticeably when it's increased further. If a small increase in minSamples makes some areas significantly cleaner, it's usually an indication that the previous value wasn't high enough to estimate the noise accurately.
Higher minSamples will also lead to longer render times but the image will reach the desired noise level for the chosen PixelVariance. Too few minSamples on the other hand, while rendering faster, can't guarantee that the image reaches the desired quality and the noise might be very uneven and change intensity from frame to frame which defeats the purpose of using adaptive sampling.
So, if you're tempted by the faster render times when using lower values for minSamples, be aware of the problems this might introduce!
3.1.1 RenderMan's magic parameter values
For many settings in RenderMan, the values '0' and '-1' have a special meaning. '-1' often means 'infinity' and is for example used for the maximum distance a shadow ray is allowed to travel. '0' on the other hand often means 'automatic' and is the default for the number of light samples used per light. This doesn't mean that no samples are used, but instead, that the internal mechanism is used to determine sample counts instead of relying on the user provided value. Usually, when the default is '0' or '-1', this indicates that these are special values. For details, you'd need to consult the documentation.
Setting minSamples to 'O', as it is by default, also triggers a special automatic mode and does not mean that there is no minimum for the sample count. This automatic mode simply takes the square root of the maximum number of samples (called maxSamples in RIS). If maxSamples has been set to 512, the automatic mode would set minSamples to \/512 ~ 23. While the attempt of simplifying things further is nice, I personally would recommend setting minSamples explicitly. The automatic mode can be used as a starting point but might need adjusting to avoid issues in your scenes.
3.1.2 How to set minSamples
If you want to render very quick previews and you don't care too much about some noisy areas, it's fine to consciously set minSamples very low. In these cases, I often set minSamples to '1' and only increase it in particularly difficult scenes. This gives very fast renders at low quality but might be enough to check certain aspects of the image.
When rendering for higher quality however, it's important to set minSamples accurately. I usually start setting minSamples to '16', maxSamples as high as one would ever need, '2048' is what I usually use, and the PixelVariance to the desired quality. I then create a crop region around areas I recognize as being problematic (the ones with most noise or smallest details) and render several images, each time with '16' more minSamples. So the first render uses '16', then '32', '48', '64', etc. If two subsequent renders are visually barely distinguishable, I choose the lower one for my minSamples.
Render with increasing amount of minSamples while keeping PixelVariance at 0.02. Rendering with too few minSamples underestimates the blue noise on the floor and the renderer fails to reach the target PixelVariance. Only with about 48 samples, adaptive sampling can estimate the noise correctly. These are subtle changes and are best observed by flipping between two renders with different minSamples.
3.2 maxSamples demystified
Wow, who'd have thought there's so much to say about a setting like minSamples? Fortunately, maxSamples is much easier to explain.
Picking up the cleaning example from above again, imagine you find a spot of dirt and no matter what you throw at it, you can't get rid of it. We've decided to only stop cleaning once we've reached the level of cleanliness we set ourself as a goal. If it's a really deep-seated stain though, we might end up cleaning for days. Well, not really, at some point, we reach the limit of our patience and just accept defeat.
Since RenderMan, or a computer in general, is infinitely patient, a render with an extremely difficult area would be happy to continue rendering for days so we need to somehow tell the renderer when to stop. For this, we can set the maximum allowed samples to be used on a pixel.
As you can see, this is not really a quality control, it's much more like a kill switch. We just set maxSamples to a value which represents the maximum amount of samples we'd ever expect to be necessary to reach the desired quality in normal circumstances.
Should there be a pixel which hasn't reached our desired quality by then, there's not much reason to invest further anyway.
In order to capture all the tiny dust particles, to avoid flickering in animation and to reach the target quality, both minSamples and maxSamples have to be set correctly.
3.2.1 What could go wrong?
There's however no easy way to know how many samples you might need to reach the desired quality. If we're setting maxSamples too low, we disturb adaptive sampling and it might not be able to reach the quality we've defined with PixelVariance. Again, less experienced users might try lowering PixelVariance further in the attempt to reduce noise but this will only make already clean areas even cleaner while leaving the noisy parts pretty much untouched. So the danger of setting maxSamples too low is to prevent BIS from delivering the target quality. A typical example is that the value chosen for maxSamples works fine for a still image but not when Motion Blur is turned on. It's best to tweak maxSamples in the same setup as your final render, so with Motion Blur enabled, if that's what you're planning to use.
When setting maxSamples too high on the other hand, the only problem is that some tiny areas might continue rendering for a long time, trying to reach the target quality but taking way too many samples in the process. This however is quite rare and usually doesn't cost much so it's usually best to keep maxSamples high so RenderMan can deliver the quality we ask for.
My recommendation again is to try different values and settling with the one which works best. Personally, I've found 512 to work well in most cases but it's perfectly normal to use 1024 or even higher for more difficult renders. The important thing is to understand what setting needs adjusting when you run into issues instead of just desperately lowering the PixelVariance (or even RelativePixelVariancel) which might just make things worse.
3.2.2 The SampleCount output
Example of the sampleCount output. Bright areas indicate where many samples have been used.
RenderMan supports some built-in outputs we can render alongside our other images and the sampleCount is an extremely useful one. The value of each pixel is determined by the amount of samples used on it. So a pixel that required 32 samples will show a value of 32 in the sampleCount output. This means that you'll have to expose the image down quite heavily to see anything else but white and that you might need to check the value of individual pixels, for example by using the Pixel Readout tool from IT's (RenderMan's Image Tool).
What makes this so useful is that you can easily check where you're shooting most of your samples and if it reaches your maxSamples limit in any area. Also, if there is a noisy spot which shows a low value in your sampleCount output, it's a good indication that you need to increase your minSamples.
4 Optimizing the 'Still Life' scene
The scene we 're going to use in this lesson
To show some techniques in a more practical way, we'll look at an actual example scene. This scene has originally been built by Dylan Sisson, but has been updated and adjusted for this lesson to better illustrate how optimizations can be applied.
This scene is simple enough to be able to discuss the problems in isolation and yet, it contains enough complexity to show the usual issues that might be lurking in your renders as well.
4.1 Techniques to identify problems
When someone's sick and goes to see a doctor. The doctor will perform some basic tests to determine the patient's physical condition. The doctor will investigate the symptoms described by the patient and might, among other things, check breathing and blood pressure.
We proceed similarly when debugging and optimizing renders. We analyze the artifacts. Then we check how the render performs in a controlled condition and use various tools to check how the different subsystems of the rendering process perform.
There are some simple techniques which will provide us a lot of information about our render as well as allowing us to track down issues and improving performance. In this lesson, we're going to look at the three tools I'm using most: XML statistics, LPE- & Data-TOFs and fixed sampling rendering.
This raw data needs interpretation and experimentation to optimize our renders. We might look at it and find out that we're spending a lot of time tracing transmission (shadow) rays. But this could be because we're shooting a lot of them or because they're expensive, or both. The statistics contain information about how many rays have been shot as well as time spent in tracing rays and executing shaders.
LPE- & Data-AOVs: LPE stands for Light Path Expression and is a language to describe regular expressions which allows us to extract very specific parts from a render. With LPEs, we can output additional images alongside our beauty render, which only contain paths which match our expression. This allows us, amongst other things, to render additional images which contain just the illumination from a particular light, the reflection of a certain object or to extract caustics. Out-putting LPEs comes pretty much for free (they still cost memory and there is a small overhead) so it's about the same cost as our usual render.
These additional images rendered alongside your beauty image are called arbitrary output variables, or AOVs. LPEs can be written out in AOVs but we can also export other data. For example, we can save out the normals, textures and uv information to separate images, making it easier to check if there are issues coming from our geometry.
By rendering with a fixed number of samples however, areas with higher variance become much more obvious. Low number of samples are often enough during debugging which gives us a way to render quickly in a controlled environment. If we apply a good optimization, we'll directly see it in the improved render time or by the reduction in noise. This would be much harder to measure with adaptive sampling.
Fixed sampling is is used, when minSamples and maxSamples are both set to the same value. Setting them both to 16 or 32 is common during debugging as it gives quick results and the noise in various areas can be compared more easily.
4.2 Analyzing the Still Life scene
Finally, let's get started on that Still Life scene! First, we launch a render with the sampleCount and cpuTime AOV added in our pass settings and with XML statistics enabled. On my machine, the render came back after 53 minutes:
The render before any optimizations. The image took 53.3 minutes to render on a notebook
(il-flOOMQ @ 2.4GHz)
Ok, that's not too bad, right? There are no obvious artifacts. But let's take a bit of a closer look at the area above the candles:
There's some high variance noise on the wall, even with our final quality settings (brightness increased on blow-up).
This is high variance noise! Even 64 minSamples weren't enough to get the adaptive sampling to clean that up and if we go higher, we'd end up with much higher render times. We'll have to find a way to fix this!
Before we start, let's have a look at the sampleCount AOV. This output benefits from some remapping to better visualize it as I'm doing here:
Remapped sampleCount AOV. Hot colors indicate areas which took many samples while blue areas represent areas which converged after just a few samples
The red areas in the sampleCount AOV indicate that we're hitting the maxSamples limit, which is 1024 in this case. We can see these reds on the candles, on the porcelain and on materials reflecting these objects. We can also see that the fur on the wall, the pears and even the barrel needed quite a large amount of samples.
Let's switch to fixed sampling minSamples and maxSamples to 16. This will uncover what problems adaptive sampling has been hiding and gives us reasonably fast feedback.
4.3 Orange fireflies
The orange fireflies on the wall are very obvious when rendering with a low, fixed number of samples, so let's start with that:
The fireflies on the noise are clearly visible with fixed sampling
Looking through our LPEs that get rendered alongside our beauty, we can see that the artifact is only present in the candle lights and only in their indirect diffuse contribution. This means the indirect diffuse rays coming from the wall have a hard time hitting the bright tops of our candles. Since the candles are relatively thin and brightly illuminated from the fire, this makes things just worse.
The chance for an indirect diffuse ray from the wall to hit the small, bright tips of the candles is very small which leads to fireflies.
The VCM (vertex connection and merging) integrator performs much better in these situations. However, the additional cost of tracing photons and added noise from enabling caustics might not be worth the benefits in our case. What else can we do? Looking at our AOVs again, we can see that, apart from the fireflies, there's not really much indirect contribution on our wall anyway. By adding the maxdiffusedepth and maxspecu-lardepth attribute to our wall geometry, we can control at what point to stop our indirect bounces on the wall. By setting both to '0' we can disable the indirect contribution for our background wall which successfully removes the fireflies:
Limiting the indirect on the wall removed the fireflies
If you can't afford to remove the indirect completely, you could render the candles (wax) in a separate pass. This way, you can turn the indirect off on the candles when you render the scene to remove the fireflies and turn it back on when you render the candles.
Rendering indirect illumination in a path tracer is not expensive, it's actually very fast. However, if there is a lot of noise in the indirect illumination, we can improve performance by removing the source of the noise. In this example, this was done by limiting the indirect on the wall.
4.4 Use subsurface scattering where it counts!
When we analyzed the sampleCount AOV, we noticed that the barrel, porcelain and pears took quite a bit of samples. Looking at these materials, we see that all of them are using subsurface scattering. Let's switch to diffuse on all of them to see what difference it makes:
In this image, subsurface scattering (SSS) has been replaced with diffuse on the candle, pear, barrel and porcelain.
Subsurface scattering (SSS) introduces quite a bit of additional noise but it's also adding realism to many materials. The render without SSS has significantly less noise but the candles and pears lost a major part of their look. The barrel and porcelain on the other hand look better without it so we can leave it off for these materials and benefit greatly from the reduced noise.
4.5 Analyzing the XML statistics
Even when you're doing preview renders from within Maya, XML statistics are being written. So let's have a look at one of those using fixed sampling (they can be best viewed within your internet browser):
This is the top part from our overview section. Here, we can see some general performance information of our render. Under Performance, we can see the render time, memory usage and CPU utilization (how well we kept RenderMan busy). We can't really judge utilization on these quick preview renders so we can ignore that.
In the Time column, we can see a rough split of where our render time was spent. Shading seems to be our biggest offender which is surprising, considering that this scene doesn't have expensive and complex shading networks. Unaccounted includes any subsystems which aren't explicitly timed. By increasing the statistics level in the render globals, additional timers can be enabled which can give more information but it can also slow down our render.
The Render time heatmap gives a visual indication of where we've spent most time. This heatmap is normalized based on this specific render so you can't compare heatmaps from different renders! Don't be tempted to think that we care about the small red areas. While it would be good to find a way to make these areas faster, they're small and the overall impact on performance would be minimal.
A heatmap from a render with fixed sampling is also different from one using adaptive sampling. With fixed sampling, we know that every pixel got the same number of samples so if one area is more costly than another, we know that this is because the evaluation of samples is more expensive there. With a render using adaptive sampling, hot areas in the heatmap could be due to costly rays or because it took many samples to reach the Pixel Variance. Just keep these things in mind when analyzing heatmaps.
Looking at this picture, I'm wondering why pixels on the wall took longer to render than the area on the barrel or on the lit part of the teapot? I'd expect this area to be among the easiest to render as the material on the wall is rather simple.
Let's switch to the Shading tab and have a look there:
In the left column, we see all the Bxdfs (Surface shaders) listed. For each of them, we see the surface execution time, the interior (volume) execution time and the time spent in opacity (& presence) calculations. At the bottom, we can see the total time spent in each of these stages. Volume and Opacity don't seem to be a bottleneck in this render. Most of the time is spent in Bxdf Exec which includes the computation of our patterns.
In the right column, there is a list of all the Patterns used in this render. I've sorted it by clicking on Compute Time. What's definitely sticking out here is that PxrVoronoise took nearly 75% of our total shading time even though it's only been used in 1% of the shading points!
Somehow, this pattern is costing us greatly in our render so let's find out where it's being used. We already know that the wall took longer to render than we'd expect, so let's start with the material assigned to it. And indeed, the walLmtl material is using a PxrVoronoise to add variation to its diffuse color:
A closer look at the parameter of this node shows that it's set to use 32 octaves. This parameter is controlling how many iterations the shader goes through to calculate its result and 32 is a lot! Most noise patterns can be clever and stop going further when additional iterations wouldn't add new details. Let's swap it out with a noise pattern which can do that, a PxrFractal in our case, and apply the same parameters to it (frequency, octaves/layers and lacunarity). Here's the new statistics after that simple change:
Wow! We nearly halved our render time with this simple change! The heatmap now also indicates that the wall is the cheapest area to render, as we'd expect.
We can't replace expensive patterns when we don't have an alternative. Instead of
replacing the PxrVoronoise node, we could have simply lowered the number of octaves to a more reasonable level. Since RenderMan 21, we can also bake patterns into textures using the new baking workflow. With this, we can bake whole networks of expensive patterns and read it back in as a texture.
4.6 Limiting multiple scattering on fur
Tracing fur is expensive, not just because it's a complex geometry, but also because shading hair is costly. Every additional bounce potentially adds noise or even fireflies. Our render time heatmap already shows the fur to be a relatively costly area and there's definitely some sparkling, or rather high variance noise, in its indirect specular which is going to take many samples to clean up.
Let's add the mctxspeculardepth attribute to our hair material and set it to 1. This will make things a bit darker since we're loosing the additional bounces but we can counteract that by brightening up the diffuse and specular a bit:
By limiting the number of bounces traced on fur, we can reduce both noise and render time.
While this doesn't change much on the render time when using fixed sampling, we've reduced the noise in the indirect greatly on the fur and that's going to pay off when we're using adaptive sampling for our final render!
With the changes so far, we've reduced our rendertime from 53 minutes down to 31. That's nearly a 35% speedup! But it's not just about the reduced render time, it's also about having removed high variance noise and fireflies in several areas. Our render now is faster to render and gives us a better quality at the same time!
4.7 Setting light, bxdf and indirect samples manually
What else can we do? One very popular technique is to manually balance the sample distribution by changing the numLightSamples, numBxdfSamples and numlndirectSamples. Doing this is hurting performance more often than not and it can create very fragile setups which vary drastically in render time at the slightest change. Still, it's important to understand how it works so you can decide for yourself if this is something your scene would benefit from. Here's an illustration of what these settings represent:
Illustration of sample distributions.
As you can see, these settings only change what's happening at a camera hit and does not affect any bounce after that. This is because we'd quickly end up with what's called a ray explosion where each additional bounce increases the number of rays exponentially. For example, if every ray hit would spawn 2 bxdf, 2 light and 2 indirect samples, we'd have traced 1 + 6 + 7* (2 + 2) = 35 rays for a single camera ray going through 2 bounces, for 3 bounces it would be 67 and after 4 bounces we'd trace 131 rays. Clearly, this won't work for scenes which need many camera samples and trace multiple bounces.
• numLightSamples: defines how many samples are being generated on lights
• numBxdf Samples: sets the number of samples to be generated by the material
• numBxdf Samples: sets the number of samples the material should generate for tracing indirect illumination
Increasing these settings can make certain noise clean up faster but, at the same time, each camera ray becomes more expensive. This means that if we cleverly balance the sample settings, we can speed up our render as long as the additional cost of each camera ray doesn't outweigh its improved sampling.
Again, we can use fixed sampling to help us set these controls correctly. When we look at the direct and indirect LPEs, we'd like to see an overall similar amount of noise. If there are small areas with stronger noise in the indirect, that's fine as adaptive sampling will take care of that. However, if the indirect is, in large parts of the render, more noisy than the direct, it can help to increase the numlndirectSamples setting.
If we're not rendering fur or other geometry with fine details, we might not need many camera samples as most of the noise is in the lighting. In these cases, it can help to increase all the sample settings and reduce minSamples by the same factor. So, for example, we could set numLighSamples=4, numBxdfSamples=A, numIndirectSamples=4 and reduce minSamples from 64 to 16. This way, instead of having to shoot 8 camera rays to sample the indirect 8 times, we only need to shoot 2 camera rays. If the shading is costly to evaluate or the geometry is expensive to trace, this can help to reduce render time. But keep in mind that lowering minSamples can introduce issues as discussed above.
4.7.1 A word of caution!
The reason why all these settings are set to 1 by default is to keep each ray as cheap as possible. With this, we can leave it to adaptive sampling to keep on refining the areas which need it. If we start increasing the sample settings, the cost of each ray increases and we might loose some of the benefits of adaptive sampling. Our changes might help in one area but could be an overhead in another as we've made each sample more costly.
Especially when rendering animations, the sources of noise are constantly changing and it's rarely possible to find good settings for the whole range of frames, one would need to animate them for the best performance which would be extremely tedious. Even worse, a small change to the scene could turn these optimized settings into extremely bad ones.
I've seen many users use values even as high as 8 for light, bxdf and indirect samples, claiming that this is giving them significant performance improvements. I'm sure that using these settings on our unoptimized scene would have shown improvements but that's simply because we had an extremely costly shading network on the wall and the reduced number of camera rays would save us shading time. With our optimized scene however, render time would increase with these settings, especially in areas which require a high number of camera rays, be it fur, materials with sharp bump mapping, small highlights, motion blurred and out of focus areas.
People using such high sample settings have to reduce minSamples to avoid extremely long render times. As discussed before however, too low values for minSamples will prevent adaptive sampling from doing its job and we might end up with noise, flickering, loosing highlights and overall inconsistent quality.
This is why I recommend to keep the numLightSamples and numBxdfSample settings at 1 in nearly all the cases. Increasing it to 2 can make sense when you have large shading
networks with many textures. If you have extremely costly shading networks and you see improved performance with 4 or even 8 samples, I recommend to investigate the source of the cost and reduce it (maybe through baking or by replacing the expensive nodes with cheaper alternatives).
Increasing the numlndirectSamples on the other hand can be useful, especially in interiors where indirect is a major part of the overall lighting, I wouldn't go higher than 2 but this depends on your scenes.
As this optimization is so sensitive to scene changes, I strongly recommend to tweak these settings as the very last step and to consider leaving it at their default. A small change to the camera or lighting could turn an optimized setup into a rendering nightmare.
For the Still Life scene, I decided against increasing the sample settings as I'm not getting any benefits out of it. Even if you're seeing small performance improvements in your scenes, always weight it against the increased fragility of such a setup.
We haven't discussed denoising yet even though this might be the first thing many people suggest when it comes to reducing render time. The reason for this is that denoising should not be your first tool for optimization but instead, be what you use on a well performing render. We've seen at the very beginning of this lesson that removing all the noise in a render can take an incredible amount of time. Reaching a reasonable level of noise on the other hand is quite quick. The denoiser's job is to take these relatively clean renders and remove the last bit of noise.
If you're adding denoising to your render too early, it might be covering up some severe issues in your render which could be responsible for denoising artifacts. By spending some time using the techniques described above, we can remove fireflies and other issues from our renders which greatly improves our denoising results.
The more noise there is left in our renders, the higher is the risk of overblurring fine details, especially in the highlights:
Applying the denoiser on a very noisy render can result in blurred details (circled areas) compared to a render using a lower PixelVariance.
While the denoiser can achieve remarkable results even with very noisy renders, high-quality results are best achieved with good-quality renders.
4.9 What if that's not enough?
Optimizing and debugging is an iterative process. When we've applied a change, we have to go back and look at the statistics again and check our render. When fixing an issue, it might uncover some other problems which hurt performance.
Sometimes, even after going through the setup over and over, you might not be able to achieve the render times you need to fit into your right render budget. In these cases, you'll have to be more aggressive with your optimizations which can also include sacrificing quality. Here are a few ideas of what else we could do:
Denoising with higher PixelVariance: Thanks to the denoiser, we can get a clean result even if we'd increase our PixelVariance from 0.01 to 0.02, giving quite a significant performance boost. The downside is that this will slightly blur some of our sharp highlights but this can be hard to notice when we're not directly comparing it to the higher quality render.
Limiting bounces for subsurface scattering: At the very beginning of the optimizations, we've discussed the cost of subsurface scattering and we've switched to a diffuse material for the barrel and porcelain material. Another way to cut the cost of SSS is to avoid tracing multiple bounces by setting the maxdiffusedepth on materials which don't necessarily need it.
Rough reflections of sharp highlights: Caustics, in simple terms, is a specular illuminating a diffuse surface, for example light being refracted in the wineglass and shining onto our barrel. Path tracers have trouble sampling caustics efficiently which is why it's disabled by default and the shader usually fakes it by using a feature called thin shadows. However, we can create situations which are similarly difficult to sample by replacing the diffuse with a rough specular as it's the case on the metal rings around our barrel. Fortunately, adaptive sampling can clean it up eventually but it takes a lot of samples. We could completely remove that by setting the maxspeculardepth to 0 on the barrel. This will result in better performance at the cost of visual complexity.
Tracing and shading hair: The fur rug in the Still Life scene is costing us quite a lot due to it's dense geometry and complex shading. We can improve performance further by simplifying the shading, either by reducing the bounces even more or by only using the diffuse lobe on the PxrMarschnerHair shader. We could also reduce the density of the hair while compensating for it with increased width of individual hair. These are all optimizations which sacrifice the look for performance and have to be carefully considered for each case.
Removing dark refractions: When rendering dark glass, like the one on the wine bottles, the shader will sample both the refraction and reflection by estimating the relative importance of one over the other. Since the shader can't know that the wall behind it is mostly in shadow, it tends to oversample the refraction, leaving the highlights in the reflections quite noisy for many iterations. If we don't need the refractions in the wine bottles, we can set the refractionGain on the glass layer to 0. With this change, the wine bottles will be clean after just a few iterations, saving additional render time. On most of the bottles, this change will be hard to notice and it might even just look like they're filled with wine so you could use that trick at least on those on the screen left.
Tessellation for subdivision, NURBS and displaced surfaces: Those of us who've been working with RenderMan for a while use subdivision, NURBS (non-uniform rational b-spline) and displaced surfaces without thinking, they're so easy to use and extremely useful. Depending on how close we are to a certain object, RenderMan will subdivide the mesh until it looks perfectly smooth and we can even throw in some displacements if we like. Since we're raytracing more and more and especially with the switch to RIS, we have to be a bit more conscious about where we use
additional tessellation as it's costing memory and can slow down raytracing. If you look at our statistics, our roughly 50'000 polygons get tessellated into more the 10 millions! This isn't really a problem, RenderMan can handle much more than that. But sometimes, using a reasonably dense poly-mesh, maybe with the rough displacement baked in, can look just as good and it might save memory and improve raytracing performance. Don't go and convert everything to poly now, just be aware that every time you add a subdivision scheme or add displacement, that object will be tessellated into potentially, subpixel polygons.
We've looked at some techniques which can help tremendously to find and remove issues in your renders. This allows you to improve your render time, visual quality and denoising results. At the same time, using these techniques makes you more sensitive to potential issues and will have a positive impact on how you approach your setups.
In the Still Life scene, we've been able to drop rendertimes by approximately 35% while improving the quality at the same time. We've also looked at some additional steps we could take to push down the render time even more. I'm sure, by using the discussed methods, you'll be able to find issues in your scene and be able to achieve similar improvements if not more!
Thanks goes to Chu Tang, Leif Pedersen and Greg Shirah for proofreading and their helpful suggestions.
I'd also like to thank Dylan Sisson for making these lessons possible and to stand by my side during the project.