Critical Rendering Path for Technical SEO

Arto Kylmanen

Oct. 30, 2023


Perfomance buzzword of the day is Critical Rendering Path (CRP). It's one of the most important aspects to consider when trying to create a fast website. There are plethora of methods, mainly in format of plugins to optimize CRP, but the core concepts of it are not too well understood. And in the era where user experience is more important that ever, that's no good. From perspective of a Technical SEO, understanding these issues in depth is important for effective analysis and optimization.

Let's get started.

What is the rendering path?

Rendering path is a sequence of steps that browsers take in order to transform web code into visual, interactive content. Critical Rendering Path is the focus on components that are most important to deliver the visuals to the user as fast as possible.

Step 1: Get HTML

The initial step involves the browser sending an HTTP GET request to the server to obtain the HTML document. This HTML serves as the foundational structure for rendering the webpage.

Effective metrics: DNS Lookup time, TTFB
Challenges:
DNS and Server issues

Step 2: Construct DOM Tree

After receiving the HTML, the browser parses it to construct the Document Object Model (DOM) tree. Each HTML element is converted into a corresponding DOM node, and the hierarchical relationships between these nodes are established based on the HTML markup.

Effective metrics: DOMContentLoaded

Challenges: Complexity and size of HTML, unnecessary nesting and elements.

Step 3: Get CSS and JS

While the DOM tree is being constructed, the browser fetches external CSS and JavaScript resources. These files are essential for styling and interactivity, and they are often fetched in parallel to expedite the rendering process.

Effective metrics: Resource Load Time, Request Count

Challenges: Render-blocking resources, possible 3rd party reliance. Poor usage of async and defer params, lack of connection prewarming

Step 4: Construct CSSOM Tree

The browser parses the fetched CSS to construct the CSS Object Model (CSSOM) tree. This tree represents the styling rules applicable to the DOM elements and is crucial for the subsequent construction of the render tree.

Effective Metric: Style Recalculation Time

Challenges: Unused CSS, large CSS files, duplicate CSS

Step 5: Construct Render Tree

The DOM and CSSOM trees are then combined to construct the render tree. This tree contains only the nodes that are visible to the user and includes both their DOM properties and calculated styles from the CSSOM.

Effective Metric: Render Tree Calculation Time

Challenges: Sum of the steps preceding

Step 6: Calculate Layout

Once the render tree is constructed, the browser calculates the layout. This involves determining the exact position and size of each element within the viewport. The layout is calculated relative to the render tree and results in a geometry for each visible element.

Effective Metric: Layout Duration

Challenges: Dynamic changes

Step 7: Painting

After layout calculation, the painting process begins. This step involves filling in pixels and includes applying text, colors, images, and other visual elements. Each node in the render tree is painted according to the calculated layout and styles.

Effective Metrics: Paint Duration, FP, FCP

Challenges: Complex visual elements

Step 8: Compositing

The painted elements are composited into layers. These layers are then assembled in the correct z-index order to form the final visual output. Compositing optimizes the rendering performance by allowing the browser to repaint only the modified portions of the page.

Effective Metrics: Frame Rate, TTI

Challanges: Excessive layers and complex layer interactions

Step 9: Display

Finally, the composited layers are displayed on the screen, completing the rendering process. At this point, the webpage is fully rendered and interactive, ready for user engagement.

Effective Metric: Layout Reflow

Challanges: Layout reflow triggers (various causes)

Layout reflows

Layout reflows, also known as layout recalculations, are essential for user-triggered actions like resizing windows or opening menus. They are triggered when changes are made to the DOM or CSSOM that affect the layout. This requires a recalculation of the render tree.

Performance bottleneck

While important for the function for the page to operate as it should, layout reflows can also be performance bottlenecks.

Render tree building is a synchronous process. This means that when a layout reflow is in action, other actions, such as JavaScript execution and user interactions are put on hold. Besides that, the operation is computationally expensive, as it cascades through the layout, often causing multiple elements to be resized or repositioned.

Let's imagine a site where we have 3 <section>s, each containing 3 <div>s. Now we update the height of the second <div> in the first <section>. Now, the elements below that div will have to make space, so we need to recalculate their new position as well. Thus, the reflow will affect every single sibling element in the page DOM, depending of the direction of the reflow.

During the loading process, this can inflate FCP, LCP and TTI

During the runtime this can inflate FID and INP.

Overall, layout reflows (depends) can also trigger CLS on page.

Example 1: Batch-less DOM Manipulation

Scenario

A JavaScript library updates CSS of an element dynamically.

styles.js
// Update styles
document.getElementById('hero').style.margin = '5px'  // Add margin
document.getElementById('hero').style.font-size = '1.25em'; // Set font size

What does the code above do?

The first line modifies the margin of the hero container. This triggers a layout reflow, which means that the second line is put on hold until it's completed. Then, and only then, the second line would run.

Now imagine that this happens 100s of times. This is a reality for some unoptimized sites. There are several real-world scenarios where this can happen. Legacy code and jQuery are one of the first things that come to mind.

Solution

The solution is sort of in the name here: We should batch all these styles and inject them with single line of code. In this case, we achieve this by using style.cssText method.

Updated styles.js
// Batch the styles together
document.getElementById('container').style.cssText = 'margin: 5px; font-size: 1.25em;';

Example 2: Forced Synchronous Layouts

Scenario

Site is building a table dynamically on front-end with React.

datatable.js
import { useEffect, useRef } from 'react';

const DataTable = () => {
  const tableRef = useRef(null);

  useEffect(() => {
    async function fetchData() {
      const data = await new Promise((resolve) => {
        setTimeout(() => {
          resolve([
              { "id": 1, "breed": "Siamese", "rating": 8.5 },
              { "id": 2, "breed": "Maine Coon", "rating": 9.0 },
              { "id": 3, "breed": "Persian", "rating": 7.5 },
              { "id": 4, "breed": "Sphynx", "rating": 8.0 },
              { "id": 5, "breed": "Bengal", "rating": 9.2 },
              { "id": 6, "breed": "Ragdoll", "rating": 8.8 },
              { "id": 7, "breed": "British Shorthair", "rating": 8.3 },
              { "id": 8, "breed": "Abyssinian", "rating": 8.7 },
              { "id": 9, "breed": "Scottish Fold", "rating": 7.9 },
              { "id": 10, "breed": "Burmese", "rating": 8.1 }          
          ]);
        }, 1000);
      });

      const table = tableRef.current;

      for (let i = 0; i < data.length; i++) {
        const row = table.insertRow(-1);
        const cell1 = row.insertCell(0);
        const cell2 = row.insertCell(1);
        const cell3 = row.insertCell(2);

        cell1.innerHTML = data[i].id;
        cell2.innerHTML = data[i].breed;
        cell3.innerHTML = data[i].rating;

        if (data[i].rating > 8) {
          cell3.style.backgroundColor = 'green';
        } else if (data[i].rating >= 6 && data[i].rating <= 8) {
          cell3.style.backgroundColor = 'yellow';
        } else {
          cell3.style.backgroundColor = 'red';
        }
      }
    }

    fetchData();
  }, []);

  return (
    <table ref={tableRef}>
      <thead>
        <tr>
          <th>ID</th>
          <th>Breed</th>
          <th>Rating</th>
        </tr>
      </thead>
      <tbody>
        {/* Rows will be inserted here */}
      </tbody>
    </table>
  );
};

export default DataTable;

How about this? So we fetch some data in JSON format. We assign the data to a table and then assign a color to the rating cell based on value. From perspective of a critical rendering path, the problematic part is the multiple DOM operations that we assign to each row. The useEffect is running an initial render and after that everything gets updated directly into DOM. From perspective of CRP, this is not optimal implementation.

In terms of risk for layout overflows (R), for this particular table we are looking at:


N = Number of items (iterables) in the JSON = 10

M = Amount of DOM manipulations per iteration = 4

R = Risk of layout overflows = N x M

R = 10 x 4 = 40

So, there is a possibility of up to 40 layout reflows in this script. Let's try and improve that.

Solution

Fixing this one is not as straightforward to fix. There are few fixes that we can do, depending on the use-case. Let's look at some possible alternatives.

Prerendering

This depends on the nature of the table. When we talk about a product listing or such, this can be an option. However, for tables that would display near real-time data, such as stock prices or live sports feed, the constant rendering could be inefficient solution. Additionally, prerendering is sometimes difficult to implement, depending on current base and setup.

Refactor

Another option is to change the code logic. Let's take a look at a possible example here:

import { useEffect, useRef } from 'react';

const DataTable = () => {
  const tableRef = useRef(null);

  useEffect(() => {
    async function fetchData() {
      const data = await new Promise((resolve) => {
        setTimeout(() => {
          resolve([
              { "id": 1, "breed": "Siamese", "rating": 8.5 },
              { "id": 2, "breed": "Maine Coon", "rating": 9.0 },
              { "id": 3, "breed": "Persian", "rating": 7.5 },
              { "id": 4, "breed": "Sphynx", "rating": 8.0 },
              { "id": 5, "breed": "Bengal", "rating": 9.2 },
              { "id": 6, "breed": "Ragdoll", "rating": 8.8 },
              { "id": 7, "breed": "British Shorthair", "rating": 8.3 },
              { "id": 8, "breed": "Abyssinian", "rating": 8.7 },
              { "id": 9, "breed": "Scottish Fold", "rating": 7.9 },
              { "id": 10, "breed": "Burmese", "rating": 8.1 }   
          ]);
        }, 1000);
      });

      const table = tableRef.current;
      const tbody = document.createElement('tbody');

      data.forEach(item => {
        const row = document.createElement('tr');
        const cell1 = document.createElement('td');
        const cell2 = document.createElement('td');
        const cell3 = document.createElement('td');

        cell1.textContent = item.id;
        cell2.textContent = item.breed;
        cell3.textContent = item.rating;

        if (item.rating > 8) {
          cell3.style.backgroundColor = 'green';
        } else if (item.rating >= 6 && item.rating <= 8) {
          cell3.style.backgroundColor = 'yellow';
        } else {
          cell3.style.backgroundColor = 'red';
        }

        row.appendChild(cell1);
        row.appendChild(cell2);
        row.appendChild(cell3);
        tbody.appendChild(row);
      });

      table.appendChild(tbody);
    }

    fetchData();
  }, []);

  return (
    <table ref={tableRef}>
      <thead>
        <tr>
          <th>ID</th>
          <th>Breed</th>
          <th>Rating</th>
        </tr>
      </thead>
      {/* tbody will be inserted here */}
    </table>
  );
};

export default DataTable;

In this snippet, we write the <tbody> in memory and then insert it into the table when ready. This definitely makes the complexity easier. Because we do only one DOM modification, we will always have maximum of 1 layout reflow, regardless of the table size.

From perspective on risk of layout reflows, the performance gain is significant:

N = tbody instead of row = 1

M = 1

R = 1 x 1 = 1

There are 3 trade-offs to consider however:

  1. Inserting the whole table at one can cause a layout shift
  2. It might increase rendering time of the table
  3. If the table would have a lot of rows, for example in 10s of thousands, writing them unscrupulously into memory is not the best idea.

All these depend on the size of the table and should be considered when implementing such improvements in production. There are various ways to address each trade-off listed above.

Example 3: Non-Optimized CSS Selectors

Highly specific or complex CSS selectors can also induce reflows. For example, using deeply nested descendant selectors like body > div > ul > li > a can make the browser work harder to resolve styling, triggering unnecessary reflows.

Scenario

<body>
  <div class="container">
    <div class="wrapper">
      <ul class="list">
        <li class="item"><a href="#">Link 1</a></li>
        <li class="item"><a href="#">Link 2</a></li>
        <li class="item"><a href="#">Link 3</a></li>
      </ul>
    </div>
  </div>
<style>
body > div > div > ul > li > a {
  color: blue;
  text-decoration: none;
}

body > div > div > ul > li > a:hover {
  text-decoration: underline;
}
</style>
</body>

The problem with this is how the browser processes this data. Browsers follow so called "right-to-left" evaluation when analyzing DOM for new rules, until they can confirm a match. So in this case, browser would check if the rule applies for element with a < li < ul < div < div < body. Browser would have to repeat this for every single <a> (also in this case, the key selector) in the DOM.

To better understand how big of a bottleneck this can be, let's calculate the computational cost for a process given above:

N = Number of key selector elements = 3

M = Number of steps in selector = 6

R = N×M

R = 3 x 6 = 18

This gives the answer: 18 individual steps (per rule) required to resolve this.

Considering the above, it's easy to understand how this can contribute to performance bottlenecks in larger-scale applications.

But for most cases, this is a less common problem with modern website practices.

The exact scenario above is a highly exaggurated problem. Similar concept is present on many sites in smaller scale.

Solution

Solution would be quite straightforward. We replace the current <li> element with following

<li class="item"><a href="#" class="list-link">Link 1</a></li>

<style>
.list-link {
  color: blue;
  text-decoration: none;
}

.list-link:hover {
  text-decoration: underline;
}
</style>

By creating a class, the browser will be able to confirm where the styling belongs with one step. This will highly reduce the amount of calculation of CSSOM and further Render Tree.

Let's also calculate the computational cost for this one:

N = 3

M = 1

R = 3 x 1 = 3

So we have gained 6x performance gain by creating classes and simplifying our selection method

So, rule of thumb when writing custom CSS: Avoid nested selectors for CSS styles whenever possible.

Compositing

Compositing is one of the last steps in the rendering pipeline. The process is similar to layering in graphic design software, where each layer can be manipulated independently and then combined to create the final image. The compositing stage is often hardware-accelerated, offloading the work to the GPU to improve performance. So each data structure (layer/component) has information on it's pixels, layer index and any other visual properties. Compositing algorithm takes these data structures as input and outputs a two-dimensional array of pixels, that represent the final image to be painted on screen.

Even if the layout reflows have been minimized through CRP optimization, poor considerations for compositing stage can create a bottleneck there.

Focus on best practices

The best way to attack this challenge is to keep tight performance budget and be mindful during development.

Here are some important things to keep in mind:

  1. Layer Budgeting: Be mindful of the number of layers you create. Each layer consumes memory and requires compositing.
  2. Optimize Paint Areas: Limit the areas of the screen that need to be repainted. Use will-change CSS property to inform the browser of what kinds of changes to expect. Balance it's use, as overusing can actually increase compositing cost.
  3. GPU Acceleration: Offload some of the graphical work to the GPU by using translate3d, translateZ, or setting will-change: transform;. This can make compositing faster but should be used judiciously to avoid excessive GPU memory consumption.
  4. Batch Visual Changes: If you have to make multiple changes that will trigger compositing, try to batch these changes together in a single animation frame using requestAnimationFrame.
  5. Debounce and Throttle: For operations like scrolling or resizing that trigger frequent compositing, do debouncing or throttling the events to reduce the number of compositing operations.
  6. Conditional Rendering: In frameworks like React, use conditional rendering to remove elements from the DOM that are not currently needed, reducing the number of elements that need to be composited.
  7. Use Web Workers: Offload non-UI computations to Web Workers to keep the main thread free, ensuring that it can respond quickly to compositing requirements.
  8. Profile and Monitor: Use performance profiling tools to identify bottlenecks. The Performance tab in Chrome DevTools can be particularly useful for this.
  9. Code Splitting: In larger applications, use code splitting to only load the parts of the application that are currently needed, reducing the initial number of elements that need to be composited.
  10. Avoid Complex Visual Effects: Effects like CSS filters, blending, and complex box-shadow can be expensive to composite. Use them sparingly and test their performance impact.
  11. Test Across Devices: Different devices have different compositing capabilities. Always test performance on a range of devices, including those with lower-end hardware.

Know your users

Especially when talking about compositing, hardware acceleration and GPU offloading, it's very important to consider what users do you serve. It's good to gain better understanding using Analytical data to know what are your general limitations.

Many performance issues I see in the wild, stem from the fact that product team and developers only test with high-end devices. If you serve 2nd and 3rd world countries (think, India) users will not have the same hardware power to run the apps.

Summary

This article provided you with knowledge about Critical Rendering Path and how it can degrade page performance significantly. But it still only touches the surface.

When we take in account the modern web frameworks and highly complex applications that many web products have, you could write a book about the intricacies that go into CRP optimization.

Keep in mind that web performance is always a sum of the whole, not it's individual parts. This article only touches one part. Even with excellent CRP optimization (from the angle of this article, that is) if you're loading a massive JavaScript library or 3MB images, many of the optimizations will be undermined.

Remember, that computational cost can easily become exponential in nature, so even seemingly small performance issues individually can add up subpar user experience and failure of Core Web Vitals.

Be mindful of this and make sure your developers are up-to-date in their CRP best practices. The ideal scenario is that these challenges are considered in development and never become a serious problem in production.

If you are having performance problems with your website or web application, it might require in-depth analysis. If you are having trouble with it, why not contact me to ask for some expert help?

Back to index

Copied to clipboard