Perfomance buzzword of the day is Critical Rendering Path (CRP)
. It's one of the most important aspects to consider when trying to create a fast website. There are plethora of methods, mainly in format of plugins to optimize CRP, but the core concepts of it are not too well understood. And in the era where user experience is more important that ever, that's no good. From perspective of a Technical SEO, understanding these issues in depth is important for effective analysis and optimization.
Let's get started.
Rendering path is a sequence of steps that browsers take in order to transform web code into visual, interactive content. Critical Rendering Path
is the focus on components that are most important to deliver the visuals to the user as fast as possible.
The initial step involves the browser sending an HTTP GET request to the server to obtain the HTML document. This HTML serves as the foundational structure for rendering the webpage.
Effective metrics: DNS Lookup time, TTFB
Challenges: DNS and Server issues
After receiving the HTML, the browser parses it to construct the Document Object Model (DOM) tree
. Each HTML element is converted into a corresponding DOM node, and the hierarchical relationships between these nodes are established based on the HTML markup.
Effective metrics: DOMContentLoaded
Challenges: Complexity and size of HTML, unnecessary nesting and elements.
While the DOM tree is being constructed, the browser fetches external CSS and JavaScript resources. These files are essential for styling and interactivity, and they are often fetched in parallel to expedite the rendering process.
Effective metrics: Resource Load Time, Request Count
Challenges: Render-blocking resources, possible 3rd party reliance. Poor usage of async
and defer
params, lack of connection prewarming
The browser parses the fetched CSS to construct the CSS Object Model (CSSOM)
tree. This tree represents the styling rules applicable to the DOM elements and is crucial for the subsequent construction of the render tree.
Effective Metric: Style Recalculation Time
Challenges: Unused CSS, large CSS files, duplicate CSS
The DOM and CSSOM trees are then combined to construct the render tree. This tree contains only the nodes that are visible to the user and includes both their DOM properties and calculated styles from the CSSOM.
Effective Metric: Render Tree Calculation Time
Challenges: Sum of the steps preceding
Once the render tree is constructed, the browser calculates the layout. This involves determining the exact position and size of each element within the viewport. The layout is calculated relative to the render tree and results in a geometry for each visible element.
Effective Metric: Layout Duration
Challenges: Dynamic changes
After layout calculation, the painting process begins. This step involves filling in pixels and includes applying text, colors, images, and other visual elements. Each node in the render tree is painted according to the calculated layout and styles.
Effective Metrics: Paint Duration, FP, FCP
Challenges: Complex visual elements
The painted elements are composited into layers. These layers are then assembled in the correct z-index order to form the final visual output. Compositing optimizes the rendering performance by allowing the browser to repaint only the modified portions of the page.
Effective Metrics: Frame Rate, TTI
Challanges: Excessive layers and complex layer interactions
Finally, the composited layers are displayed on the screen, completing the rendering process. At this point, the webpage is fully rendered and interactive, ready for user engagement.
Effective Metric: Layout Reflow
Challanges: Layout reflow triggers (various causes)
Layout reflows, also known as layout recalculations, are essential for user-triggered actions like resizing windows or opening menus. They are triggered when changes are made to the DOM or CSSOM that affect the layout. This requires a recalculation of the render tree.
While important for the function for the page to operate as it should, layout reflows can also be performance bottlenecks.
Render tree building is a synchronous process. This means that when a layout reflow is in action, other actions, such as JavaScript execution and user interactions are put on hold. Besides that, the operation is computationally expensive, as it cascades through the layout, often causing multiple elements to be resized or repositioned.
Let's imagine a site where we have 3 <section>
s, each containing 3 <div>
s. Now we update the height of the second <div>
in the first <section>. Now, the elements below that div will have to make space, so we need to recalculate their new position as well. Thus, the reflow will affect every single sibling element in the page DOM, depending of the direction of the reflow.
During the loading process, this can inflate FCP, LCP and TTI
During the runtime this can inflate FID and INP.
Overall, layout reflows (depends) can also trigger CLS on page.
A JavaScript library updates CSS of an element dynamically.
// Update styles document.getElementById('hero').style.margin = '5px' // Add margin document.getElementById('hero').style.font-size = '1.25em'; // Set font size
What does the code above do?
The first line modifies the margin of the hero container. This triggers a layout reflow, which means that the second line is put on hold until it's completed. Then, and only then, the second line would run.
Now imagine that this happens 100s of times. This is a reality for some unoptimized sites. There are several real-world scenarios where this can happen. Legacy code and jQuery are one of the first things that come to mind.
The solution is sort of in the name here: We should batch all these styles and inject them with single line of code. In this case, we achieve this by using style.cssText
method.
// Batch the styles together document.getElementById('container').style.cssText = 'margin: 5px; font-size: 1.25em;';
Site is building a table dynamically on front-end with React.
import { useEffect, useRef } from 'react'; const DataTable = () => { const tableRef = useRef(null); useEffect(() => { async function fetchData() { const data = await new Promise((resolve) => { setTimeout(() => { resolve([ { "id": 1, "breed": "Siamese", "rating": 8.5 }, { "id": 2, "breed": "Maine Coon", "rating": 9.0 }, { "id": 3, "breed": "Persian", "rating": 7.5 }, { "id": 4, "breed": "Sphynx", "rating": 8.0 }, { "id": 5, "breed": "Bengal", "rating": 9.2 }, { "id": 6, "breed": "Ragdoll", "rating": 8.8 }, { "id": 7, "breed": "British Shorthair", "rating": 8.3 }, { "id": 8, "breed": "Abyssinian", "rating": 8.7 }, { "id": 9, "breed": "Scottish Fold", "rating": 7.9 }, { "id": 10, "breed": "Burmese", "rating": 8.1 } ]); }, 1000); }); const table = tableRef.current; for (let i = 0; i < data.length; i++) { const row = table.insertRow(-1); const cell1 = row.insertCell(0); const cell2 = row.insertCell(1); const cell3 = row.insertCell(2); cell1.innerHTML = data[i].id; cell2.innerHTML = data[i].breed; cell3.innerHTML = data[i].rating; if (data[i].rating > 8) { cell3.style.backgroundColor = 'green'; } else if (data[i].rating >= 6 && data[i].rating <= 8) { cell3.style.backgroundColor = 'yellow'; } else { cell3.style.backgroundColor = 'red'; } } } fetchData(); }, []); return ( <table ref={tableRef}> <thead> <tr> <th>ID</th> <th>Breed</th> <th>Rating</th> </tr> </thead> <tbody> {/* Rows will be inserted here */} </tbody> </table> ); }; export default DataTable;
How about this? So we fetch some data in JSON format. We assign the data to a table and then assign a color to the rating cell based on value. From perspective of a critical rendering path, the problematic part is the multiple DOM operations that we assign to each row. The useEffect is running an initial render and after that everything gets updated directly into DOM. From perspective of CRP, this is not optimal implementation.
In terms of risk for layout overflows (R), for this particular table we are looking at:
N = Number of items (iterables) in the JSON = 10
M = Amount of DOM manipulations per iteration = 4
R = Risk of layout overflows = N x M
R = 10 x 4 = 40
So, there is a possibility of up to 40 layout reflows in this script. Let's try and improve that.
Fixing this one is not as straightforward to fix. There are few fixes that we can do, depending on the use-case. Let's look at some possible alternatives.
Prerendering
This depends on the nature of the table. When we talk about a product listing or such, this can be an option. However, for tables that would display near real-time data, such as stock prices or live sports feed, the constant rendering could be inefficient solution. Additionally, prerendering is sometimes difficult to implement, depending on current base and setup.
Refactor
Another option is to change the code logic. Let's take a look at a possible example here:
import { useEffect, useRef } from 'react'; const DataTable = () => { const tableRef = useRef(null); useEffect(() => { async function fetchData() { const data = await new Promise((resolve) => { setTimeout(() => { resolve([ { "id": 1, "breed": "Siamese", "rating": 8.5 }, { "id": 2, "breed": "Maine Coon", "rating": 9.0 }, { "id": 3, "breed": "Persian", "rating": 7.5 }, { "id": 4, "breed": "Sphynx", "rating": 8.0 }, { "id": 5, "breed": "Bengal", "rating": 9.2 }, { "id": 6, "breed": "Ragdoll", "rating": 8.8 }, { "id": 7, "breed": "British Shorthair", "rating": 8.3 }, { "id": 8, "breed": "Abyssinian", "rating": 8.7 }, { "id": 9, "breed": "Scottish Fold", "rating": 7.9 }, { "id": 10, "breed": "Burmese", "rating": 8.1 } ]); }, 1000); }); const table = tableRef.current; const tbody = document.createElement('tbody'); data.forEach(item => { const row = document.createElement('tr'); const cell1 = document.createElement('td'); const cell2 = document.createElement('td'); const cell3 = document.createElement('td'); cell1.textContent = item.id; cell2.textContent = item.breed; cell3.textContent = item.rating; if (item.rating > 8) { cell3.style.backgroundColor = 'green'; } else if (item.rating >= 6 && item.rating <= 8) { cell3.style.backgroundColor = 'yellow'; } else { cell3.style.backgroundColor = 'red'; } row.appendChild(cell1); row.appendChild(cell2); row.appendChild(cell3); tbody.appendChild(row); }); table.appendChild(tbody); } fetchData(); }, []); return ( <table ref={tableRef}> <thead> <tr> <th>ID</th> <th>Breed</th> <th>Rating</th> </tr> </thead> {/* tbody will be inserted here */} </table> ); }; export default DataTable;
In this snippet, we write the <tbody>
in memory and then insert it into the table when ready. This definitely makes the complexity easier. Because we do only one DOM modification, we will always have maximum of 1 layout reflow, regardless of the table size.
From perspective on risk of layout reflows, the performance gain is significant:
N = tbody instead of row = 1
M = 1
R = 1 x 1 = 1
There are 3 trade-offs to consider however:
All these depend on the size of the table and should be considered when implementing such improvements in production. There are various ways to address each trade-off listed above.
Highly specific or complex CSS selectors can also induce reflows. For example, using deeply nested descendant selectors like body > div > ul > li > a can make the browser work harder to resolve styling, triggering unnecessary reflows.
<body> <div class="container"> <div class="wrapper"> <ul class="list"> <li class="item"><a href="#">Link 1</a></li> <li class="item"><a href="#">Link 2</a></li> <li class="item"><a href="#">Link 3</a></li> </ul> </div> </div> <style> body > div > div > ul > li > a { color: blue; text-decoration: none; } body > div > div > ul > li > a:hover { text-decoration: underline; } </style> </body>
The problem with this is how the browser processes this data. Browsers follow so called "right-to-left" evaluation when analyzing DOM for new rules, until they can confirm a match. So in this case, browser would check if the rule applies for element with a < li < ul < div < div < body
. Browser would have to repeat this for every single <a>
(also in this case, the key selector) in the DOM.
To better understand how big of a bottleneck this can be, let's calculate the computational cost for a process given above:
N = Number of key selector elements = 3
M = Number of steps in selector = 6
R = N×M
R = 3 x 6 = 18
This gives the answer: 18 individual steps (per rule) required to resolve this.
Considering the above, it's easy to understand how this can contribute to performance bottlenecks in larger-scale applications.
But for most cases, this is a less common problem with modern website practices.
The exact scenario above is a highly exaggurated problem. Similar concept is present on many sites in smaller scale.
Solution would be quite straightforward. We replace the current <li>
element with following
<li class="item"><a href="#" class="list-link">Link 1</a></li> <style> .list-link { color: blue; text-decoration: none; } .list-link:hover { text-decoration: underline; } </style>
By creating a class, the browser will be able to confirm where the styling belongs with one step. This will highly reduce the amount of calculation of CSSOM and further Render Tree.
Let's also calculate the computational cost for this one:
N = 3
M = 1
R = 3 x 1 = 3
So we have gained 6x performance gain by creating classes and simplifying our selection method
So, rule of thumb when writing custom CSS: Avoid nested selectors for CSS styles whenever possible.
Compositing is one of the last steps in the rendering pipeline. The process is similar to layering in graphic design software, where each layer can be manipulated independently and then combined to create the final image. The compositing stage is often hardware-accelerated, offloading the work to the GPU to improve performance. So each data structure (layer/component) has information on it's pixels, layer index and any other visual properties. Compositing algorithm takes these data structures as input and outputs a two-dimensional array of pixels, that represent the final image to be painted on screen.
Even if the layout reflows have been minimized through CRP optimization, poor considerations for compositing stage can create a bottleneck there.
The best way to attack this challenge is to keep tight performance budget and be mindful during development.
Here are some important things to keep in mind:
will-change
CSS property to inform the browser of what kinds of changes to expect. Balance it's use, as overusing can actually increase compositing cost.translate3d
, translateZ
, or setting will-change: transform;
. This can make compositing faster but should be used judiciously to avoid excessive GPU memory consumption.requestAnimationFrame
.box-shadow
can be expensive to composite. Use them sparingly and test their performance impact.Especially when talking about compositing, hardware acceleration and GPU offloading, it's very important to consider what users do you serve. It's good to gain better understanding using Analytical data to know what are your general limitations.
Many performance issues I see in the wild, stem from the fact that product team and developers only test with high-end devices. If you serve 2nd and 3rd world countries (think, India) users will not have the same hardware power to run the apps.
This article provided you with knowledge about Critical Rendering Path and how it can degrade page performance significantly. But it still only touches the surface.
When we take in account the modern web frameworks and highly complex applications that many web products have, you could write a book about the intricacies that go into CRP optimization.
Keep in mind that web performance is always a sum of the whole, not it's individual parts. This article only touches one part. Even with excellent CRP optimization (from the angle of this article, that is) if you're loading a massive JavaScript library or 3MB images, many of the optimizations will be undermined.
Remember, that computational cost can easily become exponential in nature, so even seemingly small performance issues individually can add up subpar user experience and failure of Core Web Vitals.
Be mindful of this and make sure your developers are up-to-date in their CRP best practices. The ideal scenario is that these challenges are considered in development and never become a serious problem in production.
If you are having performance problems with your website or web application, it might require in-depth analysis. If you are having trouble with it, why not contact me to ask for some expert help?