"What happens after a user enters a URL in the browser address bar?" This question is a classic for us front-end developers; it's fundamental to front-end development, a common topic in job interviews, and a basis for performance optimization. However, the focus of this article is not on what happens afterward, but rather on what happens beforehand—specifically, the steps our code goes through to become a webpage that internet users can access. How do we reasonably update web pages?
The first question involves development and deployment, while the second concerns publishing. Below, I will explain these four parts: webpage entry, development, deployment, and publishing.
Part 1 Webpage Entry#
This part will briefly introduce what constitutes the webpage that users see and what work the browser does to present these components to the user. First, here is the main page of bilibili:
A content-rich, beautifully designed, and user-friendly webpage relies on the front-end trio: HTML, CSS, JS, along with resource files such as images and fonts:
- HTML determines the content of the webpage and serves as the entry point for users visiting any website. CSS and JS code can be written directly in HTML or in separate files that are referenced in HTML.
- CSS is responsible for the webpage's styling.
- JS enables user interaction.
<!-- Basic structure of the webpage entry HTML -->
<!DOCTYPE html>
<html>
<head>
<title>Webpage Title, displayed on the browser tab</title>
<meta name="keywords" content="webpage keywords, SEO"/>
<meta name="description" content="webpage description, SEO"/>
<!-- Inline CSS in HTML -->
<style>
.foo {
color: red;
}
</style>
<!-- External CSS file reference in HTML -->
<link rel="stylesheet" href="https://s.alicdn.com/@g/msite/msite-rax-detail-cdn/1.0.73/web/screen.css"/>
</head>
<body>
<!-- Webpage content -->
<div class="foo">
Page Content
</div>
<!-- Inline JS script in HTML -->
<script>
function log(param) {
console.log(param)
}
log('Parsing and executing this JS code')
</script>
<!-- External JS file reference in HTML -->
<script src="https://s.alicdn.com/@g/msite/msite-rax-detail-cdn/1.0.73/web/screen.js"></script>
</body>
</html>
Before a user accesses any website, they must first enter a valid address in the address bar. The browser then sends a request to the server to retrieve the corresponding webpage entry file, "xxx.html." Opening the browser's Network console will show that this is the first response content received by the browser.
Next, the browser parses the HTML code, identifies other resources, and sends more requests. Through the loading, parsing, and execution (optional) of various types of resources, it gradually becomes the complete page that the user sees. At this point, we must mention CRP (Critical Rendering Path), which refers to the series of key steps the browser takes to convert HTML, JS, and CSS code into pixels visible on the screen, as follows:
- Download HTML over the network and parse the HTML code to construct the DOM.
- Download CSS over the network and parse the CSS code to construct the CSSOM.
- Download JS over the network, parse and execute the JS code, which may modify the DOM or CSSOM.
- Once the DOM and CSSOM are "shaped," the browser constructs the Render Tree based on the DOM and CSSOM.
- The reflow process calculates the position and style of each element node.
- The repaint process draws the actual pixels on the screen.
At this point, the webpage is presented to the user for further browsing and interaction.
Part 2 Development Phase#
After reviewing the previous part, you should now understand how the browser searches for and presents webpages. This part will briefly introduce the modern web development process.
Code Writing#
In today's world, where webpage content is increasingly rich and functionalities are becoming more complex, the front-end trio of HTML, CSS, and JS has grown significantly. Clearly, organizing CSS and JS code within a single HTML file is no longer suitable. We no longer write HTML, CSS, and JS code in the traditional way; instead, we use various UI frameworks (such as React/Vue/Angular) for component-based development and CSS preprocessors (such as Sass/Less/Stylus) for styling.
Engineering Capabilities#
Using front-end build tools (such as webpack/vite/Rollup), we organize various types of files and provide modularization, automation, optimization, and transpilation capabilities for local development and production packaging.
It is necessary to explain modularization, which allows us to treat different types of files uniformly as modules during the development phase. Modules become first-class citizens in the module system, allowing them to reference each other. The differences between different file type modules are handled by the build tools.
import '@/common/style.scss' // Importing SCSS
import arrowBack from '@/common/arrow-back.svg' // Importing SVG
import { loadScript } from '@/common/utils.js' // Importing a function from JS
Unlike the development phase, build tools also provide rich capabilities for the production environment, allowing for compression, tree-shaking optimization, uglification, compatibility, extraction, and other processing of business source code, resulting in optimized code suitable for production environments. The JS built for production looks like this:
!function(){"use strict";function t(t){if(null==t)return-1;var e=Number(t);return isNaN(e)?-1:Math.trunc(e)}function e(t){var e=t.name;return/(\.css|\.js|\.woff2)/.test(e)&&!/(\.json)/.test(e)}function n(t){var e="__";return"".concat(t.protocol).concat(e).concat(t.name).concat(e).concat(t.decodedBodySize).concat(e).concat(t.encodedBodySize).concat(e).concat(t.transferSize).concat(e).concat(t.startTime).concat(e).concat(t.duration).concat(e).concat(t.requestStart).concat(e).concat(t.responseEnd).concat(e).concat(t.responseStart).concat(e).concat(t.secureConnectionStart)}var r=function(){return/WindVane/i.test(navigator.userAgent)};function o(){return r()}function c(){return!!window.goldlog}var i=function(){return a()},a=function(){var t=function(t){var e=document.querySelector('meta[name="'.concat(t,'"]'));if(!e)return;return e.getAttribute("content")}("data-spm"),e=document.body&&document.body.getAttribute("data-spm");return t&&e&&"".concat(t,".")......
The production environment CSS built looks like this:
@charset "UTF-8";.free-shipping-block{-webkit-box-orient:horizontal;-webkit-box-direction:normal;-webkit-box-align:center;-ms-flex-align:center;-webkit-align-items:center;align-items:center;background-color:#ffe8da;background-position:100% 100%;background-repeat:no-repeat;background-size:200px 100px;border-radius:8px;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-flex-direction:row;-ms-flex-direction:row;flex-direction:row;margin-top:24px;padding:12px}.free-shipping-block .content{-webkit-box-flex:1;-ms-flex-positive:1;color:#4b1d1f;-webkit-flex-grow:1;flex-grow:1;font-size:14px;margin-left:8px;margin-top:0!important}.free-shipping-block .content .desc img{padding-top:2px;vertical-align:text-top;width:120px}.free-shipping-block .co.....
The build tool automatically includes the JS and CSS resources in the output HTML code:
<!doctype html><html><head><script defer="defer" src="/build/xxx.js"></script><link href="/build/xxx.css" rel="stylesheet"></head><body><div id="root"></div></body></html>
Part 3 Code Deployment#
At this point, we have all the resources needed for the webpage entry (HTML and corresponding CSS, JS, and other static resources). We can simply double-click the HTML file to open it in the browser for local access to our page. Ha! Front-end development is that simple!
Now we can consider the next step: we need to make our page accessible to testers, product managers, operators, and global users on the internet, right? Just running it locally won't suffice (doge); we must upload all these resources to the internet.
In the development phase, webpage access occurs on a locally running development server, typically with an IP of 127.0.0.1 and a custom port. Access is done in the form of IP + Port + Path. One way is to manually upload resources to the server, allowing others to access the page using the server's IP + Port + Path (the topic of website domain registration and mapping is omitted here). The other way is to automate the entire process through a dedicated publishing platform, which simply does the following:
- Check branch submission information, mandatory configurations, compliance checks for dependencies, and other checkpoints.
- Run scripts to execute pre-configured dependency installation and packaging build commands, initiate cloud builds, install project dependencies, and package a production environment artifact (essentially, this step is similar to cloning the project to local and initializing it, followed by local build).
- Upload the artifact to the CDN.
At this point, users can enter the URL in their browsers to access our page. The server returns the HTML, which references resources from the CDN, allowing the client (browser) to render the page.
Part 4 Publishing Externally#
Iterative Updates#
For pages with tens of thousands (or millions) of DAUs, the high volume of traffic and extreme performance metrics require us to consider safe publishing and user experience for iterative modifications before officially making them accessible.
.foo {
background-color: red;
}
For index.css, if users have to request this file every time they open the page, it not only wastes bandwidth but also makes users wait longer for downloads. We can leverage strong caching in HTTP to cache static resources in the browser, allowing users to see the page faster (the speed is reflected in the browser reading files directly from memory/dist cache, eliminating download time).
<!-- Set cache expiration time -->
Cache-Control: max-age=2592000,s-maxage=86400
For static resources, servers often set a very long cache expiration time to fully utilize caching, so the browser doesn't need to send requests. However, if the browser doesn't send requests, what do we do if we have updates/bug fixes for the page? A common solution is to append a version number to the resource URL, such as:
<!-- Update via version number -->
<!doctype html>
<html>
<head>
<script defer="defer" src="https://s.alicdn.com/build/foo.js?t=0.0.1"></script>
<link href="https://s.alicdn.com/build/index.css?t=0.0.1" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
When updating next time, changing the version number will force the browser to initiate a new request:
<!-- Iteration version 0.0.2 -->
<!doctype html>
<html>
<head>
<script defer="defer" src="https://s.alicdn.com/build/foo.js?t=0.0.2"></script>
<link href="https://s.alicdn.com/build/index.css?t=0.0.2" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
However, this approach has a problem: if the HTML references multiple files and only one of them is changed during an iteration while the others remain unchanged, using a unified version number will invalidate the local cache for all other files!
To solve this problem, we need to implement file-level granularity for cache control. We can easily think of the data digest algorithm in HTTPS, which generates a unique hash value based on the file content. If the file is unchanged, the hash value remains the same, allowing for precise caching at the individual file level:
<!-- Control updates via file content digest -->
<!doctype html>
<html>
<head>
<!-- foo.js remains unchanged, continue using cache -->
<script defer="defer" src="https://s.alicdn.com/build/foo.js"></script>
<!-- index.css has changed styles, needs to request the updated file and cache it -->
<link href="https://s.alicdn.com/build/index_1i0gdg6ic.css" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
Alternatively, we can use the iteration version number to add to the resource path:
<!-- Control updates via resource path -->
<!doctype html>
<html>
<head>
<!-- Resource path updated, request new resources -->
<script defer="defer" src="https://s.alicdn.com/0.0.2/build/foo.js"></script>
<!-- Resource path updated, request new resources -->
<link href="https://s.alicdn.com/0.0.2/build/index.css" rel="stylesheet">
</head>
<body>
<div class="foo"></div>
</body>
</html>
Separation of Static and Dynamic Content#
Modern front-end deployment solutions often upload static resources (JS, CSS, images, etc.) to a CDN closer to users. These resources rarely change and need to fully utilize caching to improve cache hit rates. In contrast, dynamic pages (HTML) are tailored to individual user data, perform SSR for SEO, and are stored closer to business servers for faster data retrieval and injection.
With two types of resources distributed in different locations, static resources are referenced in HTML via CDN links. However, a question arises: when updating the page, should we publish static resources first or the page itself?
Publishing the page first, then the resources:
<!-- New page, old resources -->
<!doctype html>
<html>
<head>
<!-- Resources are not fully published yet -->
<script defer="defer" src="https://s.alicdn.com/0.0.1/build/foo.js"></script>
<link href="https://s.alicdn.com/0.0.1/build/index.css" rel="stylesheet">
</head>
<body>
<!-- Page has been modified -->
<div class="bar"></div>
</body>
</html>
Before the static resources are published, users may access the new page structure, but the static resources are still old. Users might see a page with broken styles or encounter errors due to old JS scripts failing to find element nodes, which is not feasible 🙅.
Publishing the resources first, then the page:
<!-- Old page, new resources -->
<!doctype html>
<html>
<head>
<!-- Resources have been published -->
<script defer="defer" src="https://s.alicdn.com/0.0.2/build/foo.js"></script>
<link href="https://s.alicdn.com/0.0.2/build/index.css" rel="stylesheet">
</head>
<body>
<!-- Page has not been published yet -->
<div class="foo"></div>
</body>
</html>
Before the page is published, the page structure remains unchanged while the resources are new. If a user has previously visited and has the old resources cached locally, they will see a normal page. Otherwise, if they access the old page but load new resources, the same issues mentioned above will occur: either broken styles or JS execution errors leading to a blank screen, which is also not feasible 🙅.
Thus, neither option works! This is why, in the past, developers had to work late at night to deploy projects during low-traffic periods, as it minimizes the impact. However, for large companies, there are no absolute low-traffic periods, only relatively low ones. Even then, for those of us who pursue perfection, this is unacceptable!
The issue arises from overlay publishing; when resources to be published overwrite already published resources, problems occur. The corresponding solution is non-overlay publishing, which involves adding version numbers or hash values to file paths. When publishing new resources, do not overwrite old ones. First, fully publish static resources, then gradually roll out the full-page release, perfectly resolving the issue.
Therefore, regarding static resource optimization, we should aim to:
- Configure long cache expiration times to improve cache hit rates and save bandwidth.
- Use content digests or versioned file paths as the basis for cache updates to achieve precise cache control.
- Deploy static resources via CDN to save network request transmission paths and shorten response times.
- Use non-overlay publishing to update resources for a smooth transition.
At this point, the code painstakingly written by front-end developers has gone through continuous iterations, (cloud) builds, resource deployments, and external publishing, allowing global users to experience our products and enjoy surfing the internet!