I only know that Nginx is awesome! But I don’t know how it supports millions of concurrencies?

Latest update time：2024-04-16

Reads：

I saw an interesting topic on the Internet some time ago: I only know that Nginx is awesome, but I don’t know how it supports millions of concurrencies?

Indeed, this is a good question, often asked in interviews, and many people have stumbled upon it!

So, today we will talk about this topic together.

As we all know, whether it is operation and maintenance, development, or testing, learning the Nginx technology stack is always essential. It’s just that different positions have different depth and breadth of mastery.

What is Nginx?

Nginx is an open source, high-performance, highly reliable web and reverse proxy server that supports hot deployment and can run almost 24/7. It can also hot update the software version without interrupting service. Nginx has very good performance, takes up less memory, has strong concurrency capabilities, can support high concurrency, and supports most protocols, such as TCP, UDP, SMTP, HTTPS, etc. The most important thing is that Nginx is free and open source and can be commercialized, and it is relatively simple to configure and use.

Many major Internet companies in China, such as Baidu, JD.com, Sina, NetEase, Tencent, etc., are using Nginx, and many high-profile foreign websites are also using Nginx, such as Netflix, GitHub, SoundCloud, MaxCDN, etc.

Official website: http://www.nginx.org

nginx architecture

How does Nginx support millions of concurrencies?

Nginx can support millions of concurrent connections, mainly through the following aspects:

Main process and worker process

When Nginx starts, it will generate a master process (master) and a worker process (worker).

[root@nginx ~]# ps -ef|grep nginx
root       6324      1  0 09:06 ?        00:00:00 nginx: master process /usr/local/nginx-1.12.2/sbin/nginx
nobody     6325   6324  0 09:06 ?        00:00:00 nginx: worker process
root       6327   1244  0 09:06 pts/0    00:00:00 grep --color=auto nginx

The main process is mainly responsible for scheduling worker processes (managing Worker processes) and does not directly handle network requests.
The worker process (all Worker processes are equivalent) is the process that actually handles network requests and responses. Each worker process is independent and can handle thousands of network requests at the same time.

event driven model

Nginx's event-driven model consists of three basic units: event collector, sender and processor.

Event collector: collects various IO requests of the worker process
Event sender: Send IO events to event handlers
Event handler: handles response to various events

The Nginx event-driven architecture is based on an asynchronous and non-blocking approach. This design allows Nginx to handle multiple network requests at the same time.

When a client initiates a request, Nginx will hand the request to a worker process, which is responsible for processing the request. The worker process processes requests asynchronously, and each request is processed in a separate worker process, so that the blocking of one request will not affect the processing of other requests. Therefore, it can handle multiple client requests at the same time, thereby improving concurrent processing capabilities.

Non-blocking IO

Nginx uses non-blocking I/O operations when processing requests, which means that it does not block the process while waiting for the I/O operation to complete. By using non-blocking I/O, Nginx can handle multiple I/O operations at the same time, thereby improving overall processing capabilities.

As shown in the previous figure, the event sender will put the event into a similar pending list, and then use non-non-blocking I/O mode to call the event handler to handle the request.

We also call this processing mode "multiplexed I/O". The most common ones include the following three: select model, poll model, and epoll model.

Memory management

Nginx uses powerful memory pool technology to manage memory. The memory blocks in the memory pool are pre-allocated, which avoids frequent memory allocation and release operations, thus reducing the overhead of memory allocation and release. This allows Nginx to handle a large number of concurrency Connect more efficiently.

load balancing

Nginx can be used as a reverse proxy server to forward client requests to the backend server for processing. By configuring a load balancing strategy, Nginx can distribute requests to multiple backend servers to further improve the overall processing capability. This is also a key technology for supporting millions of concurrencies.

upstream server_pools { 
  server 192.168.1.100:8888   weight=5;
  server 192.168.1.101:9999   weight=5;
  server 192.168.1.102:6666   weight=5;
  #weigth参数表示权值，权值越高被分配到的几率越大
}
server {  
  listen 80; 
  server_name mingongge.com;
  location / {    
  proxy_pass http://server_pools; 
   }
}

Nginx load balancing strategy

Polling strategy: The strategy adopted by default, all client request polling is assigned to the server. This strategy can work normally, but if one of the servers is under too much pressure and delays occur, it will affect all users assigned to this server.
Minimum number of connections policy: Prioritize requests to less stressed servers, which balances the length of each queue and avoids adding more requests to highly stressed servers.
Fastest response time policy: Prioritize allocation to the server with the shortest response time.
Client IP binding strategy: Requests from the same IP are always assigned to only one server, effectively solving the session sharing problem of dynamic web pages.

cache

Nginx supports the caching function. As an important means of performance optimization, Nginx caching can greatly reduce the load on the back-end server.

We can store static files on the local disk through Nginx configuration and provide them directly to the client, which reduces the number of requests to the back-end server and improves performance and concurrent processing capabilities.

proxy_cache_path  #代理缓存的路径
#语法格式
proxy_cache_path path [levels=levels] [use_temp_path=on|off] keys_zone=name:size [inactive=time] [max_size=size] [manager_files=number] [manager_sleep=time] [manager_threshold=time] [loader_files=number] [loader_sleep=time] [loader_threshold=time] [purger=on|off] [purger_files=number] [purger_sleep=time] [purger_threshold=time];

proxy_cache #开启或关闭代理缓存
#语法格式
proxy_cache zone | off;  #zone为内存区域的名称，即上面中keys_zone设置的名称。

proxy_cache_key #定义如何生成缓存的键
#语法格式
proxy_cache_key string;  #string为生成Key的规则，如proxy_host$request_uri。

proxy_cache_valid  #缓存生效的状态码与过期时间。
#语法格式
proxy_cache_valid [code ...] time;  #code为状态码，time为有效时间，可以根据状态码设置不同的缓存时间。如：proxy_cache_valid 200 302 30m;

proxy_cache_min_uses #设置资源被请求多少次后被缓存。
#语法格式
proxy_cache_min_uses number;  #number为次数，默认为1。

proxy_cache_use_stale #当后端出现异常时，是否允许Nginx返回缓存作为响应。
#语法格式
proxy_cache_use_stale error;  #error为错误类型

proxy_cache_lock  #是否开启锁机制
#语法格式
proxy_cache_lock on | off;

proxy_cache_lock_timeout #配置锁超时机制，超出规定时间后会释放请求。
#语法格式
proxy_cache_lock_timeout time;

proxy_cache_methods #设置对于那些HTTP方法开启缓存。
#语法格式
proxy_cache_methods method;  #method为请求方法类型，如GET、HEAD等。

proxy_no_cache #设置不存储缓存的条件，符合时不会保存。
#语法格式
proxy_no_cache string...;  #string为条件，如arg_nocache $arg_comment;

proxy_cache_bypass  #设置不读取缓存的条件，符合时不会从缓存中读取。
#语法格式
proxy_cache_bypass string...;  #与上面proxy_no_cache的配置方法类似。

add_header  #配置往响应头中添加字段信息。
#语法格式
add_header fieldName fieldValue;

$upstream_cache_status #记录了缓存是否命中的信息，存在以下多种情况：
MISS：请求未命中缓存。
HIT：请求命中缓存。
EXPIRED：请求命中缓存但缓存已过期。
STALE：请求命中了陈旧缓存。
REVALIDDATED：Nginx验证陈旧缓存依然有效。
UPDATING：命中的缓存内容陈旧，但正在更新缓存。
BYPASS：响应结果是从原始服务器获取的。
#注：这是一个Nginx内置变量，与上面的参数不同。

The following is a configuration example

server{  
        location / {  
            # 使用名为nginx_cache的缓存空间  
            proxy_cache hot_cache;  
            # 对于200、206、304、301、302状态码的数据缓存1天  
            proxy_cache_valid 200 206 304 301 302 1d;  
            # 对于其他状态的数据缓存30分钟  
            proxy_cache_valid any 30m;  
            # 定义生成缓存键的规则（请求的url+参数作为key）  
            proxy_cache_key $host$uri$is_args$args;  
            # 资源至少被重复访问三次后再加入缓存  
            proxy_cache_min_uses 3;  
            # 出现重复请求时，只让一个去后端读数据，其他的从缓存中读取  
            proxy_cache_lock on;  
            # 上面的锁超时时间为3s，超过3s未获取数据，其他请求直接去后端  
            proxy_cache_lock_timeout 3s;  
            # 对于请求参数或cookie中声明了不缓存的数据，不再加入缓存  
            proxy_no_cache $cookie_nocache $arg_nocache $arg_comment;  
            # 在响应头中添加一个缓存是否命中的状态（便于调试）  
            add_header Cache-status $upstream_cache_status;  
        }

Modular design

Nginx's modular design enables it to select and load different modules according to needs to support various functions, such as logging, authentication, etc. This design is highly flexible and easy to expand and maintain.

Nginx's modules mainly include core modules, standard HTTP modules, optional HTTP modules, mail service modules and third-party modules. These modules use event-driven models and non-blocking I/O and other technical means to efficiently handle a large number of concurrent connections and support millions of concurrent accesses.

The modular design of Nginx allows it to load different modules according to different needs to support various functions. This design is very flexible and easy to expand and maintain.

The core module is the basic part of Nginx. It mainly implements part of the underlying communication protocol and also provides a runtime environment for other modules and Nginx processes. The standard HTTP module is part of the core and is responsible for defining other modules besides the configuration module. The optional HTTP module provides more advanced functions, such as load balancing, SSL encryption, etc. The mail service module is related to mail. Third-party modules are modules provided by third parties that can extend the functions of Nginx.

In general, Nginx's modular design is one of the key factors in its high performance and high concurrency capabilities.

agency mechanism

When Nginx is used as a reverse proxy server, it will receive the client's request and forward it to the back-end server for processing. Through the proxy mechanism, Nginx can implement request forwarding, load balancing, caching and other functions, improving processing performance and concurrency capabilities.

For related introduction articles, please refer to: The following is a basic configuration example:

http {
.............
    upstream product_server{
        127.0.0.1:8081;
    }

    upstream admin_server{
        127.0.0.1:8082;
    }

    upstream test_server{
        127.0.0.1:8083;
    }

server {
      
  #默认指向product的server
  location / {
      proxy_pass http://product_server;
      }

  location /product/{
      proxy_pass http://product_server;
     }

  location /admin/ {
      proxy_pass http://admin_server;
     }

  location /test/ {
      proxy_pass http://test_server;
      }
    }
}

Nginx is designed through these excellent technologies, and then these technical means are widely used in Nginx, so that Nginx can efficiently handle a large number of concurrent connections. In actual applications, we can improve and optimize its performance by properly configuring Nginx. Further improve concurrent processing capabilities to support millions of concurrent accesses.

Note: The pictures are all from Internet materials, and the copyright belongs to the original author. This article is only for sharing technology.

Nginx can efficiently handle a large number of concurrent connections. In actual applications, we can improve and optimize its performance by properly configuring Nginx and further improve concurrent processing capabilities to support millions of concurrent accesses.

Spring recruitment has begun. If you are not fully prepared, it will be difficult to find a good job in spring recruitment.

I’m sending you a big employment gift package, so you can raid the spring recruitment and find a good job!

Latest articles about

■The 10 most commonly used nslookup commands, network engineers must keep them after reading this!

■The most comprehensive and detailed explanation of Linux inter-process communication methods on the entire network is here, you can't miss it!

■When doing Linux system penetration testing, please remember these high-frequency instructions!

■Summary of 50 Operation and Maintenance Troubleshooting and Repair Techniques

■Simple tutorial: Set up FTP server on Linux in 5 minutes and easily realize file transfer

■Please keep it, we have compiled the commonly used shortcut keys for Ubuntu system for you!

■This may be the university in China that hates subways the most. It even wrote a paper opposing subways...

■Expert Guide: Linux LVS Four Working Modes and Their Best Practices

■Distinguish RS485, RS232, USB in 1 minute

■v8.0 was released in 1997, and v9.0 was released today. What kind of magical language is still alive?