Article count:922 Read by:3074353

Account Entry

A guide to regular expressions and location path matching in nginx

Latest update time:2024-08-13
    Reads:

Link: https://www.cnblogs.com/xiongzaiqiren/p/16968651.html

Preface, I verified the regular expression, location path matching rules and priority in nginx in nginx-v1.23.2 stand-alone environment.
Prepare the environment first, the basic configuration is as follows nginx/conf/conf.d/host.conf:

server {
listen 8081;
server_name 10.90.5.70;

proxy_connect_timeout 60;
proxy_read_timeout 600;
proxy_send_timeout 600;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto "http";
proxy_set_header Host $host;
proxy_http_version 1.1;
proxy_set_header Connection "";
proxy_next_upstream error non_idempotent;

proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";

location / {
root /usr/share/nginx/html;
index index.html index.htm;
}
}

The following cases are all based on the above configuration verification.


1. Regular expressions in nginx

The regular expressions in nginx basically follow the format and rules of Regular Expression. The difference is that special characters are usually used to represent the beginning of the regular expression, that is, to identify the characters that are to be processed by Regular Expression.
The part of nginx that can use regular expressions can be in the server or on the location path.

The meaning of common regular expressions

^ :匹配输入字符串的起始位置
$ :匹配输入字符串的结束位置
. :匹配除“\n”之外的任何单个字符,若要匹配包括“\n”在内的任意字符,请使用诸如“[.\n]”之类的模式
\d :匹配纯数字
\w :匹配字母或数字或下划线或汉字
\s :匹配任意的空白符
\b :匹配单词的开始或结束

【下面这部分是标注匹配长度(字符数量、重复数量)的】
* :匹配前面的字符零次或多次。如“ol*”能匹配“o”及“ol”、“oll”
+ :匹配前面的字符一次或多次。如“ol+”能匹配“ol”及“oll”、“olll”,但不能匹配“o”
? :匹配前面的字符零次或一次,例如“do(es)?”能匹配“do”或者“does”,”?”等效于”{0,1}”
{n} :重复 n 次
{n,} :重复 n 次或更多次
{n,m} :重复 n 到 m 次

[] :定义匹配的字符范围
[c] :匹配单个字符 c
注意:在括号里面用-表示范围:
[a-z] :匹配 a-z 小写字母的任意一个
[a-zA-Z0-9] :匹配所有大小写字母或数字
() :表达式的开始和结束位置 例如:(jpg|gif|swf|)

| :或运算符
! :非运算符(与其后面的表达式去反运算)
正则表达式里面没有“与运算符”。

\ :转义字符,将后面接着的字符标记为一个特殊字符或一个原义字符或一个向后引用。如“\n”匹配一个换行符,而“\$”则匹配“$”

Reference link: https://www.jb51.net/article/149053.htm

2. Location path matching rules and priorities

location: used to set the request URI. The location configuration item in nginx is the most basic configuration, and its configuration is also slightly complicated.

Location matching rules and priorities

Default value/
syntax location [ = | ~ | ~* | ^~ ] uri { ... }
Location server, location
uri variable is the request string to be matched, which can contain regular expressions or not. Then:

nginx服务器在搜索匹配location的时候,是先使用不包含正则表达式进行匹配,找到一个匹配度最高的一个,然后在通过包含正则表达式的进行匹配,如果能匹配到直接访问,匹配不到,就使用刚才匹配度最高的那个location来处理请求。

Another way to describe it, meaning the same thing:

location 匹配的优先级(与location在配置文件中的顺序无关)
= 精确匹配会第一个被处理。如果发现精确匹配,nginx停止搜索其他匹配。
普通字符匹配,正则表达式规则和长的块规则将被优先和查询匹配,也就是说如果该项匹配还需去看有没有正则表达式匹配和更长的匹配。
^~ 则只匹配该规则,nginx停止搜索其他匹配,否则nginx会继续处理其他location指令。
最后匹配理带有"~""~*"的指令,如果找到相应的匹配,则nginx停止搜索其他匹配;
当没有正则表达式或者没有正则表达式被匹配的情况下,那么匹配程度最高的逐字匹配指令会被使用。

Location priority official documentation:

1.  Directives with the = prefix that match the query exactly. If found, searching stops.
2. All remaining directives with conventional strings, longest match first. If this match used the ^~ prefix, searching stops.
3. Regular expressions, in order of definition in the configuration file.
4. If #3 yielded a match, that result is used. Else the match from #2 is used.

1. =前缀的指令严格匹配这个查询。如果找到,停止搜索。
2. 所有剩下的常规字符串,最长的匹配。如果这个匹配使用^〜前缀,搜索停止。
3. 正则表达式,在配置文件中定义的顺序。
4. 如果第3条规则产生匹配的话,结果被使用。否则,如同从第2条规则被使用。

It contains the matching rules for the path and the configuration for the rules.

Location rules are divided into five categories according to the leading character:
Code Preambler illustrate
= uri Full word matching, the corresponding rule will take effect only when the request path and URI match completely.
~ regular Case-sensitive regular expression matching
~* regular Case-insensitive regular expression matching
^~uri Negate regular path matching
url Paths without any leading characters match

When these five types of rules exist in the configuration file at the same time, they take effect according to certain priority rules.
Priority ( location = ) > ( location 完整路径 ) > ( location ^~ 否定正则 ) > ( location ~* 正则顺序 ) > ( location ~ 区分大小写正则顺序 ) > ( location 部分起始路径 ) > ( / )
: ① > ④ > ③ > ② > ⑤

The nginx rule decision process is as follows:

Text description:
The following description is very critical
1. Check whether the request uri matches a certain = rule. If so, apply the rule directly and terminate subsequent matching.
2. nginx first checks all path matching rule configuration items, including the "^~" rule and the rule without a leading symbol, and selects and remembers the configuration item with the longest match with the current request uri. However, at this time, the relevant configuration will not be enabled, but only remembered.
3. Determine whether the path rule selected in the previous step contains ^~. If it does, use this rule to terminate subsequent matching.
4. Check regular expressions in the configuration order. When the first suitable regular expression is matched, use this rule to terminate subsequent matching.
5. Use the path matching rule selected in step 3.

The following is an introduction to instance properties:

1, without a sign, requires to start with the specified mode

Location directive example:

server {
listen 8081;
server_name 127.0.0.1;

# 不带符号,要求必须以指定模式开始(区分大小写,并且后面带/是有区别的)
location /aaa {
default_type text/plain;
return 200 "access success aaa \n\r";
}
}

# 能匹配到:
http://127.0.0.1:8081/aaa
http://127.0.0.1:8081/aaa/
http://127.0.0.1:8081/aaadef
http://127.0.0.1:8081/aaa/def/
http://127.0.0.1:8081/aaa?p1=TOM

# 不能匹配到(大小写区分):
http://127.0.0.1:8081/Aaa

# 如果规则(后面跟/目录符号) location /aaa/ { 则只能匹配到下面两行:
http://127.0.0.1:8081/aaa/
http://127.0.0.1:8081/aaa/def/

As shown in the figure:

2, = is used before a URI that does not contain a regular expression and must match the specified pattern exactly.

In actual testing, the presence or absence of a space after the equal sign does not affect the effect. Example of location directive:

server {
listen 8081;
server_name 127.0.0.1;

# = : 用于不包含正则表达式的uri前,必须与指定的模式精确匹配(区分大小写,并且后面带/是有区别的)
location = /bbb {
default_type text/plain;
return 200 "access success bbb \n\r";
}

}

# 能匹配到:
http://127.0.0.1:8081/bbb
http://127.0.0.1:8081/bbb?p1=TOM

# 不能匹配到(大小写区分):
http://127.0.0.1:8081/bbb/
http://127.0.0.1:8081/bbbcd
http://127.0.0.1:8081/Bbb

As shown in the figure:

3. Contains regular expressions

~: used to indicate that the current URI contains a regular expression and is case-sensitive
~*: used to indicate that the current URI contains a regular expression and is case-insensitive
In other words, if the URI contains a regular expression, it needs to be marked with the above two symbols
^~: used before a URI that does not contain a regular expression, the function is the same as without a symbol, the only difference is that if the pattern matches, then stop searching for other patterns. ( 可用它提升优先级 )

Location directive with regular expression, example 1:

server {
listen 8081;
server_name 127.0.0.1;

# ~ :用于表示当前uri中包含了正则表达式,并且区分大小写
# 正则表达式:区分大小写,以/abc开头,以1个字母或数字或下划线或汉字结束的
location ~^/eee\w$ {
default_type text/plain;
return 200 "access success. 000 Regular expression matched: eee \n\r";
}
}

# 能匹配到:
http://127.0.0.1:8081/eeeb
http://127.0.0.1:8081/eeeB
http://127.0.0.1:8081/eee2

# 不能匹配到(大小写区分):
http://127.0.0.1:8081/eee
http://127.0.0.1:8081/Eee
http://127.0.0.1:8081/eee/
http://127.0.0.1:8081/eeedef
http://127.0.0.1:8081/eee/def/
http://127.0.0.1:8081/eee?p1=TOM

As shown in the figure:

Location directive with regular expression, example 2:

server {
listen 8081;
server_name 127.0.0.1;

# ~*: 用于表示当前uri中包含了正则表达式,并且不区分大小写
# 正则表达式:不区分大小写,以/abc开头,以字母或数字或下划线或汉字结束的
location ~*^/ddd\w$ {
default_type text/plain;
return 200 "access success. 111 Regular expression matched: ddd \n\r";
}
}

# 能匹配到:
http://127.0.0.1:8081/dddb
http://127.0.0.1:8081/dddB
http://127.0.0.1:8081/ddd2
http://127.0.0.1:8081/DddH

# 不能匹配到(大小写区分):
http://127.0.0.1:8081/ddd
http://127.0.0.1:8081/Ddd
http://127.0.0.1:8081/ddd/
http://127.0.0.1:8081/ddddef
http://127.0.0.1:8081/ddd/def/
http://127.0.0.1:8081/ddd?p1=TOM

As shown in the figure:

The location directive does not contain a regular expression, example three:

server {
listen 8081;
server_name 127.0.0.1;

# ^~: 用于不包含正则表达式的uri前,功能和不加符号的一致,唯一不同的是,如果模式匹配,那么就停止搜索其他模式了,可用于提升优先级。(区分大小写,并且后面带/是有区别的)
location ^~ /fff {
default_type text/plain;
return 200 "access success. Non Regular expression matched: fff \n\r";
}
}

# 能匹配到:
http://127.0.0.1:8081/fff
http://127.0.0.1:8081/fff/
http://127.0.0.1:8081/fffdef
http://127.0.0.1:8081/fff/def/
http://127.0.0.1:8081/fff?p1=TOM

# 不能匹配到(大小写区分):
http://127.0.0.1:8081/Fff
http://127.0.0.1:8081/pp/fff

# 如果规则(后面跟/目录符号) location /fff/ { 则只能匹配到下面两行:
http://127.0.0.1:8081/fff/
http://127.0.0.1:8081/fff/def/

As shown in the figure:

Define a named location

Use "@" to define a named location for internal targeting, such as error_page, try_files
@location Example:

# 示例:404错误页将被内部重定向
error_page 404 = @fetch;
location @fetch(
proxy_pass http://fetch;
)

# 类似案例:
error_page 404 /404.html;
error_page 500 502 503 504 /50x.html;
location = /50x.html {
root /usr/share/nginx/html;
}



Autumn The recruitment has already begun. If you are not well prepared, Autumn It is difficult to find a good job.


Here is a big employment gift package for everyone. You can prepare for the spring recruitment and find a good job!



Latest articles about

 
EEWorld WeChat Subscription

 
EEWorld WeChat Service Number

 
AutoDevelopers

About Us Customer Service Contact Information Datasheet Sitemap LatestNews

Room 1530, Zhongguancun MOOC Times Building,Block B, 18 Zhongguancun Street, Haidian District,Beijing, China Tel:(010)82350740 Postcode:100190

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京ICP证060456号 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号