Who wouldn’t be an operation and maintenance expert if you master these 12 Linux Shell text processing skills?
From: Big CC
Link: http://www.cnblogs.com/me115/p/3427319.html
Below I introduce the most commonly used tools for processing text using Shell under Linux:
find, grep, xargs, sort, uniq, tr, cut, paste, wc, sed, awk;
The examples and parameters provided are the most commonly used and practical;
The principle I use for shell scripts is to write commands in single lines, and try not to exceed 2 lines;
If you have more complex task requirements, consider Python;
1. Find file search
Find txt and pdf files
find . \( -name "*.txt" -o -name "*.pdf" \) -print
.txt
pdf
find . -regex ".*\(\.txt|\.pdf\)$"
-
-iregex: Ignore case-sensitive regular expressions
Negate the parameter to find all non-txt text
find . ! -name "*.txt" -print
Specifying the search depth
find . -maxdepth 1 -type f
-
Custom Search
Search by Type:
find . -type d -print //只列出所有目录
-type f file/ l symbolic link
Search by time:
-
-atime access time (the unit is day, the unit is -amin, similar to the following)
-
-mtime modification time (content was modified)
-
-ctime change time (metadata or permission change)
All files accessed in the last 7 days:
find . -atime 7 -type f -print
Search by size:
w font k MG
Find files larger than 2k
find . -type f -size +2k
Search by permission:
find . -type f -perm 644 -print //找具有可执行权限的所有文件
find . -type f -user weber -print// 找用户weber所拥有的文件
-
delete:
Delete all swp files in the current directory:
find . -type f -name "*.swp" -delete
find . -type f -user root -exec chown weber {} \; //将当前目录下的所有权变更为weber
Note: {} is a special string. For each matching file, {} will be replaced with the corresponding file name.
Eg: copy all found files to another directory:
find . -type f -mtime +10 -name "*.txt" -exec cp {} OLD \;
-
Combining multiple commands
Tips: If you need to execute multiple commands later, you can write them into a script. Then execute the script when calling -exec;
-exec ./commands.sh {} \;
-print delimiter
By default, is used
as the file delimiter;
'\n'
-print0 uses '\0' as the file delimiter, so that you can search for files containing spaces;
2. Grep text search
grep match_patten file // default access matching line
Common parameters
-
-o only outputs matching lines VS -v only outputs non-matching lines
-
-c counts the number of times a file contains text
grep -c "text" filename
-
-n Print matching line numbers
-
-i Ignore case when searching
-
-l Print only the file name
Recursive search for text in multiple directories (a favorite for programmers searching for code):
grep "class" . -R -n
Matching multiple patterns
grep -e "class" -e "vitural" file
Grep outputs file names ending with \0: (-z)
grep "test" file* -lZ| xargs -0 rm
3. xargs command line parameter conversion
xargs can convert input data into command line parameters of a specific command; in this way, it can be used in combination with many commands, such as grep and find;
Convert multi-line output to single-line output
cat file.txt| xargs
\n
It is the delimiter between multiple lines of text.
Convert single line to multiple lines of output
cat single.txt | xargs -n 3
-n: Specifies the number of fields to display per line
-
-d
Define the delimiter (the default is space and the delimiter for multiple lines is \n) -
-n specifies multiple lines of output
-
-I {} specifies the replacement string, which will be replaced when xargs is expanded. It is used when the command to be executed requires multiple parameters.
cat file.txt | xargs -I {} ./command.sh -p {} -1
-
-0: specifies \0 as the input delimiter
Eg: Count the number of program lines
find source_dir/ -type f -name "*.cpp" -print0 |xargs -0 wc -l
4. sort
Field Description:
-
-n sorts numerically vs -d sorts lexicographically
-
-r sort in reverse order
-
-k N specifies sorting by column N
For example:
sort -nrk 1 data.txt
sort -bd data // 忽略像空格之类的前导空白字符
5. uniq eliminates duplicate lines
Eliminate duplicate rows
sort unsort.txt | uniq
sort unsort.txt | uniq -c
sort unsort.txt | uniq -d
6. Use tr for conversion
General Usage
echo 12345 | tr '0-9' '9876543210' //加解密转换,替换对应字符
cat text | tr '\t' ' ' //制表符转空格
tr delete characters
cat file | tr -d '0-9' // 删除所有数字
cat file | tr -c '0-9' //获取文件中所有数字
cat file | tr -d -c '0-9 \n' //删除非数字数据
cat file | tr -s ' '
Character Classes
-
Various character classes are available in tr:
-
alnum: letters and numbers
-
alpha: letter
-
digit: number
-
space: blank character
-
lower: lowercase
-
upper: uppercase
-
cntrl: control (non-printable) characters
-
print: printable characters
Usage: tr [:class:] [:class:]
eg: tr '[:lower:]' '[:upper:]'
7. cut splits text by column
Extract the 2nd and 4th columns of the file:
cut -f2,4 filename
cut -f3 --complement filename
-d specifies the delimiter:
cat -f2 -d";" filename
cut range
-
N - Nth field to the end
-
-M The first field is M
-
NM N to M fields
The unit of cut
-
-b In bytes
-
-c In characters
-
-f In field units (use delimiter)
For example:
cut -c1-5 file //打印第一到5个字符
cut -c-2 file //打印前2个字符
8. paste concatenates text by column
Join two texts together by columns;
cat file112cat file2
colin
book
paste file1 file21 colin2 book
paste file1 file2 -d “,”
1,colin
2,book
9. wc Tool for counting lines and characters
wc -l file // count lines
wc -w file // count words
wc -c file // count characters
10. sed text replacement tool
First replacement
sed 's/text/replace_text/' file //替换每一行的第一处匹配的text
Global Replacement
sed 's/text/replace_text/g' file
After the default replacement, the replaced content is output. If you need to directly replace the original file, use -i:
sed -i 's/text/repalce_text/g' file
Remove blank lines:
sed '/^$/d' file
Variable conversion, the matched string is referenced by the & marker.
echo this is en example | sed 's/\w+/[&]/g'$>[this] [is] [en] [example]
Substring matching flagThe first matching bracket content is referenced using the marker \1
sed 's/hello\([0-9]\)/\1/'
Double quote evaluation
sed 's/$var/HLLOE/'
When using double quotes, we can specify variables in both the sed style and replacement strings;
eg:p=patten
r=replaced
echo "line con a patten" | sed "s/$p/$r/g"$>line con a replaced
sed 's/^.\{3\}/&\//g' file
11. awk data stream processing tool
awk script structure
awk 'BEGIN{ statements } statements2 END{ statements } '
How it works
-
Execute the statement block in begin;
-
Read a line from the file or stdin, then execute statements2, and repeat this process until the entire file has been read;
-
Execute the end statement block;
print prints the current line
When print is used without parameters, the current line is printed;
echo -e "line1\nline2" | awk 'BEGIN{print "start"} {print } END{ print "End" }'
When print is separated by commas, the parameters are delimited by spaces;
echo | awk ' {var1 = "v1" ; var2 = "V2"; var3="v3"; \
print var1, var2 , var3; }'$>v1 V2 v3
Use the - separator ("" as a separator);
echo | awk ' {var1 = "v1" ; var2 = "V2"; var3="v3"; \
print var1"-"var2"-"var3; }'$>v1-V2-v3
Special variables: NR NF $0 $1 $2
NR: indicates the number of records, which corresponds to the current line number during execution;
NF: indicates the number of fields, which corresponds to the number of fields in the current line during execution;
$0: this variable contains the text content of the current line during execution;
$1: the text content of the first field;
$2: the text content of the second field;
echo -e “line1 f2 f3\n line2 \n line 3” | awk ‘{print NR”:”1”-“$2}’
Print the second and third fields of each line:
awk '{print $2, $3}' file
Count the number of lines in a file:
awk ' END {print NR}' file
Add up the first field of each line:
echo -e "1\n 2\n 3\n 4\n" | awk 'BEGIN{num = 0 ;
print "begin";} {sum += $1;} END {print "=="; print sum }'
Passing external variables
var=1000echo | awk '{print vara}' vara=$var # 输入来自stdinawk '{print vara}' vara=$var file # 输入来自文件
用样式对awk处理的行进行过滤
-
awk 'NR < 5' #line number is less than 5
-
awk 'NR==1,NR==4 {print}' file #print out the lines with numbers 1 and 4
-
awk '/linux/' #Lines containing linux text (can be specified using regular expressions, super powerful)
-
awk '!/linux/' #Lines that do not contain linux text
Set the delimiter
Use -F to set the delimiter (default is space)
awk -F: '{print $NF}' /etc/passwd
Read command output
Use getline to read the output of the external shell command into the variable cmdout;
echo | awk '{"grep root /etc/passwd" | getline cmdout; print cmdout }'
for(i=0;i<10;i++){print $i;}
for(i in array){print array[i];}
Eg:
Print lines in reverse order: (implementation of tac command)
seq 9| \
awk '{lifo[NR] = $0; lno=NR} \
END{ for(;lno>-1;lno--){print lifo[lno];}
} '
awk implements head and tail commands
head:
awk 'NR<=10{print}' filename
tail:
awk '{buffer[NR%10] = $0;} END{for(i=0;i<11;i++){ \
print buffer[i %10]} } ' filename
Print the specified columns
Implementation with awk:
ls -lrt | awk '{print $6}'
Implementation by cut method
ls -lrt | cut -f6
Print the specified text area
Determine the line number
seq 100| awk 'NR==4,NR==6{print}'
Confirm text
Print the text between start_pattern and end_pattern;
awk '/start_pattern/, /end_pattern/' filename
For example:
seq 100 | awk '/13/,/15/'cat /etc/passwd| awk '/mai.*mail/,/news.*news/'
-
index(string,search_string): Returns the position where search_string appears in string
-
sub(regex,replacement_str,string): replace the first occurrence of the regular expression with replacement_str;
-
match(regex, string): Check if the regular expression can match the string;
-
length(string): Returns the length of the string
echo | awk '{"grep root /etc/passwd" | getline cmdout; print length(cmdout) }'
, eg:
seq 10 | awk '{printf "->%4s\n", $1}'
12. Iterate over lines, words, and characters in a file
While loop
while read line;doecho $line;done < file.txt改成子shell:cat file.txt | (while read line;do echo $line;done)
cat file.txt | awk ‘{print}’
for word in $line;do echo $word;done
3. Iterate over each character
${string:start_pos:num_of_chars}: extract a character from a string; (bash text slice)
${#word}: returns the length of the variable word
for((i=0;i<${#word};i++))doecho ${word:i:1);done
Autumn
The recruitment has already begun. If you are not well prepared,
Autumn
It is difficult to find a good job.
Here is a big employment gift package for everyone. You can prepare for the spring recruitment and find a good job!