436 views|3 replies

336

Posts

2

Resources
The OP
 

Chapter 7 Pattern Matching and Regular Expressions in Python Programming Quick Start 2nd Edition [Copy link]

This post was last edited by woaidownload on 2024-5-10 09:38
Chapter 7 Pattern Matching and Regular Expressions in Python Programming Quick Start to Automate Tedious Work 2nd Edition
This chapter is quite long. The author elaborates on the methods and techniques of matching and searching in strings through regular expressions, which is of great help for the use of Regex.
exercise
  1. What is the function that creates a Regex object?
    Answer:
    Create a Regex object through re.compile().
  2. Why are raw strings often used when creating Regex objects?
    Answer: Raw strings can simplify the use of escape
    characters in regular expressions .
  3. What does the search() method return?
    Answer:
    The search() method returns a Match object.
  4. How do you get the actual string that matches the pattern through the Match object?
    Answer:
    The Match object has a group() method that returns the actual matching text in the search string.
  5. In the regular expression created with r'(\d\d\d)-(\d\d\d-\d\d\d\d)', what does group 0 represent? What does group 1 represent? What does group 2 represent?
    Answer:
    Group 0 returns the entire matched text, group 1 returns the text matched by the first bracket, and group 2 returns the text matched by the second bracket.
  6. Brackets and periods have special meanings in regular expression syntax. How do I specify a regular expression to match real brackets and period characters?
    Answer:
    In regular expressions, brackets represent grouping. Periods are wildcard characters that match all characters except newline characters. To match real brackets and periods, you need to use backslash escapes, \( \) \.
  7. The findall() method returns a list of strings or a list of tuples of strings. What determines which kind of return it provides?
    Answer:
    If there is no grouping in the regular expression, the findall() method returns a list of strings. If there is a grouping in the regular expression, the findall() method returns a list of tuples of strings.
  8. In a regular expression, what does the | character mean?
    Answer:
    In a regular expression, the “|” symbol is called a “pipeline” and can match one of many expressions.
  9. In regular expressions, what are the two meanings of the ? character?
    Answer:
    In regular expressions, the ? character declares a non-greedy match or indicates an optional grouping.
  10. In regular expressions, what is the difference between the + and * characters?
    Answer:
    In regular expressions, * means "match zero or more times", and + (plus sign) means "match one or more times".
  11. In a regular expression, what is the difference between {3} and {3,5}?
    Answer:
    In a regular expression, {3} means to match 3 times, and {3,5} means to match any of 3, 4, and 5 times.
  12. What do the \d, \w, and \s abbreviation character classes mean in regular expressions?
    Answer:
    In regular expressions, \d represents any number from 0 to 9; \w represents any letter, number, or underscore character (which can be considered to match a "word" character); and \s represents a space, tab, or newline character (which can be considered to match a "blank" character).
  13. What do the \D, \W, and \S abbreviation character classes mean in regular expressions?
    Answer:
    In regular expressions, \D means any character except digits 0 to 9; \W means any character except letters, digits, and underscores; \S means any character except spaces, tabs, and newlines.
    14. What is the difference between .* and *??
    Answer:
    .* means any text; *? means a non-greedy pattern that matches 0 or more times.
  14. What is the character class syntax that matches all numbers and lowercase letters?
    Answer:
    [0-9a-z]
  15. How do I make a regular expression case-insensitive?
    Answer:
    You can pass re.IGNORECASE or re.I as the second parameter to re.compile().
  16. What does the character . usually match? If re.DOTALL is passed as the second argument to re.compile(), what will it match?
    Answer:
    The "." character matches all characters except newline. If re.DOTALL is passed as the second argument to re.comple(), it will match all characters including newline.
  17. If numRegex = re.compile(r'\d+'), what does numRegex.sub('X', '12, drummers, 11 pipers, five rings, 3 hens') return?
    Answer:
    The string "X, drummers, X pipers, five rings, X hens"
  18. What does passing re.VERBOSE as the second argument to re.compile() allow you to do?
answer:
When matching complex text patterns, long, convoluted regular expressions may be required. Passing re.VERBOSE as the second argument to re.compile() will cause whitespace and comments in the regular expression string to be ignored, thus alleviating this problem.
20. Write a regular expression to match numbers with a comma every 3 digits. It must match the following numbers:
  • '42'
  • '1,234'
  • '6,368,745'
But it will not match the following numbers:
  • '12,34,567' (only two digits between the commas)
  • '1234' (missing comma)
answer:
re.compile(r'''
(?<![\d|,])\d{1,3}(?=\s) #Match only 1-3 digits, left
|
(?<![\d|,])\d{1,3}(?:,\d{3})+(?=\s) #Match with ","
'', re.VERBOSE)
21. Write a regular expression to match the complete name of Nakamoto. You can assume that the name always appears in
Before the last name, there is a word that starts with a capital letter. This regular expression must match:
  • 'Satoshi Nakamoto'
  • 'Alice Nakamoto'
  • 'RoboCop Nakamoto'
But it does not match:
  • 'satoshi Nakamoto' (name without capital letter)
  • 'Mr. Nakamoto' (the preceding word contains non-alphabetic characters)
  • 'Nakamoto' (no name)
  • 'Satoshi nakamoto' (last name without capital letter)
answer:
re.compile(r'[AZ][a-zA-Z]*\sNakamoto')
22. Write a regular expression to match a sentence whose first word is Alice, Bob, or Carol.
The second word is eats, pets, or throws, and the third word is apples, cats, or baseballs. The sentence ends with a period. This regular expression is not case sensitive. It must match:
  • 'Alice eats apples.'
  • 'Bob pets cats.'
  • 'Carol throws baseballs.'
  • 'Alice throws Apples.'
  • 'BOB EATS CATS.'
But it does not match:
  • 'RoboCop eats apples.'
  • 'ALICE THROWS FOOTBALLS.'
  • 'Carol eats 7 cats.'
answer:
re.compile(r'''((?:Alice|Bob|Carol)\s(?:eats|pets|throws)\s(?:apples|cats|baseballs)\.)''', re.VERBOSE | re.I)

Latest reply

Regular expressions are difficult now, and there are many more details than before   Details Published on 2024-5-10 12:04
 
 

91

Posts

0

Resources
2
 

Too detailed, oh

 
 
 

5998

Posts

6

Resources
3
 

Regular expressions are difficult now, and there are many more details than before

Personal signature

在爱好的道路上不断前进,在生活的迷雾中播撒光引

 
 
 

336

Posts

2

Resources
4
 

There are very few opportunities to apply regular expressions. Through reading, I found that regular expressions are powerful, but they are also very complicated. It would be great if there was a design tool

 
 
 

Just looking around
Find a datasheet?

EEWorld Datasheet Technical Support

EEWorld
subscription
account

EEWorld
service
account

Automotive
development
circle

Copyright © 2005-2024 EEWORLD.com.cn, Inc. All rights reserved 京B2-20211791 京ICP备10001474号-1 电信业务审批[2006]字第258号函 京公网安备 11010802033920号
快速回复 返回顶部 Return list