A powerful tool to improve the quality of your code: the use of macro definitions
I. Introduction
I have always had this feeling: when I learn knowledge in a new field, if a certain knowledge point in it seems difficult to understand when I first come into contact with it , then no matter how much time I spend studying this knowledge point in the future, I will always think that this knowledge point is difficult. In other words, the first impression is particularly important.
For example, the macro definition in C language seems to conflict with me. I always think that the macro definition is the most difficult part of C language , just like some friends always think that pointers are the most difficult part of C language.
The essence of macro is code generator, which realizes dynamic code generation with the support of preprocessor, and the specific operation is realized through conditional compilation and macro expansion. We first establish such a basic concept in our mind, and then deeply understand how to control macro definition through actual description and code.
Therefore, today we will summarize and dig deeper into all the knowledge points of macro definition. I hope that after reading this article, I can get rid of this psychological barrier. After reading this summary article, I believe you will also be able to have a general and global grasp of macro definition.
2. Operation of the Preprocessor
1. Macro effectiveness stage: preprocessing
When a C program is compiled, it goes through four stages, from the source file to the final binary executable file :
What we are discussing today is the first stage: preprocessing . The work of this stage is completed by the preprocessor , which includes the following 4 tasks:
File introduction (#include); Conditional compilation (#if..#elif..#endif); macro expansions; Line control.
2. Conditional compilation
Generally, every line of code in a C language file is to be compiled, but sometimes, for the sake of optimizing the program code, we want to compile only a part of the code . In this case, we need to add conditions to the program to let the compiler compile only the code that meets the conditions and discard the code that does not meet the conditions. This is conditional compilation .
Simply put: the preprocessor dynamically processes the code according to the conditions we set, outputs the valid code to an intermediate file, and then sends it to the compiler for compilation.
Conditional compilation is basically used in all project codes. For example, when you need to consider the following situations, you will definitely use conditional compilation :
The program needs to be compiled into executable programs for different platforms; The same set of code needs to run on different functional products on the same platform; There are some codes in the program for testing purposes, which need to be shielded if we don't want to pollute the product-level code.
Here are 3 examples of conditional compilation that are often seen in code:
Example 1: Used to distinguish between C and C++ code
# __cplusplus
extern "C" {
#
void hello();
# __cplusplus
}
#
Such code can be found in almost every open source library. Its main purpose is mixed programming of C and C++. Specifically:
If you use gcc to compile, the macro __cplusplus will not exist, and the extern "C" in it will be ignored; If you use g++ to compile, the macro __cplusplus exists, and the extern "C" in it takes effect. The compiled function name hello will not be rewritten by the g++ compiler, so it can be called by C code;
Example 2: Used to distinguish different platforms
# defined(linux) || defined(__linux) || defined(__linux__)
sleep(1000 * 1000); // 调用 Linux 平台下的库函数
# defined(WIN32) || defined(_WIN32)
Sleep(1000 * 1000); // 调用 Windows 平台下的库函数(第一个字母是大写)
#
So,
linux, __linux, __linux__, WIN32, _WIN32
where do these come from? We can think of them as
prepared by
the compilation target platform (operating system)
.
Example 3: Declaring export and import functions when writing a dynamic library on Windows
# defined(linux) || defined(__linux) || defined(__linux__)
#
#
// 函数声明
LIBA_API void hello();
This code is taken directly from an example in a short video I recorded on Bilibili. At that time, it was mainly to demonstrate how to use make and cmake build tools to compile on Linux platform . Later, my friends asked me to use make and cmake to build on Windows platform , so I wrote the above macro definition.
When using MSVC to compile a dynamic library, you need to define the macro LIBA_API_EXPORTS in the compilation options (Makefile or CMakeLists.txt), then the macro LIBA_API at the beginning of the exported function hello will be replaced with: __declspec(dllexport), indicating the export operation; When compiling an application, if you use a dynamic library, you need to include the header file of the dynamic library. At this time, you do not need to define the macro LIBA_API_EXPORTS in the compilation options. Then the LIBA_API at the beginning of the hello function will be replaced by __declspec(dllimport), indicating the import operation. One more thing to add: If you use a static library, no macro definition is required in the compilation options, then the macro LIBA_API is empty.
3. Platform predefined macros
As we have seen above, the target platform will predefine some macros for us to use in the program. In addition to the above operating system related macros, there is another type of macro definition that is widely used in the logging system:
FILE : current source code file name;
LINE : current source code line number;
FUNCTION : currently executed function name;
DATE : compilation date;
TIME : compilation time;
For example:
printf("file name: %s, function name = %s, current line:%d \n", __FILE__, __FUNCTION__, __LINE__);
3. Macro expansion
The so-called macro expansion is code replacement , which is also the main point I want to express. The biggest benefits of macro expansion are as follows:
Reduce duplicate code; Complete some functions that cannot be achieved through C syntax (string concatenation); Dynamically define data types to implement functions similar to templates in C++; The program is easier to understand and modify (for example: numbers and strings are always on);
When we write code, all places where macro names are used can be understood as a placeholder . In the preprocessing phase of the compiler, these macro names will be replaced with the code segments in the macro definition. Note: this is just a simple text replacement .
1. The most common macros
To facilitate the following description, let's first look at several common macro definitions:
(1) Definition of data type
# BOOL
typedef char BOOL;
#
# TRUE
#
# FALSE
#
One thing to note when defining data types is that if your program needs to be compiled with compilers on different platforms, you need to check whether the compiler you are using has defined the data types controlled by these macro definitions . For example, there is no BOOL type in gcc, but in MSVC, the BOOL type is defined as int.
(2) Get the maximum and minimum values
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
#define MIN(a, b) (((a) < (b)) ? (a) : (b))
(3) Calculate the number of elements in an array
# ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
(4) Bit operations
# BIT_MASK(x) (1 << (x))
# BIT_GET(x, y) (((x) >> (y)) & 0x01u)
# BIT_SET(x, y) ((x) | (1 << (y)))
# BIT_CLR(x, y) ((x) & (~(1 << (y))))
# BIT_INVERT(x, y) ((x) ^ (1 << (y)))
2. Difference from function
Judging from the above macros, all these operations can be implemented through functions , so what are their advantages and disadvantages?
This is accomplished through functions:
The type of the formal parameter needs to be determined, and the parameter is checked when calling; Calling a function requires additional overhead: operating the parameters and return values in the function stack;
Implemented through macros:
No need to check parameters, more flexible parameter passing; Directly expand the macro code without calling the function; If the same macro is called in multiple places, the code size will increase;
It is better to give an example to illustrate, let's take the above comparison size as an example:
(1) Using macros to implement
#define MAX(a, b) (((a) > (b)) ? (a) : (b))
int main()
{
printf("max: %d \n", MAX(1, 2));
}
(2) Use functions to implement
int max(int a, int b)
{
if (a > b)
return a;
return b;
}
int main()
{
printf("max: %d \n", max(1, 2));
}
Except for the function call overhead, there seems to be no difference. Here we compare two integer data , but what if we need to compare two floating point data ?
Use macro to call: MAX(1.1, 2.2); everything is OK; Use function call: max(1.1, 2.2); Compilation error: Type mismatch.
At this point, the advantage of using macros to implement this is reflected: because there is no concept of type in macros, the caller can pass in any data type , and then in the subsequent comparison operations, the greater than or less than operations are executed using the syntax of the C language itself.
If you use functions to implement this, you must define another function to operate on floating point data, and later you may also be able to compare char, long, and so on.
In C++ , such operations can be implemented through parameter templates . The so-called template is also a mechanism for dynamic code generation. After a function template is defined, multiple functions are dynamically generated according to the actual parameters of the caller . For example, define the following function template:
template<typename T> T max(T a, T b){
if (a > b)
return a;
return b;
}
max(1, 2); // 实参是整型
max(1.1, 2,2); // 实参是浮点型
When the compiler sees
max(1, 2)
, it will
dynamically generate
a function
int max(int a, int b) { ... }
;
When the compiler sees
max(1.1, 2.2)
, it will
dynamically generate
another function
float max(float a, float b) { ... }
.
Therefore, from the perspective of dynamic code generation, macro definitions are somewhat similar to template parameters in C++, except that macro definitions are merely code extensions.
The following example is also quite good, using macro type independence to dynamically generate structures :
# VEC(T) \
struct vector_
T *data; \
size_t size; \
};
int main()
{
VEC(int) vec_1 = { .data = NULL, .size = 0 };
VEC(float) vec_2 = { .data = NULL, .size = 0 };
}
This example uses ##, which will be explained below. In the previous example, the macro parameters passed were all variables , while the macro parameters passed here are data types . Through the type independence of the macro, the purpose of "dynamically" creating a structure is achieved:
struct vector_int {
int *data;
size_t size;
}
struct vector_float {
float *data;
size_t size;
}
There is a
trap
Note:
there cannot be spaces in the data type passed
. If you use it like this:
VEC(long long)
, then after replacement you get:
struct vector_long long { // 语法错误
long long *data;
size_t size;
}
4. Symbols: # and ##
The role of these two symbols in programming is also very clever. It is an exaggeration to say that they can be seen in any framework code! The functions are as follows:
#: Convert the parameter into a string; ##: Connection parameters.
1. #: Stringification
Let’s look at the simplest example:
# STR(x) #x
printf("string of 123: %s \n", STR(123));
The input is a number 123 , and the output result is the string "123" . This is stringification.
2. ##: Parameter connection
Concatenate the parameters in the macro character by character to get a new identifier , for example:
# MAKE_VAR(name, no) name##no
int main(void)
{
int MAKE_VAR(a, 1) = 1;
int MAKE_VAR(b, 2) = 2;
printf("a1 = %d \n", a1);
printf("b2 = %d \n", b2);
return 0;
}
When the macro is called
MAKE_VAR(a, 1)
, the symbol
## replaces name and no on both sides with a and 1, and then concatenates them to get a1
. Then the int data type in the calling statement indicates that a1 is an integer data, and finally initialized to 1.
5. Processing of variable parameters
1. Definition and use of parameter names
The number of parameters in a macro definition can be uncertain , just like calling the printf print function. When defining, you can use three dots (...) to indicate variable parameters, or you can add the name of the variable parameter before the three dots.
If you use three dots (...) to receive variable parameters, you need to use __VA_ARGS__ to represent variable parameters when using them , as follows:
# debug1(...) printf(__VA_ARGS__)
debug1("this is debug1: %d \n", 1);
If you add a parameter name before the three dots (...), you must use this parameter name when using it, and you cannot use __VA_ARGS__ to represent variable parameters , as follows:
# debug2(args...) printf(args)
debug1("this is debug2: %d \n", 2);
2. Handling of zero variable parameters
Take a look at this macro:
# debug3(format, ...) printf(format, __VA_ARGS__)
debug3("this is debug4: %d \n", 4);
There is no problem with compilation and execution. However, if the macro is used like this:
debug3("hello \n");
When compiling, an error occurs:
error: expected expression before ‘)’ token
.
Why?
Take a look at the code after macro expansion (
__VA_ARGS__
empty):
printf("hello \n",);
Can you see the problem? There is an extra comma after the format string ! To solve the problem, the preprocessor provides us with a method: the ## symbol automatically deletes the extra comma . So if the macro definition is changed to the following, there will be no problem.
printf(format, ##__VA_ARGS__) define debug3(format, ...)
Similarly, if you define the name of a variable parameter yourself, add ## in front of it, as follows:
printf(format, ##args) define debug4(format, args...)
6. Fantastic Macro
The essence of macro expansion
is text replacement
, but
once variable parameters (
__VA_ARGS__
) and the connection function of ## are added, it can be transformed into endless imagination
.
I have always believed that imitation is the first step to becoming a master. Only by being knowledgeable, watching more, and learning more about how others use macros, and then using them for your own benefit, and training according to the steps of "first rigidify - then optimize - finally solidify", one day you will also be able to become a master.
Here we will look at some clever implementations using macro definitions.
1. Logging
Adding logging functionality to the code is a standard feature of almost every product. The most common usage is as follows:
# DEBUG
#
#
int main()
{
LOG("name = %s, age = %d \n", "zhangsan", 20);
return 0;
}
During compilation, if you need to output the log function, pass in the macro definition
DEBUG
, so that you can print out the debugging information. Of course, in the actual product, it needs to be written into a file. If you do not need to print statements, you can achieve this by defining the statement that prints the log information as an empty statement.
Changing our thinking, we can also use conditional statements to control the printing information, as follows:
# DEBUG
#
#
int main()
{
debug {
printf("name = %s, age = %d \n", "zhangsan", 20);
}
return 0;
}
This way of controlling log information is not seen much, but it can achieve the purpose. It is put here just to broaden your thinking.
2. Use macros to iterate over each parameter
# first(x, ...) #x
# rest(x, ...) #__VA_ARGS__
# destructive(...) \
do { \
printf("first is: %s\n", first(__VA_ARGS__)); \
printf("rest are: %s\n", rest(__VA_ARGS__)); \
} while (0)
int main(void)
{
destructive(1, 2, 3);
return 0;
}
The main idea is to separate the first parameter in the variable parameter VA_ARGS each time , and then recursively process the following parameters , so that each parameter can be separated. I remember that Mr. Hou Jie also implemented a similar function in the C++ video using the syntax of variable parameter templates.
I just found the code demonstrated by Teacher Hou Jie in Youdao Notes. Friends who are familiar with C++ can study the following code:
// 递归的最后一次调用
void myprint()
{
}
template <typename T, typename... Types>
void myprint(const T &first, const Types&... args)
{
std::cout << first << std::endl;
std::cout << "remain args size = " << sizeof...(args) << std::endl;
// 把其他参数递归调用
myprint(args...);
}
int main()
{
myprint("aaa", 7.5, 100);
return 0;
}
3. Dynamically call different functions
// 普通的枚举类型
enum {
ERR_One,
ERR_Two,
ERR_Three
};
// 利用 ## 的拼接功能,动态产生 case 中的比较值,以及函数名。
# TEST(no) \
case ERR_
Func_
break;
void Func_One()
{
printf("this is Func_One \n");
}
void Func_Two()
{
printf("this is Func_Two \n");
}
void Func_Three()
{
printf("this is Func_Three \n");
}
int main()
{
int c = ERR_Two;
switch (c) {
TEST(One);
TEST(Two);
TEST(Three);
};
return 0;
}
In this example, the core lies in the TEST macro definition . Through the ## splicing function, the comparison target of the case branch is constructed, and then the corresponding function is dynamically spliced and finally called.
4. Dynamically create error codes and corresponding error strings
This is also a very clever example, which uses the two functions # (stringification) and ## (concatenation) to dynamically generate error codes and corresponding error strings:
# MY_ERRORS \
E(TOO_SMALL) \
E(TOO_BIG) \
E(INVALID_VARS)
# E(e) Error_## e,
typedef enum {
MY_ERRORS
} MyEnums;
# E
# E(e) #e,
const char *ErrorStrings[] = {
MY_ERRORS
};
# E
int main()
{
printf("%d - %s \n", Error_TOO_SMALL, ErrorStrings[0]);
printf("%d - %s \n", Error_TOO_BIG, ErrorStrings[1]);
printf("%d - %s \n", Error_INVALID_VARS, ErrorStrings[2]);
return 0;
}
After we expand the macro, we get an enumeration type and an array of string constants:
typedef enum {
Error_TOO_SMALL,
Error_TOO_BIG,
Error_INVALID_VARS,
} MyEnums;
const char *ErrorStrings[] = {
"TOO_SMALL",
"TOO_BIG",
"INVALID_VARS",
};
Isn't the code after macro expansion very simple? The compilation and execution results are as follows:
0 - TOO_SMALL
1 - TOO_BIG
2 - INVALID_VARS
VII. Conclusion
Some people love macros to death, to the point of abusing them; while some people hate macros to the core, and even use the word "evil"! In fact, macros are to C like kitchen knives are to chefs and gangsters : if used well, they can make the code structure concise and easy to maintain later; if used poorly, they will introduce obscure syntax and difficult-to-debug bugs.
For us developers, we just need to strike a balance between program execution efficiency and code maintainability.
【Recommended reading】
Macro definitions commonly used by embedded engineers
C language implements object-oriented principles
Structures and unions in C language
Ten thousand words compiled, the most complete introductory notes on C language!
Several C programming habits worth learning
Is inline assembly terrible? Read this article to put an end to it!
5T technical resources are available for free! Including but not limited to: C/C++, Arm, Linux, Android, artificial intelligence, microcontrollers, Raspberry Pi, etc. Reply " peter
"
in the official account
to get them for free! !
Remember to click Share , Like and Watching , give me