20.1: Function-Like Preprocessor Macros

So far, we've been defining simple preprocessor macros with simple values, such as

	#define MAXLINE 200

and

	#define DATAFILE "data.dat"

These macros always expand to constant text (in these examples, the integer constant 200 and the string literal "data.dat", respectively) wherever they're used. However, it's also possible to define macros which expand to text which is different each time, depending on some subsidiary text which you specify. These macros take arguments, in much the same way that functions take arguments. In either case, the outcome (the expansion of the macro, like the action of the function) depends in some way on the particular values passed to it as arguments. The basic syntax of a function-like macro definition is

	#define macroname( args ) expansion

There must be no space between macroname and the open parenthesis.

We will illustrate the use of function-like macros with several examples.

In a previous chapter, we used the ``bitwise'' operators &, |, and ~ to manipulate individual bits within an integer value or ``flags word.'' In one application, we defined several simple macros whose values were ``bitmasks'':

	#define DIRTY	0x01
	#define OPEN	0x02
	#define VERBOSE	0x04

Then we used code like

	flags |= DIRTY;

to ``set the DIRTY bit,'' and code like

	flags &= ~DIRTY;

to clear the DIRTY bit, and code like

	if(flags & DIRTY)

to test it. With enough practice, these idioms become familiar enough that they can be read immediately, but suppose we wanted to make them less cryptic. Using the preprocessor, we'll be able to set up macros so that we can write

	SETBIT(flags, DIRTY);

and

	CLEARBIT(flags, DIRTY);

and

	if(TESTBIT(flags, DIRTY))

The definition of the SETBIT() macro might look like this:

	#define SETBIT(x, b) x |= b

When a function-like macro is expanded, the preprocessor keeps track of the ``arguments'' it was ``called'' with. When we write

	SETBIT(flags, DIRTY);

we're invoking the SETBIT() macro with a first argument of flags and a second argument of DIRTY. Within the definition of the macro, those arguments were known as x and b. So in the replacement text of the macro, x |= b, everywhere that x appears it will be further replaced by (in this case) flags, and everywhere that b appears it will be replaced by DIRTY. So the invocation

	SETBIT(flags, DIRTY);

will result in the expansion

	flags |= DIRTY;

Notice that the semicolon had nothing to do with macro expansion; it appeared following the close parenthesis of the invocation, and so shows up following the final expansion.

Similarly, we can define the CLEARBIT() and TESTBIT() macros like this:

	#define CLEARBIT(x, b) x &= ~b
	#define TESTBIT(x, b) x & b

Convince yourself that the invocations

	CLEARBIT(flags, DIRTY);

and

	if(TESTBIT(flags, DIRTY))

will result in the expansions

	flags &= ~DIRTY;

and

	if(flags & DIRTY)

as desired.

Just as for a regular function, parameter names such as x and b in a function-like macro definition are arbitrary; they're just used to indicate where in the replacement text the actual argument ``values'' should be plugged in. Also, those parameter names are not looked for within character or string constants. If you had a macro like

	#define XX(a, b) printf("%s is a %s\n", a, b)

then the invocation

	XX("John", "pumpkin-head");

would result in

	printf("%s is a %s\n", "John", "pumpkin-head");

It would not result in

	printf("%s is "John" %s\n", "John", "pumpkin-head");

which (in this case, anyway) would not have been at all what you wanted.

If we remember that (other than being careful not to expand macro arguments inside of string and character constants) the preprocessor is otherwise pretty dumb and literal-minded, we can see why there must not be a space between the macro name and the open parenthesis in a function-like macro definition. If we wrote

	#define SETBIT (x, b) x |= b

the preprocessor would think we were defining a simple macro, named SETBIT, with the (rather meaningless) replacement text (x, b) x |= b , and every time it saw SETBIT, it would replace it with (x, b) x |= b . (It would ignore any parentheses and arguments that the invocation of SETBIT happened to be followed with; that is, after the incorrect definition, the invocation

	SETBIT(flags, DIRTY);

would expand to

	(x, b) x |= b(flags, DIRTY);

where the (flags, DIRTY) part passed through without modification, along with the trailing semicolon.)

There are a few potential pitfalls associated with preprocessor macros, and with function-like ones in particular. To illustrate these, let's look at another example. C has no built-in exponentiation operator; if you want to square something, the easiest way is usually to multiply it by itself. Suppose that you got tired of writing

	x * x

and

	a * a + b * b

and

	(x + 1) * (x + 1)

Knowing about function-like preprocessor macros, you might be inspired to define a SQUARE() macro:

	#define SQUARE(z) z * z

Now you can write things like SQUARE(x) and SQUARE(a) + SQUARE(b), and this seems like it will be workable and convenient. But wait: what about that third example? If you write

	y = SQUARE(x + 1);

the simpleminded preprocessor will expand it to

	y = x + 1 * x + 1;

Remember, the preprocessor doesn't evaluate arguments the same way a function call would, it just performs textual substitutions. So in this last example, the ``value'' of the macro parameter z is x + 1, and everywhere that a z had appeared in the replacement text, the preprocessor fills in x + 1. But when the rest of the compiler sees the result, it will give multiplication higher precedence, as usual, and it will interpret the result as if you had written

	y = x + (1 * x) + 1;

which will not usually give you the result you wanted!

How can we fix this problem? We could forbid ourselves to ever ``call'' the SQUARE() macro on an argument that wasn't a single constant or variable name, but this seems like a harsh restriction. A better solution is to play with the definition of the macro itself: since the expansion we want is

	(x + 1) * (x + 1)

we can achieve that by defining the macro like this:

	#define SQUARE(z) (z) * (z)

Now

	y = SQUARE(x + 1);

expands to

	y = (x + 1) * (x + 1);

as we wished.

There's another problem, though: what if we write

	q = 1 / SQUARE(r);

Now we get

	q = 1 / (r) * (r)

and the rest of the compiler interprets this as

	q = (1 / (r)) * (r)

(Multiplication and division have the same precedence, and by default they go from left to right.) What can we do this time? We could enclose the invocation of the SQUARE() macro in extra parentheses, like this:

	q = 1 / (SQUARE(r));

but that seems like a real nuisance to remember. A better solution is to build those extra parentheses into the definition of the macro, too:

	#define SQUARE(z) ((z) * (z))

Now the code 1 / SQUARE(r)expands to 1 / ((r) * (r))and we have a macro that's safe against all of the troublesome invocations we've tried so far.

There's a third potential problem, though: suppose we write

	y = SQUARE(x++);

Even with all of our parentheses, this expands to

	y = ((x++) * (x++));

and this is a distinct no-no, because we're incrementing x twice within the same expression. We might end up with y containing the value x * x, as we wanted, but it's somewhat more likely that we'll end up with (x + 1) * x or x * (x + 1), instead. (We're now worried not just about what the macro expands to, but what the resultant expression evaluates to.) Furthermore, since expressions like x++ * x++ are undefined according to the ANSI/ISO C Standard, they can actually result in anything, even complete nonsense. So SQUARE(x++) simply isn't going to work. (The explicit parentheses, by the way, don't make the expression any less undefined.)

There's no good fix for this third problem. We are going to have to remember that when we invoke function-like macros, the macro might expand one of its arguments multiple times, so we had better not ever give it an argument with a side effect, such as x++, or else the side effect might end up happening multiple times, with undefined results. (That's one reason we always use capital letters for macro names, to remind ourselves that they are special, and that we might have to be careful when invoking them.)

The other way around the third problem is not to use a function-like preprocessor macro at all, but instead to use a genuine function. If we defined

	int square(int x)
	{
		return x * x;
	}

then we wouldn't have any of these problems. (Of course, then we'd have the limitation that we could only use this square function on arguments of a certain type, in this case, int. We could declare it as accepting and returning type double, but then we might worry that it was doing needless floating-point conversions in the cases where we handed it integer values...)

When should you use a function-like macro and when should you use a real function? In most cases, it's safer to use real functions. Generally, you use function-like macros only when the code they expand to is quite small and simple, and when defining and using a real function would for some reason be awkward, or when the code will be executed so often that the overhead of calling a real function would significantly impact the program's efficiency.

As an example of how a real function might be awkward, notice that we couldn't write SETBIT() and CLEARBIT() as conventional functions, because functions can't modify their arguments, yet SETBIT() and CLEARBIT() are supposed to. (That is, SETBIT(flags, DIRTY) modifies flags.)

To summarize the important rules of this section, whenever defining a function-like macro, remember:

Put parentheses around each instance of each macro parameter in the replacement text.
Put parentheses around the entire replacement text.
Capitalize the macro name to remind yourself that it is a macro, so that you won't call it on arguments with side effects.

Remember, too, not to put a space between the macro name and the open parenthesis in the definition.

Rewriting our first three examples to follow these rules, we'd have:

	#define SETBIT(x, b)   ((x) |= (b))
	#define CLEARBIT(x, b) ((x) &= ~(b))
	#define TESTBIT(x, b)  ((x) & (b))

(It's harder to see how SETBIT() and CLEARBIT() might fail if they weren't parenthesized, but unless you're really sure of yourself, there usually isn't a reason not to use the extra parentheses.)

A few final notes about function-like preprocessor macros: Sometimes, people try to write function-like macros which are even more like functions in that they expand to multiple statements; however, this is considerably trickier than it looks (at least, if it's not to fall victim to additional sets of pitfalls). Also, people sometimes wish for macros that take a variable number of arguments (in much the same way that the printf function accepts a variable number of arguments), but there's not yet a good way to do this, either.

Read sequentially: prev next up top