section 2.7: Type Conversions

The conversion rules described here and on page 44 are straightforward, but they're quite important, so you'll need to learn them well. Usually, conversions happen automatically and when you want them to, but not always, so it's important to keep the rules in mind. (Recall the discussion of 5/9 on page 12.)

Deep sentence:

A char is just a small integer, so chars may be freely used in arithmetic expressions.

Whether you treat a ``small integer'' as a character or an integer is pretty much up to you. As we saw earlier, in the ASCII character set, the character '0' has the value 48. Therefore, saying

	int i = '0';

is the same as saying

	int i = 48;

If you print i out as a character, using

	putchar(i);

	printf("%c", i);

(the %c format prints characters; see page 13), you'll see the character '0'. If you print it out as a number:

	printf("%d", i);

you'll see the value 48.

Most of the time, you'll use whatever notation matches what you're trying to do. If you want the character '0', you'll use '0'. If you want the value 48 (as the number of months in four years, or something), you'll use 48. If you want to print characters, you'll use putchar or printf %c, and if you want to print integers, you'll use printf %d. Occasionally, you'll cross over between thinking of characters as characters and as values, such as in the character-counting program in section 1.6 on page 22, or in the atoi function we'll look at next. (You should never have to know that '0' has the value 48, and you should never have to write code which depends on it.)

page 43

To illustrate the ``schitzophrenic'' nature of characters (are they characters, or are they small integer values?), it's useful to look at an implementation of the standard library function atoi. (If you're getting overwhelmed, though, you may skip this example for now, and come back to it later.) The atoi routine converts a string like "123" into an integer having the corresponding value.

As you study the atoi code at the top of page 43, figure out why it does not seem to explicitly check for the terminating '\0' character.

The expression

	s[i] - '0'

is an example of the ``crossing over'' between thinking about a character and its value. Since the value of the character '0' is not zero (and, similarly, the other numeric characters don't have their ``obvious'' values, either), we have to do a little conversion to get the value 0 from the character '0', the value 1 from the character '1', etc. Since the character set values for the digit characters '0' to '9' are contiguous (48-57, if you must know), the conversion involves simply subtracting an offset, and the offset (if you think about it) is simply the value of the character '0'. We could write

	s[i] - 48

if we really wanted to, but that would require knowing what the value actually is. We shouldn't have to know (and it might be different in some other character set), so we can let the compiler do the dirty work by using '0' as the offset (since subtracting '0' is, by definition, the same as subtracting the value of the character '0').

The functions from <ctype.h> are being introduced here without a lot of fanfare. Here is the main loop of the atoi routine, rewritten to use isdigit:

	for (i = 0; isdigit(s[i]); ++i)
		n = 10 * n + (s[i] - '0');

Don't worry too much about the discussion of signed vs. unsigned characters for now. (Don't forget about it completely, though; eventually, you'll find yourself working with a program where the issue is significant.) For now, just remember:

Use int as the type of any variable which receives the return value from getchar, as discussed in section 1.5.1 on page 16.
If you're ever dealing with arbitrary ``bytes'' of binary data, you'll usually want to use unsigned char.

page 44

As we saw in section 2.6 on page 44, relational and logical operators always ``return'' 1 for ``true'' and 0 for ``false.'' However, when C wants to know whether something is true or false, it just looks at whether it's nonzero or zero, so any nonzero value is considered ``true.'' Finally, some functions which return true/false values (the text mentions isdigit) may return ``true'' values of other than 1.

You don't have to worry about these distinctions too much, and you also don't have to worry about the fragment

	d = c >= '0' && c <= '9'

as long as you write conditionals in a sensible way. If you wanted to see whether two variables a and b were equal, you'd never write

	if((a == b) == 1)

(although it would work: the == operator ``returns'' 1 if they're equal). Similarly, you don't want to write

	if(isdigit(c) == 1)

because it's equally silly-looking, and in this case it might not work. Just write things like

	if(a == b)

and

	if(isdigit(c))

and you'll steer clear of most problems. (Make sure, though, that you never try something like if('0' <= c <= '9'), since this wouldn't do at all what it looks like it's supposed to.)

The set of implicit conversions on page 44, though informally stated, is exactly the set to remember for now. They're easy to remember if you notice that, as the authors say, ``the `lower' type is promoted to the `higher' type,'' where the ``order'' of the types is

	char < short int < int < long int < float < double < long double

(We won't be using long double, so you don't need to worry about it.) We'll have more to say about these rules on the next page.

Don't worry too much for now about the additional rules for unsigned values, because we won't be using them at first.

Do notice that implicit (automatic) conversions do happen across assignments. It's perfectly acceptable to assign a char to an int or vice versa, or assign an int to a float or vice versa (or any other combination). Obviously, when you assign a value from a larger type to a smaller one, there's a chance that it might not fit. Therefore, compilers will often warn you about such assignments.

page 45

Casts can be a bit confusing at first. A cast is the syntax used to request an explicit type conversion; coercion is just a more formal word for ``conversion.'' A cast consists of a type name in parentheses and is used as a unary operator. You may have used languages which had conversion operators which looked more like function calls:

	integer i = 2;
	floating f = floating(i);	/* not C */
	integer i2 = integer(f);	/* not C */

In C, you accomplish the same thing with casts:

	int i = 2;
	float f = (float)i;
	int i2 = (int)f;

(Actually, in C, we wouldn't need casts in those initializations at all, because conversions between int and float are some of the ones that C performs automatically.)

To further understand both how implicit conversions and explicit casts work, let's study how the implicit conversions would look if we wrote them out explicitly. First we'll declare a few variables of various types:

	char c1, c2;
	int i1, i2;
	long int L1, L2;
	double d1, d2;

Next we'll look at the kinds of conversions which C automatically performs when performing arithmetic on two dissimilar types, or when assigning a value to a dissimilar type. The rules are straightforward: when performing arithmetic on two dissimilar types, C converts one or both sides to a common type; and when assigning a value, C converts it to the type of the variable being assigned to.

If we add a char to an int:

	i2 = c1 + i1;

the fourth rule on page 44 tells us to convert the char to an int, as if we'd written

	i2 = (int)c1 + i1;

If we multiply a long int and a double:

	d2 = L1 * d1;

the second rule tells us to convert the long int to a double, as if we'd written

	d2 = (double)L1 * d1;

An assignment of a char to an int

	i1 = c1;

is as if we'd written

	i1 = (int)c1;

and an assignment of a float to an int

	i1 = f1;

is as if we'd written

	i1 = (int)f1;

Some programmers worry that implicit conversions are somehow unreliable and prefer to insert lots of explicit conversions. I recommend that you get comfortable with implicit conversions--they're quite useful--and don't clutter your code with extra casts.

There are a few places where you do need casts, however. Consider the code

	i1 = 200;
	i2 = 400;
	L1 = i1 * i2;

The product 200 x 400 is 80000, which is not guaranteed to fit into an int. (Remember that an int is only guaranteed to hold values up to 32767.) Since 80000 will fit into a long int, you might think that you're okay, but you're not: the two sides of the multiplication are of the same type, so the compiler doesn't see the need to perform any automatic conversions (none of the rules on page 44 apply). The multiplication is carried out as an int, which overflows with unpredictable results, and only after the damage has been done is the unpredictable value converted to a long int for assignment to L1. To get a multiplication like this to work, you have to explicitly convert at least one of the int's to long int:

	L1 = (long int)i1 * i2;

Now, the two sides of the * are of different types, so they're both converted to long int (by the fifth rule on page 44), and the multiplication is carried out as a long int. If it makes you feel safer, you can use two casts:

	L1 = (long int)i1 * (long int)i2;

but only one is strictly required.

A similar problem arises when two integers are being divided. The code

	i1 = 1;
	f1 = i1 / 2;

does not set f1 to 0.5, it sets it to 0. Again, the two operands of the / operand are already of the same type (the rules on page 44 still don't apply), so an integer division is performed, which discards any fractional part. (We saw a similar problem in section 1.2 on page 12.) Again, an explicit conversion saves the day:

	f1 = (float)i1 / 2;

Alternately, in a case like this, you can use a floating-point constant:

	f1 = i1 / 2.0;

In either case, as soon as one of the operands is floating point, the division is carried out in floating point, and you get the result you expect.

Implicit conversions always happen during arithmetic and assignment to variables. The situation is a bit more complicated when functions are being called, however.

The authors use the example of the sqrt function, which is as good an example as any. sqrt accepts an argument of type double and returns a value of type double. If the compiler didn't know that sqrt took a double, and if you called

	sqrt(4);

	int n = 4;
	sqrt(n);

the compiler would pass an int to sqrt. Since sqrt expects a double, it will not work correctly if it receives an int. Therefore, it was once always necessary to use explicit conversions in cases like this, by calling

	sqrt((double)4)

	sqrt((double)n)

	sqrt(4.0)

However, it is now possible, with a function prototype, to tell the compiler what types of arguments a function expects. The prototype for sqrt is

	double sqrt(double);

and as long as a prototype is in effect (``in scope,'' as the cognoscenti would say), you can call sqrt without worrying about conversions. When a prototype is in effect, the compiler performs implicit conversions during function calls (specifically, while passing the arguments) exactly as it does during simple assignments.

Obviously, using prototypes makes for much safer programming, and it is recommended that you use them whenever possible. For the standard library functions (the ones already written for you), you get prototypes automatically when you include the header files which describe sets of library functions. For example, you get prototypes for all of C's built-in math functions by putting the line

	#include <math.h>

at the top of your program. For functions that you write, you can supply your own prototypes, which we'll be learning more about later.

However, there are a few situations (we'll talk about them later) where prototypes do not apply, so it's important to remember that function calls are a bit different and that explicit conversions (i.e. casts) may occasionally be required. Don't imagine that prototypes are a panacea.

page 46

Don't worry about the rand example.

Read sequentially: prev next up top