Q: What's the difference between the ``unsigned preserving'' and ``value preserving'' rules?
A: These rules concern the behavior when an unsigned type must be promoted to a ``larger'' type. Should it be promoted to a larger signed or unsigned type? (To foreshadow the answer, it may depend on whether the larger type is truly larger.)
Under the unsigned preserving (also called ``sign preserving'') rules, the promoted type is always unsigned. This rule has the virtue of simplicity, but it can lead to surprises (see the first example below).
Under the value preserving rules, the conversion depends on the actual sizes of the original and promoted types. If the promoted type is truly larger--which means that it can represent all the values of the original, unsigned type as signed values--then the promoted type is signed. If the two types are actually the same size, then the promoted type is unsigned (as for the unsigned preserving rules).
Since the actual sizes of the types are used in making the determination, the results will vary from machine to machine. On some machines, short int is smaller than int, but on some machines, they're the same size. On some machines, int is smaller than long int, but on some machines, they're the same size.
In practice, the difference between the unsigned and value preserving rules matters most often when one operand of a binary operator is (or promotes to) int and the other one might, depending on the promotion rules, be either int or unsigned int. If one operand is unsigned int, the other will be converted to that type--almost certainly causing an undesired result if its value was negative (again, see the first example below). When the ANSI C Standard was established, the value preserving rules were chosen, to reduce the number of cases where these surprising results occur. (On the other hand, the value preserving rules also reduce the number of predictable cases, because portable programs cannot depend on a machine's type sizes and hence cannot know which way the value preserving rules will fall.)
Here is a contrived example showing the sort of surprise that can occur under the unsigned preserving rules:
unsigned short us = 10; int i = -5; if(i > us) printf("whoops!\n");The important issue is how the expression i > us is evaluated. Under the unsigned preserving rules (and under the value preserving rules on a machine where short integers and plain integers are the same size), us is promoted to unsigned int. The usual integral conversions say that when types unsigned int and int meet across a binary operator, both operands are converted to unsigned, so i is converted to unsigned int, as well. The old value of i, -5, is converted to some large unsigned value (65,531 on a 16-bit machine). This converted value is greater than 10, so the code prints ``whoops!''
Under the value preserving rules, on a machine where plain integers are larger than short integers, us is converted to a plain int (and retains its value, 10), and i remains a plain int. The expression is not true, and the code prints nothing. (To see why the values can be preserved only when the signed type is larger, remember that a value like 40,000 can be represented as an unsigned 16-bit integer but not as a signed one.)
Unfortunately, the value preserving rules do not prevent all surprises. The example just presented still prints ``whoops'' on a machine where short and plain integers are the same size. The value preserving rules may also inject a few surprises of their own--consider the code:
unsigned char uc = 0x80; unsigned long ul = 0; ul |= uc << 8; printf("0x%lx\n", ul);Before being left-shifted, uc is promoted. Under the unsigned preserving rules, it is promoted to an unsigned int, and the code goes on to print 0x8000, as expected. Under the value preserving rules, however, uc is promoted to a signed int (as long as int's are larger than char's, which is usually the case). The intermediate result uc << 8 goes on to meet ul, which is unsigned long. The signed, intermediate result must therefore be promoted as well, and if int is smaller than long, the intermediate result is sign-extended, becoming 0xffff8000 on a machine with 32-bit longs. On such a machine, the code prints 0xffff8000, which is probably not what was expected. (On machines where int and long are the same size, the code prints 0x8000 under either set of rules.)
To avoid surprises (under either set of rules, or due to an unexpected change of rules), it's best to avoid mixing signed and unsigned types in the same expression, although as the second example shows, this rule is not always sufficient. You can always use explicit casts to indicate, unambiguously, exactly where and how you want conversions performed; see questions 12.42 and 16.7 for examples. (Some compilers attempt to warn you when they detect ambiguous cases or expressions which would have behaved differently under the unsigned preserving rules, although sometimes these warnings fire too often; see also question 3.18.)
K&R2 Sec. 2.7 p. 44, Sec. A6.5 p. 198, Appendix C p. 260
ISO Sec. 126.96.36.199, Sec. 188.8.131.52, Sec. 184.108.40.206
Rationale Sec. 220.127.116.11
H&S Secs. 6.3.3,6.3.4 pp. 174-177