section 1.5.1: File Copying

page 16

Pay particular attention to the discussion of why the variable to hold getchar's return value is declared as an int rather than a char. The distinction may not seem terribly significant now, but it is important. If you use a char, it may seem to work, but it may break down mysteriously later. Always remember to use an int for anything you assign getchar's return value to.

page 17

The line

	while ((c = getchar()) != EOF)
epitomizes the cryptic brevity which C is notorious for. You may find this terseness infuriating (and you're not alone!), and it can certainly be carried too far, but bear with me for a moment while I defend it.

The simple example on pages 16 and 17 illustrates the tradeoffs well. We have four things to do:

  1. call getchar,
  2. assign its return value to a variable,
  3. test the return value against EOF, and
  4. process the character (in this case, print it again).
We can't eliminate any of these steps. We have to assign getchar's value to a variable (we can't just use it directly) because we have to do two different things with it (test, and print). Therefore, compressing the assignment and test into the same line (as on page 17) is the only good way of avoiding two distinct calls to getchar (as on page 16). You may not agree that the compressed idiom is better for being more compact or easier to read, but the fact that there is now only one call to getchar is a real virtue.

In a tiny program like this, the repeated call to getchar isn't much of a problem. But in a real program, if the thing being read is at all complicated (not just a single character read with getchar), and if the processing is at all complicated (such that the input call before the loop and the input call at the end of the loop become widely separated), and if the way that input is done is ever changed some day, it's just too likely that one of the input calls will get changed but not the other.

(Also, note that when an assignment like c = getchar() appears within a larger expression, the surrounding expression receives the same value that is assigned. Using an assignment as a subexpression in this way is perfectly legal and quite common in C.)

When you run the character copying program, and it begins copying its input (your typing) to its output (your screen), you may find yourself wondering how to stop it. It stops when it receives end-of-file (EOF), but how do you send EOF? The answer depends on what kind of computer you're using. On Unix and Unix-related systems, it's almost always control-D. On MS-DOS machines, it's control-Z followed by the RETURN key. Under Think C on the Macintosh, it's control-D, just like Unix. On other systems, you may have to do some research to learn how to send EOF.

(Note, too, that the character you type to generate an end-of-file condition from the keyboard has nothing to do with the EOF value returned by getchar. The EOF value returned by getchar is a code indicating that the input system has detected an end-of-file condition, whether it's reading the keyboard or a file or a magnetic tape or a network connection or anything else.)

Another excellent thing to know when doing any kind of programming is how to terminate a runaway program. If a program is running forever waiting for input, you can usually stop it by sending it an end-of-file, as above, but if it's running forever not waiting for something (i.e. if it's in an infinite loop) you'll have to take more drastic measures. Under Unix, control-C will terminate the current program, almost no matter what. Under MS-DOS, control-C or control-BREAK will sometimes terminate the current program, but by default MS-DOS only checks for control-C when it's looking for input, so an infinite loop can be unkillable. There's a DOS command, I think it's

	break on
which tells DOS to look for control-C more often, and I recommend using this command if you're doing any programming. (If a program is in a really tight infinite loop under MS-DOS, there can be no way of killing it short of rebooting.) On the Mac, try command-period or command-option-ESCAPE.

Finally, don't be disappointed (as I was) the first time you run the character copying program. You'll type a character, and see it on the screen right away, and assume it's your program working, but it's only your computer echoing every key you type, as it always does. When you hit RETURN, a full line of characters is made available to your program, which it reads all at once, and then copies to the screen (again). In other words, when you run this program, it will probably seem to echo the input a line at a time, rather than a character at a time. You may wonder how a program can read a character right away, without waiting for the user to hit RETURN. That's an excellent question, but unfortunately the answer is rather complicated, and beyond the scope of this introduction. (Among other things, how to read a character right away is one of the things that's not defined by the C language, and it's not defined by any of the standard library functions, either. How to do it depends on which operating system you're using.)


Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995, 1996 // mail feedback