section 7.4: Formatted Input -- Scanf

page 157

Somehow we've managed to make it through six chapters without meeting scanf, which it turns out is just as well.

In the examples in this book so far, all input (from the user, or otherwise) has been done with getchar or getline. If we needed to input a number, we did things like

	char line[MAXLINE];
	int number;
	getline(line, MAXLINE);
	number = atoi(line);
Using scanf, we could ``simplify'' this to
	int number;
	scanf("%d", &number);
This simplification is convenient and superficially attractive, and it works, as far as it goes. The problem is that scanf does not work well in more complicated situations. In section 7.1, we said that calls to putchar and printf could be interleaved. The same is not always true of scanf: you can have baffling problems if you try to intermix calls to scanf with calls to getchar or getline. Worse, it turns out that scanf's error handling is inadequate for many purposes. It tells you whether a conversion succeeded or not (more precisely, it tells you how many conversions succeeded), but it doesn't tell you anything more than that (unless you ask very carefully). Like atoi and atof, scanf stops reading characters when it's processing a %d or %f input and it finds a non-numeric character. Suppose you've prompted the user to enter a number, and the user accidentally types the letter `x'. scanf might return 0, indicating that it couldn't convert a number, but the unconvertable text (the `x') remains on the input stream unless you figure out some other way to remove it.

For these reasons (and several others, which I won't bother to mention) it's generally recommended that scanf not be used for unstructured input such as user prompts. It's much better to read entire lines with something like getline (as we've been doing all along) and then process the line somehow. If the line is supposed to be a single number, you can use atoi or atof to convert it. If the line has more complicated structure, you can use sscanf (which we'll meet in a minute) to parse it. (It's better to use sscanf than scanf because when sscanf fails, you have complete control over what you do next. When scanf fails, on the other hand, you're at the mercy of where in the input stream it has left you.)

With that little diatribe against scanf out of the way, here are a few comments on individual points made in section 7.4.

We've met a few functions (e.g. getline, month_day in section 5.7 on page 111) which return more than one value; the way they do so is to accept a pointer argument that tells them where (in the caller) to write the returned value. scanf is the epitome of such functions: it returns potentially many values (one for each %-specifier in its format string), and for each value converted and returned, it needs a pointer argument.

The statement on page 157 that ``blanks or tabs'' in the format string ``are ignored'' (which is repeated on page 159) is a simplification: in actuality, a blank or tab (or newline; actually any whitespace) in the format string causes scanf to skip whitespace (blanks, tabs, etc.) in the input stream.

A * character in a scanf conversion specifier means something completely different than it does for printf: for scanf, it means to suppress assignment (i.e. for that conversion specifier, there isn't a pointer in the argument list to receive the converted value, so the converted value is discarded). With scanf, there is no direct way of taking a field width from the argument list, as * does for printf.

Conversion specifiers like %d and %f automatically skip leading whitespace while looking for something to convert. This means that the format strings "%d %d" and "%d%d" act exactly the same--the whitespace in the first format string causes whitespace to be skipped before the second %d, but the second %d would have skipped that whitespace anyway. (Yet another scanf foible is that the innocuous-looking format string "%d\n" converts a number and then skips whitespace, which means that it will gobble up not only a newline following the number it converts, but any number of newlines or whitespace, and in fact it will keep reading until it finds a non-whitespace character, which it then won't read. This sounds confusing, but so is scanf's behavior when given a format string like "%d\n". The moral is simple: don't use trailing \n's in scanf format strings.)

page 158

Notice that, for scanf, the %e, %f, and %g formats are all the same, and signify conversion of a float value (they accept a pointer argument of type float *). To convert a double, you need to use %le, %lf, or %lg. (This is quite different from the printf family, which uses %e, %f, and %g for floats and doubles, though all three request different formats. Furthermore, %le, %lf, and %lg are technically incorrect for printf, though most compilers probably accept them.)

page 159

More precisely, the reason that you don't need to use a & with monthname is that an array, when it appears in an expression like this, is automatically converted to a pointer.

The dual-format date conversion example in the middle of page 159 is a nice example of the advantages of calling getline and then sscanf. At the beginning of this section, I said that ``when sscanf fails, you have complete control over what you do next.'' Here, ``what you do next'' is try calling sscanf again, on the very same input string (thus effectively backing up to the very beginning of it), using a different format string, to try parsing the input a different way.


Read sequentially: prev next up top

This page by Steve Summit // Copyright 1995, 1996 // mail feedback