Subsections of Arrays, Strings, and Files
Arrays
Arrays in C are, for the most part, the same as arrays in Java or C#. Here are the key differences:
- Arrays in C must be of a constant size (not a variable size from user input, for example)
- Arrays in C do not have an associated length field that keep track of the number of slots in the array (you must keep track of this information yourself)
Declaring
Here is the format for declaring an array:
type name[size];
Here, type is the type of elements you want to store in the array (like int), name is the name of the array, and size is how many slots you want to reserve. Note that size MUST be a constant.
Here is an example:
int nums[10];
However, the following will not compile because the size is given by a variable:
//this code will not compile as size is not constant
int size = 10;
int nums[size];
Initializing
Unlike Java, arrays in C are not initialized to any value. Instead, each slot in the array holds some random garbage value that is leftover in that spot in memory.
Here is how you could initialize all the values in the nums array to 0:
int i;
for (i = 0; i < 10; i++) {
nums[i] = 0;
}
Notice that the first index in the array is 0, and the last index is size-1. Also, recall that arrays do not have a length field – we must remember that we reserved 10 spaces for nums.
Arrays and Functions
Arrays can be passed to functions just like any other variable. Because arrays don’t have a length field, you will almost always want to pass the size of the array and the array itself. Here’s an example:
#include <stdio.h>
//function prototype – takes an array of ints and its size
void print(int[], int);
int main()
{
int nums[10];
int i;
for (i = 0; i < 10; i++)
{
nums[i] = i;
}
print(nums, 10);
return 0;
}
void print(int arr[], int size)
{
int i;
for (i = 0; i < size; i++)
{
printf("%d\n", arr[i]);
}
}
Multi-Dimensional Arrays
You can create multi-dimensional arrays in C by specifying extra dimensions at the time of declaration. For example, this declares a 5x10 array of characters:
char array[5][10];
The first dimension is the row and the second dimension is the column. To access an array element, specify the desired row and column number. For example:
//sets element at row 2, column 3 to 'A'
array[2][3] = 'A';
Be Careful!
If you access an array element in Java or C# with an index that is either negative or too big, you will get some kind of array index exception. Those languages will even tell you in what file and on what line the error occurred. C is not as friendly about this mistake. If you access an element with a bad index, such as:
int nums[10];
nums[10] = 0; //10 is past the bounds of the array
Then one of two things will happen:
- C will allow you to modify the memory that is just past the end of your array (where spot 10 would be if there were that many spots). This memory might belong to one of your other variables!
- Your program will crash with a segmentation fault (seg fault). You will see this error quite a bit when you get started in C – it means that you tried to access memory that isn’t yours. Unfortunately, the error message does not give you any information about where the problem occurred – you will have to find it yourself.
Strings
We saw before that there is no string type in C. This is true – but you can simulate a string by using an array of characters that is terminated with a special end-of-string character, ‘\0’.
String Variables
A string literal can be declared as follows:
char str[] = "Hello";
After this line, str references the following characters in memory:
0 | 1 | 2 | 3 | 4 | 5 |
---|---|---|---|---|---|
H | e | l | l | o | \0 |
We could have created the same string like this:
char str[6] = {'H', 'e', 'l', 'l', 'o', '\0'};
Arrays (and strings) are constant memory addresses. We can change values in strings and arrays, but we can’t change the memory address. So, we can do things like this:
str[0] = 'h'; //OK
But we can’t change the memory address (the entire string):
str = "hi"; //Compiler error!
Later in this section, you will see a function called strcpy
that copies the characters from one string to another.
String Input and Output
Strings can be inputted and outputted just like any other variable. To print a string, use printf
with the %s
control string character. To get a string as input, you can use scanf
(again with the %s
control string character). When reading in a string, scanf
will read characters up to but not including the first whitespace it encounters (’\n’, ’ ‘, ‘\t’, etc.).
Here’s an example:
char name[10];
printf("Enter your name: "); //Suppose you enter "Fred"
scanf("%s", name);
printf("Hello, %s!\n", name); //Will print "Hello, Fred!"
Notice that when you use scanf
to read in a string, you do not need to put an & in front of the string variable name. This is because a string is an array of characters, and arrays are already memory addresses. (We’ll learn more about this in the section on Pointers.)
The trouble with using scanf
to input strings is that the function doesn’t check the size of the array when it is reading input. So, if you typed the name “George Washington” in the above example (which needs 18 characters of space), scanf
wouldn’t stop writing once it reached the end of the array. Instead, it would try to write past the end of the array. This would
cause some of your variables to be overwritten, or a segmentation fault. In the worst case, it could be exploited by a hacker with a buffer overflow attack, where the hacker knowingly inserted program instructions beyond the bounds of the input buffer.
A better choice for reading in strings is the fgets
function. Here’s the prototype:
char[] fgets(char s[], int size, FILE *stream);
You pass fgets
the string buffer (s
), the size of the buffer (size
), and the stream you’re reading from (use stdin
to read as regular user input). It returns the string it read, or NULL
if it was unable to read anything.
It will stop reading user input when either:
- It has read
size-1
characters (it needs the last spot for a ‘\0’) - It has reached the end of the input
- It has reached a newline
Here is same example using fgets
:
char name[10];
printf("Enter your name: "); //Suppose you enter "Fred"
fgets(name, 10, stdin);
printf("Hello, %s", name); //Will print "Hello, Fred"
NOTE: if fgets
reaches a newline character before reading size-1
characters, it WILL store the newline as its last character (just before the \0
). If the user enters “Fred” in the example above, the name
array will hold: {'F', 'r', 'e', 'd', '\n', '\0', (garbage), (garbage), (garbage), (garbage)}
.
If you want to remove that \n
character, you will need to overwrite the \n
to hold the end-of-string character instead (\0
). We will see a convenient trick for doing this in the strcspn
section below.
Conversions
It is sometimes necessary to convert between strings, ints, and doubles.
From string to int/double
There are two conversion functions from a string to an int or double:
atoi
: converts from a string to an intatof
: converts from a string to a double
To use any of these functions, you need to add:
#include <stdlib.h>
To the top of the file.
Here is an example of using the conversion functions:
char buff[10];
int num;
double d;
printf("Enter an integer: ");
fgets(buff, 10, stdin); //Suppose you enter "47"
num = atoi(buff); //num = 47
printf("Enter a real number: ");
fgets(buff, 10, stdin); //Suppose you enter "4.75"
d = atof(buff); //d = 4.75
From int/double to string
The easiest way to convert from an int or double to a string is to use the sprintf
function, which is part of stdio.h
. sprintf
works exactly like printf
, but lets you “print” to a string instead of to standard out. You can either print a single int or double (thus converting it to a string), or you can print longer a longer string that mixes variable values with other text. Here is an example of using sprintf
:
char buff1[10];
char buff2[40];
int num = 7;
double dec = 14.23;
sprintf(buff1, "%d", num); //buff1 is now "7"
sprintf(buff2, "Decimal value: %lf.", dec); //buff2 is now "Decimal value: 14.23."
String Functions
Below is a list of common string functions. To use any of these, you need to add:
#include <string.h>
To to the top of the file.
strcat
char[] strcat(char str1[], char str2[]);
This function copies the characters in str2
onto the end of str1
. It returns the newly concatenated string (although str1
also references the concatenated string).
For example:
char str1[20];
char str2[20];
printf("Enter two words: "); //Suppose you entered "hi hello"
scanf("%s %s", str1, str2);
strcat(str1, str2); //str1 = "hihello", str2 = "hello"
strcmp
int strcmp(char str1[], char str2[]);
This function compares str1
and str2
to see which string comes alphabetically before the other. It returns:
- A number less than 0, if
str1
comes alphabetically beforestr2
- 0, if
str1
equalsstr2
- A number greater than 0, if
str1
comes alphabetically afterstr2
For example:
char str1[20];
char str2[20];
printf("Enter two words: "); //Suppose you entered "hi hello"
scanf("%s %s", str1, str2);
if (strcmp(str1, str2) < 0) {
printf("%s comes first\n", str1);
}
else if (strcmp(str1, str2) > 0) {
printf("%s comes first\n", str2);
}
else {
printf("The strings are equal\n");
}
The code above would print “hello comes first”.
strcpy
char[] strcpy(char str1[], char str2[]);
This function copies the characters in str2
into str1
, overwriting anything that was already in str1
. It returns the newly copied string (although str1
also references the copied string).
For example:
char src[20];
char dest[20];
printf("Enter a word: ");
scanf("%s", src); //Suppose you entered "hello"
strcpy(dest, src); //Now dest also holds "hello"
src[0] = 'B'; //Now src is "Bello", and dest is "hello"
strcspn
int strcspn(char str1[], char str2[]);
This function returns the number of characters that appear in str1
before reaching ANY character from str2
. (If the first character in str1
also appears in str2
, then strcspn
returns 0.)
For example:
char str[20];
int index;
printf("Enter a word: ");//Suppose you entered "hello"
scanf("%s", str);
index = strcspn(str, "la"); //index is 2
//2 characters appear in str before finding any character from "la"
strcspn
is especially handy for removing the trailing \n
that gets added to strings when using fgets
. As we saw earlier in this section, if we do:
char name[10];
printf("Enter your name: ");
fgets(name, 10, stdin);
And enter “Fred”, then the name
array will hold {'F', 'r', 'e', 'd', '\n', '\0', (garbage), (garbage), (garbage), (garbage)}
. We can use strcspn
to find the index of \n
and then replace it with a \0
:
name[strcspn(name, "\n")] = '\0';
When strcspn
gives us the number of characters read before reaching a \n
, that IS the index of \n
. In the same line, we can replace that position to be the end-of-string marker, which effectively deletes the newline from the end of the string.
strlen
int strlen(char str[]);
This function returns the number of characters in str
.
For example:
char str[20];
printf("Enter a word: "); //Suppose you entered "hello"
scanf("%s", str);
printf("%d\n", strlen(str)); //prints 5
strtok
char[] strtok(char str[], char delim[]);
This function returns the first token found in str
before the occurrence of any character in delim
. (After the first call to strtok
, pass NULL
as str
. This will tell it to continue looking for tokens in the original string.)
For example:
char buff[200];
char *token; //We'll learn about this notation in "Pointers"
printf("Enter names, separated by commas: ");
//Suppose you entered "Fred,James,Jane,Lynn"
scanf("%s", buff);
token = strtok(buff, ",");
while (token != NULL)
{
printf("%s\n", token);
token = strtok(NULL, ",");
}
The code above will print:
Fred
James
Jane
Lynn
strncpy
char[] strncpy(char str1[], char str2[], int n);
This function copies the first n
characters from str2
to str1
, overwriting anything that was already in str1
. It returns the newly copied string (although str1
also references the copied string).
strncmp
int strncmp(char str1[], char str2[], int n);
This function compares the first n
characters in str1
and str2
to see which length-n prefix comes first alphabetically. It returns:
- A number less than 0, if the first
n
characters instr1
come alphabetically before the firstn
characters instr2
- 0, if the first
n
characters instr1
equal the firstn
characters instr2
- A number greater than 0, if the first
n
characters instr1
come alphabetically after the firstn
characters instr2
strrchr
char[] strrchr(char str[], char c);
This function finds the LAST occurrence of c
in str
. It returns the suffix of str
that begins with the last occurrence of c
.
strspn
int strspn(char str1[], char str2[]);
This function returns the number of characters read in str1
before reaching a character that is NOT in str2
.
strstr
char[] strstr(char str1[], char str2[]);
This function determines whether str2
is a substring of str1
. If str2
is not a substring of str1
, it returns NULL
. If str2
is a substring of str1
, it returns the suffix of str1
beginning with the str2
substring.
Be Careful!
It’s very easy to make a mistake when using strings. Strings are arrays, so you will get in trouble if you try to access memory beyond the end of the array.
For example:
char buff[5];
printf("Enter a word: ");
//Suppose you enter "Hello"
scanf("%s", buff);
scanf
will copy the characters ‘H’, ’e’, ’l’, ’l’, ‘o’ into the array. However, it will then try to add the end-of-string character, ‘\0’, into the 6th spot in the array. This is past the end of the array, so your program will either crash with a segmentation fault, or you will overwrite the
value of some other variable. A lot of the string functions involve writing to strings, and none of them will handle an out-of-bounds error gracefully.
When you use the following functions, MAKE SURE you have enough memory allocated:
scanf
strcpy
strcat
strncpy
Files
This section contains information on opening a file, reading from a file, and writing to a file. I only cover how to interact with text files – it is also possible to read from and write to binary files.
Whenever you are doing file I/O, you need to add:
#include <stdio.h>
Opening a File
Before we can interact with a file, we need to open it. The fopen
function lets us open files for different kinds of input and output. Here’s the prototype:
FILE* fopen(char filename[], char mode[])
The FILE*
return type means that the function is returning the address of a FILE
object. We’ll learn more about pointers in the next section. If the file could not be opened, fopen returns NULL
.
Here, filename
is a string representation of the filename, such as “data.txt”. fopen
searches the current directory for the file if no absolute path is given. The string mode
specifies what type of operations you want to do on the file.
Here are the different options for the mode:
Mode | Description |
---|---|
“r” | Open for reading (file must exist) |
“w” | Open for writing (overwrites old data) |
“a” | Open for appending (creates file if necessary) |
“r+” | Open for reading and writing (file must exist) |
“w+” | Open for reading and writing (overwrites old data) |
“a+” | Open for reading and appending (opens at end of file) |
For example, we can open the file “data.txt” for reading, and print an error if we were unsuccessful:
FILE *fp = fopen("data.txt", "r");
if (fp == NULL) {
printf("Error opening file\n");
}
After we are done reading from a file or writing to a file, we must close the file with the fclose
function. Here’s the prototype:
int fclose(FILE* fp)
To close data.txt, we’d do:
fclose(fp);
Reading from a File
There are two major functions for reading from a file – fscanf
and fgets
. fgets
works exactly like we’ve seen before, except now we specify a FILE*
instead of stdin
. fscanf
works exactly like scanf
, except we first specify the FILE*
. We’ll start with fscanf
:
int fscanf(FILE *stream, char str[], variable addresses...)
fscanf
, like scanf
, returns the number of variables that were correctly read in. If it was unable to read any more input, the EOF
constant is returned. Thus we can compare the return value of fscanf
to EOF
to see if we’ve reached the end of the file.
Suppose the file data.txt looks like this (a bunch of names and ages, each on separate lines):
Bob 20
Jill 15
Tony 17
Lisa 22
We want to read this file, and print something like “Bob is 20 years old” to the console for each person in the file. Here’s how:
FILE *fp = fopen("data.txt", "r");
char name[20];
int age;
if (fp != NULL) {
while (fscanf(fp, "%s %d", name, &age) != EOF) {
printf("%s is %d years old\n", name, age);
}
fclose(fp);
}
Now, lets try to do the same thing with the fgets
function. Here’s the prototype:
char[] fgets(char s[], int size, FILE *stream)
fgets
reads a string from a specified file into the s
array. The size
parameter specifies the size of the string – it will not write past the end of the array. It returns a reference to the string that was read. If no string was read (specifying an error or the end of file), NULL
is returned.
fgets
will attempt to read size-1
characters unless it reaches a newline or the end of the file.
Here’s the same example repeated with fgets
:
FILE *fp = fopen("data.txt", "r");
char name[20];
char buf[30];
int age;
if (fp != NULL) {
while (fgets(buf, 30, fp) != NULL) {
//parse the current line
char *token = strtok(buf, " ");
strcpy(name, token);
//get the age
token = strtok(buf, " ");
age = atoi(token);
printf("%s is %d years old\n", name, age);
}
fclose(fp);
}
As we saw when using fgets
to read from stdin
, it WILL store the newline character at the end of each string when reading from a file (assuming there is still room in the array). You may want to use strcspn
to overwrite the \n
with a \0
.
Reading files with fscanf
is usually simpler (since it doesn’t involve parsing lines), but it is more error-prone than fgets
.
Writing to a File
The primary function for writing to a file is fprintf
. This function works exactly like printf
, but the first argument is now a FILE*
. Here’s the prototype:
int fprintf(FILE* fp, char str[], variables to print...)
Here is an example that will ask the user to input 10 numbers. Each number will be written on a separate line to the file out.txt:
FILE *fp = fopen("out.txt", "w");
if (fp != NULL) {
int num, i;
for (i = 0; i < 10; i++) {
printf("Type a number: ");
scanf("%d", num);
fprintf(fp, "%d\n", num);
}
fclose(fp);
}