Arrays in C: The Fundamental Concepts

Posts

An array in C is a fundamental data structure used to store a collection of elements. The single most important characteristic of an array is that all elements must be of the same data type. For example, you can have an array of integers, an array of floating-point numbers, or an array of characters, but you cannot mix these types within a single, standard C array. This homogeneity is a core principle of arrays in C and many other statically-typed languages.

The second key characteristic is that these elements are stored in a contiguous memory location. This means that the elements are placed side-by-side in the computer’s memory, one right after the other, with no gaps. This design choice is not accidental; it is the reason why arrays are so efficient. Because they are in a continuous block, the computer can calculate the exact location of any element very quickly, simply by knowing the starting location and the element’s index.

You can locate and access each individual element within the array with the help of its index. The index is a numerical value that specifies the position of an element. In C, arrays are zero-indexed, which means the first element is at index 0, the second element is at index 1, and so on. If an array has a size of 10 elements, the valid indices will be from 0 to 9. Understanding this zero-based numbering is one of the most crucial concepts for new C programmers.

In summary, an array is a fixed-size, contiguous block of memory that holds a sequence of elements of the same data type. The size of the array is fixed, meaning you must specify how many elements the array will hold at the time you declare it, and this size cannot be changed later. This is a defining feature of C arrays and differentiates them from more flexible (but less direct) data structures. We will explore the implications of this fixed size throughout this series.

Why Do We Need Arrays?

To understand the necessity of arrays, let’s first consider the alternative. Imagine you are writing a program to store the grades of five students. Without arrays, you would need to declare a separate variable for each student’s grade. You might write something like int grade1;, int grade2;, int grade3;, int grade4;, and int grade5;. This is manageable for five students, but it is already becoming clumsy.

Now, what if we have a large collection of these variables? Suppose you need to store the grades for a class of 100 students, or the daily high temperature for an entire year. You cannot be expected to declare 100 or 365 unique variables, such as int grade1;, int grade2;, all the way to int grade100;. This approach is not just tedious and time-consuming; it is fundamentally unscalable and makes your code impossible to manage.

Even more problematic than the declaration is the processing of this data. If you wanted to calculate the average grade for the 100 students, you would have to write a single, massive line of code: average = (grade1 + grade2 + grade3 + … + grade100) / 100;. This is a nightmare to write, read, and debug. You cannot use a loop, because each variable has a different name. This is the core problem that arrays are designed to solve.

Here comes the use of arrays in C. An array allows us to store multiple values of the same type under a single variable name. Instead of 100 different int variables, we can declare a single array: int grades[100];. Now, we have one variable name, grades, that represents 100 contiguous memory slots, each capable of holding an integer. This is the power of data aggregation.

With this array, processing the data becomes trivial. To calculate the average, we can use a loop. A simple for loop can iterate from index 0 to 99, accessing each element by its index (grades[0], grades[1], etc.). We can add them to a sum variable inside the loop. This reduces 100 lines of addition to a simple, 3-line loop. Hence, we use an array in C when we are working with a large number of similar items that we need to process systematically.

The Core Concept: Contiguous Memory

We have mentioned that arrays are stored in “contiguous memory locations.” This is perhaps the most important technical detail to understand about them. When you declare an array, you are asking the operating system to find and reserve a single, unbroken block of memory large enough to hold all the elements you requested. If you ask for an array of 10 integers, and an integer on your system takes 4 bytes, you are asking for a 40-byte block of memory.

Let’s visualize this. Imagine the computer’s memory is a long street of houses, and each house has an address. When you declare int arr[5];, the compiler finds five empty, adjacent lots and builds five “integer-sized” houses. If the first house (element arr[0]) is at memory address 1000, the second house (arr[1]) will be at address 1004 (assuming a 4-byte int). The third (arr[2]) will be at 1008, the fourth (arr[3]) at 1012, and the fifth (arr[4]) at 1016.

This contiguous layout is what makes array access so fast. When you ask for arr[3], the computer does not have to search for it. It performs a simple calculation. It takes the starting address of the array (1000), and adds the “offset.” The offset is the index (3) multiplied by the size of the data type (4 bytes). So, 1000 + (3 * 4) = 1012. The computer jumps directly to memory address 1012 and retrieves the value. This is called “random access” and it takes a constant amount of time, O(1), no matter how large the array is.

This contrasts with other data structures like linked lists, where elements can be scattered all over memory. In a linked list, to find the fourth element, you must start at the first, follow its pointer to the second, follow its pointer to the third, and finally follow its pointer to the fourth. This is a sequential operation that gets slower as the list gets longer. The contiguous nature of arrays is their primary performance advantage.

Declaring an Array in C

Before you can use an array, you must “declare” it. This declaration tells the C compiler three essential things about the array: the data type of the elements, the name of the array, and the size of the array. The basic syntax for declaring an array in C is given here. Data_type array_name[ array_size ];

Let’s break down each part. The Data_type is any valid C data type. This specifies what kind of items will be stored in the array. This can be int for integers, float for floating-point numbers, char for characters, or even more complex types like structs. All elements in the array must share this exact type.

The array_name is the identifier you will use to refer to the array. It follows the same rules as naming any other variable in C. For example, you might choose a name like grades, temperatures, or sensor_readings to make your code more readable. It is good practice to choose plural nouns for array names, as they represent a collection of items.

The array_size is the most critical part of the declaration. It is a value, enclosed in square brackets [], that specifies exactly how many elements the array can hold. In classic C (C89/90 standard), this size must be a constant integer value, like 10, or a constant expression that the compiler can evaluate, such as a macro defined with #define SIZE 10. This is because the compiler needs to know precisely how much memory to reserve for the array when it compiles the program.

Let us name our array “arr” and declare it using the above syntax. The data type of our array elements will be integer, and the size is 6. The declaration would look like this: int arr[6]; When the compiler sees this line, it will set aside a continuous block of memory large enough for 6 integers. If an int on this system is 4 bytes, this declaration reserves 6 * 4 = 24 bytes of memory.

Understanding Array Indexing

Once an array is declared, its elements are accessed using an “index” (also called a “subscript”). The index is specified by placing an integer expression inside the square brackets [] after the array’s name. For example, to access an element in our array arr, we would write arr[index]. A common point of confusion for beginners is that array indexing in C is “zero-based.” This means the first element of the array is at index 0, not index 1.

For our declared array, int arr[6];, there are six elements. The valid indices for this array are 0, 1, 2, 3, 4, and 5. The first element is arr[0]. The second element is arr[1]. The last element is arr[5]. The size of the array is 6, so the last valid index is always size – 1. This pattern is universal in C.

This zero-based indexing system is not arbitrary. It is a direct consequence of how memory access works. The index is actually an “offset” from the starting address of the array. The name of the array, arr, represents the starting memory address of the entire block. When you write arr[0], you are telling the computer to get the value at an offset of 0 elements from the start. When you write arr[3], you are asking for the value at an offset of 3 elements from the start.

This is why accessing arr[0] is the first element. Attempting to access arr[6] in an array of size 6 is a very common and dangerous error. This index is “out of bounds.” The C compiler and runtime system typically do not stop you from doing this. Instead, your program will access the memory immediately after the array, which could be holding another variable, or just be garbage data. This “buffer overflow” is a source of many bugs and security vulnerabilities.

The Static Nature of C Arrays

As we have mentioned, the size of a standard C array must be specified at compile time. This is what we mean when we say arrays in C are “static.” The compiler must know the exact size of the array when it is building the final executable program. This is because standard arrays are typically allocated on the “stack,” a region of memory that is managed automatically as functions are called and returned. The compiler needs to generate instructions to reserve a fixed-size block on the stack for the array.

This fixed-size nature has significant trade-offs. The main advantage is efficiency. Allocation on the stack is extremely fast—it is just a matter of moving a single “stack pointer” by a fixed amount. There is no complex memory management overhead. The memory is also automatically reclaimed when the function exits, which prevents memory leaks.

However, the disadvantages are equally significant. The first is memory wastage. If you are not sure how much data you will need, you must guess a maximum size. If you declare an array int data[1000]; to be safe, but you only end up using 50 elements, you have wasted the memory for the other 950 integers. This can be a serious problem in memory-constrained systems.

The second, and often more severe, disadvantage is the risk of overflow. If you declare int data[100]; but you then try to store 101 elements, you will write data “out of bounds.” This can corrupt other variables on the stack, leading to unpredictable program behavior and crashes. This lack of flexibility is the primary motivation for “dynamic” data structures, which we will compare arrays to in a later part of this series.

It is important to note that the C99 standard introduced “Variable Length Arrays” (VLAs), which allow you to declare an array using a variable for the size, like int n = 10; int arr[n];. However, VLAs have their own set of rules and risks (like stack overflow if n is too large) and they were controversially made an optional feature in the C11 standard. For this reason, many C programmers stick to the classic fixed-size arrays for maximum portability and safety.

Initialization: Giving Arrays Their First Values

When you declare an array, such as int arr[6];, you have only reserved the memory. You have not specified what values should be stored in those six memory slots. In C, if you declare an array inside a function (a “local” array), its contents are “uninitialized.” This means the memory slots will contain whatever “garbage” values were left over in that part of memory from previous program operations.

Using these garbage values is a common source of bugs, as they can lead to unpredictable calculations. Therefore, it is essential to “initialize” your array, which means giving it a set of starting values. C provides several ways to do this. The most common method is to initialize the array at the same time you declare it, using an “initializer list.” This is a list of values enclosed in curly braces {}.

For example, we can declare and initialize our array of six integers in one line: int arr[6] = {1, 4, 8, 25, 2, 17}; The compiler will see this and place 1 into arr[0], 4 into arr[1], 8 into arr[2], 25 into arr[3], 2 into arr[4], and 17 into arr[5]. This is the clearest and most direct way to set up an array.

A special feature of this syntax is “partial initialization.” What if you provide fewer values than the array’s size? int arr[10] = {1, 2, 3}; In this case, the compiler will initialize the first three elements as specified (arr[0]=1, arr[1]=2, arr[2]=3). The C standard guarantees that all remaining elements in the array will be automatically initialized to zero. This is a very useful and common idiom for creating a large array and ensuring it is “zeroed out.”

If you want to initialize all elements of an array to zero, you can simply write: int arr[10] = {0}; This initializes the first element to 0, and the rule of partial initialization then sets the remaining 9 elements to 0 as well. This is a concise and efficient way to ensure your array starts in a clean, predictable state.

Another convenient syntax is initializing an array without specifying the size. If you provide an initializer list, you can let the compiler count the elements for you by leaving the square brackets empty. int arr[] = {1, 4, 8, 25, 2, 17}; The compiler will see that you provided 6 values, so it will automatically create the array with a size of 6. This is very handy, as it means you can add or remove elements from your initializer list without having to manually update the array size in the declaration.

Data Types in Arrays

The rule that an array must hold elements of a “similar data type” is a cornerstone of C’s type system. This property is known as being “homogeneous.” Let’s explore what this means in practice. You can declare an array of any of C’s built-in fundamental types. For example, float prices[100]; creates an array that can hold 100 floating-point numbers, which are numbers with decimal points.

Similarly, double high_precision_pi[50]; creates an array to hold 50 double-precision floating-point numbers, which offer more precision than float. And char name[30]; creates an array of 30 characters. This last example is particularly special in C. An array of characters is the standard way to represent a “string,” or a piece of text. We will dedicate a large section to character arrays in a later part.

The homogeneity rule also applies to user-defined data types. In C, you can create your own complex data types using struct. For example, you could define a struct to represent a student: struct Student { int student_id; float gpa; char name[50]; }; This Student type is now a valid data type in your program. You can create an array of this type to store data for an entire class: struct Student class_roll[30];

This declaration creates a contiguous block of memory large enough to hold 30 Student structures. Each element of the array, such as class_roll[0] or class_roll[1], is a complete Student structure. You can then access the members of that structure using the dot . operator, for example: class_roll[0].student_id = 101; or class_roll[0].gpa = 3.5;.

You can also store pointers in an array. int* pointer_array[10]; creates an array that holds 10 pointers to integers. This is a powerful and advanced technique used for many purposes, such as creating an array of strings (which is an array of pointers to characters). The key takeaway is that no matter how simple or complex the data type is, all elements within one array must be of that same type.

Arrays and Memory: A Deeper Look

Let’s quantify the memory usage of an array. Because all elements are the same size and are stored contiguously, the total memory occupied by an array is a simple multiplication: total_bytes = array_size * sizeof(element_type). The sizeof operator in C is a compile-time tool that tells you how many bytes a particular data type (or variable) occupies in memory.

Let’s use our example int arr[6];. To find its total size, we would use sizeof(arr). The compiler, which knows arr is an array of 6 integers and knows sizeof(int) is (for example) 4 bytes, will replace sizeof(arr) with the value 24. This is very useful.

You can also find the size of a single element by using sizeof(int) or, more robustly, sizeof(arr[0]). This gives you the size of the first element, which is the same as all other elements. This leads to a very common and important C idiom for calculating the number of elements in an array, especially when you have used the [] syntax to let the compiler set the size.

Imagine you have int arr[] = {1, 4, 8, 25, 2, 17};. You know the size is 6, but if this list was 100 items long, you would not want to count it. You can have the program calculate the length for you: int length = sizeof(arr) / sizeof(arr[0]); Here, sizeof(arr) is the total size of the array in bytes (24). sizeof(arr[0]) is the size of one element in bytes (4). 24 / 4 = 6. This formula will always give you the correct number of elements in the array. This is critical for writing loops, as it allows your loop to automatically adapt if you change the size of the array’s initializer list.

This sizeof trick, however, comes with a massive warning that we will explore in Part 3. It only works in the same scope where the array was originally declared. If you pass an array to another function, the array “decays” into a pointer. Inside that function, sizeof(arr) will just give you the size of a pointer (e.g., 8 bytes), not the size of the whole array. This is why you must always pass the size of an array to a function as a separate argument.

Common Pitfalls for Beginners

Arrays are simple in concept but have several common “gotchas” that trip up new C programmers. The most common is the “off-by-one” error. This happens when you misunderstand the zero-based indexing. For an array int arr[10];, the elements are 0 through 9. A beginner will often write a loop that runs from 1 to 10. for (int i = 1; i <= 10; i++) { arr[i] = 0; } This loop is wrong in two ways. It skips arr[0], the first element. Even worse, it tries to write to arr[10], which is out of bounds and will corrupt memory.

The correct loop structure in C for an array of size N is almost always: for (int i = 0; i < N; i++) { … use arr[i] … } This “start at 0, go up to but not including N” pattern is the standard C idiom for iterating over arrays. It correctly accesses indices 0, 1, 2, …, N-1.

Another common pitfall is using an uninitialized array. A programmer declares int arr[10]; and then immediately tries to use it in a calculation, like int sum = arr[0] + arr[1];. The program will compile, but the sum will be a meaningless number based on whatever garbage values were in memory. You must always initialize your data before you read it.

A third pitfall is confusing assignment with comparison. This is a general C error but often appears with arrays. A beginner might write if (arr[i] = 5) { … } (using a single =) intending to check if the element is 5. Instead, this assigns the value 5 to arr[i] and the if statement will always evaluate to true (since 5 is non-zero). The correct operator is the “is equal to” operator, ==, as in if (arr[i] == 5) { … }.

Finally, as mentioned, attempting to use sizeof to find the length of an array inside a function is a classic error. It will not work. The solution is to pass the length as an explicit function parameter. Understanding these common mistakes is the first step to avoiding them and writing robust, correct C code.

Recap: Declaration vs. Initialization

In the first part of our series, we established the fundamental concepts of C arrays. We learned that an array is a fixed-size, contiguous block of memory holding elements of a single, homogeneous data type. We also briefly touched on the difference between “declaration” and “initialization.” It is crucial to solidify this distinction, as it is a common source of confusion. A “declaration” is what reserves the memory. int arr[10]; This line is a declaration. It tells the compiler, “I need space for 10 integers,” and the compiler reserves that space (e.g., 40 bytes) on the stack. At this point, the memory is reserved, but its contents are “indeterminate” or “garbage.”

An “initialization,” by contrast, is the act of giving an array its first set of values, at the same time it is declared. int arr[10] = {1, 2, 3}; This is both a declaration and an initialization. It reserves space for 10 integers and populates the first three with 1, 2, and 3, while setting the remaining seven to 0. An “assignment,” on the other hand, happens after declaration. It is the act of changing a value that is already in the array. arr[4] = 99; This is an assignment. You are not initializing the array; you are updating an existing element.

It is important to know that C provides a special, convenient syntax for initialization (the {} list) that is only available at the time of declaration. You cannot use this syntax for assignment later. For example, the following code is illegal in C: int arr[10]; arr = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; // ILLEGAL The compiler will not accept this. The array name arr is not a modifiable variable; it represents a constant starting address. You cannot “assign” a new list to the entire array at once. If you want to populate an array after declaring it, you must do so one element at a time, typically by using a loop.

Initializing Array In C During Declaration

Let’s take a deeper look at the most common and recommended way to initialize an array: during its declaration. This method is clear, concise, and safe, as it ensures the array never contains garbage data. The syntax, as we have seen, uses a comma-separated list of values enclosed in curly braces {}. Data_type name_of_array [ size ] = { value1, value2, …, valueN };

Let’s look at an example with various data types. For an integer array: int arr[6] = {1, 4, 8, 25, 2, 17}; For a floating-point array: float prices[4] = {1.99, 10.50, 3.14, 0.75}; For a character array: char vowels[5] = {‘a’, ‘e’, ‘i’, ‘o’, ‘u’}; In each case, the number of values in the list matches the size specified in the square brackets. This is the simplest, most direct scenario.

As we covered in Part 1, if you provide fewer initializers than the array size, the remaining elements are automatically set to zero. This is a very powerful feature. int histogram[256] = {0}; This line declares an array of 256 integers. It explicitly initializes the first element, histogram[0], to 0. Because it is a partial initialization, the C standard guarantees that all other elements, from histogram[1] to histogram[255], are also initialized to 0. This is the standard idiom for creating and “zeroing out” an array.

What happens if you provide more initializers than the array size? int arr[3] = {10, 20, 30, 40}; // COMPILER ERROR The C compiler will raise an error. This is a helpful safety feature. It prevents you from accidentally trying to stuff more data into the array than you allocated space for, which would lead to a buffer overflow. The compiler recognizes that the size of your initializer list exceeds the declared size of the array and stops the program from being built.

Initializing Array In C Without Size

The compiler error we just saw highlights a common maintenance problem. What if you have a long list of initial values and you update it frequently? int prime_numbers[10] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29}; Now, if you want to add the next prime number, 31, you have to remember to update the size in two places: int prime_numbers[11] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31}; Forgetting to change the [10] to [11] would result in a compiler error. This is annoying and error-prone.

To solve this, C provides a convenient feature: if you are initializing an array during its declaration, you can omit the size in the square brackets. The compiler will automatically count the number of elements in your initializer list and make the array that exact size. Data_type name_of_array [ ] = { value1, value2, …, valueN };

Let’s use our prime number example: int prime_numbers[] = {2, 3, 5, 7, 11, 13, 17, 19, 23, 29}; The compiler will count 10 elements and create an array prime_numbers of size 10. If you later add 31 to the list, the compiler will automatically create an array of size 11. This makes your code much easier to maintain and less error-prone.

This method is highly recommended whenever you are declaring an array with a full list of initial values. It is also often used for character arrays that represent strings. The following two declarations are equivalent: char greeting[6] = “Hello”; char greeting[] = “Hello”; In C, a string literal like “Hello” is treated as a character array that includes a special “null-terminating” character, \0, at the end. This character marks the end of the string. So, “Hello” actually contains 6 characters: ‘H’, ‘e’, ‘l’, ‘l’, ‘o’, and ‘\0’. When you use the empty brackets [], the compiler counts all 6 and sizes the array perfectly.

Declaring Array in C with Loops

What if you want to initialize an array, but not with a fixed set of constants? What if you want to populate it based on a formula, for example, the first 100 even numbers? You cannot use the {} initializer list for this, as it requires constant values. In this scenario, you must first declare the array and then use a loop to assign values to its elements. This is the most common method for populating an array at runtime.

First, you declare an array of the required size. Since you are not initializing it, you must provide an explicit size. int even_numbers[100]; At this moment, the even_numbers array exists, but it is filled with garbage values. We must now loop through it and assign the correct value to each “slot.” A for loop is the perfect tool for this, as it is designed to repeat an action a specific number of times. We need to loop 100 times, for indices 0 through 99.

The loop would look like this: for (int i = 0; i < 100; i++) { … } Inside this loop, the variable i will take on the values 0, 1, 2, 3, and so on, all the way up to 99. This variable i is our array index. For each index i, we need to calculate the corresponding even number. The first even number (at index 0) is 0. The second (at index 1) is 2. The third (at index 2) is 4. The formula is value = i * 2.

So, we place this assignment inside our loop: for (int i = 0; i < 100; i++) { even_numbers[i] = i * 2; } When i is 0, even_numbers[0] is set to 0 * 2 = 0. When i is 1, even_numbers[1] is set to 1 * 2 = 2. When i is 99, even_numbers[99] is set to 99 * 2 = 198. After this loop finishes, our array is fully and correctly populated. This technique of “declare, then loop-to-assign” is fundamental.

This method is also used for reading data from an external source, such as user input. If you want to ask the user to enter 5 numbers, you would first declare the array: float values[5]; Then, you would loop to ask for each value: printf(“Please enter 5 numbers:\n”); for (int i = 0; i < 5; i++) { scanf(“%f”, &values[i]); } This loop will pause 5 times, and on each iteration, it will store the user’s input into the next available slot in the array, values[0], values[1], and so on.

Accessing Elements in Arrays

We have already seen the syntax for accessing array elements, as it is used in both initialization loops and for updating values. But let’s formalize it. Accessing an element means retrieving the value stored at a specific position in the array. This is done using the array name followed by the index in square brackets []. variable = array_name[index];

Suppose we have our initialized array: int arr[6] = {1, 4, 8, 25, 2, 17}; If we want to get the third element and store it in a new variable, we would use index 2 (since indexing starts at 0). int third_element = arr[2]; After this line, the variable third_element will hold the value 8.

You can use this access syntax anywhere you would use a regular variable. You can print it directly: printf(“The first element is: %d\n”, arr[0]); This would print “The first element is: 1”. You can also use it in calculations: int sum_of_first_two = arr[0] + arr[1]; This would calculate 1 + 4, and sum_of_first_two would be set to 5.

This ability to quickly retrieve any element is called “random access.” As we discussed in Part 1, this operation is extremely fast (O(1), or constant time) because the computer just does a simple address calculation: start_address + (index * element_size). It does not matter if you are accessing arr[1] or arr[100000]; the time it takes to find the element is the same.

This random access is the primary reason for using an array. It is ideal for situations where you need to look up data by its position. If you have data where the position is meaningful (e.g., “the 5th student,” “the 10th day of the month”), an array is a natural and efficient choice. Quick and easy retrieval of data items is a key advantage.

Update Element inside an Array

Updating an element, or assigning it a new value, uses the exact same syntax as accessing, but it is used on the left-hand side of an assignment operator (=). The array access expression array_name[index] acts as a “modifiable L-value,” meaning it represents a specific memory location that you can write a new value into. The syntax is: array_name[index] = new_value;

Let’s use our array again: int arr[6] = {1, 4, 8, 25, 2, 17}; At this point, arr[0] holds the value 1. If we want to update this element to a new value, say 9, we would write: arr[0] = 9; After this line, the array’s contents are {9, 4, 8, 25, 2, 17}. The original value 1 is overwritten and lost.

This can be done with any valid index. If we want to change the last element, we use index 5: arr[5] = 99; The array’s contents are now {9, 4, 8, 25, 2, 99}.

You can also make the new value dependent on the old value. This is very common. For example, to increment the third element by one: arr[2] = arr[2] + 1; This line first accesses arr[2] (which is 8), adds 1 to it (resulting in 9), and then assigns that new value 9 back into the arr[2] slot. The array is now {9, 4, 9, 25, 2, 99}. This can be written more concisely using C’s increment operators: arr[2]++; This line has the exact same effect.

This ability to update elements in-place is essential. It allows you to modify your data as your program runs. A common example is “sorting” an array. A sorting algorithm works by repeatedly comparing two elements and swapping their positions if they are in the wrong order. This “swap” is just a series of update operations using a temporary variable. We will see a full example of this shortly.

Example of Array in C: Sorting

Let’s take a practical example to understand the concepts of array access, update, and loop-based manipulation. We will write a simple program that takes an array of numbers and sorts them into descending order (from largest to smallest). We will use a basic but easy-to-understand sorting algorithm called “Bubble Sort.” This algorithm works by repeatedly stepping through the list, comparing each pair of adjacent items, and swapping them if they are in the wrong order.

First, let’s include our standard I/O header and define our main function. We will declare and initialize an array of 6 integers. #include <stdio.h> int main() { int arr[6] = {1, 4, 8, 25, 2, 17}; int n = 6; int i, j, temp; We declare n=6 to hold the size, and i and j to be our loop counters. The temp variable will be used for swapping.

The logic for Bubble Sort requires two nested loops. The outer loop (i) runs from 0 to n-1. The inner loop (j) runs from i+1 to n-1. This compares arr[i] with every element that comes after it. for (i = 0; i < n; i++) { for (j = i + 1; j < n; j++) { … } }

Inside the inner loop, we make our comparison. We want to sort in descending order, so if the element later in the array (arr[j]) is greater than the element earlier (arr[i]), they are in the wrong order and we must swap them. if (arr[j] > arr[i]) { … // swap logic }

To swap arr[i] and arr[j], we need our temp variable. We first store the value of arr[i] in temp. Then, we overwrite arr[i] with the value of arr[j]. Finally, we put the original value (now in temp) into arr[j]. temp = arr[i]; arr[i] = arr[j]; arr[j] = temp; This three-step process is the standard way to swap two variables in C.

After these nested loops have finished, the array will be sorted. We can then add one more loop to print the results. printf(“Printing Sorted Element List (Descending):\n”); for (i = 0; i < n; i++) { printf(“%d\n”, arr[i]); } return 0; } When you run this program, the output will be: 25, 17, 8, 4, 2, 1. This example demonstrates every concept we have discussed: declaration, initialization, using loops for traversal, and accessing/updating elements to manipulate the data.

Advantages of this Approach

The loop-based approach to initialization and manipulation is the workhorse of C programming. Its main advantage is its flexibility. It allows you to populate arrays with values that are not known at compile time. These values can come from user input (scanf), a file, a sensor reading, or the result of a complex calculation.

It also allows you to work with arrays that are far too large to initialize manually. Nobody is going to type a 10,000-element initializer list. But you can easily declare int data[10000]; and then use a loop to read 10,000 data points from a file. This scalability is essential for real-world applications.

Furthermore, this approach separates the allocation of memory from the population of data. This can be a useful design pattern. A function might be responsible for creating an array and passing it to another function, which is then responsible for filling it with data. This modularity makes code cleaner and easier to maintain.

The example code in the original article contains a mix of C (printf) and C++ (iostream, cout) and a logical error. The C++ example attempts to initialize an array elements from another uninitialized array arr. A correct C++ version would be simpler, but for C, the sorting example we just built is a much clearer and more functional demonstration of array manipulation using loops.

Common Errors in Initialization and Access

The most common error is the “off-by-one” error, which we mentioned in Part 1. It is worth repeating. Accessing arr[6] in an array of size 6 is undefined behavior. Your program might crash, it might corrupt other data, or it might appear to work, which is the most dangerous outcome, as the bug will be hidden. Always loop from 0 to size – 1.

Another error is forgetting to initialize an array before reading from it. If you declare int arr[10]; and then immediately try to printf(“%d”, arr[0]);, you will print a garbage value. Always make sure an array element has been assigned a value before you access (read) it. The int arr[10] = {0}; trick is your best friend to prevent this.

A more subtle error is related to array types. C will happily let you write float f = arr[0]; even if arr is an int array. The compiler will perform an “implicit type conversion,” turning the integer 1 into a float 1.0. This can sometimes be what you want, but it can also lead to loss of precision if you assign a float to an int, as the decimal part will be truncated.

Finally, remember that you cannot assign arrays to each other. int a[5] = {1, 2, 3, 4, 5}; int b[5]; b = a; // ILLEGAL An array’s name is not a variable that can be reassigned. It is a constant address. If you want to copy the contents of array a into array b, you must do it element by element, using a loop. for (int i = 0; i < 5; i++) { b[i] = a[i]; } This is a fundamental concept that we will explore much more in the next part, when we discuss the deep relationship between arrays and pointers.

The Deep Connection: Arrays and Pointers

In the C programming language, there is an intimate and fundamental relationship between arrays and pointers. Understanding this connection is arguably the single most important “aha!” moment for any aspiring C programmer. It clarifies why arrays behave the way they do, especially when they are passed to functions. The most important rule to learn is this: in most contexts, the name of an array “decays” into a pointer to its first element.

Let’s say you have an array: int arr[10]; When you use the name arr in your code (for example, passing it to a function), you are not passing the entire 10-integer block of memory. Instead, the compiler automatically converts arr into a pointer of type int* that holds the memory address of the first element of the array. That is, arr becomes equivalent to &arr[0].

This is why you cannot assign one array to another, as in a = b;. You are trying to assign a constant address to another constant address, which is illegal. This is also why the [] index operator is just a convenient syntax. The compiler translates your easy-to-read code into pointer arithmetic. When you write arr[3], the compiler internally translates this to *(arr + 3).

This expression, *(arr + 3), means “take the starting address arr, add 3 element sizes to it, and then dereference that new address (get the value at that location).” This is known as “pointer arithmetic.” This connection explains why array access is so fast. It is just a single addition and a memory lookup. It also explains why C arrays are zero-indexed. The first element is at *(arr + 0), which is simply *arr, the value at the starting address.

Pointer Arithmetic Explained

Let’s break down pointer arithmetic, as it is key to understanding arrays. When you add an integer to a pointer, you are not adding that many bytes. You are adding that many elements. The compiler is smart enough to know the size of the data type the pointer points to. Let’s use our int arr[10]; and assume it starts at memory address 1000, and sizeof(int) is 4 bytes.

The name arr decays to a pointer to arr[0], so arr has the value (address) 1000. If you write arr + 1, you are not calculating 1000 + 1 = 1001. The compiler knows arr is an int*, so it calculates 1000 + (1 * sizeof(int)), which is 1000 + (1 * 4) = 1004. This is the memory address of arr[1]. Similarly, arr + 3 means 1000 + (3 * sizeof(int)), or 1000 + (3 * 4) = 1012. This is the memory address of arr[3].

Now, let’s look at the “dereference” operator, *. This operator means “get the value at this address.” *arr is the value at address 1000, which is arr[0]. *(arr + 1) is the value at address 1004, which is arr[1]. *(arr + 3) is the value at address 1012, which is arr[3]. This shows that arr[i] is just “syntactic sugar” (a convenient alternative syntax) for *(arr + i).

This interchangeability is complete. You can even use the pointer syntax with array indexing, though it is very confusing and not recommended: 3[arr] is a perfectly legal, if bizarre, way to write arr[3]. The compiler sees 3[arr] and converts it to *(3 + arr), which is identical to *(arr + 3), which is the same as arr[3]. This is a common “trick” question in C interviews, but it perfectly illustrates that the [] operator is just a commutative addition operation followed by a dereference.

Passing Arrays to Functions

This array-pointer decay is most important when you pass an array to a function. Let’s say you have an array in main and you want to write a function sum_array to calculate the sum of its elements. In main, you declare: int numbers[] = {10, 20, 30, 40, 50}; int total = sum_array(numbers, 5); You call the function, passing the array numbers.

When the sum_array function receives numbers, it does not receive a copy of the entire 5-element array. Copying large arrays would be extremely slow and memory-intensive. Instead, due to array decay, the function receives only a pointer to the first element. The function “signature” (its declaration) must reflect this. You have two ways to write it, and they are exactly identical to the compiler.

The “array syntax” way: int sum_array(int arr[], int size) { … } The “pointer syntax” way: int sum_array(int* arr, int size) { … } Even if you use the first syntax with int arr[], the compiler immediately converts it to int* arr. The empty brackets [] are just a visual clue to the human programmer that you expect a pointer to the start of an array, not just a pointer to a single integer.

The sizeof Trap

The fact that arrays decay to pointers inside functions leads to the most common and dangerous trap for C beginners: the sizeof operator. In Part 1, we learned a trick to get the length of an array: int length = sizeof(arr) / sizeof(arr[0]); This trick ONLY works in the same scope (the same function) where the array was originally declared with a fixed size.

Let’s see what happens if we try to use this inside our sum_array function: int sum_array(int arr[], int size) { int length = sizeof(arr) / sizeof(arr[0]); // THIS IS WRONG … } This will not work. Inside this function, arr is no longer an “array of 5 integers.” It has decayed into a simple int* (a pointer to an integer). sizeof(arr) will not give you the size of the original array (20 bytes). It will give you the size of a pointer on your system, which is typically 4 or 8 bytes. sizeof(arr[0]) will be sizeof(int) (4 bytes). If you are on a 64-bit system (8-byte pointers), length will be calculated as 8 / 4 = 2, regardless of whether the original array had 5 or 5,000 elements.

This is why you must pass the size of the array as a separate argument to the function. Our function call was sum_array(numbers, 5);. We explicitly passed the 5. The correct function implementation would be: int sum_array(int* arr, int size) { int sum = 0; for (int i = 0; i < size; i++) { sum += arr[i]; // or sum += *(arr + i) } return sum; } This function now correctly uses the size parameter to control its loop, not the broken sizeof trick.

Modifying Array Data Within a Function

A critical consequence of “pass-by-pointer” is that the function gets a pointer to the original data, not a copy. When you pass a simple variable like int x to a function, the function gets a copy of x. If the function modifies its copy, the original x in main is unchanged. This is called “pass-by-value.”

But when you pass an array, you are passing the memory address of its first element. The function now has direct access to the original array’s memory. This means any modifications the function makes to the array elements are permanent and will be visible back in the main function. This is effectively “pass-by-reference,” and it is the default behavior for arrays.

Let’s write a function zero_out_array that takes an array and sets all its elements to 0. void zero_out_array(int* arr, int size) { for (int i = 0; i < size; i++) { arr[i] = 0; } }

Now, in our main function: int main() { int numbers[] = {10, 20, 30, 40, 50}; printf(“Before: %d\n”, numbers[0]); // Prints “Before: 10” zero_out_array(numbers, 5); printf(“After: %d\n”, numbers[0]); // Prints “After: 0” return 0; } As you can see, the zero_out_array function permanently modified the numbers array that lived in the main function. This is extremely powerful, but also requires caution. You must be aware that any function you pass an array to can modify your original data.

Using const to Protect Array Data

Sometimes, you want to pass an array to a function just to read it (like in our sum_array example) but you want to guarantee that the function does not accidentally modify your data. This is a good practice for writing safe, “read-only” functions. C provides the const keyword for this purpose.

If you add const to the pointer declaration in the function signature, you are telling the compiler that this function is not allowed to change the data the pointer points to. int sum_array(const int* arr, int size) { … } The const keyword here means “arr is a pointer to an integer that is constant.” The integer itself cannot be changed through this pointer.

Now, inside your sum_array function: int sum = 0; for (int i = 0; i < size; i++) { sum += arr[i]; // This is OK, we are only reading. } But what if a programmer makes a mistake and tries to modify the array? for (int i = 0; i < size; i++) { sum += arr[i]; arr[i] = 0; // COMPILER ERROR } The C compiler will see this line and generate an error, something like “assignment of read-only location arr[i].” This is a fantastic safety feature. It allows you to enforce at compile time that your function only reads data and has no side effects.

As a best practice, any time you write a function that takes an array but should not modify it, you should declare the array parameter as const. This makes your code safer, more self-documenting (it tells other programmers your intent), and can even allow the compiler to perform certain optimizations.

Array Name vs. a Real Pointer

So, if the array name arr just decays to a pointer, is it exactly the same as a pointer? Not quite. There are two important exceptions. The first exception, as we have seen, is the sizeof operator. int arr[10]; int* p = arr; Here, sizeof(arr) is 10 * sizeof(int) (e.g., 40 bytes). But sizeof(p) is just the size of a pointer (e.g., 8 bytes). The array name arr is not just a pointer; it is a “non-modifiable l-value” that refers to the entire block of 10 integers, and sizeof is smart enough to know this.

The second exception is the & (address-of) operator. When you use arr (or &arr[0]), you get a pointer to the first element. The type is int* (pointer to an integer). When you use &arr, you get a pointer to the entire array. The type is int (*)[10] (a pointer to an array of 10 integers). This is a very subtle and confusing distinction.

Let’s look at the values: arr (decays to &arr[0]) might have the value (address) 1000. Its type is int*. &arr also has the value (address) 1000. Its type is int (*)[10]. They both point to the same location, but they are of different types. Where does this matter? Pointer arithmetic.

If you take arr + 1, you get 1000 + (1 * sizeof(int)) = 1004. If you take (&arr) + 1, you get 1000 + (1 * sizeof(arr)), which is 1000 + (1 * 40) = 1040. (&arr) + 1 gives you the address of the memory after the entire 10-element array. This distinction is advanced, but it proves that the compiler does, in some contexts, treat the array arr as more than just a simple pointer.

For 99% of C programming, you can safely live by the rule: “An array’s name is a constant pointer to its first element.” Just remember the sizeof trap when passing arrays to functions, and you will be in good shape.

Pointers to Array Elements

You can, of course, create your own pointers that point to elements within an array. This is a very common and powerful technique for traversing an array. Instead of using an integer index i, you can use a pointer that “walks” through the array. int arr[5] = {10, 20, 30, 40, 50}; Let’s create a pointer that points to the beginning of the array. int* p = arr; // Same as int* p = &arr[0]; Now p holds the address of arr[0].

We can also create a pointer that points to the end of the array. This is a common C idiom. int* end = arr + 5; Note that arr + 5 is the address of arr[5], which is one past the end of the array. We are not allowed to dereference this pointer (read or write its value), but we are allowed to have it for comparison purposes.

Now, we can write a loop that uses this pointer instead of an index i. int sum = 0; for (int* p = arr; p < end; p++) { sum += *p; } Let’s trace this.

  1. int* p = arr;: p starts at the address of arr[0].
  2. p < end;: Is p (address of arr[0]) less than end (address of arr[5])? Yes.
  3. sum += *p;: Dereference p (which is arr[0], value 10) and add it to sum. sum is now 10.
  4. p++: This is pointer arithmetic. It increments p to point to the next element. p now points to arr[1].
  5. p < end;: Is p (address of arr[1]) less than end? Yes.
  6. sum += *p;: Dereference p (which is arr[1], value 20). sum is now 30.
  7. p++: p now points to arr[2]. This continues until p points to arr[5]. When p points to arr[5], the check p < end becomes “is arr[5] < arr[5]?”, which is false. The loop terminates. The sum is 150.

This style of programming is very common in experienced C code. It can be more efficient in some older compilers, and it more closely maps to the underlying machine operations. Both the index-based loop (for (i=0;… )) and the pointer-based loop (for (p=arr;… )) are valid and useful tools to have.

Returning Arrays from Functions

This is another major point of confusion. A function in C cannot return a standard array. This syntax is illegal: int[10] my_function() { … } // ILLEGAL Why? Arrays are not “first-class citizens” in C. You cannot pass them by value, and you cannot return them by value. When you try, they decay to pointers. So, you might think, “Fine, I’ll just return a pointer to the array.” int* my_function() { int arr[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; return arr; // DANGEROUS AND WRONG }

This is one of the worst and most common errors in C. The array arr was created on the “stack.” When my_function finishes and returns, its entire stack frame is destroyed. This means the arr array nos longer exists. The function returns a pointer to a memory location that is now “garbage.” Back in main, if you try to use this pointer, you will be reading garbage data, and your program will likely crash. This is called “returning a pointer to a local variable.”

So how do you get array data out of a function? You use the “pass-by-reference” behavior we saw earlier. Instead of having the function create the array, the caller (e.g., main) creates the array. Then, the caller passes a pointer to that array into the function, and the function’s job is to fill it up.

Correct “return” pattern: void fill_array(int* arr, int size) { // This function “returns” data by filling the provided array for (int i = 0; i < size; i++) { arr[i] = (i + 1) * 10; } } And in main: int main() { int my_data[10]; // 1. Main allocates the memory fill_array(my_data, 10); // 2. Pass a pointer to the function // 3. The function fills main’s memory printf(“%d\n”, my_data[2]); // Prints 30 return 0; } This “caller-allocates” pattern is the standard, safe, and correct way to “return” array data from a function in C. The main function owns the memory, and the fill_array function just populates it.