A Quick Introduction To Data Structures

Suppose you went to a grocery store and bought two items, let’s say a can of beans and a small packet of rice. You went to the checkout counter, paid for the can of beans and rice, and politely refused to take a bag as these are just two items which you can carry one in each hand and walk to your car. Great!

Now imagine you went to the grocery store and bought eggs, bread, apples, spinach, lettuce, carrots, pasta sauce, pasta….. And you go to the checkout counter. Now you definitely need a bag! As you cannot carry all these items with your two hands, even if it is a short walk to your car.

Computer systems are the same!  Our computer programs work on data and logic. For example, in the program to calculate the sum of the numbers 1 and 2, the data would be the numbers 1 and 2 and the logic would be the addition operation. In such small programs, we do not essentially need a data structure. 

//data - Just two numbers, we can manage without a data structure
int one = 1;
int two = 2;

//call the logic
Sum(one, two)

//the logic
int sum(int a, int b){
 return (a+b);
}

But we seldom write such small programs in the real world. Consider a similar program which calculates the sum of all the numbers between 1 and 100. It would get really messy to handle 100 numbers if we start assigning a variable each to every number.

//data of a hundred numbers
int one = 1;
int two = 2;
........
int hundred = 100;

//calling the sum logic 
//since the sum method accepts only two  numbers
// we need to call the method again, and again, and again
int result = sum(one, two)
result = sum(three, result)
result = sum(four, result)
... 
//and so on

Do we really need over a 100 variables just to calculate a sum of numbers?

Well, we can also use a for loop to make this a bit cleaner, since we know the range of numbers we are working on (1-100)

int sum = 0;
For(int i = 1; i <= 100; i++){
 sum = sum + i;
}

But, the input numbers would not always be in the form of a “range” which can be represented by the index variable (i) of the for loop. For example, the input numbers could be 1, 200, 99, 450, 77 and so on.

This is where data structures come into play. Just like bags in a grocery store which help organize your groceries, data structures are nothing but storage units which the computer uses to organize the data in its memory.

Let’s go back to the grocery bag example.

Would you put loose eggs into a bag? No, they will break! That’s why the eggs are packed in a crate – neatly organized so that they may not break. Similarly, when you start putting groceries in a bag, you will not put tomatoes first and then a bottle of pasta sauce on top of it to avoid the tomatoes from squishing. Thus, when we start putting stuff in the bag, we would sub-consciously be taking care of the following:

  1. Every item is inserted in the bag properly so that nothing breaks
  2. Every item can be retrieved from the bag in proper condition
  3. Every item can be easily searched in the bag, without damaging the bag or other items
  4. Every item can be removed from the bag, without damaging the bag or other items

That is exactly how the computer works with data structures. Each “structure” stores data in such a way so that the basic operations of insertion, retrieval, deletion and search can be done easily without breaking the data structure or affecting the data.

Different Data structures store data differently to cater to different operations that can be performed on the data. Some data structures support easy insertion , while others provide fast search. Following are some common data structures that you would encounter in your coding life:

Arrays: A structure which stores  a series of data elements in contiguous locations in memory for easy retrieval

Linked Lists: Like arrays, store a series of data elements. Unlike arrays, does not store in contiguous locations. Data is stored in nodes. Each node has a pointer to point to the memory location of the next node (which has the next data). This allows for better insertion and deletion operations as compared to arrays.

Maps / Dictionaries: Store data as key-value pairs to search data efficiently

Trees/ Graphs: Like linked lists, Trees store data in nodes. Each node is structured in a way that it can have a parent and children. Graphs are similar to trees in the sense that all nodes are connected to form a kind of network. These are advanced data structures and provide the means to solve complex algorithms.

Thus we can see that different structures organize data differently providing their own advantages. This was a super fast introduction to the world of data structures, to give you an understanding of where data structures come from (without intimidating you!).

To summarize:

  • Data structures are nothing but different ways in which the computer organizes the data being used in a program
  • Different operations that can be performed on data structures are: insertion, deletion, retrieval and search of data elements
  • There are many data structures which are being used in real world programs. They all organize data differently, with different strengths when it comes to the different operations
  • Arrays, linked lists, maps, dictionaries, trees and graphs are some examples of data structures.

In my upcoming posts, we will do a deep dive into the various data structures and start solving real world problems using those data structures.

Let’s uncomplicate coding!

K.