Understanding Big O Notation + Frequency Counter Pattern
One of the biggest challenges I faced as a self-taught developer is Data structures and Algorithms. In my several research (asking questions, googling, and reading), I have come to realize that “Efficiency matters a lot” and this is why I decided to study data structures and algorithms again extensively. This time, I am using the learning by doing approach. Thus, I try to understand it from a practical point of view rather than a theoretical point. Without wasting much time, let’s dive into what you’re here for.
In this blog, you’ll learn about the Big O notation with some use cases and how we can optimize one of the examples with “Frequency CounterPattern” for better performance.
The content for the blog is grouped, as shown below:
- Big O Notation
- What problem are we trying to solve?
- Optimizing quadratic to linear function
- Solving the problem with Frequency Counter Pattern
- Comparing the two solutions using Big-O Complexity Chart
Let’s dive straight into it!
- Big O Notation
This is a concept that can be difficult to understand but very helpful if you get the basics. Big O Notation is used in giving an upper bound on a function to within a constant factor. In simple terms, we use Big O notation to determine your cod'syour code'syour code's efficiency or complexity mostly in te,rms of time and space complexity.
Let’s take a look at the different representation of the O and n with some examples for each.
When an algorithm has O(1), the input is independent of time and space. Thus, time and space remain the same regardless of the input size.
In this code snippet, I am adding one to the input (assuming the input is a number). This doesn’t affect the time or space complexity, thus they remain constant.
When an algorithm has O(n), the input is dependent on time and space and increases linearly. Thus, time and space grow in direct proportion to the size of the input.
In this code snippet, the function logs all the elements in the array. If the array size increases, the amount of time to perform this function will increase in direct proportion to the array.
When an algorithm has O(n²), the input is dependent on time and space and increases quadratically. Thus, time and space grow in direct proportion to the square of the size (n*n) of the input.
This code snippet compares two arrays. If they have some elements in them, it returns true else false. I used a “nested loop” to solve this problem, a common O(n²) algorithm. Later in this blog, I used frequency pattern to optimize this solution to have O(n)
When an algorithm has O(2^n), the algorithm time/space complexity doubles if there’s an addition to the input.
Recursion is a good example of O(2^n). This code snippet is one of the ways to solve a Fibonacci question using recursion. The time/space will double with each addition to the input.
I will use the binary search example to explain how the logarithm works. Binary search is a search algorithm used in finding the index of an element in an array. This is done by dividing the search/array into half each time we are looking for the target. With this, the time complexity is proportional to the logarithm of the input size.
2. What problem are we trying to solve?
No, buttat we understand what Big O notation is let’s in this articl in this articleesee how we can apply refactoring one of the code snippets from earlier (isSame). Let’s take a look at the question below:
Given a function that takes two arrays as an argument, determine if the values are the same in both arrays.
In solving this problem, let’s first breakdown the question into smaller parts and make possible test cases.
First of all, let’s assume the function returns true or false if it passes or fails.
From the question, we can make the following test cases with the corresponding return value:
3. Solving the problem in a “naive way...”
The first approach to solving this that came to mind was using nested for loops like the code in O(n²) above.
In the code above,
- I first check if the two params have the same size. If they don’t, then it means it's false.
- While the two params have the same size, I looped through the first array, and within the first array for loop block, I will loop through the second array. This is what is referred to as the “Nested loop”.
- The last thing I did was check if array1 does not include a value from array two at an index (which “j” in the code above). If that’s the case, then the two arrays are not the same, and therefore, I return false else, I return true.
This is a basic solution and would probably be the first solution that comes to mind,, but there are some downsides to this solution. Though this solution works, the efficiency of the code is not the best. Using the Big O notation logic, this code is O(n^2); thus, if n (in our case, the arrays) increases, the time and space complexity will be n*n. This might not be a problem for arrays with small sizes but arrays with larger sizes.
4. Solving the same problem with Frequency Counter Pattern
The second approach to solve this question is using the “Frequency Counter Pattern” as shown in the diagram below:
In the code above:
- I first check if the two params have unequal size and return false because that implies they do not have same values.
- The array in stored in an object where each key (each array index) has a value (the number of times the key appears in the array). Using “for of” loop, I checked if the key already has a pair and increment the value or initialize to 1. This is done for both array1 and array2.
- The last I did was to use “for in” loop to loop through one of the objects, in this case frequencyCounter1. This way, I can check if a key in frequencyCounter1 can not be found in frequencyCounter2. If that’s the case, then that means the objects do not have the same key value pairs. If they have same values and keys, then I return true.
With this solution, we achieve Big-O complexity of O(n).
5. Comparing the two solutions using Big-O Complexity Chart
Let’s take a look at the diagram below to compare the performance difference between these two solutions. From the diagram, the first solution will be in the red zone (Horrible) while the second solution will be in the green zone (Good).
Big O notation is a good concept to know as developer. This will help you determine which solution will be more efficient.
Here are some useful links to explore more about Big O notation
- Big O Cheats Sheet
FINALE! I hope you find this blog useful ☺