Ok, let’s solve the last programming riddle:
Minimum window in String in O(n)
Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n)
S = "ADOBECODEBANC" T = "ABC" Minimum window is
What I’ve personally found tricky in this problem was to realize the algorithm. I tried to imagine, without writing, a solution. I tried to imagine the string, the chars, the indexes to keep track of the windows, and so on. This way, I couldn’t even think about a naive O(n^2) solution. I was stuck.
What can we do when we are stuck on a problem? Have a break. Try to free our mind for a little while. Have some fun/relaxing activities and in the end come back to break the problem down, armed with pen and paper. Yes, pen and paper. Drawing an instance of the problem is the best way to catch what we are missing, moreover breaking a problem in smaller part is the best way to focus on different aspects of the problem, without get stuck blinded from the whole complexity.
Let’s start from the smallest unit of the problem question: the window. What exactly is a window in this context? A window in this context is a string itself. It is a subset of the input string S, which contains all the chars in T. An important property of a window is how its edges are defined, this is the keystone of the problem, and as long as I had not realized/formalized it, I could not solve the problem.
The figure above shows two examples of window over S given T. Focusing on the edges of these two windows,we can easily see that they are chars from the set T. So, the window is always bounded by two different characters of T and, if we imagine to scroll the string S from left to right, once we have found all the chars in T, we know that we have found a window and this windows can be calculated as all the chars of S between the “furthest” char of T from the current position and the char of T we just found at the current position (the closest).
So far we have two important rules of our algorithm:
- A window has been found when at least all the chars in T have been found.
- A window can be formalized as the difference between the index of the leftmost char and the rightmost one.
We’re not done yet. Once we found the first window, keeping scrolling the string S, which is the event that makes us realize that we found another window? If we focus on the difference between window 1 and window 2 in the above figure, we can see that the second window is found as soon as a char of T is found again in a different position, in that case B, which becomes the new window’s edge. We can write a third rule then:
- Once we have the first window, every next possible window is found as soon a char of T is found again in a different position.
Wow, without even realizing it, we have written an algorithm to solve the problem! The following figure sums up all we have said so far:
We have the algorithm, but still some details to keep it O(n) are missing.
- How do we perform the “is-in-there” test to check if a character is contained in the T? We need a Set built over a Hast-Table. In this way, we can perform this test in O(1)
- How do we perform the “do-we-have-all-of-them” test to check if we’ve found a window? We need another Set (let’s name it: charsFound) in order to keep track of the number of chars we’ve found. When the size of this set is equal to the size of T: we’ve found a windows.
- And finally, very important, once we’ve found all the chars in T, how do we calculate the window? As we said a window is calculable as the difference between the index of the leftmost char and the rightmost one, which means the smallest index in the “charsFound” set and the biggest one. This Set must contain also the indexes of the chars and it must be sorted by them. In this way we can access to the smallest/biggest index in O(1).
We can now find every windows in the string, but we want to find the smallest one. That’s easy. We can keep a reference to the “currentMinWin” and every time, scrolling the string, we find a new window, we can perform a check on its length and if it is smaller than the currentMinWin’s length, we have a new currentMinWin
Now you know everything to try to write your own solution code. Anyway, if you want, you can have a look at my code written in Java. See you guys