Many times, the problem at hand is not hard, but tedious. You need to observe what works for this particular problem, consider all the edge conditions, then write down the boring grocery list. Hey, wake up, that's why your boss is still paying you.
Problem:
find the smallest element in a rotating array. A rotating array is one by moving the first a few elements to the end of an ascending ordered array. For example, in rotating array [5, 6, 1, 2, 3, 4], the smallest element is 1.
Analysis:
Easy solution is traversal the array then find the smallest one. This approach takes N comparison. It has time complexity O(N) and space complexity O(1). However, your boss want your program to be faster. When you have lots of threads do the search or have to search a huge array, any tiny improvement of the algorithm will be magnified.
The array is almost ordered, we can use this property to improve the algorithm for this particular problem.
first of all, there are many edge cases besides the normal case in the example:
- [5, 6, 1, 2, 3, 4] normal rotating
- [3, 4, 5, 6, 1, 2] normal rotating
- [6, 1, 2, 3, 4, 5] normal rotating
- [7, 1, 2, 3, 4, 5, 6] normal rotating
- [5, 6, 7, 1, 2, 3, 4] normal rotating
- [3, 4, 5, 6, 7, 1, 2] normal rotating
- [1, 2, 3, 4, 5, 6] no rotating
- [0, 1, 1, 1, 1, 1] no rotating, repeating elements
- [1, 1, 0, 1, 1, 1] rotating, repeating elements
- [1, 0, 1, 1, 1, 1] rotating, repeating elements
- [1, 1, 1, 1, 0, 1] rotating, repeating elements
- [1, 1, 1, 1, 1, 0] rotating, repeating elements
- [0, 0, 0, 0, 0, 1] no rotating, repeating elements
- [0, 0, 1, 0, 0, 0] rotating, repeating elements
- [0, 1, 0, 0, 0, 0] rotating, repeating elements
- [0, 0, 0, 0, 1, 0] rotating, repeating elements
- [0, 0, 0, 0, 0, 1] rotating, repeating elements
- [2, 2, 2, 2, 2, 2] all repeating elements
With some observation, they are actually 4 cases:
- [1, 2, 3, 4, 5, 6] [0, 1, 1, 1, 1, 1] [0, 0, 0, 0, 0, 1] no rotating, array[0] < array[length - 1], the smallest element is the first element
- [5, 6, 1, 2, 3, 4] [3, 4, 5, 6, 1, 2] [6, 1, 2, 3, 4, 5] [7, 1, 2, 3, 4, 5, 6] [5, 6, 7, 1, 2, 3, 4] [3, 4, 5, 6, 7, 1, 2] rotating, array[0] > array[length - 1]
- [1, 1, 1, 0, 1, 1] [0, 0, 1, 0, 0, 0] rotating, array[0] == array[length - 1]
- [2, 2, 2, 2, 2, 2] all elements are the same, array[0] == array[length - 1]
Now we know the terrain, let's engage the problem.
Here we can easily detect and solve case 1, we only need to worry about case 2, 3, 4.
Case 2 is a problem of looking for the dividing point between 2 parts. Binary search is good at this kind of task, because at each step, we can eliminate half of the search by checking the current position.
Case 3 and 4 won't benefit from binary search, the array's value is almost the same everywhere. At each binary divide step, you won't find any hint about which half to search, you have to search both halves. So let's just search from left to right, the searching order won't change your chance of finding the special point at each step.
Case 3 and 4 won't benefit from binary search, the array's value is almost the same everywhere. At each binary divide step, you won't find any hint about which half to search, you have to search both halves. So let's just search from left to right, the searching order won't change your chance of finding the special point at each step.
find the value of middle point mid = (low + high)/2
if array[mid] < array[mid - 1] this is the dividing point
if array[mid] < array[mid - 1] this is the dividing point
if array[mid] > array[mid - 1]
if array[mid] > array[low]
dividing point is at the right half
if array[mid] < array[high]
dividing point is at the left half
if array[mid] == array[mid - 1]
search from left to right
The new algorithm improves the performance: when there is no rotation, it skips the search, finishes the calculation in O(1) complexity; when the array elements are normal, it saves time by doing binary search, decreases the time complexity to O(logN); when the array elements are almost the same, it still does O(N) time search.
Let's take another problem with the same nature.
Problem:
Given 2 collections, the element in the collection is a number range. These range are in ascending order. Find the common part of the 2 collections. For example, Given collection {[3, 9], [13, 15]} and {[6, 8], [14, 19]}, the common parts are {[6, 8], [14, 15]}.
Analyze:
we can brutal force. Let's name the two collections as leftC and rightC. We iterate through leftC. For each element, we search overlap in the rightC. The search is to compare the lower bound and higher bound of the element to each elements in the rightC. We need to iterate both leftC and rightC, the time complexity is O(N1N2).
There is a general solution from textbook -- build a balanced interval search tree, then find all intervals that intersects (lo, hi) in the interval search tree. For N intervals, the building cost is O(NlogN), and each search step have time complexity O(RlogN), R is the number of intersections.
So if we build such a balanced interval search tree with rightC. The build cost is O(N2logN2). Then we iterate through the leftC, for each element, we search all the intersects, the cost of these search is O(N1RlogN2).
None of these methods used the fact that "These range are in ascending order. " This fact means the 2 collections have lower entropy, we should be able to find better algorithm than general cases.
First of all, let's get the requirement clear. "These range are in ascending order. "
Let's represent the range elements in a collection as {...[low(i - 1), high(i - 1)], [low(i), high(i)], [low(i + 1), high(i + 1)], ...}
Q: Can the elements in the same collection overlap?
A: no.
Q: Ascending order means low(i) < high(i) or low(i) <= high(i)?
A: low(i) <= high(i)
Q: Can there be repeating elements in the range?
A: no.
Q: "No overlap" means high(i) < low(i + 1)? There is no high(i) == low(i + 1) in any circumstances?
A: correct.
Q: The common parts will be stored in a third collection as ranges, right?
A: yes.
Q: the range boundary are all positive integers?
A: not necessary, but yes.
Q: when there are any empty collection, we output an empty collection as the common parts?
A: yes.
Q: Let's assume the two collections are @Nonnull
A: ok
Now lets start from the leftC's first element lrange0 = [llow0, lhigh0] and rightC's first element rrange0 = [rlow0, rhigh0]. There are many cases, especially those <= and >= edges.
Case 1: lrange0 includes rrange0.
llow0 <= rlow0 <= lhigh0 && llow0 <= rhigh0 <= lhigh0
-------------
-----
-------------
-------------
-------------
------
-------------
---
Case 2: lrange0 overlap rrange0, lrange0 is at left.
llow0 <= rlow0 <= lhigh0 && rhigh0 > lhigh0
-------------
----------------
-------------
----------------------
-------------
-----------------
Case 3: lrange0 overlap rrange0, rrangee0 is at left.
llow0 <= rlow0 <= lhigh0 && rhigh0 > lhigh0
----------------
------------------
----------------
------------------------
---------------
-----------
Case 4: rrange0 includes lrange0.
rlow0 <= llow0 <= rhigh0 && rlow0 <= lhigh0 <= rhigh0
---------
----------------------
----------------------
----------------------
----------
-----------------------
----------
-----------------------
Case 5: lrange0 and rrange0 have no overlap, lrange0 is at left.
lhigh0 < rlow0
-------------
-------------------------------
Case 6: lrange0 and rrange0 have no overlap, rrange0 is at left.
rhigh0 < llow0
------------------------------
---------------
Now we know the terrain, let's engage the problem.
Case 1: lrange0 includes rrange0
if we move to lrange1, we won't find any overlap with rrange0, if we move to rrange1, we might find overlap with lrange0. So we move to rrange1 next.
Case 2: lrange0 overlap rrange0, lrange0 is at left
if we move to rrange1, we won't find any overlap with lrange0, if we move to lrange1, we might find overlap with rrange0. So we move to lrange1 next.
Case 3: lrang0 overlap rrang0, rrange0 is at left
if we move to rrange1, we won't find any overlap with lrange0, if we move to lrange1, we might find overlap with rrange0. So we move to lrange1 next.
Case 3: lrang0 overlap rrang0, rrange0 is at left
if we move to lrange1, we won't find any overlap with rrange0, if we move to rrange1, we might find overlap with lrange0. So we move to rrange1 next.
Case 4: rrange0 includes lrange0
if we move to rrange1, we won't find any overlap with lrange0, if we move to lrange1, we might find overlap with rrange0. So we move to lrange1 next.
if we move to rrange1, we won't find any overlap with lrange0, if we move to lrange1, we might find overlap with rrange0. So we move to lrange1 next.
Case 5: lrange0 and rrange0 have no overlap, lrange0 is at left
if we move to rrange1, we won't find any overlap with lrange0, if we move to lrange1, we might find overlap with rrange0. So we move to lrange1 next.
Case 6: lrange0 and rrange0 have no overlap, rrange0 is at left
if we move to lrange1, we won't find any overlap with rrange0, if we move to rrange1, we might find overlap with lrange0. So we move to rrange1 next.
So you can tell from the above 6 cases, given any leftC's index i and rightC's index j, we can test llow rlow, lhigh, rhigh in order to figure out which case leftC[i] and rightC[j] are in. Once we processed the overlap, we can decide to increase i or increase j according to which case we are currently in. Finally, the exit condition for this state machine is when you exausted either collection's elements.
That is the solution. We only have to iterate each collection once, the time complexity is O(N1 + N2) and there is little extra space needed, the space complexity is O(1).
The take away from the above problem is: the best solution sometimes come from observation and analyzation at an intimacy distance of the problem.
Here are another problem emphasis this point.
Analyze:
The brutal force approach is to iterate through the first array, then search each element in the second and third arrays. The time complexity is O(N1 (N2 + N3)).
However we notice "Each array is in ascending order." Again these arrays have lower entropy, we can expect better solution than usual.
First of all, lets get the requirement clear:
Q: What type of elements in the arrays?
A: positive integers.
Q: Is repeating elements allowed?
A: no.
Q: Is empty array allowed?
A: no.
Analyze:
There are a few cases:
lets represent the three arrays as a1, a2, a3
Case 1: no overlap. a1[high] < a2[low] && a2[high] < a3[low]
------
--------
--------
Case 2: no overlap. a2[high] < a1[low] && a1[high] < a3[low]
--------
------
--------
Case3: no overlap. a3[high] < a2[low] && a2[high] < a1[low]
--------
---------
-------
Case4: overlap. a1[i] == a2[j] && a2[j]== a3[k], next possible matching state is i++, j++, k++
- -
- -
- -
Case5: overlap. a1[i] == a2[j] && a2[j] < a3[k], next possible matching state is i++, j++, k++
- -
- -
- -
Case6: overlap. a1[i] == a3[k] && a1[i] < a2[j], next possible matching state is i++, j++, k++
- -
- -
- -
Case7: overlap. a2[j] == a3[k] && a2[j] < a1[i], next possible matching state is i++, j++, k++
- -
- -
- -
Case 5: overlap. a1[i] < a2[j] && a2[j] < a3[k], next possible matching state is i++
- -
- -
- -
Case 6: overlap. a1[i] < a3[k] && a3[k] < a2[j], next possible matching state is i++
- -
- -
- -
Case 7: overlap. a2[j] < a3[k] && a3[k] < a1[i], next possible matching state is j++
- -
- -
- -
Case 8: overlap. a2[j] < a1[i] && a1[i] < a[k], next possible matching state is j++
- -
- -
- -
Case 9: overlap. a3[k] < a2[j] && a2[j] < a1[i], next possible matching state is k++
- -
- -
- -
Case 10: overlap. a3[k] < a1[i] && a1[i] < a2[j], next possible matching state is k++
- -
- -
- -
Finally, the exiting state is when one of the array elements are exausted.
A clever guy might find a simpler reasoning that covers all the above conditions:
if(a1[i] == a2[j] && a2[j] == a3[k]) {
i ++; j++; k++;
} else if(a1[i] < a2[j]) {
i++; //a1[i] won't be the common element
} else if(a2[j] < a3[k]) { //have a1[i] > a2[j] meets implicitly
j++; //a2[j] won't be the common element
} else { //a3[k] < a2[j] implicitly
k++; //a3[k] won't be the common element
}
The time complexity is O(N1 + N2 + N3).
So you can tell from the above 6 cases, given any leftC's index i and rightC's index j, we can test llow rlow, lhigh, rhigh in order to figure out which case leftC[i] and rightC[j] are in. Once we processed the overlap, we can decide to increase i or increase j according to which case we are currently in. Finally, the exit condition for this state machine is when you exausted either collection's elements.
That is the solution. We only have to iterate each collection once, the time complexity is O(N1 + N2) and there is little extra space needed, the space complexity is O(1).
The take away from the above problem is: the best solution sometimes come from observation and analyzation at an intimacy distance of the problem.
Here are another problem emphasis this point.
Problem:
Given 3 arrays, find their common elements. Each array is in ascending order. For example, given the following array {2, 4, 7, 9, 14, 16} {1, 2, 3, 9, 14, 66} {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} the common elements are {2, 9}Analyze:
The brutal force approach is to iterate through the first array, then search each element in the second and third arrays. The time complexity is O(N1 (N2 + N3)).
However we notice "Each array is in ascending order." Again these arrays have lower entropy, we can expect better solution than usual.
First of all, lets get the requirement clear:
Q: What type of elements in the arrays?
A: positive integers.
Q: Is repeating elements allowed?
A: no.
Q: Is empty array allowed?
A: no.
Analyze:
There are a few cases:
lets represent the three arrays as a1, a2, a3
Case 1: no overlap. a1[high] < a2[low] && a2[high] < a3[low]
------
--------
--------
Case 2: no overlap. a2[high] < a1[low] && a1[high] < a3[low]
--------
------
--------
Case3: no overlap. a3[high] < a2[low] && a2[high] < a1[low]
--------
---------
-------
Case4: overlap. a1[i] == a2[j] && a2[j]== a3[k], next possible matching state is i++, j++, k++
- -
- -
- -
Case5: overlap. a1[i] == a2[j] && a2[j] < a3[k], next possible matching state is i++, j++, k++
- -
- -
- -
Case6: overlap. a1[i] == a3[k] && a1[i] < a2[j], next possible matching state is i++, j++, k++
- -
- -
- -
Case7: overlap. a2[j] == a3[k] && a2[j] < a1[i], next possible matching state is i++, j++, k++
- -
- -
- -
Case 5: overlap. a1[i] < a2[j] && a2[j] < a3[k], next possible matching state is i++
- -
- -
- -
Case 6: overlap. a1[i] < a3[k] && a3[k] < a2[j], next possible matching state is i++
- -
- -
- -
Case 7: overlap. a2[j] < a3[k] && a3[k] < a1[i], next possible matching state is j++
- -
- -
- -
Case 8: overlap. a2[j] < a1[i] && a1[i] < a[k], next possible matching state is j++
- -
- -
- -
Case 9: overlap. a3[k] < a2[j] && a2[j] < a1[i], next possible matching state is k++
- -
- -
- -
Case 10: overlap. a3[k] < a1[i] && a1[i] < a2[j], next possible matching state is k++
- -
- -
- -
Finally, the exiting state is when one of the array elements are exausted.
A clever guy might find a simpler reasoning that covers all the above conditions:
if(a1[i] == a2[j] && a2[j] == a3[k]) {
i ++; j++; k++;
} else if(a1[i] < a2[j]) {
i++; //a1[i] won't be the common element
} else if(a2[j] < a3[k]) { //have a1[i] > a2[j] meets implicitly
j++; //a2[j] won't be the common element
} else { //a3[k] < a2[j] implicitly
k++; //a3[k] won't be the common element
}
The time complexity is O(N1 + N2 + N3).