Site Search:

find longest common substring for 2 strings

Problem

Given 2 strings, find the longest common substring s1 and s2. For example, given "abccade" and "dgcadde", the output should be "cad".

Solution

brutal force approach is to find all the substrings of s2, then test if they are contained in s1. We need a loop for the substring's start position and another loop for substring's end position. If implemented with char comparison, contains need another 2 layers of loops, the total cost is O(MN^3).

We can organize the comparison with so called sliding window to achieve O((M + N)*N) time complexity and O(1) space complexity. The outer loop iterate through len1 + len2, when i < len1, the start position of s2 is fixed and s1's start position changes, otherwise, the start position of s1 is fixed and s2's start position changes. So once i iterate through 0 to len1 + len2, we covered all the scenarios.

public class LongestCommonString {
    public static String getMaxCommon(String s1, String s2) {
        int maxEnd = 0;
        int maxLength = 0;
        int tmpMax = 0;
        int len1 = s1.length();
        int len2 = s1.length();
        int s1Start = 0;
        int s2Start = 0;
        int i, j;
        for(i = 0; i < len1 + len2; i++) {
            s1Start = s2Start = 0;
            tmpMax = 0;
            if(i < len1) {
                s1Start = len1 - i;
            } else {
                s2Start = i - len1;
            }
            for(j = 0; s1Start + j < len1 && s2Start + j < len2; j++) {
                if(s1.charAt(s1Start + j) == s2.charAt(s2Start + j)) {
                    tmpMax ++;
                } else {
                    if(tmpMax > maxLength) {
                        maxLength = tmpMax;
                        maxEnd = s1Start + j;
                    } else {
                        tmpMax = 0;
                    }
                }
            }
            if(tmpMax > maxLength) {
                maxLength = tmpMax;
                maxEnd = s1Start + j;
            }
        }
        return s1.substring(maxEnd - maxLength, maxEnd);
    }
    public static void main(String...args) {
        String s1 = "abccade";
        String s2 = "dgcadde";
        System.out.println(getMaxCommon(s1, s2));
    }
}

Another approach is to use dynamic programming, 
knowing the formula as follows:
f(i + 1, j + 1) = 0 when s1[i + 1] != s2[j + 1]
f(i + 1, j + 1) = f(i, j) + 1 when s1[i + 1] == s2[j + 1]
and the initial condition as follows:
f(0, j) = 0 for all j >=0
f(i, 0) = 0 for all i >=0