Site Search:

Longest Substring with At Most K Distinct Characters

Problem:

Given a string, find the length of the longest substring T that contains at most k distinct characters.

For Example:

Given string "hmhkl" and k = 2
Output: 3
T is "hmh" with length 3, which contains 2 distinct characters.

Solution:

We can solve this problem with sliding window. The left and right pointer initially point to first character. Move the right pointer expands the sliding window and adds new character; while move the left pointer shrink the sliding window and remove old character. We just have to count how many distinct characters are in the sliding window. To achieve that goal, we can use an integer to count the distinct characters and a map to count the repeating times for each character. The map is updated when we add or remove a character, the count is updated when we meet a new character or the map's count for a character decreases to 0. Actually, assuming the string only contains lowercase characters, we can use a size 26 int array to replace the map for the same purpose.

Once we have the count of distinct characters in the sliding window, we can decide when to expand the sliding window and when to shrink the sliding window. When the count is k+1, we need to register a max length then we need to shrink the sliding window until the count is k again.

class Solution {
  public static void main(String...args) {
    System.out.println(longestSub("hmhkl", 2));
    System.out.println(longestSub("klhmhkl", 2));
    System.out.println(longestSub("klhmh", 2));
    System.out.println(longestSub("klhmhklklhmhkl", 2));
    System.out.println(longestSub("bbbbb", 1));
    System.out.println(longestSub("babab", 1));
  }

  public static int longestSub(String str, int k) {
    if(str == null || str.length() == 0)
      return 0;
    int count = 0;
    int[] counter = new int['z' - 'a' + 1]; //26
    int maxLen = 0;
    int tmpMax = 0;
    int l = 0;
    for(int i = 0; i < str.length(); i++) {  //babab
      char c = str.charAt(i); //a
      counter[c - 'a']++;   //b 0, a 1
      tmpMax++;   //2
      if(counter[c - 'a'] == 1) {
        count++;  //2
        if(count > k) {
          maxLen = Math.max(maxLen, tmpMax - 1);  //1
          while(l < i) {
            c = str.charAt(l++); //b
            counter[c - 'a']--;
            tmpMax--;
            if(counter[c - 'a'] == 0) {
              count--;
              break;
            }
          }
        }
      }
    }
    maxLen = Math.max(maxLen, tmpMax);
    return maxLen;
  }
}

The time complexity is O(N), the left and right pointer go through the string only once at most.
The space complexity is O(1), we only has a size 26 array and a few variables for this algorithm.