Site Search:

Sub-String Search algorithm comparison

back>

Sub-String Search

Algorithm time complexity (worst) time complexity (typical) extra space backup in place notes
brute force MN 1.1N 1 yes simple and effective
K-M-P 2N 1.1N MR no other versions trade off worst time performance with space
Boyer-Moore MN N/M R yes can further optimize
Rabin-Karp 7N 7N 1 no Monte Carlo, tiny chance of incorrect
Regular Expression MN ? M yes M is the length of regular expression

We have studied how to search a key in a collection of keys, for example, searching a string in a collection of strings. In order to search, we need to repeatedly compare 2 keys, for example compare 2 Strings. However, we haven't studied how to search a string in one super long string.

Since string is the carrier of almost all modern informations (DNA for example), this simple operation is so common and so important, that mankind never stops searching for better way of doing it.

If the message we need to search is explicit, the cost is typically 1.1N, a little bit more than reading through the the book of N characters. The best we can do is skippy read and find the answer in N/M. If the message we need to search is a riddle, the typical cost is uncertain, the cost is at most MN, we exhaust all possible meaning of M and check if N can match.

The primitive brute force algorithm is very cost effective in most cases, but we can push the limit with cleverly designed DFAs and NFAs.