XYZ CODE: Sub-String Search algorithm comparison

Sub-String Search

Algorithm	time complexity (worst)	time complexity (typical)	extra space	backup in place	notes
brute force	MN	1.1N	1	yes	simple and effective
K-M-P	2N	1.1N	MR	no	other versions trade off worst time performance with space
Boyer-Moore	MN	N/M	R	yes	can further optimize
Rabin-Karp	7N	7N	1	no	Monte Carlo, tiny chance of incorrect
Regular Expression	MN	?	M	yes	M is the length of regular expression

We have studied how to search a key in a collection of keys, for example, searching a string in a collection of strings. In order to search, we need to repeatedly compare 2 keys, for example compare 2 Strings. However, we haven't studied how to search a string in one super long string.

Since string is the carrier of almost all modern informations (DNA for example), this simple operation is so common and so important, that mankind never stops searching for better way of doing it.

If the message we need to search is explicit, the cost is typically 1.1N, a little bit more than reading through the the book of N characters. The best we can do is skippy read and find the answer in N/M. If the message we need to search is a riddle, the typical cost is uncertain, the cost is at most MN, we exhaust all possible meaning of M and check if N can match.

The primitive brute force algorithm is very cost effective in most cases, but we can push the limit with cleverly designed DFAs and NFAs.

XYZ CODE

Sub-String Search algorithm comparison

Sub-String Search

No comments:

Post a Comment