While building a text editor, I needed to implement search — since no text-editor would be complete without search! But how would I implement it?
Before the default, knee-jerk reaction of “use string.IndexOf,” I thought to myself, “hey, wait a minute–regular expressions are supposed to be fast.” In fact, their basis is that they’re efficiently implemented.
What if I used a regular expression to search instead? I could simply put the search text in (properly escaped), and a match would reveal the location of the text.
I decided to run a test; the test comprised of searching through 10,000 words worth of text for a single word, which was located only once at the end of the entire text. I ran a for-loop, in which I either used IndexOf, or some regexp’s Match, to find the result.
And the result? Not surprising in the outcome, but surprising in the difference; for 5000 iterations of searching, IndexOf clocked in at 18 seconds. And Regexp.Match clocked in at one-quarter of a second.
Wait a sec. That means that Regexp.Match, in this case, is more than 70x times faster than IndexOf. Wow. That’s quite an order of magnitude difference.
Based on this research, I would therefore recommend avoiding IndexOf in favour of Regexp.Match for sizable texts (say, more than 1000 words or so). It might actually save you a ton of time. And of course, you need to properly quote your search string, to make sure the user doesn’t enter any regular-expression-specific special characters.