|
Executive Summary: The non-optimal version of string addition/concatenation in UserTalk - text = text + "b" + "r\n" - is exponentially slower than the optimised version for non-trivial strings. It's worth looking at your own code to see if you can make similar optimisations. The optimised version scales linearly with string length.
Caveat: As usual Amdahl's Law applies and your degree of speedup will be determined by the percentage of your code that is executing such string concatenations.
To while away some time last night and this morning I thought I'd benchmark string addition. One of the techniques I used was incorporated into system.verbs.builtins.radio.html.viewNewsItems in update #0134 - thanks for the acknowledgement Userland. But I hadn't twigged to the optimisation used in update #0144. Namely, the parenthesisation of a (string) sub-expression. Here are the relevant announcements:
system.verbs.builtins.radio.html.viewNewsItems
1/25/2002; 10:40:59 AM (#0144): Another optimization for the News
Aggregator page. In the local "add" procedure, we parenthesize the
text we're adding to htmltext, this triggers an ancient optimization
in the kernel that doesn't do any handle copying on string
assignments of the form x = x + s. Before parenthesizing average
time was 104 ticks, after -- 45! More than double the speed. Worth
doing.
1/24/2002; 7:07:47 PM (#0134): Tweak for performance. Uncached news
page went from approx 220 ticks to approx 40 ticks. Thanks to Duncan
Smeed for the trick to get add () to be so much faster. We'll surely
use this in other places.
So, I hacked together the following benchmark timer. You might find the results interesting. Incidentally, the rationale for using a string of initial length 100,000 bytes is that the News Agggregator generates a page greater in size than that from my (paltry) subscription to 13 news sources. Shorter strings of course may not see the same degree of dramatic speed-up but IMHO any speed-up, no matter how small, is worthwhile. The generation of the 'news' page is much faster than it once was so I was curious to see what sort of contribution each type of optimisation made. Here's the benchmarker. It's not pretty but it does the job.
Notice the 4 different versions of the add() macro. add1() is the usual (non-optimal) version that was origianlly used in Radio. add2() incorporates the 'semicolonic optimisation' used in update #0134. add3() uses the parenthesisation of a (string) sub-expression optimisation but without the semicolonic optimisation. And, finally, add4() uses both optimsation techniques. For comparison purposes, I used an in-line string addition in the body of the inner loop since this was without the overhead of an add() call/return. Since the same 'side-effect' affected the inner for-loop, the non-optimial version of the string addition - text = text + "b" + "r\n" - is exponentially slower than the speed of the optimised version.
Just for kicks I tried a base string of one million characters. The non-optimal version took Tot: 3357 Ave: 335 Min: 288 Max: 379. The optimal version took Tot: 24 Ave: 2 Min: 1 Max: 9, a 150-fold speed-up on average. There was little discernible difference with a 10,000 character string so I tried doubling the length of the string each time:
- 20,000 chars: two-fold improvement
- 40,000 chars: eight-fold improvement
- 80,000 chars: twelve-fold improvement
- 160,000 chars: fifty-fold improvement
Here's the function:
on timer () {
local {
text = string.filledString('a',100000);
i,tc,maxtc,mintc,totaltc,elapsedtc,result=""};
on add1 (s) {
text = text + s + "r\n"};
on add2 (s) {
text = text + s + "r\n";};
on add3 (s) {
text = text + (s + "r\n")};
on add4 (s) {
text = text + (s + "r\n");};
totaltc=clock.ticks();
maxtc=0;
mintc=infinity;
for i = 1 to 10 {
tc = clock.ticks ();
for i = 1 to 100 {add1("b")};
elapsedtc=(clock.ticks () - tc);
if (elapsedtc > maxtc) {
maxtc = elapsedtc};
if (elapsedtc < mintc) {
mintc = elapsedtc}};
totaltc = (clock.ticks() - totaltc);
result = "-> Tot: " + totaltc + "tAve: " + totaltc/10 + "tMin: " + mintc + "tMax: " + maxtc;
clipboard.put(stringType,@result);
dialog.alert (result)};
bundle { //test code
timer ()}
Here are the results:
- for i = 1 to 100 {text = text + "b" + "r\n"}
- -> Tot: 291 Ave: 29 Min: 20 Max: 46
- for i = 1 to 100 {text = text + "b" + "r\n";}
- -> Tot: 213 Ave: 21 Min: 19 Max: 26
- for i = 1 to 100 {text = text + ("b" + "r\n")}
- -> Tot: 132 Ave: 13 Min: 11 Max: 18
- for i = 1 to 100 {text = text + ("b" + "r\n");}
- -> Tot: 8 Ave: 0 Min: 0 Max: 3
- for i = 1 to 100 {add1("b")}
- -> Tot: 274 Ave: 27 Min: 25 Max: 30
- for i = 1 to 100 {add2("b")}
- -> Tot: 142 Ave: 14 Min: 13 Max: 16
- for i = 1 to 100 {add3("b")}
- -> Tot: 148 Ave: 14 Min: 13 Max: 18
- for i = 1 to 100 {add4("b")}
- -> Tot: 13 Ave: 1 Min: 1 Max: 4
|