Confessions of a speed junky: How I made my code faster
Posted by
Brad Wood
Jul 18, 2008 08:33:00 UTC
The past couple of days I've been messing around with a couple functions, cleaning them up a bit to blog about them. One of them is for color-coding SQL and the other for highlighting differences in two strings. Both are pretty small, but very repetitive in what they do. Depending on the size of the text you are processing, performance varied. Sometimes the code inside was repeated hundreds of thousands of times given a large enough test.I originally wrote the code on ColdFusion 7 and had been pleased with its behavior but the speed always left a little lacking. The first time I ran both of the functions on ColdFusion 8 I was absolutely floored. The same code ran dozens of times faster on a SLOWER server. The only difference-ColdFusion 8. That really is a testament to Adobe and the performance tuning they did in Scorpio. However, I wasn't totally satisfied with that. I also wanted to clean up performance on ColdFusion 7 as well.
This code is definitely atypical. It does not represent the way 99% of your application works. Generally it makes no difference if one ColdFusion function takes 20 ms longer than another. When you only call the dang thing twice you don't notice. Everyone's knee-jerk reaction to performance testing is to make up a cfloop that iterates a few hundred thousand times and time the code. Generally that sort of test has absolutely no resemblance to a real life app. In this particular case though, it would actually be a bit more applicable.
I didn't need to write any loops, the code already looped enough. I just needed to create a big enough test to keep the server busy for about 30 seconds or so while I looked at what it was up to. I use SeeFusion from WebApper to monitor my servers and I swear by it-- it is a very useful program. The slower lines of code are accentuated when you are running them a million times. It's kind of like when a drywall finisher turns off the lights and uses one bright spot light to illuminate all the small imperfections for the final sanding. I ran a stack trace several times over while running the code and immediately lines of code became clear bottlenecks. These finds are probably specific to my application, and I'm not suggesting to make any of these changes yourself, but for the record this is what I changed:
StringBuffer
Right off the bat, I saw a whole bunch of String.concat() going on. By default ColdFusion strings are stored as primitive Java strings which are immutable, meaning they cannot be changed after creation. Take the following code:[code]<cfscript> foo = "Test1"; foo = foo & "Test2"; foo = foo & "Test3"; foo = foo & "Test4"; </cfscript>[/code]Each time I append test to foo, Java throws away the previous String and creates a new one which is the concatenated version. This is harmless in small amounts, but consumes tons of memory and CPU when done a few million times. If you can get away with it, the solution is to use <cfsavecontent>. This tag creates a buffer in memory and increases its size as needed. If cfsavecontent just won't work then our answer lies in Java! This code creates an instance of a StringBuffer object with an initial size of 50. (optional) As text is appended the String Buffer grows to accommodate.
[code]foo = createObject("java","java.lang.StringBuffer").init(javacast("int",50)); foo.append("Test1"); foo.append("Test2"); foo.append("Test3"); foo.append("Test4"); [/code]Go get your test back out, simply call the toString() method of your StringBuffer object. This brought the largest single gain cutting the execution time by more than half!
Compare instead of eq
My string compare function used the eq operator quite a lot and my stack traces showed a LOT of parseDouble() and NumberFormatException's being thrown. A nasty little habit of CF7 is to test a string to see if it is numeric. If it is, it will treat it as such (which I assume performs better?) if not, an error is thrown by the parseDouble method and caught internally. Then the string is treated as, well, a string. All this casting and error handling was being very costly and I knew my strings would rarely be numbers anyway. I switched over to the compare function which assumes you are using Strings and got another sizable increase in overall speed.[code]<cfif string1 eq string2> <cfif compare(string1,string2) eq 0>[/code]
reFind instead of reFindNoCase
This one was small, but measurable. My SQL Color Coder uses some simple regex functions and I noticed my traces were spending a fair amount of time on them-- specifically dealing with changing case of text. I had used the "NoCase" versions out of habit, but didn't really need case-insensitivity. When I switched from reFindNoCase to reFind and from reReplaceNoCase to reReplace I got another small boost in performance. Overall, I was able to get the code about 5-12 times faster which means a lot when it goes from a two minutes down to 10 seconds. Once again, I'll note that this kind of performance tuning is totally unnecessary for most scenarios. In fact I've had people tell me I should NEVER use one function over another because later version of ColdFusion (or alternative CFML engines) may perform differently. I say Phooey. If you know your app currently runs a version that will make your code twice as fast and cut a 5 minute job in half by using a certain method, I say do it. What's the worst that can happen? Your code will run better for now? You can always refactor if necessary on a newer version. In fact I wouldn't recommend trying to tune something that isn't giving you problems anyway-- it's probably fine the way it is. I'm not going to go as far as to say "Pre-optimization is the root of all evil" though. I don't like that quote too awful much because I think it gives people a false sense of security when they code-- as if they can just fling code at the server without really thinking about it and then mop up their mess later. I know that isn't the meaning of it, but I guess I'll save that rant for another blog post.
Larry C. Lyons
In tests I've run comparing string concatenation to cfsavecontent and the java string buffer, cfsavecontent has been the fastest of all three methods. For string concatenation, I've seen times like 27 seconds vs. about 160ms or less for cfsavecontent, while using the java string buffer was around 420 ms. A caveat, this test was run using cf7.02 on a WinXP box. Let me know if you'd like a copy of the code and I can send it along.
Here are the results of one test (iteration of 50000 in each case):
String Concatenation string & string: 28297ms String Length: 650000
CFSaveContent cfsavecontent: 156ms String Length: 650001
Using Java String Buffer java string buffer: 422ms String Length: 650001
regards, larry
Ben Nadel
I went through a big "Compare" and "CompareNoCase" phase for a while. But I just never got comfortable with the fact that "NOT" meant "Equal".... ie, that NOT Compare() was the affirmation of equality. Even after months of using it, I would always stumble when reading the code I wrote. I eventually just moved back to using "EQ". It's slower, but my coding is faster :)
Dan Wilson
Brad,
I found this to be a pretty interesting post and a good examination into how things work under the covers. Thanks for sharing.
DW
Brad Wood
@Larry: I'd love to see your test code. I actually made some similar code a while back when the whole csv thing came up on the talk list. Mine also reported on the memory increase. Maybe I can find it again...
@Ben: Yeah, the fact that the compare returns 0 if it matches kind of makes sense since it also looks for greater and less than, but I agree with you-- it does make me do a double take to see "not compare()" in the code.
David Stockton
Regarding Ben's comment:
I definitely agree - reading "compare()" can take a) a lot of getting used to and b) make reading your code slower.
But, as Brad says - you're probably going over an existing piece of slow code, trying to optimize it. In which case the compare() technique should definitely be in your arsenal :)
@Brad - Nice little summary post for people that may not have seen this before.
Dan Sorensen
Thinking about Ben's comments, I wonder if it would simplify matters, AND retain the speed gains by encapsulating the fastest methods mentioned within a UDF.
Brad Wood
@Dan S: that's an interesting thought. I'm a little leery of the complication that might add into some otherwise simple code. I guess I did do something similar to that once when I needed a fast script version of cfparam. Scope hunting was killing my isdefined, so I made a a UDF that called structkeyexists and called it "structparam". Perhaps I should blog it. At the time it made a pretty big difference in a block of code that got hit pretty heavy each time it was called to create some XML from form fields.
Dan Sorensen
Well to answer Ben's issue directly, in the case of compare a UDF could be created that would return true or false. It could be very simple:
if(fastcompare(a,b)) { //true } else { // false }
(assuming that someone abstracts compare into fastcompare().
Even cfsavecontent could be put into a nice UDF.
Henry
Why StringBuffer when u can use the newer and better StringBuilder?
http://java.sun.com/docs/books/tutorial/java/data/buffers.html http://java.sun.com/j2se/1.5.0/docs/api/java/lang/StringBuilder.html
as long as your var your variables correctly, you shouldn't have problem with thread safety.
Brad Wood
@Henry: I would have to look again, but for some reason i was thinking the StringBuilder didn't come along until Java 5 or 6. I wanted this code to run on CF7 (JDK 1.4)
Please correct me if I am wrong though.
Larry C. Lyons
@Henry: I'm going to have to have a look at StringBuilder.
Brad Wood
I double checked and StringBuilder was in fact introduced in Java 5. If you are using ColdFusion 8 and aren't concerned with compatabtility with CF7, I would recommended trying it out as opposed to StrungBuffer.
Peter J. Farrell
FYI, there are some problems using java.lang.StringBuilder on Adobe CF8 on JVM 1.6. It seems that length() and substring() errors out saying the methods are missing. Appears to be a problem with the CFJavaProxy in the CFML engine. It does not exist on JVM 1.5. So go figure.
Brad Wood
@Peter: Usually when a Java class tells you it can't find a method, it is because you aren't passing in the correct arguments of the correct type. Since Java allows for method overloading (multiple methods of the same name in a class with different parameters/types), if there is a method whose signature is foo(int bar) and you call it with a string (which is the default form of most CF variables) it will tell you it couldn't find the method. What it really means is it couldn't find a method named "foo" which took a string as a parameter. To get around this, use the javaCast() function in ColdFusion to make sure your parameters are being cast to the correct type. Also, make sure you are supplying the correct number of parameters. Check out the Java Docs here: http://java.sun.com/javase/6/docs/api/java/lang/StringBuilder.html
(Note there are two substring methods. Tell java which one you want based on the parameters you provide.
Hope that helps!