BoxLang's QoQ Is Here, And It's 5x Faster Than Lucee, 17x Faster Than Adobe!
As BoxLang (our new CF-compatible JVM language) nears its final release, we're very pleased to announce that Query of Query support is ready for testing! QoQ often times draws a variety of reactions from people, but it's a really nice feature to run any SQL select you want against 1 or more in-memory queries for the purpose of filtering, aggregating, or joining. One of the biggest complaints is performance, which is why I've performed 2 rounds of performance enhancements to Lucee's QoQ support in the past which I have detailed here and here.
Building on the knowledge and experience I got from overhauling Lucee's QoQ, I incorporated those lessons into the architecture of every part of BoxLang's query object, and QoQ implementation. I'm pleased to have a fully-functional 100% Java implementation of QoQ which supports everything Adobe and Lucee QoQ support and much more. We're not using an HSQLDB fallback-- everything is pure Java and super optimized. I was able to make some big architectural changes since this was being built from the ground-up. I've also incorporated all of the QoQ unit tests from the Lucee TestBox test suite into BoxLang and they are all passing!
- Parallel Java Streams are incorporated everywhere for blazing performance
- SQL data types are tracked at every turn for optimized data operations without needing to cast values
- Data shuffling is minimized by performing as much work as possible on a stream of integers representing the rows of the query to be selected. The final result isn't built until needed
- Short circuit scenarios are detected and optimized (like using LIMIT with no ORDER BY can stop iterating as soon as the required number of rows are acquired)
- An ANTLR parsing grammar based on SQLite is used for fast parsing that is easily customizable and supports many new constructs that Lucee and Adobe don't support like CASE statements as INNER JOINs (more on this in a coming blog post)
- SQL Functions are implemented in a pluggable manner, allowing you to register custom SQL functions at runtime to use directly inside your QoQ
- A minimal locking strategy has been implemented that keeps the query objects thread safe, but with very little contention under multi-threaded use
I feel the need... the need for speed.
I'll put out another post soon detailing the new syntax features BoxLang has incorporated, which really create some substantial distance between what it's capable of and what you get with Adobe and Lucee. This post, however, is all about speed. My goal was to match or exceed the performance of the other engines. My first post on improving Lucee's QoQ used a test suite of a single 30,000 row query object. My second post, used a single query of 1 Million rows. Sadly, I couldn't include Adobe CF in the second post because it's performance at that scale was just dismal. I'm pleased to report that Adobe has made some improvements to their QoQ as it was able to hang with the other engines now, so I'll be including all 3:
- BoxLang 1.0.0-snapshot
- Lucee Server 6.1.1+118
- Adobe ColdFusion 2023.0.12.330713
This time, I'm using 3 query objects (for the join tests). The same data I used in the second post (1 Million rows) and 2 additional smaller queries of around 5 rows each. I'm basically re-using the exact same SQL selects I used in my previous post on Lucee, but I had to "dumb them down" a bit due to Adobe CF not supporting all the same features (like `true` as a literal value, `TOP 1`, table aliases, or the ucase() function). But for the most part, it's all the same as before to keep an even playing field across my tests and not "stack the deck" in a way that favors BoxLang.
The proof of the pudding is in the eating
Ok, time for the tasty stuff. Each set of numbers represents the milliseconds to complete the query, so a lower number is better because it was faster. I ran each select 3 times to warm up the server, recording the value of the last run. BoxLang outran Lucee and Adobe in every single test. Sometimes by a factor of 80 times faster! You may be wondering where I got the numbers "5x faster than Lucee, and 17x faster than Adobe" from in the blog title. What I did was calculate how many times faster BoxLang was in each of the tests (excluding any tests that did not run on all 3 engines) and then averaged those results. I tried to cover a wide variety of different QoQ examples, but it's worth noting the performance gains you see in your app will be a factor of what your SQL uses and how many rows you have in the queries. Lucee, and especially BoxLang do much better at larger queries and I am using 1 Million rows in each test. For QoQs ran against smaller datasets, the performance difference may be closer to only 2x (the difference of 2ms to 4ms, etc) I have a graph at the end which shows the improvement of each engine over a range of data set sizes.
Basic Select
SELECT name, age, upper( email ) as email, department, isContract, yearsEmployed, sickDaysLeft, hireDate, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees WHERE age > 20 AND department IN ('Accounting','IT','HR') AND isActive = 1 ORDER BY department, isContract, yearsEmployed desc
BoxLang | Lucee | Adobe CF |
250 ms | 3,192 ms | 1,809 ms |
This is a basic non-grouping, non-aggregate select with a WHERE clause, a couple expressions in the SELECT list, and an ORDER BY. BoxLang is 13x faster than Lucee and 7x faster than Adobe CF.
Basic UNION
SELECT name, age, upper( email ) as email, department, isContract, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees where age > 20 AND department = 'HR' AND isActive = 1 UNION SELECT name, age, upper( email ) as email, department, isContract, isActive, empID, favoriteColor, yearsEmployed*12 as monthsEmployed FROM employees where age > 20 AND department = 'Accounting' AND isActive = 0 order by department, name, email, age desc
BoxLang | Lucee | Adobe CF |
253 ms | 1,220 ms | 490 ms |
Here we have two non-grouping, non-aggregates selects UNION'ed together (DISTINCT is implied). BoxLang is 5x faster than Lucee and 2x faster than Adobe CF.
Non-Grouping Aggregate
SELECT max(age) as maxAge, min(age) as minAge, count(1) as theCount FROM employees where department IN ('Accounting','IT') AND isActive = 1
BoxLang | Lucee | Adobe CF |
173 ms | 573 ms | 4,290 ms |
Here we have simple non-grouping, aggregate select. It only outputs one row, so it's already fairly fast. BoxLang is 3x faster than Lucee and 25x faster than Adobe CF.
Grouped Aggregate
SELECT age, department, isActive, isContract, count(1) as theCount FROM employees where age > 20 AND isActive = 1 group by age, department, isActive, isContract HAVING count(1) > 3 ORDER BY age, department, isActive, isContract
BoxLang | Lucee | Adobe CF |
109 ms | 841 ms | 8,485 ms |
This select uses a WHERE, a GROUP BY, a HAVING, an ORDER BY and aggregate functions-- so basically everything! BoxLang is 8x faster than Lucee and 78x faster than Adobe CF. I'm honestly not sure why Adobe is so slow here, but I can say it is consistently slow with large data sets (see the last graph below).
Basic String Concat
SELECT name + department as departmentAge FROM employees
BoxLang | Lucee | Adobe CF |
289 ms | 377 ms | 4,041 ms |
This simply example highlights the benefits of BoxLang internally tracking all data types to optimize operations without needing to test and cast. We also use StringBuilder wherever possible for best performance. BoxLang is the same speed as Lucee and 14x faster than Adobe CF.
Basic LIKE operator
SELECT * FROM employees where name like '%Harry%'
BoxLang | Lucee | Adobe CF |
116 ms | 341 ms | 673 ms |
This is another simple test like above, but a very common use case. BoxLang caches compiled Regex patterns. BoxLang is 3x faster than Lucee and 6x faster than Adobe CF.
2-Table Join
SELECT employees.name, departments.name as department, slogan FROM employees, departments WHERE employees.department = departments.name and employees.isActive = 1 and age > 20 AND department IN ('Accounting','IT','HR') order by employees.name
BoxLang | Lucee | Adobe CF |
596 ms | 2,315 ms | 2,107 ms |
The join tests are new. Lucee and Adobe only support the "comma join" syntax, which is an inherent CROSS JOIN, with the limiting condition in the WHERE clause. Here we have a simple join from our employees table (1 million rows) to our departments table (5 rows). Again, BoxLang was architected in such a way that it doesn't create any intermediate copies of the joined data. Instead, it deals with a Java Stream of int[] arrays representing the intersections of the rows from each table. This is a very lightweight and fast data structure and we only build up the final data at the end. BoxLang is 4x faster than Lucee and 4x faster than Adobe CF.
3-Table Join
SELECT employees.name, departments.name as department, slogan, hexcode FROM employees, departments, colors WHERE employees.department = departments.name and employees.favoriteColor = colors.name and employees.isActive = 1 and age > 20 AND department IN ('Accounting','IT','HR') order by employees.name
INNER JOIN version (BoxLang only)
SELECT e.name, d.name as department, slogan, hexcode FROM employees e inner join departments d on e.department = d.name inner join colors c on e.favoriteColor = c.name WHERE e.isActive = 1 and e.age > 20 AND e.department IN ('Accounting','IT','HR') order by e.name
BoxLang | BoxLang INNER JOIN | Lucee |
939 ms | 546 ms | 2,892 ms |
Here we lose Adobe CF as it has an imposed limit of only 2 tables being joined at a time. Lucee (via HSQLDB) supports more than 2 tables. BoxLang has gone a step further, and also supports ANSI join syntax (INNER JOIN) which is superior as it allows us to optimize our filtering of the rows since we can apply the ON clause of each join as we process them, limiting the Cartesian product the where clause operates on. Adobe CF is not represented here, but we have Lucee and BoxLang (running the "comma join" syntax) and then BoxLang AGAIN (running the INNER JOIN syntax) to show how it's even better. BoxLang's "comma join" is 3x faster than Lucee, and BoxLang's INNER JOIN is 5x faster than Lucee. The result of this test were not included in the average performance times in the blog title since Adobe does not support this query.
Multi-Table UNION
SELECT name, department FROM employees WHERE department = 'Accounting' UNION SELECT name, department FROM employees WHERE department = 'It' UNION SELECT name, department FROM employees WHERE department = 'HR' UNION SELECT name, department FROM employees WHERE department = 'Executive' UNION SELECT name, department FROM employees WHERE department = 'Janitorial' order by department, name
BoxLang | Lucee | Adobe CF |
1,123 ms | 6,989 ms | 2,809 ms |
I have a 2-table UNION above, but I added this test to specifically showcase some optimizations BoxLang implements using using DISTINCT UNIONs. We wait to de-dupe the final result until we reach the last distinct union, preventing unnecessary processing. (UNION is inherently DISTINCT unless you specify UNION ALL) BoxLang is 6x faster than Lucee and 3x faster than Adobe.
Grouped Aggregate Across Different Data Sizes
SELECT age, department, isActive, isContract, count(1) as _count FROM employees where age > 20 AND isActive = 1 group by age, department, isActive, isContract HAVING count(1) > 3 ORDER BY age, department, isActive, isContract
I'm not providing the full data here for brevity, but I'll gladly share it with anyone who is curious.
BoxLang | Lucee | Adobe CF |
You may recognize this query from above. It's a basic grouped, ordered select. I ran that query roughly 50 times on a data set starting at 20 rows and working its way up to just over 1 Million rows. You can see the relative performance stays close under around 20,000 rows, but then things really start to spread out. Adobe's performance appears to get exponentially worse. Lucee slows down linearly, but at a much steeper angle than BoxLang, which stays VERY fast as the data set grows. BoxLang gets up to 10x faster than Lucee and almost 90x faster than Adobe CF! The result of this test were not included in the average performance times in the blog title since we included this query above already.
BoxLang | Lucee | Adobe CF |
Let's zoom in our Y axis a little so we can get a better look at Lucee and BoxLang. While Lucee nears 1 second executions, BoxLang barely breaks 100 ms.
Conclusion
- Adobe CF has gotten better in the past, seeing as how it was so bad 2 years ago, I couldn't even include it in the comparison, so that's a plus for them
- Lucee is overall pretty fast given all the work I've put into it. It also uses parallel Java streams and more internally.
- BoxLang is still beating the socks off of both of them!
- Please give your app a test on BoxLang and report any issues you may have with QoQ or log feature requests.