BoxLang Query Of Query Improvements and Additions
A few days ago, I blogged about how the new Query of Query support in BoxLang has greatly exceeded the performance of Adobe and Lucee's QoQ. If you haven't read that post, you can find it here:
I also said we had added a bunch of nice new features to BoxLang's QoQ, which I'll explain today. We built our QoQ support from scratch using a super fast ANTLR grammar based on SQLite which we customized for our needs. This gave us a great opportunity to incorporate some of those features out of the box.
ANSI JOIN syntax
Lucee and Adobe both support the so-called "table join" syntax where you can list more than one table in your FROM clause with a comma between them. Lucee allows more than two tables as well as table aliases. Adobe not only allows no more than 2 tables, it also doesn't support table aliases, which is a huge bummer. We support not only the ANSI join syntax like INNER JOIN, but we also have NO LIMIT on the number of tables you can join, and of course you can use table aliases.
SELECT u.name, d.name as departmentName FROM qryUsers u INNER JOIN qryDepartments d ON u.departmentID = d.departmentID
The way aliases are resolved is:
- You can specify the whole table name as qryUsers.name
- Or you can specify the table alias as u.name
- Or you can specify just the column name SO LONG AS the column name is only found in one table
ANSI join syntax will produce faster executing queries since we're able to apply the ON clause while we process each join, limiting the number of row combinations the WHERE clause needs applied to.
JOIN types
This is another huge addition-- filling in a big functionality gap CF has had for years.
CROSS JOIN
Lucee and Adobe CF only allow the equivalent of a CROSS JOIN, in which the Cartesian product of both tables is produced and you must filter out the rows you don't want in the WHERE clause. We also support this using the ANSI join sytax
SELECT * FROM table1 t1 CROSS JOIN table2 t2 WHERE t1.col = t2.col
CROSS JOINs cannot have an ON clause.
INNER JOIN
This is the most common join type. It creates a Cartesian product of both tables filtering the intersection based the boolean result of an ON clause.
SELECT c.name as colorName, f.name as fruitName FROM fruit f INNER JOIN colors c ON f.color = c.name
That query will only return rows for colors and fruits which have a match.
LEFT (OUTER) JOIN
LEFT JOIN, or LEFT OUTER JOIN is a type of outer join that contains one row for every row in the first table. Values for the second table will only be available where the intersection of the ON clause matched rows in the second query. Nulls will be present otherwise.
SELECT c.name as colorName, f.name as fruitName FROM fruit f LEFT OUTER JOIN colors c ON f.color = c.name
The above query would return one row for every fruit, whether or not it has a matching color in the colors table.
RIGHT (OUTER) JOIN
RIGHT JOIN or RIGHT OUTER JOIN is a type of outer jon that is the opposite of a LEFT join. It contains one row for every row in the second table. Values for the first table will only be available where the intersection of the ON clause matched rows in the first query. Nulls will be present otherwise.
SELECT c.name as colorName, f.name as fruitName FROM fruit f RIGHT OUTER JOIN colors c ON f.color = c.name
The above query would return one row for every color, whether or not it has a matching fruit in the fruit table.
FULL (OUTER) JOIN
FULL JOIN or FULL OUTER JOIN is a type of OUTER join that combines all rows from both tables. If the ON clause does does match one of the tables, nulls will be present.
SELECT c.name as colorName, f.name as fruitName FROM fruit f FULL OUTER JOIN colors c ON f.color = c.name
That query will return a row for every fruit whether or not it has a matching color, and every color whether or not it has a matching fruit.
LIMIT N and TOP N
Adobe does not support LIMIT or TOP. Lucee supports ONLY TOP. BoxLang supports both!
SELECT TOP 1 * from qryData
or
SELECT * from qryData LIMIT 1
In Adobe CF, you must use the maxRows option in the query, but this can ONLY be applied to the overall select, making it worthless to use with unions or nested selects.
Table aliases
Adobe CF doesn't support any sort of table aliases. Lucee does, and so does BoxLang.
SELECT * FROM qryData d
or
SELECT * FROM qryData as d
Subqueries
This is another super-powerful feature QoQ has been missing. The ability to take an entire select statement, wrap it up in parenthesis and use it as a subquery. It comes in two forms.
FROM or JOIN Subquery
You can replace any table name in a FROM or JOIN clause with a subquery in parentheses followed by a table alias. The select inside the parenthesis can have TOP, WHERE, JOINS, ORDER BY, or UNIONS. Basically anything and everything is possible. We only support non-correlated subqueries-- meaning they are fully contained and don't reference any tables outside of themselves.
SELECT * FROM ( SELECT TOP 1 * FROM qryData WHERE col = 'value' ORDER BY col desc ) d WHERE foo = 'bar' ORDER BY foo
or
SELECT e.*, s.name as supName FROM qryEmployees e INNER JOIN (SELECT * FROM qryEmployees WHERE supervisor = true) s on e.supervisor = s.name
IN or NOT IN Subquery
You can also nest an entire nested select inside the parentheses of an IN or NOT IN statement instead of providing a list of values. No alias is needed in this case, as this query can't be referenced in the SELECT clause. Your subquery must select a single column.
SELECT * FROM qryUsers where dept IN (SELECT departmentName FROM qryDept WHERE departmentType = 'Excecutive' )
CASE Statements
A super-handy feature of most DB's is the CASE statement, which is similar to an if/else or switch statement in CF. There are two flavors of case and we support them both. Case is an expression, so it can be used anywhere, including as the input to a function, WHERE clause, or ORDER BY.
Standard Case
The first CASE statement doesn't take an input expressions, but expects each WHEN to contain a boolean expression which evaluates true or false. The WHEN statements don't need to operate on the same value since they each have their own condition.
SELECT name, CASE WHEN name = 'brad' THEN 'me' WHEN name = 'luis' THEN 'boss' WHEN name = 'jacob' THEN 'Mr. Kansas City' ELSE 'default' END as title FROM qryEmployees
Input Case
This variation of the CASE statement is more like a switch in that you feed it an input expression which is then matched against each WHEN. In this case, each WHEN isn't given a boolean conditional, but instead an expression to compare to the input expression. This helps reduce boilerplate.
SELECT name, CASE name WHEN 'brad' THEN 'me' WHEN 'luis' THEN 'boss' WHEN 'jacob' THEN 'Mr. Kansas City' ELSE 'other' END as title FROM qryEmployees
Custom Function Registration
This is a really sweet addition. If you've ever wished you could inject a bit of custom logic into your SQL like with how SQLServer lets you register custom functions, you can do this now with ANY functional object. This means, any
- UDF
- Closure
- Lambda
- BIF (Built in Function)
- Java method reference
can all be registered with BoxLang to be usable as a custom function right inside your SQL.
Scalar
A scalar function is used to process a single cell from a table and return a single value.
queryRegisterFunction( "reverse", ::reverse ); q = queryExecute( "SELECT reverse( 'Brad' ) as rev", [], { dbType : "query" } );
Here we want to tap into the built in function reverse(), so we pass in a functional wrapper to the BIF. ::reverse is a new BoxLang-only syntax. The equivalent in CF would be this slightly more verbose version:
queryRegisterFunction( "reverse", ( input )=>reverse( input ) );
Aggregate
An aggregate function is used to process zero or more values from a given row at a time. Each argument is passed as an array of values, with nulls removed.
queryRegisterFunction( "arrayToList", ::arrayToList, "varchar", "aggregate" ) q = queryExecute( " SELECT arrayToList( name ) as names FROM qryEmployess ", [], { dbType : "query" } );
Here we pass in a BIF functional wrapper around the arrayToList() BIF to create a command-delimited list of the names. Note, we've also added in a first-class string_agg() and group_concat() function which does the same thing, but this is just a contrived example.
Remember, you don't have to just wrap up BIFs. You can pass any custom closure or UDF you want to, which runs any app logic you wish!
Functions
Adobe supports a pretty limited number of functions. BoxLang supports these functions out-of-the-box. Lucee does as well.
-
Math
-
abs() - absolute value
-
acos() - arccosine
-
asin() - arcsine
-
atan() - arctangent
-
cos() - cosine
-
sin() - sine
-
tan() tangent
-
exp() - returns e raised to the power of a specified number
-
sqrt() - square root
-
ceiling() - rounds up to next closest integer
-
floor() - rounds down to next closest integer
-
-
Null Handling
-
coalesce() - takes N args, returning the first non-null
-
isNull() - takes two args, returning the second if the first is null
-
-
String
-
left() - Take left n chars
-
right() - take right n chars
-
length() - return string length
-
rTrim() - trim whitespace on right side of string
-
lTrim() - trim whitespace on left side of string
-
lcase() - same as lower()
-
ucase() - same as upper()
-
concat() - concatenates any number of args into a single string
-
-
Misc
-
convert() - Same as cast, but allows type to be dynamic. convert( column, "varchar" ) is the same as cast( column as varchar )
-
We've also added the following aggregate functions, which are popular in other DBs
- string_agg() - Concatenate values using specified delimiter
- group_concat() - same as string_agg()
Operators
We've also added the following bitwise operators that neither Adobe or CF support.
- Binary
- ^ - bitwise Exclusive OR (XOR)
- & - bitwise AND
- | - bitwise OR
- Unary
- ! - bitwise NOT
Literals
Finally, we also have support for the following literal values. Lucee supports these as well, but Adobe CF does not.
- true/false
- null
UNION DISTINCT
There are 3 ways you can write a UNION in most databases
- UNION ALL (doesn't de-dupe rows)
- UNION DISTINCT (de-dupes rows)
- UNION (same as DISTINCT)
Adobe only supports UNION and UNION ALL. Lucee and BoxLang both support the UNION DISTINCT syntax as well, which is the same as just UNION, but it's nice to support this syntax as it's common in other DBs.
Conclusion
So, as you can see, there's a lot of new stuff going on in our Query of Query implementation. We're excited about the new features and productivity we've been able to add to take QoQ to the next level. If there's any features Adobe CF or Lucee are supporting that we've missed, please let us know! We've already incorporated all of Lucee's QoQ tests from their test suite into our BoxLang test suite and can confirm they are all already passing on BoxLang. The ANTLR grammar built on an existing community project is very easy to modify and make changes to. In the future, I'd love to see the ability to
- delete rows from a query
- update rows in a query
- insert rows into a query
all via query of queries. The pieces are in place, so it's just a matter of putting them together to make BoxLang the most powerful platform for CFML and BL code in existence!