Chapter 1: Introduction
CQL was designed as a compiler for the SQLite runtime system. SQLite lacks stored procedures but has a rich C runtime interface that allows you to create any kind of control flow mixed with any SQL operations that you might need. However, SQLite’s programming interface is both verbose and error-prone in that small changes in SQL statements can require significant swizzling of the C code that calls them. Additionally, many of the SQLite runtime functions have error codes which must be strictly checked to ensure correct behavior. In practice, it’s easy to get some or all of this wrong.
CQL simplifies this situation by providing a high-level SQL language not unlike the stored procedure forms that are available in client/server SQL solutions and lowering that language to “The C you could have written to do that job using the normal SQLite interfaces.”
As a result, the C generated is generally very approachable but now the source language does not suffer from brittleness due to query or table changes, and CQL always generates correct column indices, nullability checks, error checks, and the other miscellany needed to use SQLite correctly.
CQL is also strongly typed, whereas SQLite is very forgiving with regard to what operations are allowed on which data. Strict type checking is much more reasonable given CQL’s compiled programming model.
NOTE: CQL was created to help solve problems in the building of Meta Platforms’s Messenger application, but this content is free from references to Messenger. The CQL code generation here is done in the simplest mode with the fewest runtime dependencies allowed for illustration.
Getting Started
Before starting this tutorial, make sure you have built the cql
executable first in
Building CG/SQL
The “Hello World” program rendered in CQL looks like this:
-- needed to allow vararg calls to C functions
declare procedure printf no check;
create proc hello()
begin
call printf("Hello, world\n");
end;
This very nearly works exactly as written but we’ll need a little bit of glue to wire it all up.
First, assuming you have
built
cql
, you should have the power to do this:
$ cql --in hello.sql --cg hello.h hello.c
This will produce the C output files hello.c
and hello.h
which can be readily compiled.
However, hello.c will not have a main
– rather it will have a function like this:
...
void hello(void);
...
The declaration of this function can be found in hello.h
.
NOTE:
hello.h
attempts to includecqlrt.h
. To avoid configuring include paths for the compiler, you might keepcqlrt.h
in the same directory as the examples and avoid that complication. Otherwise, you must make arrangements for the compiler to locatecqlrt.h
, either by adding it to anINCLUDE
path or by incorporating some-I
options to aid the compiler in finding the source.
That hello
function is not quite adequate to get a running program, which
brings us to the next step in getting things running. Typically, you have some
kind of client program that will execute the procedures you create in CQL. Let’s
create a simple one in a file we’ll creatively name main.c
.
A very simple CQL main might look like this:
#include <stdlib.h>
#include "hello.h"
int main(int argc, char **argv)
{
hello();
return 0;
}
Now we should be able to do the following:
$ cc -o hello main.c hello.c
$ ./hello
Hello, world
Congratulations, you’ve printed "Hello, world"
with CG/SQL!
Why did this work?
A number of things are going on even in this simple program that are worth discussing:
- the procedure
hello
had no arguments, and did not use the database- therefore its type signature when compiled will be simply
void hello(void);
so we know how to call it - you can see the declaration for yourself by examining the
hello.c
orhello.h
- therefore its type signature when compiled will be simply
- since nobody used a database, we didn’t need to initialize one
- since there are no actual uses of SQLite, we didn’t need to provide that library
- for the same reason, we didn’t need to include a reference to the CQL runtime
- the function
printf
was declared “no check”, so calling it creates a regular C call using whatever arguments are provided, in this case a string - the
printf
function is declared instdio.h
which is pulled in bycqlrt.h
, which appears inhello.c
, so it will be available to call in the generated C code - CQL allows string literals with double quotes, and those literals may have most C escape sequences in them, so the “\n” bit works
- Normal SQL string literals (also supported) use single quotes and do not allow, or need escape characters other than
''
to mean one single quote
- Normal SQL string literals (also supported) use single quotes and do not allow, or need escape characters other than
All of these facts put together mean that the normal, simple linkage rules result in an executable that prints the string “Hello, world” and then a newline.
Variables and Arithmetic
Borrowing once again from examples in “The C Programming Language”, it’s possible to do significant control flow in CQL without reference to databases. The following program illustrates a variety of concepts:
-- needed to allow vararg calls to C functions
declare procedure printf no check;
-- print a conversion table for temperatures from 0 to 300
create proc conversions()
begin
-- not null can be abbreviated with '!'
declare fahr, celsius int!;
-- variable type can be implied
-- these are all int not null (or int!)
let lower := 0; /* lower limit of range */
let upper := 300; /* upper limit of range */
let step := 20; /* step size */
-- this is the canonical SQL assignment syntax
-- but there are shorthand versions available in CQL
set fahr := lower;
while fahr <= upper
begin
-- top level assignment without 'set' is ok
celsius := 5 * (fahr - 32) / 9;
call printf("%d\t%d\n", fahr, celsius);
-- the usual assignment ops are supported
fahr += step;
end;
end;
You may notice that both the SQL style --
line prefix comments and the C style /* */
forms
are acceptable comment forms.
Like C, in CQL all variables must be declared before they are used. They remain in scope until the end of the
procedure in which they are declared, or they are global scoped if they are declared outside of any procedure. The
declarations announce the names and types of the local variables. Importantly, variables stay in scope for the whole
procedure even if they are declared within a nested begin
and end
block.
The most basic types are the scalar or “unitary” types (as they are referred to in the compiler)
type | aliases | notes |
---|---|---|
integer | int | a 32 bit integer |
long | long integer | a 64 bit integer |
bool | boolean | an 8 bit integer, normalized to 0/1 |
real | n/a | a C double |
text | n/a | an immutable string reference |
blob | n/a | an immutable blob reference |
object | n/a | an object reference |
X not null | x! | ! means not null in types |
NOTE: SQLite makes no distinction between integer storage and long integer storage, but the declarations tell CQL whether it should use the SQLite methods for binding and reading 64-bit or 32-bit quantities when using the variable or column so declared.
There will be more notes on these types later, but importantly, all keywords and
names in CQL are case insensitive just like in the underlying SQL language.
Additionally all of the above may be combined with not null
to indicate that a
null
value may not be stored in that variable (as in the example). When
generating the C code, the case used in the declaration becomes the canonical
case of the variable and all other cases are converted to that in the emitted
code. As a result, the C remains case sensitively correct.
The size of the reference types is machine dependent, whatever the local pointer
size is. The non-reference types use machine independent declarations like
int32_t
to get exactly the desired sizes in a portable fashion.
All variables of a reference type are set to NULL
when they are declared,
including those that are declared NOT NULL
. For this reason, all nonnull
reference variables must be initialized (i.e., assigned a value) before anything
is allowed to read from them. This is not the case for nonnull variables of a
non-reference type, however: They are automatically assigned an initial value of
0, and thus may be read from at any point.
The programs execution begins with three assignments:
let lower := 0;
let upper := 300;
let step := 20;
This initializes the variables just like in the isomorphic C code. Statements
are separated by semicolons, just like in C. In the above, the data type of the
variable was inferred because the let
keyword was used.
The table is then printed using a while
loop.
while fahr <= upper
begin
...
end;
The above has the usual meaning, with the statements in the begin/end
block
being executed repeatedly until the condition becomes false.
The body of a begin
/end
block such as the one in the while
statement can
contain one or more statements.
The typical computation of Celsius temperature ensues with this code:
celsius := 5 * (fahr - 32) / 9;
call printf("%d\t%d\n", fahr, celsius);
fahr += step;
This computes the Celsius temperature and then prints it out, moving on to the
next entry in the table. Note that we have started using some shorthand. SET
can be omitted. And the +=
assignment operators are also supported. Top-level
procedure calls can also be made without the call
keyword; that, too, could
have been omitted.
Importantly, the CQL compiler uses the normal SQLite order of operations, which is NOT the same as the C order of operations. As a result, the compiler may need to add parentheses in the C output to get the correct order; or it may remove some parentheses because they are not needed in the C order even though they were in the SQL order.
The printf
call operates as before, with the fahr
and celsius
variables
being passed on to the C runtime library for formatting, unchanged.
NOTE: When calling “unchecked” native functions like
printf
, string literals are simply passed through unchanged as C string literals. No CQL string object is created.
Basic Conversion Rules
As a rule, CQL does not perform its own conversions, leaving that instead to the C compiler. An exception to this is that boolean expressions are normalized to a 0 or 1 result before they are stored.
However, even with no explicit conversions, there are compatibility checks to ensure that letting the C compiler
do the conversions will result in something sensible. The following list summarizes the essential facts/rules as
they might be applied when performing a +
operation.
- the numeric types are bool, int, long, real
- non-numeric types cannot be combined with numerics, e.g. 1 + ‘x’ always yields an error
- any numeric type combined with itself yields the same type
- bool combined with int yields int
- bool or int combined with long yields long
- bool, int, or long combined with real yields real
Preprocessing Features
CQL includes its own pre-processor, it is described in Chapter 18 . While it is still possible to use the C pre-processor in front of CQL as was once necessary this practice is now deprecated. The usual pre-processor features are supported:
- macros (
@macro
) - including files (
@include
) - conditional compilation (
@ifdef
,@ifndef
) - token pasting (
@text
and@id
)