CQL was designed as a compiler for the SQLite runtime system. SQLite lacks stored procedures but has a rich C runtime interface that allows you to create any kind of control flow mixed with any SQL operations that you might need. However, SQLite’s programming interface is both verbose and error-prone in that small changes in SQL statements can require significant swizzling of the C code that calls them. Additionally, many of the SQLite runtime functions have error codes which must be strictly checked to ensure correct behavior. In practice, it’s easy to get some or all of this wrong.
CQL simplifies this situation by providing a high-level SQL language not unlike the stored procedure forms that are available in client/server SQL solutions and lowering that language to “The C you could have written to do that job using the normal SQLite interfaces.”
As a result, the C generated is generally very approachable but now the source language does not suffer from brittleness due to query or table changes, and CQL always generates correct column indices, nullability checks, error checks, and the other miscellany needed to use SQLite correctly.
CQL is also strongly typed, whereas SQLite is very forgiving with regard to what operations are allowed on which data. Strict type checking is much more reasonable given CQL’s compiled programming model.
NOTE: CQL was created to help solve problems in the building of Meta Platforms’s Messenger application, but this content is free from references to Messenger. The CQL code generation here is done in the simplest mode with the fewest runtime dependencies allowed for illustration.
Before starting this tutorial, make sure you have built the cql
executable first in Building CG/SQL
The “Hello World” program rendered in CQL looks like this:
-- needed to allow vararg calls to C functions
declare procedure printf no check;
create proc hello()
begin
call printf("Hello, world\n");
end;
This very nearly works exactly as written but we’ll need a little bit of glue to wire it all up.
First, assuming you have built cql
, you should have the power to do this:
$ cql --in hello.sql --cg hello.h hello.c
This will produce the C output files hello.c
and hello.h
which can be readily compiled.
However, hello.c will not have a main
– rather it will have a function like this:
...
void hello(void);
...
The declaration of this function can be found in hello.h
.
NOTE:
hello.h
attempts to includecqlrt.h
. To avoid configuring include paths for the compiler, you might keepcqlrt.h
in the same directory as the examples and avoid that complication. Otherwise, you must make arrangements for the compiler to locatecqlrt.h
, either by adding it to anINCLUDE
path or by incorporating some-I
options to aid the compiler in finding the source.
That hello
function is not quite adequate to get a running program, which brings us to the next step in getting things running. Typically, you have some kind of client program that will execute the procedures you create in CQL. Let’s create a simple one in a file we’ll creatively name main.c
.
A very simple CQL main might look like this:
#include <stdlib.h>
#include "hello.h"
int main(int argc, char **argv)
{
();
helloreturn 0;
}
Now we should be able to do the following:
$ cc -o hello main.c hello.c
$ ./hello
Hello, world
Congratulations, you’ve printed "Hello, world"
with CG/SQL!
A number of things are going on even in this simple program that are worth discussing:
hello
had no arguments, and did not use the database
void hello(void);
so we know how to call ithello.c
or hello.h
printf
was declared “no check”, so calling it creates a regular C call using whatever arguments are provided, in this case a stringprintf
function is declared in stdio.h
which is pulled in by cqlrt.h
, which appears in hello.c
, so it will be available to call in the generated C code''
to mean one single quoteAll of these facts put together mean that the normal, simple linkage rules result in an executable that prints the string “Hello, world” and then a newline.
Borrowing once again from examples in “The C Programming Language”, it’s possible to do significant control flow in CQL without reference to databases. The following program illustrates a variety of concepts:
-- needed to allow vararg calls to C functions
declare procedure printf no check;
-- print a conversion table for temperatures from 0 to 300
create proc conversions()
begin
-- not null can be abbreviated with '!'
declare fahr, celsius int!;
-- variable type can be implied
-- these are all int not null (or int!)
lower := 0; /* lower limit of range */
let upper := 300; /* upper limit of range */
let := 20; /* step size */
let step
-- this is the canonical SQL assignment syntax
-- but there are shorthand versions available in CQL
set fahr := lower;
while fahr <= upper
begin
-- top level assignment without 'set' is ok
:= 5 * (fahr - 32) / 9;
celsius call printf("%d\t%d\n", fahr, celsius);
-- the usual assignment ops are supported
+= step;
fahr end;
end;
You may notice that both the SQL style --
line prefix comments and the C style /* */
forms are acceptable comment forms.
Like C, in CQL all variables must be declared before they are used. They remain in scope until the end of the procedure in which they are declared, or they are global scoped if they are declared outside of any procedure. The declarations announce the names and types of the local variables. Importantly, variables stay in scope for the whole procedure even if they are declared within a nested begin
and end
block.
The most basic types are the scalar or “unitary” types (as they are referred to in the compiler)
type | aliases | notes |
---|---|---|
integer |
int | a 32 bit integer |
long |
long integer | a 64 bit integer |
bool |
boolean | an 8 bit integer, normalized to 0/1 |
real |
n/a | a C double |
text |
n/a | an immutable string reference |
blob |
n/a | an immutable blob reference |
object |
n/a | an object reference |
X not null |
x! | ! means not null in types |
NOTE: SQLite makes no distinction between integer storage and long integer storage, but the declarations tell CQL whether it should use the SQLite methods for binding and reading 64-bit or 32-bit quantities when using the variable or column so declared.
There will be more notes on these types later, but importantly, all keywords and names in CQL are case insensitive just like in the underlying SQL language. Additionally all of the above may be combined with not null
to indicate that a null
value may not be stored in that variable (as in the example). When generating the C code, the case used in the declaration becomes the canonical case of the variable and all other cases are converted to that in the emitted code. As a result, the C remains case sensitively correct.
The size of the reference types is machine dependent, whatever the local pointer size is. The non-reference types use machine independent declarations like int32_t
to get exactly the desired sizes in a portable fashion.
All variables of a reference type are set to NULL
when they are declared, including those that are declared NOT NULL
. For this reason, all nonnull reference variables must be initialized (i.e., assigned a value) before anything is allowed to read from them. This is not the case for nonnull variables of a non-reference type, however: They are automatically assigned an initial value of 0, and thus may be read from at any point.
The programs execution begins with three assignments:
lower := 0;
let upper := 300;
let := 20; let step
This initializes the variables just like in the isomorphic C code. Statements are separated by semicolons, just like in C. In the above, the data type of the variable was inferred because the let
keyword was used.
The table is then printed using a while
loop.
while fahr <= upper
begin
...
end;
The above has the usual meaning, with the statements in the begin/end
block being executed repeatedly until the condition becomes false.
The body of a begin
/end
block such as the one in the while
statement can contain one or more statements.
The typical computation of Celsius temperature ensues with this code:
:= 5 * (fahr - 32) / 9;
celsius call printf("%d\t%d\n", fahr, celsius);
+= step; fahr
This computes the Celsius temperature and then prints it out, moving on to the next entry in the table. Note that we have started using some shorthand. SET
can be omitted. And the +=
assignment operators are also supported. Top-level procedure calls can also be made without the call
keyword; that, too, could have been omitted.
Importantly, the CQL compiler uses the normal SQLite order of operations, which is NOT the same as the C order of operations. As a result, the compiler may need to add parentheses in the C output to get the correct order; or it may remove some parentheses because they are not needed in the C order even though they were in the SQL order.
The printf
call operates as before, with the fahr
and celsius
variables being passed on to the C runtime library for formatting, unchanged.
NOTE: When calling “unchecked” native functions like
printf
, string literals are simply passed through unchanged as C string literals. No CQL string object is created.
As a rule, CQL does not perform its own conversions, leaving that instead to the C compiler. An exception to this is that boolean expressions are normalized to a 0 or 1 result before they are stored.
However, even with no explicit conversions, there are compatibility checks to ensure that letting the C compiler do the conversions will result in something sensible. The following list summarizes the essential facts/rules as they might be applied when performing a +
operation.
CQL includes its own pre-processor, it is described in Chapter 18. While it is still possible to use the C pre-processor in front of CQL as was once necessary this practice is now deprecated. The usual pre-processor features are supported:
@macro
)@include
)@ifdef
, @ifndef
)@text
and @id
)
The point of using CQL is to facilitate access to a SQLite database, so we’ll switch gears to a slightly more complicated setup. We’ll still keep things fairly simple, but let’s start to use some database features.
NOTE: It is not the intent of this tutorial to also be a primer for the SQLite programming language, which is ably documented on sqlite.org. Please refer to that site for details on the meaning of the SQL statements used here if you are new to SQL.
Suppose we have the following program:
-- needed to allow vararg calls to C functions
declare procedure printf no check;
create table my_data(t text not null);
create proc hello()
begin
insert into my_data(t) values("Hello, world\n");
declare t text not null;
set t := (select * from my_data);
call printf('%s', t);
end;
The above is an interesting little baby program, and it appears as though it would once again print that most famous of salutations, “Hello, world”.
Well, it doesn’t. At least, not yet. Let’s walk through the various things that are going to go wrong as this will teach us everything we need to know about activating CQL from some environment of your choice.
CQL is just a compiler; it doesn’t know how the code it creates will be provisioned any more than, say, clang does. It creates functions with predictable signatures so that they can be called from C just as easily as the SQLite API itself, and using the same currency. Our new version of hello
now requires a database handle because it performs database operations. Also, there are now opportunities for the database operations to fail, and so hello
now provides a return code.
A new minimal main
program might look something like this:
#include <stdlib.h>
#include <sqlite3.h>
#include "hello.h"
int main(int argc, char **argv)
{
*db;
sqlite3 int rc = sqlite3_open(":memory:", &db);
if (rc != SQLITE_OK) {
(1); /* not exactly world class error handling but that isn't the point */
exit}
= hello(db);
rc if (rc != SQLITE_OK) {
(2);
exit}
(db);
sqlite3_closereturn 0;
}
If we re-run CQL and look in the hello.h
output file we’ll see that the declaration of the hello
function is now:
...
extern CQL_WARN_UNUSED cql_code hello(sqlite3 *_Nonnull _db_);
...
This indicates that the database is used and a SQLite return code is provided. We’re nearly there. If you attempt to build the program as before, there will be several link-time errors due to missing functions. Typically, these are resolved by providing the SQLite library to the command line and also adding the CQL runtime. The new command line looks something like this:
$ cc -o hello main.c hello.c cqlrt.c -lsqlite3
$ ./hello
Hello, world
The CQL runtime can be placed anywhere you want it to be, and of course the usual C separate compilation methods can be applied. More on that later.
But actually, that program doesn’t quite work yet. If you run it, you’ll get an error result code, not the message “Hello, world”.
Let’s talk about the final missing bit.
In CQL, a loose piece of Data Definition Language (henceforth DDL) does not actually create or drop anything. In most CQL programs, the normal situation is that “something” has already created the database and put some data in it. You need to tell the CQL compiler about the schema so that it knows what the tables are and what to expect to find in those tables. This is because typically you’re reconnecting to some sort of existing database. So, in CQL, loose DDL simply declares schema, it does not create it. To create schema you have to put the DDL into a procedure you can run. If you do that, then the DDL still serves a declaration, but also the schema will be created when the procedure is executed.
We need to change our program a tiny bit.
-- needed to allow vararg calls to C functions
declare procedure printf no check;
create proc hello()
begin
create table my_data(t text not null);
insert into my_data(t) values("Hello, world\n");
declare t text not null;
set t := (select * from my_data);
call printf('%s', t);
drop table my_data;
end;
If we rebuild the program, it will now behave as expected.
Let’s go over every important line of the new program, starting from main:
int rc = sqlite3_open(":memory:", &db);
This statement gives us an empty, private, in-memory only database to work with. This is the simplest case and it’s still very useful. The sqlite_open
and sqlite_open_v2
functions can be used to create a variety of databases per the SQLite documentation.
We’ll need such a database to use our procedure, and we use it in the call here:
= hello(db); rc
This provides a valid database handle to our procedure. Note that the procedure doesn’t know what database it is supposed to operate on; it expects to be handed a suitable database on a silver platter. In fact, any given procedure could be used with various databases at various times. Just like SQLite, CQL does not enforce any particular database setup; it just uses the provided database.
When hello
runs, we begin with:
create table my_data(t text not null);
This will create the my_data
table with a single column t
, of type text not null
. That will work because we know we’re going to call this with a fresh/empty database. More typically you might do create table if not exists...
or otherwise have a general attach/create phase or something to that effect. We’ll dispense with that here.
Next, we’ll run the insert statement:
insert into my_data(t) values("Hello, world\n");
This will add a single row to the table. Note that we have again used double quotes, meaning that this is a C string literal. This is highly convenient given the escape sequences. Normally SQLite text has the newlines directly embedded in it; that practice isn’t very compiler friendly, hence the alternative.
Next, we declare a local variable to hold our data:
declare t text not null;
Then, we can read back our data:
set t := (select * from my_data);
This form of database reading has very limited usability but it does work for this case and it is illustrative. The presence of (select ...)
indicates to the CQL compiler that the parenthesized expression should be given to SQLite for evaluation according to the SQLite rules. The expression is statically checked at compile time to ensure that it has exactly one result column. In this case the *
is just column t
, and actually it would have been clearer to use t
directly here but then there wouldn’t be a reason to talk about *
and multiple columns. At run time, the select
query must return exactly one row or an error code will be returned. It’s not uncommon to see (select ... limit 1)
to force the issue. But that still leaves the possibility of zero rows, which would be an error. We’ll talk about more flexible ways to read from the database later.
You can declare a variable and assign it in one step with the
LET
keyword, e.g.:= (select * from my_data); let t
The code would normally be written in this way but for discussion purposes, these examples continue to avoid
LET
.
At this point it seems wise to bring up the unusual expression evaluation properties of CQL. CQL is by necessity a two-headed beast. On the one side there is a rich expression evaluation language for working with local variables. Those expressions are compiled into C (or Lua) logic that emulates the behavior of SQLite on the data. It provides complex expression constructs such as IN
and CASE
but it is ultimately evaluated by C execution. Alternately, anything that is inside of a piece of SQL is necessarily evaluated by SQLite itself. To make this clearer let’s change the example a little bit before we move on.
set t := (select "__"||t||' '||1.234 from my_data);
This is a somewhat silly example but it illustrates some important things:
||
concatenation operator is evaluated by SQLite||
operator has lots of complex formatting conversions (such as converting real values to strings) * in fact the conversions are so subtle as to be impossible to emulate in loose C code with any economy, so, like a few other operators, ||
is only supported in the SQLite contextReturning now to our code as written, we see something very familiar:
call printf('%s', t);
Note that we’ve used the single quote syntax here for no particular reason other than illustration. There are no escape sequences here, so either form would suffice. Importantly, the string literal will not create a string object as before, but the text variable t
is, of course, a string reference. Before it can be used in a call to an undeclared function, it must be converted into a temporary C string. This might require allocation in general; that allocation is automatically managed.
Also, note that CQL assumes that calls to “no check” functions should be emitted as written. In this way, you can use printf
even though CQL knows nothing about it.
Lastly, we have:
drop table my_data;
This is not strictly necessary because the database is in memory anyway and the program is about to exit but there it is for illustration.
Now Data Manipulation Language (i.e. insert and select here; and henceforth DML) and DDL might fail for various reasons. If that happens the procedure will goto
a cleanup handler and return the failed return code instead of running the rest of the code. Any temporary memory allocations will be freed and any pending SQLite statements will be finalized. More on that later when we discuss error handling.
With that we have a much more complicated program that prints “Hello, world”
In order to read data with reasonable flexibility, we need a more powerful construction. Let’s change our example again and start using some database features.
declare procedure printf no check;
-- for previty the `create` in `create proc` can be elided
proc hello()begin
-- this time we use the ! short hand for not null
create table my_data(
int! primary key,
pos
txt text!
);
-- you can supply more than one set of values in a single insert
-- but we didn't here.
insert into my_data values(2, 'World');
insert into my_data values(0, 'Hello');
insert into my_data values(1, 'There');
declare C cursor for select * from my_data order by pos;
loop fetch C
begin
-- we elided the 'call' here to show the briefer syntax
"%d: %s\n", C.pos, C.txt);
printf(end;
close C;
drop table my_data;
end;
Reviewing the essential parts of the above.
create table my_data(
integer not null primary key,
pos not null
txt text );
The table now includes a position column to give us some ordering. That is the primary key.
insert into my_data values(2, 'World');
The insert statements provide both columns, not in the printed order. The insert form where the columns are not specified indicates that all the columns will be present, in order; this is more economical to type. CQL will generate errors at compile time if there are any missing columns or if any of the values are not type compatible with the indicated column.
The most important change is here:
declare C cursor for select * from my_data order by pos;
We’ve created a non-scalar variable C
, a cursor over the indicated result set. The results will be ordered by pos
.
loop fetch C
begin
...
end;
This loop will run until there are no results left (it might not run at all if there are zero rows, that is not an error). The FETCH
construct allows you to specify target variables, but if you do not do so, then a synthetic structure is automatically created to capture the projection of the select
. In this case the columns are pos
and txt
. The automatically created storage exactly matches the type of the columns in the select list (which could itself be a tricky calculation). In this case the select
is quite simple and the columns of the result directly match the schema for my_data
. An integer and a string reference. Both not null.
"%d: %s\n", C.pos, C.txt); printf(
The storage for the cursor is given the same names as the columns of the projection of the select, in this case the columns were not renamed in the select clause so pos
and txt
are the fields in the cursor. Double quotes were used in the format string to get the newline in there easily.
close C;
The cursor is automatically released at the end of the procedure, but in this case, we’d like to release it before the drop table
operation occurs. Therefore, an explicit close
statement is included. However, this sort of close
is frequently omitted since automatic cleanup takes care of it.
If you compile and run this program, you’ll get this output:
$ cc -x c -E hello.sql | cql --cg hello.h hello.c
$ cc -o hello main.c hello.c cqlrt.c -lsqlite3
$ ./hello
0: Hello
1: There
2: World
So the data was inserted and then sorted.
We’ve only scratched the surface of what SQLite can do, and almost all DML constructs are supported by CQL. This includes common table expressions, and even recursive versions of the same. But remember, when it comes to DML, the CQL compiler only has to validate the types and figure out what the result shape will be – SQLite always does all the heavy lifting of evaluation. All of this means with remarkably little additional code, the example below from the SQLite documentation can be turned into a CQL stored proc using the constructs we have defined above.
-- needed to allow vararg calls to C functions
declare proc printf no check;
-- proc and procedure can be used interchangeably
procedure mandelbrot()
begin
-- this is basically one giant select statement
declare C cursor for
with recursive
-- x from -2.0 to +1.2
as (select -2.0 union all select x + 0.05 from xaxis where x < 1.2),
xaxis(x)
-- y from -1.0 to +1.0
as (select -1.0 union all select y + 0.1 from yaxis where y < 1.0),
yaxis(y)
as (
m(iter, cx, cy, x, y) -- initial seed iteration count 0, at each of the points in the above grid
select 0 iter, x cx, y cy, 0.0 x, 0.0 y from xaxis, yaxis
union all
-- the next point is always iter +1, same (x,y) and the next iteration of z^2 + c
select iter+1 iter, cx, cy, x*x-y*y + cx x, 2.0*x*y + cy y from m
-- stop condition, the point has escaped OR iteration count > 28
where (m.x*m.x + m.y*m.y) < 4.0 and m.iter < 28
),as (
m2(iter, cx, cy) -- find the last iteration for any given point to get that count
select max(iter), cx, cy from m group by cx, cy
),as (
a(t) -- convert the iteration count to a printable character, grouping by line
select group_concat(substr(" .+*#", 1 + min(iter/7,4), 1), '')
from m2 group by cy
)-- group all the lines together
select rtrim(t) line from a;
-- slurp out the data
loop fetch C
begin
call printf("%s\n", C.line);
end;
end;
The above code uses various SQLite features to generate this text:
$
....#
..#*..
..+####+.
.......+####.... +
..##+*##########+.++++
.+.##################+.
.............+###################+.+
..++..#.....*#####################+.
...+#######++#######################.
....+*################################.
#############################################...
....+*################################.
...+#######++#######################.
..++..#.....*#####################+.
.............+###################+.+
.+.##################+.
..##+*##########+.++++
.......+####.... +
..+####+.
..#*..
....#
+.
The above won’t come up very often, but it does illustrate several things:
WITH RECURSIVE
actually provides a full lambda calculus so arbitrary computation is possibleWITH RECURSIVE
to create table expressions that are sequences of numbers easily, with no reference to any real datasources/demo
directory of the CG/SQL project. Additional demo code is available in Appendix 10
Until this point we’ve only discussed simple kinds of expressions as well as variables and table columns marked with NOT NULL
. These are indeed the easiest types for CQL to work with as they tend to correspond most directly to the types known to C. However, SQL provides for many more types of expressions as well as nullable types, and these require handling in any language that purports to be like SQL.
The usual arithmetic operators apply in CQL:
Example expressions (these are all true)
(1 + 2) * 3 == 9
1 + 2 * 3 == 7
6 / 3 == 2
7 - 5 == 2
6 % 5 == 1
5 / 2.5 == 2
7 & 3 == 2 | 1
1 << 2 == 4
However, before going any further, it’s important to note that CQL is inherently a two-headed beast. Expressions are either evaluated by transpiling to C (like the predicate of an IF statement or a variable assignment) or by sending them to SQLite for evaluation (like expressions inside a SELECT
statement or the WHERE
part of a DELETE
).
CQL evaluation rules are designed to be as similar as possible, but some variance is inevitable because evaluation is done in two fundamentally different ways.
The operator precedence rules in CQL are as follows; the top-most rule binds the most loosely, and the bottom-most rule binds the most tightly:
ASSIGNMENT: := += -= /= *= %= &= |= <<= >>=
LOGICAL_OR: OR
LOGICAL_AND: AND
LOGICAL_NOT: NOT
EQUALITY: = == != <> IS [NOT], [NOT] IN, [NOT] LIKE,
[NOT] MATCH, [NOT] GLOB, [NOT] BETWEEN
INEQUALITY: < <= > >=
BINARY: << >> & |
ADDITION: + -
MULTIPLICATION: * / %
CONCAT: || -> ->>
COLLATE: COLLATE
UNARY: ~ -
SPECIAL: x[] x:y x::y x:::y x.y f(y) CAST CASE ~type~
The above rules are not the same as C’s operator precedence rules! Instead, CQL follows SQLite’s rules. Parentheses are emitted in the C output as needed to force that order.
NOTE: CQL emits minimal parentheses in all outputs. Different parentheses are often needed for SQL output as opposed to C output.
In contrast to C, CQL guarantees a left-to-right order of evaluation for arguments. This applies both to arguments provided to the operators mentioned in the previous section as well as arguments provided to procedures.
CQL needs type information for both variables in the code and columns in the database. Like SQL, CQL allows variables to hold a NULL value, and just as in SQL, the absence of NOT NULL
implies that NULL
is a legal value. Consider these examples:
-- real code should use better names than this :)
create table all_the_nullables(
integer,
i1
b1 bool,long,
l1 real,
r1
t1 text,blob
bl1
);
declare i2 integer;
declare b2 bool;
declare l2 long;
declare r2 real;
declare t2 text;
declare bl2 blob;
All of i1
, i2
, b1
, b2
, l1
, l2
, r1
, r2
, t1
, t2
, and bl1
, bl2
are nullable. In some sense, variables and columns declared nullable (by virtue of the missing NOT NULL
) are the root sources of nullability in the SQL language. That and the NULL
literal. Though there are other sources as we will see.
NOT NULL
could be added to any of these, e.g.
-- real code should use better names than this :)
declare i_nn integer not null;
In fact, NOT NULL
is so common in CQL code that it can be abbreviated with a single !
character. Some languages use ?
to make a type nullable, but since nullable is the default in SQL, CQL opts for the reverse. Hence, the following code is equivalent to the above:
-- "int" is equivalent to "integer" and "!" is equivalent to "not null"
declare i_nn int!;
In the context of computing the types of expressions, CQL is statically typed, and so it must make a decision about the type of any expression based on the type information at hand at compile time. As a result, it handles the static type of an expression conservatively. If the result might be null, then the expression is of a nullable type, and the compiled code will include an affordance for the possibility of a null value at runtime.
The generated code for nullable types is considerably less efficient, and so it should be avoided if that is reasonably possible.
In places where a SQL name is allowed, such as a table name, column name, index name, trigger name, or constraint name, a back-quoted identifier may be used. This allows for the more flexible names supported by SQLite to appear anywhere they might be needed.
Example:
create table `my table` (
integer
`a column` );
Since SQL names “leak” into the language via cursors, other places a SQL name might appear have similar flexibility. For instance, names of variables, and columns in cursors can have exotic names.
When rendered to SQL, the name will be emitted like so:
table] [my
If the name goes to C or Lua, it has to be escaped and rendered like so:
X_aX20table
In this form, non-identifier characters are escaped into hex. This is invisible to users of CQL, but the C or Lua interface to such columns necessarily uses the escaped names. While this is less than perfect, it is the only way to allow access to any legal SQL name.
You can declare and initialize a variable in one step using the LET
form, e.g.
:= 1; LET x
The named variable is declared to be the exact type of the expression on the right. More on expressions in the coming sections. The right side is often a constant in these cases but does not need to be.
:= 1; -- integer not null
LET i := 1L; -- long not null
LET l := "x"; -- text not null
LET t := x IS y; -- bool not null
LET b := x = y; -- bool (maybe not null depending on x/y) LET b
The pseudo-function “nullable” removes not null
from the type of its argument but otherwise does no computation. This can be useful to initialize nullable types.
:= nullable(1); -- nullable integer variable initialized to 1
LET n_i := nullable(1L); -- nullable long variable initialized to 1 LET n_l
The pseudo-function “sensitive” adds @sensitive
to the type of its argument but otherwise does no computation. This also can be useful to initialize nullable types.
:= sensitive(1); -- sensitive nullable integer variable initialized to 1
LET s_i := sensitive(1L); -- sensitive nullable long variable initialized to 1 LET s_l
@RC
special variableCQL also has the special built-in variable @RC
, which refers to the most recent error code returned by a SQLite operation, e.g., 0 == SQLITE_OK
, 1 == SQLITE_ERROR
. @RC
is of type integer not null
. Specifically:
_rc_thrown_n
where n is the catch block number in the procedure.@RC
refers to the above error variable of the innermost catch block the @RC
reference is in.@RC
reference happens outside of any catch block, its value is SQLITE_OK
(i.e., zero).There are a number of literal objects that may be expressed in CQL. These are as follows:
''
to indicate a single quote character (this is just like normal SQLite).some_string
, or, the entire path of the current compiland if some_string
does not occur in the path.@FILE
construct is to provide a partial path to a file for diagnostics that is consistent even if the file is built in various different root paths on different build machines.REAL
and stored as the C type double
.L
are integer literals.L
, are long integer literals.Examples:
1.3 -- real
2L -- long
123456789123 -- long
123 -- integer
0x10 -- hex integer
0x10L -- hex long integer
The use of NULL
always results in a nullable result; however, this literal is special in that it has no storage class. NULL
is neither a numeric nor a string itself but rather mutates into whatever it is first combined with. For instance, NULL + 1
results in a nullable integer. Because NULL
has no primitive type, in some cases where type knowledge is required, you might have to use the CAST()
function to cast NULL
to a specific type, such as CAST(NULL as TEXT)
. This construct guarantees type consistency in cases like SELECT
from different sources combined with UNION ALL
.
NOTE: Constructs like
CAST(NULL as TEXT)
are always rewritten to justNULL
before going to SQLite as the cast is uninteresting except for the type information, which SQLite doesn’t need/use anyway.
NOTE: The trailing cast notation is often helpful for economy here, e.g.
NULL ~TEXT~
is a lot shorter thanCAST(NULL AS TEXT)
and is identical. SQLite will never see the shorthand version, it is converted immediately.
The boolean literals TRUE
and FALSE
(case insensitive) may be used freely. These are the same as the normal literals 0
and 1
except they have type BOOL
. They mix with numbers in the usual ways with the usual promotion rules.
NOTE: Even if the target language is Lua, you can mix and match bools and integers in CQL. The compiler will emit casts if needed.
The C pre-processor can still be combined with CQL, in which case the __FILE__
and __LINE__
directives of the pre-processor may be used to create literals; they will be preprocessed into normal literals.
The use of __FILE__
can give surprising results in the presence of build systems; hence, the existence of @FILE(...)
.
Use of the C-pre-processor is deprecated.
It’s possible to use named constants in CQL with nothing more than the pre-processor features that have already appeared; however, the use of macros in such a way is not entirely satisfactory. For one thing, macros are expanded before semantic analysis, which means CQL can’t provide macro constant values for you in the JSON output, for instance.
To help with this problem, CQL includes specific support for constants. They can be in the form of enumerations of a fixed type or general-purpose ad hoc constants. We’ll see both in the sections to follow.
declare enum business_type integer (
restaurant,
laundromat,= 11+3 /* math added for demo purposes only */
corner_store );
After this enum is declared, this:
select business_type.corner_store;
is the same as this:
select 14;
And that is exactly what SQLite will see: the literal 14
.
You can also use an enum type to define columns whose type is more specific than just integer
, like so:
CREATE TABLE businesses (
name TEXT,type business_type
);
CQL will then enforce that you use the correct enum to access those columns. For example, this is valid:
SELECT * FROM businesses WHERE type = business_type.laundromat;
While this does not type check, even if state.delaware is a numeric code for the state:
SELECT * FROM businesses WHERE type = state.delaware;
Enumerations follow these rules:
= expression
, the first item will be 1
, not 0
).(big = 100.0, medium = big/2, small = medium/2)
.(code = business_type.restaurant)
.enum_name.member_name
elsewhere.With these forms, you get some additional useful output: * The JSON includes the enumerations and their values in their own section. * You can use the @emit_enums
directive to put declarations like this into the .h
file that corresponds to the current compiland.
enum business_type {
= 1,
business_type__restaurant = 2,
business_type__laundromat = 14
business_type__corner_store };
Note that C does not allow for floating-point enumerations, so in case of floating-point values such as:
declare enum floating real (
= 1.0,
one = 2.0,
two = 2.71828,
e = 3.14159
pi );
you get:
// enum floating (floating point values)
#define floating__one 1.000000e+00
#define floating__two 2.000000e+00
#define floating__e 2.718280e+00
#define floating__pi 3.141590e+00
In Lua output the compiler creates a call to cql_emit_constants
like so:
("enum", "business_type", {
cql_emit_constantsrestaurant = 1,
laundromat = 2,
corner_store = 14
})
By default this function adds the indicated constants to the global dictionary _cql
but you can override this to do whatever you like with the constants.
In order to get useful expressions in enumeration values, constant folding and general evaluation were added to the compiler; these expressions work on any numeric type and the literal NULL
. The supported operations include:
+
, -
, *
, /
, %
, |
, &
, <<
, >>
, ~
, and
, or
, not
, ==
, <=
, >=
, !=
, <
, >
, the CAST
operator, and the CASE
forms (including the IIF
function). These operations are enough to make a lot of very interesting expressions, all of which are evaluated at compile time.
Constant folding was added to allow for rich enum
expressions, but there is also the const()
primitive in the language which can appear anywhere a literal could appear. This allows you to do things like:
create table something(
integer default const((1<<16)|0xf) /* again the math is just for illustration */
x );
The const
form is also very useful in macros to force arguments to be constants and so forth.
This const
form ensures that the constant will be evaluated at compile time. The const
pseudo-function can also nest so you can build these kinds of macros from other macros or you can build enum values this way. Anywhere you might need literals, you can use const
.
Constant groups are more general than enumerations but very similar. This statement declares named global constants of any type.
declare const group some_constants (
= cast(5 as integer<job_id>),
my_x = 12.0,
my_y = 'foo'
my_z );
As with enums, referring to my_x
, my_y
, or my_z
after this will cause the appropriate constant value to be inlined into the code. The output code carries no symbolic reference to the constant. However, similary to enums, if you wish the constant to be consumable by external code you can use:
@emit_const_groups some_constant;
This causes
#ifndef const_group_some_constants_defined
#define const_group_some_constants_defined
#define my_x 5
#define my_y 1.200000e+01
#define my_z "foo"
#endif
To go into the header file for C. For Lua you instead get this call:
("const", "some_constants", {
cql_emit_constantsmy_x = 5,
my_y = 1.200000e+01,
my_z = "foo"
})
You can use this call to create whatever constant dictionary or dictionaries you want by overriding cql_emit_constants
. By default they are added to a global dictionary named _cql
A common source of errors in stored procedures is incorrect typing in variables and arguments. For instance, a particular key for an entity might need to be LONG
, LONG NOT NULL
or even LONG NOT NULL @SENSITIVE
. This requires you to diligently get the type right in all the places it appears, and should it ever change, you have to revisit all the places.
To help with this situation, and to make the code a little more self-describing, we added named types to the language. This is a lot like typedef
in the C language. These defintions do not create different incompatible types, but they do let you name types for reuse.
You can now write these sorts of forms:
declare foo_id type long not null;
create table foo(
id foo_id primary key autoincrement,
name text
);
create proc inserter(name_ text, out id foo_id)
begin
insert into foo(id, name) values(NULL, name_);
set id := last_insert_rowid();
end;
declare function func_return_foo_id() foo_id;
declare var foo_id;
Additionally any enumerated type can be used as a type name. e.g.
declare enum thing integer (
thing1,
thing2
);
declare thing_type type thing;
Enumerations always include “not null” in addition to their base type. Enumerations also have a unique “kind” associated; specifically, the above enum has the type integer<thing> not null
. The rules for type kinds are described below.
Any CQL type can be tagged with a “kind”; for instance, real
can become real<meters>
, integer
can become integer<job_id>
. The idea here is that the additional tag, the “kind”, can help prevent type mistakes in arguments, in columns, and in procedure calls. For instance:
create table things(
size real<meters>,
real<seconds>
duration
);
create proc do_something(size_ real<meters>, duration_ real<seconds>)
begin
insert into things(size, duration) values(size_, duration_);
end;
In this situation, you couldn’t accidentally switch the columns in do_something
even though both are real
, and indeed SQLite will only see the type real
for both. If you have your own variables typed real<size>
and real<duration>
, you can’t accidentally do:
call do_something(duration, size);
even though both are real. The type kind won’t match.
Importantly, an expression with no type kind is compatible with any type kind (or none). Hence all of the below are legal.
declare generic real;
set generic := size; -- no kind may accept <meters>
set generic := duration; -- no kind may accept <seconds>
set duration := generic; -- no kind may be stored in <seconds>
Only mixing types where both have a kind, and the kind is different, generates errors. This choice allows you to write procedures that (for instance) log any integer
or any real
, or that return an integer
out of a collection.
These rules are applied to comparisons, assignments, column updates, anywhere and everywhere types are checked for compatibility.
To get the most value out of these constructs, the authors recommend that type kinds be used universally except when the extra compatibility described above is needed (like low-level helper functions).
Importantly, type kind can be applied to object types as well, allowing object<dict>
to be distinct from object<list>
.
At runtime, the kind information is lost. But it does find its way into the JSON output so external tools also get to see the kinds.
Nullability is tracked via CQL’s type system. To understand whether or not an expression will be assigned a nullable type, you can follow these rules; they will hopefully be intuitive if you are familiar with SQL:
The literal NULL
is, of course, always assigned a nullable type. All other literals are nonnull.
In general, the type of an expression involving an operator (e.g., +
, ==
, !=
, ~
, LIKE
, et cetera) is nullable if any of its arguments are nullable. For example, 1 + NULL
is assigned the type INTEGER
, implying nullability. 1 + 2
, however, is assigned the type INTEGER NOT NULL
.
IN
and NOT IN
expressions are assigned a nullable type if and only if their left argument is nullable: The nullability of the right side is irrelevant. For example, "foo" IN (a, b)
will always have the type BOOL NOT NULL
, whereas some_nullable IN (a, b)
will have the type BOOL
.
NOTE: In CQL, the
IN
operator behaves like a series of equality tests (i.e.,==
tests, notIS
tests), andNOT IN
behaves symmetrically. SQLite has slightly different nullability rules forIN
andNOT IN
. This is the one place where CQL has different evaluation rules from SQLite, by design.
The result of IS
and IS NOT
is always of type BOOL NOT NULL
, regardless of the nullability of either argument.
For CASE
expressions, the result is always of a nullable type if no ELSE
clause is given. If an ELSE
is given, the result is nullable if any of the THEN
or ELSE
expressions are nullable.
NOTE: The SQL
CASE
construct is quite powerful: Unlike the Cswitch
statement, it is actually an expression. In this sense, it is rather more like a highly generalized ternarya ? b : c
operator than a C switch statement. There can be arbitrarily many conditions specified, each with their own result, and the conditions need not be constants; typically, they are not.
IFNULL
and COALESCE
are assigned a NOT NULL
type if one or more of their arguments are of a NOT NULL
type.
In most join operations, the nullability of each column participating in the join is preserved. However, in a LEFT OUTER
join, the columns on the right side of the join are always considered nullable; in a RIGHT OUTER
join, the columns on the left side of the join are considered nullable.
As in most other languages, CQL does not perform evaluation of value-level expressions during type checking. There is one exception to this rule: An expression within a const
is evaluated at compilation time, and if its result is then known to be nonnull, it will be given a NOT NULL
type. For example, const(NULL or 1)
is given the type BOOL NOT NULL
, whereas merely NULL or 1
has the type BOOL
.
CQL is able to “improve” the type of some expressions from a nullable type to a NOT NULL
type via occurrence typing, also known as flow typing. There are three kinds of improvements that are possible:
Positive improvements, i.e., improvements resulting from the knowledge that some condition containing one or more AND
-linked IS NOT NULL
checks must have been true:
IF
statements:IF a IS NOT NULL AND c.x IS NOT NULL THEN
-- `a` and `c.x` are not null here
ELSE IF b IS NOT NULL THEN
-- `b` is not null here
END IF;
CASE
expressions:CASE
WHEN a IS NOT NULL AND c.x IS NOT NULL THEN
-- `a` and `c.x` are not null here
WHEN b IS NOT NULL THEN
-- `b` is not null here
ELSE
...
END;
IIF
expressions:IS NOT NULL AND c.x IS NOT NULL,
IIF(a ..., -- `a` and `c.x` are not null here
...
)
SELECT
expressions:SELECT
-- `t.x` and `t.y` are not null here
FROM t
WHERE x IS NOT NULL AND y IS NOT NULL
Negative improvements, i.e., improvements resulting from the knowledge that some condition containing one or more OR
-linked IS NULL
checks must have been false:
IF
statements:IF a IS NULL THEN
...
ELSE IF c.x IS NULL THEN
-- `a` is not null here
ELSE
-- `a` and `c.x` are not null here
END IF;
IF
statements, guard pattern:IF a IS NULL RETURN;
-- `a` is not null here
IF c.x IS NULL THEN
...
THROW;END IF;
-- `a` and `c.x` are not null here
CASE
expressions:CASE
WHEN a IS NULL THEN
...
WHEN c.x IS NULL THEN
-- `a` is not null here
ELSE
-- `a` and `c.x` are not null here
END;
IIF
expressions:IS NULL OR c.x IS NULL,
IIF(a ...,
... -- `a` and `c.x` are not null here
)
Assignment improvements, i.e., improvements resulting from the knowledge that the right side of a statement (or a portion therein) cannot be NULL
:
SET
statements:SET a := 42;
-- `a` is not null here
NOTE: Assignment improvements from
FETCH
statements are not currently supported. This may change in a future version of CQL.
There are several ways in which improvements can cease to be in effect:
The scope of the improved variable or cursor field has ended:
IF a IS NOT NULL AND c.x IS NOT NULL THEN
-- `a` and `c.x` are not null here
END IF;
-- `a` and `c.x` are nullable here
An improved variable was SET
to a nullable value:
IF a IS NOT NULL THEN
-- `a` is not null here
SET a := some_nullable;
-- `a` is nullable here
END IF;
An improved variable was used as an OUT
(or INOUT
) argument:
IF a IS NOT NULL THEN
-- `a` is not null here
CALL some_procedure_that_requires_an_out_argument(a);
-- `a` is nullable here
END IF;
An improved variable was used as a target for a FETCH
statement:
IF a IS NOT NULL THEN
-- `a` is not null here
INTO a;
FETCH c -- `a` is nullable here
END IF;
An improved cursor field was re-fetched:
IF c.x IS NOT NULL THEN
-- `c.x` is not null here
FETCH c;-- `c.x` is nullable here
END IF;
A procedure call was made (which removes improvements from all globals because the procedure may have mutated any of them; locals are unaffected):
IF a IS NOT NULL AND some_global IS NOT NULL THEN
-- `a` and `some_global` are not null here
CALL some_procedure();
-- `a` is still not null here
-- `some_global` is nullable here
END IF;
CQL is generally smart enough to understand the control flow of your program and infer nullability appropriately; here are a handful of examples:
IF some_condition THEN
SET a := 42;
ELSE
THROW;END IF;
-- `a` is not null here because it must have been set to 42
-- if we've made it this far
IF some_condition THEN
SET a := 42;
ELSE
SET a := 100;
END IF;
-- `a` is not null here because it was set to a value of a
-- `NOT NULL` type in all branches and the branches cover
-- all of the possible cases
IF a IS NOT NULL THEN
IF some_condition THEN
SET a := NULL;
ELSE
-- `a` is not null here despite the above `SET` because
-- CQL understands that, if we're here, the previous
-- branch must not have been taken
END IF;
END IF;
IF a IS NOT NULL THEN
WHILE some_condition
BEGIN
-- `x` is nullable here despite `a IS NOT NULL` because
-- `a` was set to `NULL` later in the loop and thus `x`
-- will be `NULL` when the loop repeats
:= a;
LET x SET a := NULL;
...
END;
END IF;
For positive improvements, the check must be exactly of the form IS NOT NULL
; other checks that imply a variable or cursor field must not be null when true have no effect:
IF a > 42 THEN
-- `a` is nullable here
END IF;
NOTE: This may change in a future version of CQL.
For multiple positive improvements to be applied from a single condition, they must be linked by AND
expressions along the outer spine of the condition; uses of IS NOT NULL
checks that occur as subexpressions within anything other than AND
have no effect:
IF
IS NOT NULL AND b IS NOT NULL)
(a OR c IS NOT NULL
THEN
-- `a`, `b`, and `c` are all nullable here
END IF;
For negative improvements, the check must be exactly of the form IS NULL
; other checks that imply a variable or cursor field must not be null when false have no effect:
DECLARE equal_to_null INT;
IF a IS equal_to_null THEN
...
ELSE
-- `a` is nullable here
END IF;
For multiple negative improvements to be applied from a single condition, they must be linked by OR
expressions along the outer spine of the condition; uses of IS NULL
checks that occur as subexpressions within anything other than OR
have no effect:
IF
IS NULL OR b IS NULL)
(a AND c IS NULL
THEN
...
ELSE
-- `a`, `b`, and `c` are all nullable here
END IF;
If possible, it is best to use the techniques described in “Nullability Improvements” to verify that the value of a nullable type is nonnull before using it as such.
Sometimes, however, you may know that a value with a nullable type cannot be null and simply wish to use it as though it were nonnull. The ifnull_crash
and ifnull_throw
“attesting” functions convert the type of an expression to be nonnull and ensure that the value is nonnull with a runtime check. They cannot be used in SQLite contexts because the functions are not known to SQLite, but they can be used in loose expressions. For example:
CREATE PROC square_if_odd(a INT NOT NULL, OUT result INT)
BEGIN
IF a % 2 = 0 THEN
SET result := NULL;
ELSE
SET result := a * a;
END IF;
END;
-- `x` has type `INT`, but we know it can't be `NULL`
:= square_if_odd(3);
let x
-- `y` has type `INT NOT NULL`
:= ifnull_crash(x); let y
Above, the ifnull_crash
attesting function is used to coerce the expression x
to be of type INT NOT NULL
. If our assumptions were somehow wrong, however—and x
were, in fact, NULL
—our program would crash.
As an alternative to crashing, you can use ifnull_throw
. The following two pieces of code are equivalent:
CREATE PROC y_is_not_null(x INT)
BEGIN
:= ifnull_throw(x);
let y END;
CREATE PROC y_is_not_null(x INT)
BEGIN
DECLARE y INT NOT NULL;
IF x IS NOT NULL THEN
SET y := x;
ELSE
THROW;END IF;
END;
CQL supports a variety of expressions, nearly everything from the SQLite world. The following are the various supported operators; they are presented in order from the weakest binding strength to the strongest. Note that the binding order is NOT the same as C, and in some cases, it is radically different (e.g., boolean math)
These appear only in the context of SELECT
statements. The arms of a compound select may include FROM
, WHERE
, GROUP BY
, HAVING
, and WINDOW
. If ORDER BY
or LIMIT ... OFFSET
are present, these apply to the entire UNION.
Example:
select A.x x from A inner join B using(z)
union all
select C.x x from C
where x = 1;
The WHERE
applies only to the second select in the union. And each SELECT
is evaluated before the the UNION ALL
select A.x x from A inner join B using(z)
where x = 3
union all
select C.x x from C
where x = 1
order by x;
The ORDER BY
applies to the result of the union, so any results from the 2nd branch will sort before any results from the first branch (because x
is constrained in both).
Assignment occurs in the UPDATE
statement, in the SET
and LET
statements, and in various implied cases like the +=
operator. In these cases, the left side is a simple target and the right side is a general expression. The expression is evaluated before the assignment.
NOTE: In cases where the left-hand side is an array or property, the assignment is actually rewritten to call a suitable setter function, so it isn’t actually an assignment at all.
Example:
SET x := 1 + 3 AND 4; -- + before AND then :=
The logical OR
operator performs shortcut evaluation, much like the C ||
operator (not to be confused with SQL’s concatenation operator with the same lexeme). If the left side is true, the result is true, and the right side is not evaluated.
The truth table for logical OR
is as follows:
A | B | A OR B |
---|---|---|
0 | 0 | 0 |
0 | 1 | 1 |
0 | NULL | NULL |
1 | 0 | 1 |
1 | 1 | 1 |
1 | NULL | 1 |
NULL | 0 | NULL |
NULL | 1 | 1 |
NULL | NULL | NULL |
The logical AND
operator performs shortcut evaluation, much like the C &&
operator. If the left side is false, the result is false, and the right side is not evaluated.
The truth table for logical AND
is as follows:
A | B | A AND B |
---|---|---|
0 | 0 | 0 |
0 | 1 | 0 |
0 | NULL | 0 |
1 | 0 | 0 |
1 | 1 | 1 |
1 | NULL | NULL |
NULL | 0 | 0 |
NULL | 1 | NULL |
NULL | NULL | NULL |
These are ternary operators. The general forms are:
BETWEEN expr2 AND expr3
expr1 NOT BETWEEN expr2 AND expr3 expr1
Importantly, there is an inherent ambiguity in the language because expr2
or expr3
above could be logical expressions that include AND
. CQL resolves this ambiguity by insisting that expr2
and expr3
be “math expressions” as defined in the CQL grammar. These expressions may not have ungrouped AND
or OR
operators.
Examples:
-- oh hell no (syntax error)
between 1 and 2 and 3;
a
-- all ok
between (1 and 2) and 3;
a between 1 and (2 and 3);
a between 1 and b between c and d; -- binds left to right
a between 1 + 2 and 12 / 2; a
The one operand of logical NOT
must be a numeric. NOT 'x'
is illegal.
!=
, <>
, =
, ==
, LIKE
, GLOB
, MATCH
, REGEXP
, IN
, IS
, IS NOT
These operations do some non-ordered comparison of their two operands: * IS
and IS NOT
never return NULL
, So for instance X IS NOT NULL
gives the natural answer. x IS y
is true if and only if: 1. both x
and y
are NULL
or 2. if they are equal. * The other operators return NULL
if either operand is NULL
and otherwise perform their usual test to produce a boolean. * !=
and <>
are equivalent as are =
and ==
. * Strings and blobs compare equal based on their value, not their identity (i.e., not the string/blob pointer). * Objects compare equal based on their address, not their content (i.e., reference equality). * MATCH
, GLOB
, and REGEXP
are only valid in SQL contexts, LIKE
can be used in any context (a helper method to do LIKE
in C is provided by SQLite, but not the others). * MATCH
, GLOB
, REGEXP
, LIKE
, and IN
may be prefixed with NOT
which reverses their value.
NULL IS NULL -- this is true
NULL == NULL) IS NULL -- this is also true because NULL == NULL is not 1, it's NULL.
(NULL != NULL) IS NULL -- this is also true because NULL != NULL is not 0, it's also NULL.
('xy' NOT LIKE 'z%'` -- this is true
<
, >
, <=
, >=
These operators perform the usual order comparison of their two operands:
NULL
, the result is NULL
.NOTE: CQL uses
strcmp
for string comparison. In SQL expressions, the comparison happens in whatever way SQLite has been configured. Typically, general-purpose string comparison should be done with helper functions that deal with collation and other considerations. This is a very complex topic and CQL is largely silent on it.
<<
, >>
, &
, |
These are the bit-manipulation operations. Their binding strength is very different from C, so beware. Notably, the &
operator has the same binding strength as the |
operator, so they bind left to right, which is utterly unlike most systems. Many parentheses are likely to be needed to correctly codify the usual “or of ands” patterns.
Likewise, the shift operators <<
and >>
have the same strength as &
and |
, which is very atypical. Consider:
1 << 7; -- not ambiguous but unusual meaning, not like C or Lua
x & 1) << 7; -- means the same as the above
(x & 1 << 7) -- probably what you intended x & (
Note that these operators only work on integer and long integer data. If any operand is NULL
, the result is NULL
.
+
, -
These operators perform the typical arithmetic operations. Note that there are no unsigned numeric types, so it’s always signed arithmetic that is performed.
NULL
, the result is NULL
.*
, /
, %
These operators also perform typical arithmetic operations. Note that there are no unsigned numeric types, so it’s always signed arithmetic that is performed.
NULL
, the result is NULL
.EXCEPTION: The
%
operator doesn’t make sense on real values, so real values produce an error.
||
This operator is only valid in a SQL context, it concatenates the text representation of its left and right arguments into text. The arguments can be of any type.
->
and ->>
The JSON extraction operator ->
accepts a JSON text value or JSON blob value as its left argument and a valid JSON path as its right argument. The indicated path is extracted and the result is a new, smaller, piece of JSON text.
The extended extraction operator ->>
will return the selected item as a value rather than as JSON. The SQLite documentation can be helpful here:
Both the
->
and->>
operators select the same subcomponent of the JSON to their left. The difference is that->
always returns a JSON representation of that subcomponent and the->>
operator always returns an SQL representation of that subcomponent.
Importantly, the resulting type of the ->>
operator cannot be known at compile time because it will depend on the path value which could be, and often is, a non-constant expression. Dynamic typing is not a concept available in CQL, as a consequence, in CQL, you must declare the extraction type when using the ->>
operator with syntax:
json_arg ->> ~type_spec~ path_arg
The type_spec can be any valid type like int
, int<foo>
, or a type alias. All the normal type expressions are valid.
NOTE: this form is not a CAST operation, it’s a declaration. The type of the expression will be assumed to be the indicated type. SQLite will not see the
~type~
notation when the expression is evaluated. If a cast is desired simply write one as usual, the trailing cast syntax can be specially convenient when working with JSON.
-
, ~
Unary negation (-
) and bitwise invert (~
) are the strongest binding operators.
~
operator only works on integer types (not text, not real).NULL
, the result is NULL
.The CAST
expression can be written in two ways, the standard form:
CAST( expr AS type)
And equivalently using pipeline notation
type~ expr ~
The trailing ~type~
notation has very strong binding, As shown in the order of operations table, it is even stronger than the unary ~
operator. It is indended to be used in function pipelines in combination with other pipeline operations (the :
family) rather than as part of arithmetic and so forth, hence it has a fairly strong binding (equal to :
).
The ~type~
form is immediatley converted to the standard CAST
form and so SQLite will never see this alternate notation. This form is purely syntatic sugar.
The CASE
expression has two major forms and provides a great deal of flexibility in an expression. You can think of it as an enhanced version of the C ?:
operator.
:= 'y';
let x := case x
let y when 'y' then 1
when 'z' then 2
else 3
end;
In this form, the CASE
expression (x
here) is evaluated exactly once and then compared against each WHEN
clause. Every WHEN
clause must be type-compatible with the CASE
expression. The THEN
expression corresponding to the matching WHEN
is evaluated and becomes the result. If no WHEN
matches, then the ELSE
expression is used. If there is no ELSE
and no matching WHEN
, then the result is NULL
.
If that’s not general enough, there is an alternate form:
:= 'yy';
let y := 'z';
let z := case
let x when y = 'y' then 1
when z = 'z' then 2
else 3
end;
In the second form, where there is no value before the first WHEN
keyword, each WHEN
expression is a separate independent boolean expression. The first one that evaluates to true causes the corresponding THEN
to be evaluated, and that becomes the result. As before, if there is no matching WHEN
clause, then the result is the ELSE
expression if present, or NULL
if there is no ELSE
.
The result types must be compatible, and the best type to hold the answer is selected with the usual promotion rules.
Single values can be extracted from SQLite using an inline SELECT
expression. For instance:
set x_ := (select x from somewhere where id = 1);
The SELECT
statement in question must extract exactly one column, and the type of the expression becomes the type of the column. This form can appear anywhere an expression can appear, though it is most commonly used in assignments. Something like this would also be valid:
if (select x from somewhere where id = 1) == 3 then
...
end if;
The SELECT
statement can, of course, be arbitrarily complex.
Importantly, if the SELECT
statement returns no rows, this will result in the normal error flow. In that case, the error code will be SQLITE_DONE
, which is treated like an error because in this context SQLITE_ROW
is expected as a result of the SELECT
. This is not a typical error code and can be quite surprising to callers. If you’re seeing this failure mode, it usually means the code had no affordance for the case where there were no rows, and probably that situation should have been handled. This is an easy mistake to make, so to avoid it, CQL also supports these more tolerant forms:
set x_ := (select x from somewhere where id = 1 if nothing then -1);
And even more generally, if the schema allows for null values and those are not desired:
set x_ := (select x from somewhere where id = 1 if nothing or null then -1);
Both of these are much safer to use, as only genuine errors (e.g., the table was dropped and no longer exists) will result in the error control flow.
Again, note that:
set x_ := (select ifnull(x, -1) from somewhere where id = 1);
Would not avoid the SQLITE_DONE
error code because “no rows returned” is not at all the same as “null value returned.”
The if nothing or null
form above is equivalent to the following, but it is more economical and probably clearer:
set x_ := (select ifnull(x, -1) from somewhere where id = 1 if nothing then -1);
To compute the type of the overall expression, the rules are almost the same as normal binary operators. In particular:
(select ...)
, the result type is not null if and only if the select result type is not null (see select statement, many cases).(select ... if nothing)
, the result type is not null if and only if both the select result and the default expression types are not null (normal binary rules).(select ... if nothing or null)
, the result type is not null if and only if the default expression type is not null.Finally, the form (select ... if nothing then throw)
is allowed; this form is exactly the same as normal (select ...)
, but it makes explicit that the error control flow will happen if there is no row. Consequently, this form is allowed even if @enforce_strict select if nothing
is in force.
CQL supports the notion of ‘sensitive’ data in a first-class way. You can think of it as very much like nullability; it largely begins by tagging data columns with @sensitive
.
Rather than go through the whole calculus, it’s easier to understand by a series of examples. So let’s start with a table with some sensitive data.
create table with_sensitive(
id integer,
name text @sensitive,integer @sensitive
sens );
The most obvious thing you might do at this point is create a stored procedure that would read data out of that table. Maybe something like this:
create proc get_sensitive()
begin
select id as not_sensitive_1,
+ 1 sensitive_1,
sens as sensitive_2,
name 'x' as not_sensitive_2,
-sens as sensitive_3,
between 1 and 3 as sensitive_4
sens from with_sensitive;
end;
So looking at that procedure, we can see that it’s reading sensitive data, so the result will have some sensitive columns in it.
id
is not sensitive (at least not in this example)sens + 1
is sensitive, math on a sensitive field leaves it sensitivename
is sensitive, it began that way and is unchangedx
is just a string literal, it’s not sensitive-sens
is sensitive, that’s more mathbetween
expression is also sensitiveGenerally, sensitivity is “radioactive” - anything it touches becomes sensitive. This is very important because even a simple-looking expression like sens IS NOT NULL
must lead to a sensitive result or the whole process would be largely useless. It has to be basically impossible to wash away sensitivity.
These rules apply to normal expressions as well as expressions in the context of SQL. Accordingly:
Sensitive variables can be declared:
declare sens integer @sensitive;
Simple operations on the variables are sensitive:
-- this is sensitive (and the same would be true for any other math)
+ 1; sens
The IN
expression gives a sensitive result if anything about it is sensitive:
-- all of these are sensitive
in (1, 2);
sens 1 in (1, sens);
select id in (select sens from with_sensitive)); (
Similarly, sensitive constructs in CASE
expressions result in a sensitive output:
-- not sensitive
case 0 when 1 then 2 else 3 end;
-- all of these are sensitive
case sens when 1 then 2 else 3 end;
case 0 when sens then 2 else 3 end;
case 0 when 1 then sens else 3 end;
case 0 when 1 then 2 else sens end;
Cast operations preserve sensitivity:
-- sensitive result
select cast(sens as INT);
Aggregate functions likewise preserve sensitivity:
-- all of these are sensitive
select AVG(T1.sens) from with_sensitive T1;
select MIN(T1.sens) from with_sensitive T1;
select MAX(T1.sens) from with_sensitive T1;
select SUM(T1.sens) from with_sensitive T1;
select COUNT(T1.sens) from with_sensitive T1;
There are many operators that get similar treatment such as COALESCE
, IFNULL
, IS
and IS NOT
.
Things get more interesting when we come to the EXISTS
operator:
-- sensitive if and only if any selected column is sensitive
exists(select * from with_sensitive)
-- sensitive because "info" is sensitive
exists(select info from with_sensitive)
-- not sensitive because "id" is not sensitive
exists(select id from with_sensitive)
If this is making you nervous, it probably should. We need a little more protection because of the way EXISTS
is typically used. The predicates matter. Consider the following:
-- id is now sensitive because the predicate of the where clause was sensitive
select id from with_sensitive where sens = 1;
-- this expression is now sensitive because id is sensitive in this context
exists(select id from with_sensitive where sens = 1)
In general, if the predicate of a WHERE or HAVING clause is sensitive, then all columns in the result become sensitive.
Similarly, when performing joins, if the column specified in the USING clause is sensitive or the predicate of the ON clause is sensitive, then the result of the join is considered to be all sensitive columns, even if the columns were not sensitive in the schema.
Likewise, a sensitive expression in LIMIT or OFFSET will result in 100% sensitive columns, as these can be used in a WHERE-ish way.
-- join with ON
select T1.id from with_sensitive T1 inner join with_sensitive T2 on T1.sens = T2.sens
-- join with USING
select T1.id from with_sensitive T1 inner join with_sensitive T2 using(sens);
All of these expressions and join propagations are designed to make it impossible to simply wash away sensitivity with a little bit of math.
Now we come to enforcement, which boils down to what assignments or “assignment-like” operations we allow.
If we have these:
declare sens integer @sensitive;
declare not_sens integer;
We can use those as stand-ins for lots of expressions, but the essential calculus goes like this:
-- assigning a sensitive to a sensitive is ok
set sens := sens + 1;
-- assigning not sensitive data to a sensitive is ok
-- this is needed so you can (e.g.) initialize to zero
set sens := not_sens;
-- not ok
set not_sens := sens;
Now these “assignments” can happen in a variety of ways:
Now it’s possible to write a procedure that accepts sensitive things and returns non-sensitive things. This is fundamentally necessary because the proc must be able return (e.g.) a success code, or encrypted data, that is not sensitive. However, if you write the procedure in CQL it, too, will have to follow the assignment rules and so cheating will be quite hard. The idea here is to make it easy to handle sensitive data well and make typical mistakes trigger errors.
With these rules it’s possible to compute the the type of procedure result sets and also to enforce IN/OUT parameters. Since the signature of procedures is conveniently generated with –generate_exports good practices are fairly easy to follow and sensitivity checks flow well into your programs.
This is a brief summary of CQL semantics for reference types – those types that are ref counted by the runtime.
The three reference types are:
TEXT
OBJECT
BLOB
Each of these has their own macro for retain
and release
though all three actually turn into the exact same code in all the current CQL runtime implementations. In all cases the object is expected to be promptly freed when the reference count falls to zero.
in
and inout
arguments are not retained on entry to a stored procout
arguments are assumed to contain garbage and are nulled without retaining on entryout
argument doesn’t have garbage in it, then it is up to you to release
it before you make a callout
argument, CQL will release
the argument variable before the call site, obeying its own contractretains
the object, and then does a release
on the previous objectrelease
first might accidentally free too soon)release
on all local variables when the method exitsout
parameter or a global variableout
, inout
parameters, and global variables work just like local variables except that CQL does not call release
at the end of the procedureStored procedures do not return values, they only have out
arguments and those are well defined as above. Functions however are also supported and they can have either get
or create
semantics
If you declare a function like so:
declare function Getter() object;
Then CQL assumes that the returned object should follow the normal rules above, retain/release will balance by the end of the procedure for locals and globals or out
arguments could retain the object.
If you declare a function like so:
declare function Getter() create text;
then CQL assumes that the function created a new result which it is now responsible for releasing. In short, the returned object is assumed to arrive with a retain count of 1 already on it. When CQL stores this return value it will:
As a result, if you store the returned value in a local variable it will be released when the procedure exits (as usual) or if you instead store the result in a global or an out parameter the result will survive to be used later.
CQL tries to adhere to normal SQL comparison rules but with a C twist.
OBJECT
The object type has no value-based comparison, so there is no <
, >
and so forth.
The following table is useful. Let’s suppose there are exactly two distinct objects ‘X’ and ‘Y’:
result | examples |
|
|
|
|
|
---|---|---|---|---|---|---|
true | X = X |
X <> Y |
Y = Y |
Y <> X |
X IN (X, Y) |
X NOT IN (Y) |
false | X = Y |
X <> X |
Y = X |
Y <> Y |
X NOT IN (X, Y) |
|
null | null = null |
X <> null |
x = null |
null <> null |
Y <> null |
y = null |
true | X is not null |
Y is not null |
null is null |
|||
false | X is null |
Y is null |
null is not null |
null = null
resulting in NULL
is particularly surprising but consistent with the usual SQL rules. And again, as in SQL, the IS
operator returns true for X IS Y
even if both are NULL
.
NOTE: null-valued variables evaluate as above; however, the
NULL
literal generally yields errors if it is used strangely. For example, inif x == NULL
, you get an error. The result is always going to beNULL
, hence falsey. This was almost certainly intended to beif x IS NULL
. Likewise, comparing expressions that are known to beNOT NULL
againstNULL
yields errors. This is also true where an expression has been inferred to beNOT NULL
by control flow analysis.
TEXT
Text has value comparison semantics, but normal string comparison is done only with strcmp
, which is of limited value. Typically, you’ll want to either delegate the comparison to SQLite (with (select x < y)
) or else use a helper function with a suitable comparison mechanism.
For text comparisons, including equality:
result | cases |
---|---|
true | if and only if both operands are not null and the comparison matches (using strcmp) |
false | if and only if both operands are not null and the comparison does not match (using strcmp) |
null | if and only if at least one operand is null |
Example:
'x' < 'y' == true -- because strcmp("x", "y") < 0
As with type object
, the IS
and IS NOT
operators behave similarly to equality and inequality, but never produce a NULL
result. Strings have the same null semantics as object
. In fact strings have all the same semantics as object except that they get a value comparison with strcmp
rather than an identity comparison. With object x == y
implies that x
and y
point to the very same object. With strings, it only means x
and y
hold the same text.
The IN
and NOT IN
operators also work for text using the same value comparisons as above.
Additionally there are special text comparison operators such as LIKE
, MATCH
and GLOB
. These comparisons are defined by SQLite.
BLOB
Blobs are compared by value (equivalent to memcmp
) but have no well-defined ordering. The memcmp
order is deemed not helpful as blobs usually have internal structure hence the valid comparisons are only equality and inequality.
You can use user defined functions to do better comparisons of your particular blobs if needed.
The net comparison behavior is otherwise just like strings.
DECLARE FUNCTION foo() OBJECT;
CREATE PROC foo_user (OUT baz OBJECT)
BEGIN
SET baz := foo();
END;
void foo_user(cql_object_ref _Nullable *_Nonnull baz) {
*(void **)baz = NULL; // set out arg to non-garbage
(baz, foo());
cql_set_object_ref}
DECLARE FUNCTION foo() CREATE OBJECT;
CREATE PROCEDURE foo_user (INOUT baz OBJECT)
BEGIN
DECLARE x OBJECT;
SET x := foo();
SET baz := foo();
END;
void foo_user(cql_object_ref _Nullable *_Nonnull baz) {
= NULL;
cql_object_ref x
(x);
cql_object_release= foo();
x (*baz);
cql_object_release*baz = foo();
:
cql_cleanup(x);
cql_object_release}
DECLARE FUNCTION foo() OBJECT;
CREATE PROCEDURE foo_user (INOUT baz OBJECT)
BEGIN
DECLARE x OBJECT;
SET x := foo();
SET baz := foo();
END;
void foo_user(cql_object_ref _Nullable *_Nonnull baz) {
= NULL;
cql_object_ref x
(&x, foo());
cql_set_object_ref(baz, foo());
cql_set_object_ref
:
cql_cleanup(x);
cql_object_release}
Function calls can be made using a pipeline notation to chain function results together in a way that is clearer. The general syntax is:
...args...) expr : func(
Which is exactly equivalent to
...args...) func(expr,
This can be generalized with the @op
statement, which in general might change the function name to something more specific. For instance.
@op real : call foo as foo_real;
@op real<joules> : call foo as foo_real_joules;
Now has more specific mappings
declare x real<joules>;
-- These are the functions that you might call with pipeline notation
declare function foo(x text) real;
declare function foo_real(x real) real;
declare function foo_real_joules(x real) real;
The mappings and declarations are both required now allow this:
"test":foo() --> foo("test")
5.0:foo() --> foo_real(5.0)
:foo() --> foo_real_joules(x) x
Note that in each case foo
could have included additional arguments which are placed normally.
x:foo(1,2) --> foo_real_joules(x, 1, 2) (this function is not declared)
If there are no additional arguments the ()
can be elided. This is a good way to end a pipeline.
:dump --> dump(x) x
The old ::
and :::
operators are no longer supported. The combination of naming conventions proved impossible to remember. With @op
you can decide how to resolve the naming yourself.
This is a functor like form and it can be enabled using @op
like the other cases. For instance
@op object<list_builder> : functor all as list_builder_add;
The invocation syntax is like the :
operator but with no function name, so it looks kind of like the left argument has become a functor (something you can call like a function). In reality the call is converted using the base name in the @op
directive and the base types of the arguments.
...) expr:(arg1, arg2,
To enable:
expr
) must have a type kind (e.g. object<container>
)@op
directive specifying functor
and all
@op
name plus the types of all the arguments (if any)for instance, with these declarations:
declare function new_builder() create object<list_builder>;
declare function list_builder_add_int(arg1 object<list_builder>, arg2 int!) object<list_builder>;
declare function list_builder_add_int_int(arg1 object<list_builder>, arg2 int!, arg3 int!) object<list_builder>;
declare function list_builder_add_real(arg1 object<list_builder>, arg2 real!) object<list_builder>;
declare function list_builder_to_list(arg1 object<list_builder>) create object<list>;
@op object<list_builder> : functor all as list_builder_add;
@op object<list_builder> : call to_list as list_builder_to_list;
You could write:
list := new_builder():(5):(7.0):(1,2):to_list(); let
This expands to the much less readable:
list :=
LET
list_builder_to_list(
list_builder_add_int_int(
list_builder_add_real(
list_builder_add_int(5), 7.0), 1, 2)); new_builder(),
You can use this form to build helper functions that assemble text, arrays, JSON and many other uses. Anywhere the “fluent” coding pattern is helpful this syntax gives you a very flexible pattern. In the end it’s just rewritten function calls.
The appended type names look like this:
core type | short name |
---|---|
NULL + | null |
BOOL | bool |
INTEGER | int |
LONG INTEGER | long |
REAL | real |
TEXT | text |
BLOB | blob |
OBJECT | object |
CURSOR ++ | cursor |
++ the CURSOR type applies to functions with a CURSOR argument, these are the so called dynamic-cursor arguments
Cast operations also have a pipeline notation:
-- This is the same as let i := cast(foo(x) as int);
:= x :foo() ~int~ ; let i
These operations are particularly useful when working with json data. For instance:
select foo.json : json_extract('x') ~int~ :ifnull(0) as X from foo;
This is much easier to read than the equivalent:
select ifnull(cast(json_extract(foo.json 'x') as int), 0) as X from foo;
In all the cases the expression is rewritten into the normal form so SQLite will never see the pipeline form. No special SQLite support is needed.
This form of the ~
operator has the same binding strength as the :
operator family.
->
<<
>>
||
The @op
form allows the definition of many possible overload combos. For instance the ->
is now mappable to a function call of your choice, this is very useful in pipeline forms.
@op text<xml> : arrow text<xml_path> as extract_xml_path;
Allows this mapping
-> path ---> extract_xml_path(xml, path) xml
The type forms for arrow
should use types in their simplest form, like so:
core type | short name |
---|---|
BOOL | bool |
INTEGER | int |
LONG INTEGER | long |
REAL | real |
TEXT | text |
BLOB | blob |
OBJECT | object |
These are the same forms that are used when adding argument types for the polymorpic pipeline form.
@op
can be used with lshift
rshift
and concat
to remap the <<
, >>
, and ||
operators respectively.
CQL offers structured types in the form of cursors, but applications often need other structured types. In the interest of not encumbering the language and runtime with whatever types given application might need, CQL offers a generic solution for properties and arrays where it will rewrite property and array syntax into function calls.
The rewrite depends on the object type, and in particular the “kind” designation. So for instance consider:
declare function create_container() create object<container> not null;
:= create_container();
let cont := cont.x + 1; cont.x
For this to work we need a function that makes the container:
function create_container() create object<container>;
Now cont.x
might mean field access in a cursor or table access in a select
expression. So, when it is seen, cont.x
will be resolved in the usual ways. If these fail then CQL will look for a suitable mapping using the @op
directive, like so:
@op object<container> : get x as container_get_x;
@op object<container> : set x as container_set_x;
container_set_x
to do setting and container_get_x
to do getting.
Like these maybe:
declare function container_get_x(self object<container> not null) int not null;
declare proc container_set_x(self object<container> not null, value int not null);
With those in place cont.x := cont.x + 1
will be converted into this:
CALL set_object_container_x(cont, get_object_container_x(cont) + 1);
Importantly with this pattern you can control exactly which properties are available. Missing properties will always give errors. For instance cont.y := 1;
results in name not found 'y'
.
This example uses object
as the base type but it can be any type. For instance, if you have a type integer<handle>
that identifies some storage based on the handle value, you can use the property pattern on this. CQL does not care what the types are, so if property access is meaningful on your type then it can be supported.
Additionally, the rewrite can happen in a SQL context or a native context. The only difference is that in a SQL context you would need to create the appropriate SQLite UDF and declare it with declare select function
rather than declare function
.
There is a second choice for properties, where the properties are open-ended. That is where the property name can be anything. If instead of the property specific functions above, we had created these more general functions:
declare function container_get(self object<container>! , field text!) int!;
declare proc container_set(self object<container>!, field text!, value int!);
These functions have the desired property name as a parameter (field
) and so they can work with any field.
@op object<container> : get all as container_get;
@op object<container> : set all as container_set;
With these in place, cont.y
is no longer an error. We get these transforms:
+= 1;
cont.x += 1;
cont.y -- specific functions for 'x'
CALL container_set_x(cont, container_get_x(cont) + 1);
-- generic functions for 'y'
CALL container_set(cont, 'y', container_get(cont, 'y') + 1);
Nearly the same, but more generic. This is great if the properties are open-ended. Perhaps for a dictionary or something representing JSON. Anything like that.
In fact, the property doesn’t have to be a compile-time constant. Let’s look at array syntax next.
Array access is just like the open-ended property form with two differences:
'x'] += 1; cont[
You get exactly the same transform if you set up the array mapping:
@op object<container> : array get container_get;
@op object<container> : array set container_set;
To yield:
CALL container_set(cont, 'x', container_get(cont, 'x') + 1);
The index type is not contrained so you can make several mappings with different signatures. For instance:
1,2] := cont[3,4] + cont[5,6]; cont[
Could be supported by these functions:
declare function cont_get_ij(self object<container>!, i int!, j int!) int!;
declare proc cont_set_ij(self object<container>!, i int!, j int!, value int!);
@op object<container> : array get as cont_get_ij;
@op object<container> : array set as cont_set_ij;
This results in:
CALL cont_set_ij(cont, 1, 2, cont_get_ij(cont, 3, 4) + cont_get_ij(cont, 5, 6));
This is a very simple transform that allows for the creation of any array-like behavior you might want.
In addition to the function pattern shown above, you can use the “proc as func” form to get values, like so:
create proc container(self object<container>!, field text!, out value int!)
begin
value := something_using_container_and_field;
end;
And with this form, you could write all of the property management or array management in CQL directly.
However, it’s often the case that you want to use native data structures to hold your things-with-properties or your arrays. If that’s the case, you can write them in C as usual, using the normal interface types to CQL that are defined in cqlrt.h
. The exact functions you need to implement will be emitted into the header file output for C. For Lua, things are even easier; just write matching Lua functions in Lua. It’s all primitives or dictionaries in Lua.
In C, you could also implement some or all of the property reading functions as macros. You could add these macros to your copy of cqlrt.h
(it’s designed to be changed) or you could emit them with @echo c, "#define stuff\n";
Finally, the no check
version of the functions or procedures can also be used. This will let you use a var-arg list, for instance, in your arrays, which might be interesting. Variable indices can be used in very flexible array builder forms.
Another interesting aspect of the no check
version of the APIs is that the calling convention for such functions is a little different in C (in Lua it’s the same). In C, the no check
forms most common target is the printf
function. But printf
accepts C strings not CQL text objects. This means any text argument must be converted to a normal C string before making the call. But it also means that string literals pass through unchanged into the C!
For instance:
call printf("Hello world\n");
becomes:
("Hello world\n"); printf
So likewise, if your array or property getters are no check
, then cont.x := 1;
becomes maybe container_set(cont, "x", 1)
. Importantly, the C string literal "x"
will fold across any number of uses in your whole program, so there will only be one copy of the literal. This gives great economy for the flexible type case and it is actually why no check
functions were added to the language, rounding out all the no check
flavors.
no check
functions and procedures for your getters and setters for maximum economy. If you also implement the functions on the C side as macros, or inline functions in your cqlrt.h
path, then array and property access can be very economical. There need not be any actual function calls by the time the code runs.
Though we’ve already introduced examples of procedures, let’s now go over some of the additional aspects we have not yet illustrated, like arguments and control-flow.
Consider this procedure:
create procedure copy_integer(in arg1 integer not null, out arg2 integer not null)
begin
set arg2 := arg1;
end;
arg1
has been declared as in
. This is the default: in arg1 integer not null
, and arg1 integer not null
mean the exact same thing.
arg2
, however, has been declared as out
. When a parameter is declared using out
, arguments for it are passed by reference. This is similar to by-reference arguments in other languages; indeed, they compile into a simple pointer reference in the generated C code.
Given that arg2
is passed by reference, the statement set arg2 := arg1;
actually updates a variable in the caller. For example:
declare x int not null;
call copy_integer(42, x);
-- `x` is now 42
It is important to note that values cannot be passed into a procedure via an out
parameter. In fact, out
parameters are immediately assigned a new value as soon as the procedure is called:
out
parameters are set to null
.out
parameters of a non-reference type (e.g., integer
, long
, bool
, etc.) are set to their default values (0
, 0.0
, false
, etc.).out
parameters of a reference type (e.g., blob
, object
, and text
) are set to null
as there are no default values for reference types. They must, therefore, be assigned a value within the procedure so that they will not be null
when the procedure returns. CQL enforces this.In addition to in
and out
parameters, there are also inout
parameters. inout
parameters are, as one might expect, a combination of in
and out
parameters: The caller passes in a value as with in
parameters, but the value is passed by reference as with out
parameters.
inout
parameters allow for code such as the following:
integer not null)
proc times_two(inout arg begin
-- note that a variable in the caller is both read from and written to
+= arg; -- this is the same as set arg := arg + arg;
arg end;
:= 2;
let x -- this is the same as call times_two(x)
times_two(x); -- `x` is now 4
The usual call
syntax is used to invoke a procedure. It returns no value, but it can have any number of out
arguments.
declare scratch integer not null;
call copy_integer(12, scratch);
== 12; -- true scratch
Let’s go over the most essential bits of control flow.
The CQL IF
statement has no syntactic ambiguities at the expense of being somewhat more verbose than many other languages. In CQL, the ELSE IF
portion is baked into the IF
statement, so what you see below is logically a single statement.
create proc checker(foo integer, out result integer not null)
begin
if foo = 1 then
:= 1;
result else if foo = 2 then
:= 3;
result else
:= 5;
result end if;
end;
What follows is a simple procedure that counts down its input argument.
declare procedure printf no check;
create proc looper(x integer not null)
begin
while x > 0
begin
call printf('%d\n', x);
-= 1;
x end;
end;
The WHILE
loop has additional keywords that can be used within it to better control the loop. A more general loop might look like this:
declare procedure printf no check;
create proc looper(x integer not null)
begin
while 1
begin
-= 1;
x if x < 0 then
leave;else if x % 100 = 0 then
continue;
else if x % 10 = 0 then
call printf('%d\n', x);
end if;
end;
end;
Let’s go over this peculiar loop:
while 1
begin
...
end;
This is an immediate sign that there will be an unusual exit condition. The loop will never end without one because 1
will never be false.
if x < 0 then
leave;
Now here we’ve encoded our exit condition a bit strangely: we might have done the equivalent job with a normal condition in the predicate part of the while
statement but for illustration anyway, when x becomes negative leave
will cause us to exit the loop. This is like break
in C.
else if x % 100 = 0 then
continue;
This bit says that on every 100th iteration, we go back to the start of the loop. So the next bit will not run, which is the printing.
else if x % 10 = 0 then
call printf('%d\n', x);
end if;
Finishing up the control flow, on every 10th iteration we print the value of the loop variable.
The FOR
statement provides a bit more convenience over WHILE
and that’s pretty much it. The usual thing that goes wrong with a simple WHILE
loop is that you end up forgetting to add the iteration increment and then you debug an infinite loop. To help avoid this FOR
lets you specify the loop end action in the same place as the condition. Looking at the previous example, with FOR
it looks like this:
declare procedure printf no check;
create proc looper(x int!)
begin
for x > 0; x -= 1;
begin
'%d\n', x);
printf(end;
end;
Here we’ve also used some of the more modern forms, eliding the call
.
FOR
very general, any number of statements can follow the condition so for instance
for x < 5 and y < 5; x += 1; y += 1;
begin
...
end;
Any statements may appear in the list, even things that don’t seem very loop-like. You could fetch a cursor or something.
The FOR
construct is similar to while, you might think of it as:
--- for <condition> ; <loop update statements; ... ;> ; begin <main body> end;
while <condition>
begin
<main body>
continue_will_go_here:<loop update statements; ... ;>
end;
Essentially it’s the same as putting all those statements at the end except that continue
will go to those statements instead of the top of the loop.
There is no difference in code quality. The only advantage is that the all the loop info is together.
This is possible:
let x := 5; for x < 10; x += 1;
begin
...
end;
It would be very simple to macroize FOR
to do some simple loop cases like:
@macro(stmt_list) _for!(id! expr, c1! expr, c2! expr, s! stmt_list)
begin
id! := c1!;
for id! <= c2! ; id! += 1;
begin
s!;end;
end;
proc foo()begin
int;
var x
1, 5,
_for!(x, begin
"%d\n", x);
printf(end);
end;
For is useful for this because all the loop pieces are together so it’s a little easier to macroize.
The CQL SWITCH
is designed to map to the C switch
statement for better code generation and also to give us the opportunity to do better error checking. SWITCH
is a statement like IF
, not an expression like CASE..WHEN..END
, so it combines with other statements. The general form looks like this:
SWITCH switch-expression [optional ALL VALUES]
WHEN expr1, expr2, ... THEN
[statement_list]WHEN expr3, ... THEN
[statement_list]WHEN expr4 THEN
NOTHING
ELSE
[statement_list]END;
integer not null
or long integer not null
)WHEN
expressions [expr1, expr2, etc.] are made from constant integer expressions (e.g. 5
, 1+7
, 1<<2
, or my_enum.thing
)WHEN
expressions must be compatible with the switch expression (long constants cannot be used if the switch expression is an integer)WHEN
clauses must be unique (after evaluation)LEAVE
keyword exits the SWITCH
prematurely, just like break
in C
LEAVE
is not required before the next WHEN
C
, if fall-through ever comes to SWITCH
it will be explicitNOTHING
is used after THEN
it means there is no code for that case, which is useful with ALL VALUES
(see below)ELSE
clause is optional and works just like default
in C
, covering any cases not otherwise explicitly listedALL VALUES
then:
WHEN
values must cover every value of the enum
_
are by convention considered pseudo values and do not need to be coveredWHEN
values not in the enumELSE
clause (it would defeat the point of listing ALL VALUES
which is to get an error if new values come along)Some more complete examples:
:= get_something();
let x switch x
when 1,1+1 then -- constant expressions ok
set y := 'small';
-- other stuff
when 3,4,5 then
set y := 'medium';
-- other stuff
when 6,7,8 then
set y := 'large';
-- other stuff
else
set y := 'zomg enormous';
end;
declare enum item integer (
= 0, pencil, brush,
pen
paper, canvas,
_count
);
:= get_item(); -- returns one of the above
let x
switch x all values
when item.pen, item.pencil then
call write_something();
when item.brush then nothing
-- itemize brush but it needs no code
when item.paper, item.canvas then
call setup_writing();
end;
Using THEN NOTHING
allows the compiler to avoid emitting a useless break
in the C code. Hence, that choice is better/clearer than when brush then leave;
.
Note that the presence of _count
in the enum will not cause an error in the above because it starts with _
.
The C
output for this statement will be a C
switch statement.
This example illustrates catching an error from some DML, and recovering rather than letting the error cascade up. This is the common “upsert” pattern (insert or update)
declare procedure printf no check;
create procedure upsert_foo(id_ integer, t_ text)
begin
tryinsert into foo(id, t) values(id_, t_);
catch
tryupdate foo set t = t_ where id = id_;
catchcall printf("Error code %d!\n", @rc);
throw;end;
end;
end;
Once again, let’s go over this section by section:
tryinsert into foo(id, t) values(id_, t_);
catch
Normally if the insert
statement fails, the procedure will exit with a failure result code. Here, instead, we prepare to catch that error.
catch
tryupdate foo set t = t_ where id = id_;
catch
Now, having failed to insert, presumably because a row with the provided id
already exists, we try to update that row instead. However that might also fail, so we wrap it in another try. If the update fails, then there is a final catch block:
catchcall printf("Error code %d!\n", @rc);
throw;end;
Here we see a usage of the @rc
variable to observe the failed error code. In this case, we simply print a diagnostic message and then use the throw
keyword to rethrow the previous failure (exactly what is stored in @rc
). In general, throw
will create a failure in the current block using the most recent failed result code from SQLite (@rc
) if it is an error, or else the general SQLITE_ERROR
result code if there is no such error. In this case, the failure code for the update
statement will become the result code of the current procedure.
This leaves only the closing markers:
end;
end;
If control flow reaches the normal end of the procedure it will return SQLITE_OK
.
The calling convention for CQL stored procedures often requires that the procedure return a result code from SQLite. This makes it impossible to write a procedure that returns a result like a C function, as the result position is already used for the error code. You can get around this problem by using out
arguments as your return results. So, for instance, this version of the Fibonacci function is possible.
-- this works, but it is awkward
create procedure fib (in arg integer not null, out result integer not null)
begin
if (arg <= 2) then
set result := 1;
else
declare t integer not null;
call fib(arg - 1, result);
call fib(arg - 2, t);
set result := t + result;
end if;
end;
The above works, but the notation is very awkward.
CQL has a “procedures as functions” feature that tries to make this more pleasant by making it possible to use function call notation on a procedure whose last argument is an out
variable. You simply call the procedure like it was a function and omit the last argument in the call. A temporary variable is automatically created to hold the result and that temporary becomes the logical return of the function. For semantic analysis, the result type of the function becomes the type of the out
argument.
-- rewritten with function call syntax
create procedure fib (in arg integer not null, out result integer not null)
begin
if (arg <= 2) then
set result := 1;
else
set result := fib(arg - 1) + fib(arg - 2);
end if;
end;
This form is allowed when:
out
(neither in
nor inout
are acceptable)select
statement or out
statement (more on these later)call
form. Any failures can be caught with try/catch
as usual. The “procedure as function” feature is really only syntatic sugar for the “awkward” form above, but it does allow for slightly better generated C code.
In the previous chapters we have used cursor variables without fully discussing them. Most of the uses are fairly self-evident but a more exhaustive discussion is also useful.
First there are three types of cursors, as we will see below.
A statement cursor is based on a SQL SELECT
statement. A full example might look like this:
-- elsewhere
create table xy_table(x integer, y integer);
declare C cursor for select x, y from xy_table;
When compiled, this will result in creating a SQLite statement object (type sqlite_stmt *
) and storing it in a variable called C_stmt
. This statement can then be used later in various ways.
Here’s perhaps the simplest way to use the cursor above:
declare x, y integer;
into x, y; fetch C
This will have the effect of reading one row from the results of the query into the local variables x
and y
.
These variables might then be used to create some output such as:
/* note use of double quotes so that \n is legal */
call printf("x:%d y:%d\n", ifnull(x, 0), ifnull(y,0));
More generally, there the cursor may or may not be holding fetched values. The cursor variable C
can be used by itself as a boolean indicating the presence of a row. So a more complete example might be
if C then
call printf("x:%d y:%d\n", ifnull(x, 0), ifnull(y,0));
else
call printf("nada\n");
end if
And even more generally
loop fetch C into x, y
begin
call printf("x:%d y:%d\n", ifnull(x, 0), ifnull(y,0));
end;
The last example above reads all the rows and prints them.
Now if the table xy_table
had instead had dozens of columns, those declarations would be very verbose and error prone, and frankly annoying, especially if the table definition was changing over time.
To make this a little easier, there are so-called ‘automatic’ cursors. These happen implicitly and include all the necessary storage to exactly match the rows in their statement. Using the automatic syntax for the above looks like so:
declare C cursor for select * from xy_table;
fetch C;if C then
call printf("x:%d y:%d\n", ifnull(C.x, 0), ifnull(C.y,0));
end if;
or the equivalent loop form:
declare C cursor for select * from xy_table;
loop fetch C
begin
call printf("x:%d y:%d\n", ifnull(C.x, 0), ifnull(C.y,0));
end;
All the necessary local state is automatically created, hence “automatic” cursor. This pattern is generally preferred, but the loose variables pattern is in some sense more general.
In all the cases if the number or type of variables do not match the select statement, semantic errors are produced.
The purpose of value cursors is to make it possible for a stored procedure to work with structures as a unit rather than only field by field. SQL doesn’t have the notion of structure types, but structures actually appear pretty directly in many places. Generally we call these things “Shapes” and there are a variety of source for shapes including:
SELECT
statementLet’s first start with how you declare a value cursor. It is providing one of the shape sources above.
So:
declare C cursor like xy_table;
declare C cursor like select 1 a, 'x' b;
declare C cursor like (a integer not null, b text not null);
declare C cursor like my_view;
declare C cursor like my_other_cursor;
declare C cursor like my_previously_declared_stored_proc;
declare C cursor like my_previously_declared_stored_proc arguments;
Any of those forms define a valid set of columns – a shape. Note that the select
example in no way causes the query provided to run. Instead, the select statement is analyzed and the column names and types are computed. The cursor gets the same field names and types. Nothing happens at run time.
The last two examples assume that there is a stored procedure defined somewhere earlier in the same translation unit and that the procedure returns a result set or has arguments, respectively.
In all cases the cursor declaration makes a cursor that could hold the indicated result. That result can then be loaded with FETCH
or emitted with OUT
or OUT UNION
which will be discussed below.
Once we have declared a value cursor we can load it with values using FETCH
in its value form. Here are some examples:
Fetch from compatible values:
from values(1,2); fetch C
Fetch from a call to a procedure that returns a single row:
from call my_previously_declared_stored_proc(); fetch C
Fetch from another cursor:
from D; fetch C
In this last case if D is a statement cursor it must also be “automatic” (i.e. it has the storage). This form lets you copy a row and save it for later. For instance, in a loop you could copy the current max-value row into a value cursor and use it after the loop, like so:
declare C cursor for select * from somewhere;
declare D cursor like C;
loop fetch C
begin
if (not D or D.something < C.something) then
from C;
fetch D end if;
end;
After the loop, D either empty because there were no rows (thus if D
would fail) or else it has the row with the maximum value of something
, whatever that is.
Value cursors are always have their own storage, so you could say all value cursors are “automatic”.
And as we saw above, value cursors may or may not be holding a row.
declare C cursor like xy_table;
if not C then
call printf("this will always print because C starts empty\n");
end if;
When you call a procedure you may or may not get a row as we’ll see below.
The third type of cursor is a “result set” cursor but that won’t make any sense until we’ve discussed result sets a little which requires OUT
and/or OUT UNION
and so we’ll go on to those statements next. As it happens, we are recapitulating the history of cursor features in the CQL language by exploring the system in this way.
This form allows any kind of declaration, for instance:
declare C cursor like ( id integer not null, val real, flag boolean );
This wouldn’t really give us much more than the other forms, however typed name lists can include LIKE in them again, as part of the list. Which means you can do this kind of thing:
declare C cursor like (like D, extra1 real, extra2 bool)
You could then load that cursor like so:
fetch C from values (from D, 2.5, false);
and now you have D plus 2 more fields which maybe you want to output.
Importantly this way of doing it means that C always includes D, even if D changes over time. As long as the extra1
and extra2
fields don’t conflict names it will always work.
Value cursors were initially designed to create a convenient way for a procedure to return a single row from a complex query without having a crazy number of OUT
parameters. It’s easiest to illustrate this with an example.
Suppose you want to return several variables, the “classic” way to do so would be a procedure like this:
create proc get_a_row(
integer not null,
id_ out got_row bool not null,
out w integer not null,
out x integer,
out y text not null,
out z real)
begin
declare C cursor for
select w, x, y, z from somewhere where id = id_;
into w, x, y, z;
fetch C set got_row := C;
end;
This is already verbose, but you can imagine the situation gets very annoying if get_a_row
has to produce a couple dozen column values. And of course you have to get the types exactly right. And they might evolve over time. Joy.
On the receiving side you get to do something just as annoying:
declare w integer not null
declare x integer;
declare y text;
declare z real;
declare got_row bool not null;
call get_a_row(id, got_row, w, x, y, z);
Using the out
statement we get the equivalent functionality with a much simplified pattern. It looks like this:
create proc get_a_row(id_ integer not null)
begin
declare C cursor for
select w, x, y, z from somewhere where id = id_;
fetch C;out C;
end;
To use the new procedure you simply do this:
declare C cursor like get_a_row;
from call get_a_row(id); fetch C
In fact, originally you did the two steps above in one statement and that was the only way to load a value cursor. Later, the calculus was generalized. The original form still works:
declare C cursor fetch from call get_a_row(id);
The OUT
statement lets you return a single row economically and lets you then test if there actually was a row and if so, read the columns. It infers all the various column names and types so it is resilient to schema change and generally a lot less error prone than having a large number of out
arguments to your procedure.
Once you have the result in a value cursor you can do the usual cursor operations to move it around or otherwise work with it.
The use of the LIKE
keyword to refer to groups of columns spread to other places in CQL as a very useful construct, but it began here with the need to describe a cursor shape economically, by reference.
The semantics of the OUT
statement are that it always produces one row of output (a procedure can produce no row if an out
never actually rans but the procedure does use OUT
).
If an OUT
statement runs more than once, the most recent row becomes the result. So the OUT
statement really does mirror having one out
variable for each output column. This was its intent and procedures that return at most, or exactly, one row are very common so it works well enough.
However, in general, one row results do not suffice; you might want to produce a result set from various sources, possibly with some computation as part of the row creation process. To make general results, you need to be able to emit multiple rows from a computed source. This is exactly what OUT UNION
provides.
Here’s a (somewhat contrived) example of the kind of thing you can do with this form:
create proc foo(n integer not null)
begin
declare C cursor like select 1 value;
:= 0;
let i while i < n
begin
-- emit one row for every integer
from values(i);
fetch C out union C;
set i := i + 1;
end;
end;
In foo
above, we make an entire result set out of thin air. It isn’t a very interesting result, but of course any computation would have been possible.
This pattern is very flexible as we see below in bar
where we merge two different data streams.
create table t1(id integer, stuff text, [other things too]);
create table t2(id integer, stuff text, [other things too]);
create proc bar()
begin
declare C cursor for select * from t1 order by id;
declare D cursor for select * from t2 order by id;
fetch C;
fetch D;
-- we're going to merge these two queries
while C or D
begin
-- if both have a row pick the smaller id
if C and D then
if C.id < D.id then
out union C;
fetch C;else
out union D;
fetch D;end if;
else if C then
-- only C has a row, emit that
out union C;
fetch C;else
-- only D has a row, emit that
out union D;
fetch D;end if;
end;
end;
Just like foo
, in bar
, each time OUT UNION
runs a new row is accumulated.
Now, if you build a procedure that ends with a SELECT
statement CQL automatically creates a fetcher function that does something like an OUT UNION
loop – it loops over the SQLite statement for the SELECT
and fetches each row, materializing a result.
With OUT UNION
you take manual control of this process, allowing you to build arbitrary result sets. Note that either of C
or D
above could have been modified, replaced, skipped, normalized, etc. with any kind of computation. Even entirely synthetic rows can be computed and inserted into the output as we saw in foo
.
Now that we have OUT UNION
it makes sense to talk about the final type of cursor.
OUT UNION
makes it possible to create arbitrary result sets using a mix of sources and filtering. Unfortunately this result type is not a simple row, nor is it a SQLite statement. This meant that neither of the existing types of cursors could hold the result of a procedure that used OUT UNION
. – CQL could not itself consume its own results.
To address this hole, we need an additional cursor type. The syntax is exactly the same as the statement cursor cases described above but, instead of holding a SQLite statement, the cursor holds a result set pointer and the current and maximum row numbers. Stepping through the cursor simply increments the row number and fetches the next row out of the rowset instead of from SQLite.
Example:
-- reading the above
create proc reader()
begin
declare C cursor for call bar();
loop fetch C
begin
call printf("%d %s\n", C.id, C.stuff); -- or whatever fields you need
end;
end;
If bar
had been created with a SELECT
, UNION ALL
, and ORDER BY
to merge the results, the above would have worked with C
being a standard statement cursor, iterating over the union. Since foo
produces a result set, CQL transparently produces a suitable cursor implementation behind the scenes, but otherwise the usage is the same.
Note this is a lousy way to simply iterate over rows; you have to materialize the entire result set so that you can just step over it. Re-consuming like this is not recommended at all for production code, but it is ideal for testing result sets that were made with OUT UNION
which otherwise would require C/C++ to test. Testing CQL with CQL is generally a lot easier.
LIKE
formsThere are lots of cases where you have big rows with many columns, and there are various manipulations you might need to do.
What follows is a set of useful syntactic sugar constructs that simplify handling complex rows. The idea is that pretty much anywhere you can specify a list of columns you can instead use the LIKE x
construct to get the columns as they appear in the shape x
– which is usually a table or a cursor.
It’s a lot easier to illustrate with examples, even though these are, again, a bit contrived.
First we need some table with lots of columns – usually the column names are much bigger which makes it all the more important to not have to type them over and over, but in the interest of some brevity, here is a big table:
create table big (
id integer primary key,
integer unique,
id2 integer,
a integer,
b integer,
c integer,
d integer,
e integer
f );
This example showcases several of the cursor and shape slicing features by emitting two related rows:
create proc foo(id_ integer not null)
begin
-- this is the shape of the result we want -- it's some of the columns of "big"
-- note this query doesn't run, we just use its shape to create a cursor
-- with those columns.
declare result cursor like select id, b, c, d from big;
-- fetch the main row, specified by id_
-- main row has all the fields, including id2
declare main_row cursor for select * from big where id = id_;
fetch main_row;
-- now fetch the result columns out of the main row
-- `like result` here means to use the names of the result cursor
-- to index into the columns of the main_row cursor, and then
-- and store them in `result`
from cursor main_row(like result);
fetch result
-- this is our first result row
out union result;
-- now we want the related row, but we only need two columns
-- from the related row, 'b' and 'c'
declare alt_row cursor for select b, c from big where big.id2 = main_row.id2;
fetch alt_row;
-- update some of the fields in 'result' from the `alt_row`
update cursor result(like alt_row) from cursor alt_row;
-- and emit the modified result, so we've generated two rows
out union result;
end;
Now let’s briefly discuss what is above. The two essential parts are:
fetch result from cursor main_row(like result);
and
update cursor result(like alt_row) from cursor alt_row;
In the first case what we’re saying is that we want to load the columns of result
from main_row
but we only want to take the columns that are actually present in result
. So this is a narrowing of a wide row into a smaller row. In this case, the smaller row, result
, is what we want to emit. We needed the other columns to compute alt_row
.
The second case, what we’re saying is that we want to update result
by replacing the columns found in alt_row
with the values in alt_row
. So in this case we’re writing a smaller cursor into part of a wider cursor. Note that we used the update cursor
form here because it preserves all other columns. If we used fetch
we would be rewriting the entire row contents, using NULL
if necessary, and that is not desired here.
Here is the rewritten version of the above procedure; this is what ultimately gets compiled into C.
CREATE PROC foo (id_ INTEGER NOT NULL)
BEGIN
DECLARE result CURSOR LIKE SELECT id, b, c, d FROM big;
DECLARE main_row CURSOR FOR SELECT * FROM big WHERE id = id_;
FETCH main_row;
id, b, c, d)
FETCH result(FROM VALUES(main_row.id, main_row.b, main_row.c, main_row.d);
OUT UNION result;
DECLARE alt_row CURSOR FOR SELECT b, c FROM big WHERE big.id2 = main_row.id2;
FETCH alt_row;
UPDATE CURSOR result(b, c) FROM VALUES(alt_row.b, alt_row.c);
OUT UNION result;
END;
Of course you could have typed the above directly but if there are 50 odd columns it gets old fast and is very error prone. The sugar form is going to be 100% correct and will require much less typing and maintenance.
Finally, while I’ve shown both LIKE
forms separately, they can also be used together. For instance:
update cursor C(like X) from cursor D(like X);
The above would mean, “move the columns that are found in X
from cursor D
to cursor C
”, presuming X
has columns common to both.
Many of the examples used the FETCH
statement in a sort of demonstrative way that is hopefully self-evident but the statement has many forms and so it’s worth going over them specifically. Below we’ll use the letters C
and D
for the names of cursors. Usually C
;
A cursor declared in one of these forms:
declare C cursor for select * from foo;
declare C cursor for call foo();
(foo might end with a select
or use out union
)is either a statement cursor or a result set cursor. In either case the cursor moves through the results. You load the next row with:
FETCH C
, orFETCH C into x, y, z;
In the first form C
is said to be automatic in that it automatically declares the storage needed to hold all its columns. As mentioned above, automatic cursors have storage for their row.
Having done this fetch you can use C as a scalar variable to see if it holds a row, e.g.
declare C cursor for select * from foo limit 1;
fetch C;if C then
-- bingo we have a row
call printf("%s\n", C.whatever);
end if
You can easily iterate, e.g.
declare C cursor for select * from foo;
loop fetch C
begin
-- one time for every row
call printf("%s\n", C.whatever);
end;
Automatic cursors are so much easier to use than explicit storage that explicit storage is rarely seen. Storing to out
parameters is one case where explicit storage actually is the right choice, as the out
parameters have to be declared anyway.
A value cursor is declared in one of these ways:
declare C cursor fetch from call foo(args)
foo
must be a procedure that returns one row with OUT
declare C cursor like select 1 id, "x" name;
declare C cursor like X;
A value cursor is always automatic; it’s purpose is to hold a row. It doesn’t iterate over anything but it can be re-loaded in a loop.
fetch C
or fetch C into ...
is not valid on such a cursor, because it doesn’t have a source to step through.The canonical way to load such a cursor is:
fetch C from call foo(args);
foo
must be a procedure that returns one row with OUT
fetch C(a,b,c...) from values(x, y, z);
The first form is in some sense the origin of the value cursor. Value cursors were added to the language initially to provide a way to capture the single row OUT
statement results, much like result set cursors were added to capture procedure results from OUT UNION
. In the first form, the cursor storage (a C struct) is provided by reference as a hidden out parameter to the procedure and the procedure fills it in. The procedure may or may not use the OUT
statement in its control flow, as the cursor might not hold a row. You can use if C then ...
as before to test for a row.
The second form is more interesting as it allows the cursor to be loaded from arbitrary expressions subject to some rules:
NOT NULL
columns will get a valueWith this form, any possible valid cursor values could be set, but many forms of updates that are common would be awkward. So there are various forms of syntactic sugar that are automatically rewritten into the canonical form. See the examples below:
fetch C from values(x, y, z)
fetch C from arguments
C
was also rewritten into C(a,b,c,..)
etc.fetch C from arguments like C
like C
would generate)C
has columns a
and b
then there must exist formals named a
or a_
and b
or b_
, in any positionfetch C(a,b) from cursor D(a,b)
fetch C(a,b) from values(D.a, D.b);
That most recent form doesn’t seem like it saves much, but recall the first rewrite:
fetch C from cursor D
fetch C from D
can be used if the cursors have the exact same column names and types; it also generates slightly better code and is a common caseIt is very normal to want to use only some of the columns of a cursor; these LIKE
forms do that job. We saw some of these forms in an earlier example.
fetch C from cursor D(like C)
D
is presumed to be “bigger” than C
, in that it has all of the C
columns and maybe more. The like C
expands into the names of the C
columns so C
is loaded from the C
part of D
fetch C(a, b, g) from values (D.a, D.b, D.g)
D
might have had fields c, d, e, f
which were not used because they are not in C
.The symmetric operation, loading some of the columns of a wider cursor can be expressed neatly:
fetch C(like D) from cursor D
like D
expands into the columns of D
causing the cursor to be loaded with what’s in D
and NULL
(if needed)fetch C(x, y) from values(D.x, D.y)
LIKE
can be used in both places, for instance suppose E
is a shape that has a subset of the rows of both C
and D
. You can write a form like this:
fetch C(like E) from cursor D(like E)
E
and copy them from D to C.As is mentioned above, the fetch
form means “load an entire row into the cursor”. This is important because “half loaded” cursors would be semantically problematic. However there are many cases where you might like to amend the values of an already loaded cursor. You can do this with the update
form.
update cursor C(a,b,..) from values(1,2,..);
update cursor C(like D) from D
is possible; it is in fact the use-case for which this was designed.It’s often desirable to treat bundles of arguments as a unit, or cursors as a unit, especially when calling other procedures. The shape patterns above are very helpful for moving data between cursors, and the database. These can be rounded out with similar constructs for procedure definitions and procedure calls as follows.
First we’ll define some shapes to use in the examples. Note that we made U
using T
.
create table T(x integer not null, y integer not null, z integer not null);
create table U(like T, a integer not null, b integer not null);
We haven’t mentioned this before but the implication of the above is that you can use the LIKE
construct inside a table definition to add columns from a shape.
We can also use the LIKE
construct to create procedure arguments. To avoid conflicts with column names, when used this way the procedure arguments all get a trailing underscore appended to them. The arguments will be x_
, y_
, and z_
as we can see if the following:
create proc p1(like T)
begin
call printf("%d %d %d\n", x_, y_, z_);
end;
Shapes can also be used in a procedure call, as showed below. This next example is obviously contrived, but of course it generalizes. It is exactly equivalent to the above.
create proc p2(like T)
begin
call printf("%d %d %d\n", from arguments);
end;
Now we might want to chain these things together. This next example uses a cursor to call p1
.
create proc q1()
begin
declare C cursor for select * from T;
loop fetch C
begin
/* this is the same as call p(C.x, C.y, C.z) */
call p1(from C);
end;
end;
The LIKE
construct allows you to select some of the arguments, or some of a cursor to use as arguments. This next procedure has more arguments than just T
. The arguments will be x_
, y_
, z_
, a_
, b_
. But the call will still have the T
arguments x_
, y_
, and z_
.
create proc q2(like U)
begin
/* just the args that match T: so this is still call p(x_, y_, z_) */
call p1(from arguments like T);
end;
Or similarly, using a cursor.
create proc q3(like U)
begin
declare C cursor for select * from U;
loop fetch C
begin
/* just the columns that match T so this is still call p(C.x, C.y, C.z) */
call p1(from C like T);
end;
end;
Note that the from
argument forms do not have to be all the arguments. For instance you can get columns from two cursors like so:
call something(from C, from D)
All the varieties can be combined but of course the procedure signature must match. And all these forms work in function expressions as well as procedure calls.
e.g.
set x := a_function(from C);
Since these forms are simply syntatic sugar, they can also appear inside of function calls that are in SQL statements. The variables mentioned will be expanded and become bound variables just like any other variable that appears in a SQL statement.
Note the form x IN (from arguments)
is not supported at this time, though this would be a relatively easy addition.
There are many cases where stored procedures require complex arguments using data shapes that come from the schema, or from other procedures. As we have seen the LIKE
construct for arguments can help with this, but it has some limitations. Let’s consider a specific example to study:
create table Person (
id text primary key,
not null,
name text not null,
address text real
birthday );
To manage this table we might need something like this:
create proc insert_person(like Person)
begin
insert into Person from arguments;
end;
As we have seen, the above expands into:
create proc insert_person(
not null,
id_ text not null,
name_ text not null,
address_ text real)
birthday_ begin
insert into Person(id, name, address, birthday)
values(id_, name_, address_, birthday_);
end;
It’s clear that the sugared version is a lot easier to reason about than the fully expanded version, and much less prone to errors as well.
This much is already helpful, but just those forms aren’t general enough to handle the usual mix of situations. For instance, what if we need a procedure that works with two people? A hypothetical insert_two_people
procedure cannot be written with the forms we have so far.
To generalize this the language adds the notion of named argument bundles. The idea here is to name the bundles which provides a useful scoping. Example:
create proc insert_two_people(p1 like Person, p2 like Person)
begin
-- using a procedure that takes a Person args
call insert_person(from p1);
call insert_person(from p2);
end;
or alternatively
create proc insert_two_people(p1 like Person, p2 like Person)
begin
-- inserting a Person directly
insert into Person from p1;
insert into Person from p2;
end;
The above expands into:
create proc insert_two_people(
not null,
p1_id text not null,
p1_name text not null,
p1_address text real,
p1_birthday not null,
p2_id text not null,
p2_name text not null,
p2_address text real)
p2_birthday begin
insert into Person(id, name, address, birthday)
values(p1_id, p1_name, p1_address, p1_birthday);
insert into Person(id, name, address, birthday)
values(p2_id, p2_name, p2_address, p2_birthday);
end;
Or course different named bundles can have different types – you can create and name shapes of your choice. The language allows you to use an argument bundle name in all the places that a cursor was previously a valid source. That includes insert
, fetch
, update cursor
, and procedure calls. You can refer to the arguments by their expanded name such as p1_address
or alternatively p1.address
– they mean the same thing.
Here’s another example showing a silly but illustrative thing you could do:
create proc insert_lotsa_people(P like Person)
begin
-- make a cursor to hold the arguments
declare C cursor like P;
-- convert arguments to a cursor
from P;
fetch C
-- set up to patch the cursor and use it 20 times
:= 0;
let i while i < 20
begin
update cursor C(id) from values(printf("id_%d", i));
insert into Person from C;
set i := i + 1;
end;
end;
The above shows that you can use a bundle as the source of a shape, and you can use a bundle as a source of data to load a cursor. After which you can do all the usual value cursor things. Of course in this case the value cursor was redundant, we could just as easily have done something like this:
set P_id := printf("id_%d", i);
insert into Person from P;
set i := i + 1;
NOTE: the CQL JSON output includes extra information about procedure arguments if they originated as part of a shape bundle do identify the shape source for tools that might need that information.
The select list of a SELECT
statement already has complex syntax and functionality, but it is a very interesting place to use shapes. To make it possible to use shape notations and not confuse that notation with standard SQL the @COLUMNS
construct was added to the language. This allows for a sugared syntax for extracting columns in bulk.
The @COLUMNS
clause is like of a generalization of the select T.*
with shape slicing and type-checking. The forms are discussed below:
This is the simplest form, it’s just like T.*
:
-- same as A.*
select @columns(A) from ...;
-- same as A.*, B.*
select @columns(A, B) from ...;
This allows you to choose some of the columns of one table of the FROM clause.
-- the columns of A that match the shape Foo
select @columns(A like Foo) from ...;
-- get the Foo shape from A and the Bar shape from B
select @columns(A like Foo, B like Bar) from ...;
Here we do not specify a particular table that contains the columns, they could come from any of the tables in the FROM clause.
--- get the Foo shape from anywhere in the join
select @columns(like Foo) from ...;
-- get the Foo and Bar shapes, from anywhere in the join
select @columns(like Foo, like Bar) from ...;
This pattern can be helpful for getting part of a shape.
-- get the a and b from the Foo shape only
select @columns(like Foo(a,b))
This pattern is great for getting almost all of a shape (e.g. all but the pk).
-- get the Foo shape except the a and b columns
select @columns(like Foo(-a, -b))
This form allows you to slice out a few columns without defining a shape, you simply list the exact columns you want.
-- T1.x and T2.y plus the Foo shape
select @columns(T1.x, T2.y, like Foo) from ...;
Its often the case that there are duplicate column names in the FROM
clause. For instance, you could join A
to B
with both having a column pk
. The final result set can only have one column named pk
, the distinct clause helps you to get distinct column names. In this context distinct
is about column names, not values.
-- removes duplicate column names
-- e.g. there will be one copy of 'pk'
select @columns(distinct A, B) from A join B using(pk);
-- if both Foo and Bar have an (e.g.) 'id' field you only get one copy
select @columns(distinct like Foo, like Bar) from ...;
If a specific column is mentioned it is always included, but later expressions that are not a specific column will avoid that column name.
-- if F or B has an x it won't appear again, just T.x
select @columns(distinct T.x, F like Foo, B like Bar) from F, B ..;
Of course this is all just sugar, so it all compiles to a column list with table qualifications – but the syntax is very powerful. You can easily narrow a wide table, or fuse joins that share common keys without creating conflicts.
-- just the Foo columns
select @columns(like Foo) from Superset_Of_Foo_From_Many_Joins_Even;
-- only one copy of 'pk'
select @columns(distinct A,B,C) from
A join B using (pk) join C using (pk);
And of course you can define shapes however you like and then use them to slice off column chucks of your choice. There are many ways to build up shapes from other shapes. For instance, you can declare procedures that return the shape you want and never actually create the procedure – a pattern is very much like a shape “typedef”. E.g.
declare proc shape1() (x integer, y real, z text);
declare proc shape2() (like shape1, u bool, v bool);
With this combination you can easily define common column shapes and slice them out of complex queries without having to type the columns names over and over.
What follows are the rules for columns that are missing in an INSERT
, or FETCH
statement. As with many of the other things discussed here, the forms result in automatic rewriting of the code to include the specified dummy data. So SQLite will never see these forms.
Two things to note: First, the dummy data options described below are really only interesting in test code, it’s hard to imagine them being useful in production code. Second, none of what follows applies to the update cursor
statement because its purpose is to do partial updates on exactly the specified columns and we’re about to talk about what happens with the columns that were not specified.
When fetching a row all the columns must come from somewhere; if the column is mentioned or mentioned by rewrite then it must have a value mentioned, or a value mentioned by rewrite. For columns that are not mentioned, a NULL value is used if it is legal to do so. For example, fetch C(a) from values(1)
might turn into fetch C(a,b,c,d) from values (1, NULL, NULL, NULL)
In addition to the automatic NULL you may add the annotation @dummy_seed([long integer expression])
. If this annotation is present then:
This construct is hugely powerful in a loop to create many complete rows with very little effort, even if the schema change over time.
declare i integer not null;
declare C like my_table;
set i := 0;
while (i < 20)
begin
-- the id is fully specified the rest come from the dummy
id) from values(i+10000) @dummy_seed(i);
fetch C(insert into my_table from cursor C;
end;
Now in this example we don’t need to know anything about my_table
other than that it has a column named id
. This example shows several things:
insert
statement as well as fetch
insert into my_table(id) values(i+10000) @dummy_seed(i)
would have worked too with no cursor at all
The @dummy_seed
form can be modified with @dummy_nullables
, this indicates that rather than using NULL for any nullable value that is missing, CQL should use the seed value. This overrides the default behavior of using NULL where columns are needed. Note the NULL filling works a little differently on insert statements. Since SQLite will provide a NULL if one is legal the column doesn’t have to be added to the list with a NULL value during rewriting, it can simply be omitted, making the statement smaller.
Finally for insert
statement only, SQLite will normally use the default value of a column if it has one, so there is no need to add missing columns with default values to the insert statement. However if you specify @dummy_defaults
then columns with a default value will instead be rewritten and they will get _seed_
as their value.
Some examples. Suppose columns a
, b
, c
are not null; m
, n
are nullable; and x
, y
have defaults.
-- as written
insert into my_table(a) values(7) @dummy_seed(1000)
-- rewrites to
insert into my_table(a, b, c) values(7, 1000, 1000);
-- as written
insert into my_table(a) values(7) @dummy_seed(1000) @dummy_nullables
-- rewrites to
insert into my_table(a, b, c, m, n) values(7, 1000, 1000, 1000, 1000);
-- as written
insert into my_table(a) values(7) @dummy_seed(1000) @dummy_nullables @dummy_defaults
-- rewrites to
insert into my_table(a, b, c, m, n, x, y) values(7, 1000, 1000, 1000, 1000, 1000, 1000);
The sugar features on fetch
, insert
, and update cursor
are as symmetric as possible, but again, dummy data is generally only interesting in test code. Dummy data will continue to give you valid test rows even if columns are added or removed from the tables in question so it’s a nice toil saver.
Generalized Cursor Lifetime refers to capturing a Statement Cursor in an object so that it can used more flexibly. Wrapping something in an object is often called “boxing”. Since Generalized Cursor Lifetime is a mouthful we’ll refer to it as “boxing” from here forward. The symmetric operation “unboxing” refers to converting the boxed object back into a cursor.
The normal cursor usage pattern is by far the most common, a cursor is created directly with something like these forms:
declare C cursor for select * from shape_source;
declare D cursor for call proc_that_returns_a_shape();
At this point the cursor can be used normally as follows:
loop fetch C
begin
-- do stuff with C
end;
Those are the usual patterns and they allow statement cursors to be consumed sort of “up” the call chain from where the cursor was created. But what if you want some worker procedures that consume a cursor? There is no way to pass your cursor down again with these normal patterns alone.
To generalize the patterns, allowing, for instance, a cursor to be returned as an out parameter or accepted as an in parameter you first need to declare an object variable that can hold the cursor and has a type indicating the shape of the cursor.
To make an object that can hold a cursor:
declare obj object<T cursor>;
Where T
is the name of a shape. It can be a table name, or a view name, or it can be the name of the canonical procedure that returns the result. T should be some kind of global name, something that could be accessed with #include
in various places. Referring to the examples above, choices for T
might be shape_source
the table or proc_that_returns_a_shape
the procedure.
NOTE: A key purpose of the
interface
keyword is to create shapes. e.g.
id integer not null, name text); interface my_shape (
The declared interface my_shape
can be used to help define columns, arguments, etc. Anywhere a shape goes.
To create the boxed cursor, first declare the object variable that will hold it and then set the object from the cursor. Note that in the following example the cursor C
must have the shape defined by my_shape
or an error is produced. The type of the object is crucial because, as we’ll see, during unboxing that type defines the shape of the unboxed cursor.
-- recap: declare the box that holds the cursor (T changed to my_shape for this example)
declare box_obj object<my_shape cursor>;
-- box the cursor into the object (the cursor shape must match the box shape)
set box_obj from cursor C;
The variable box_obj
can now be passed around as usual. It could be stored in a suitable out
variable or it could be passed to a procedure as an in
parameter. Then, later, you can “unbox” box_obj
to get a cursor back. Like so:
-- unboxing a cursor from an object, the type of box_obj defines the type of the created cursor
declare D cursor for box_obj;
These primitives allow cursors to be passed around with general purpose lifetime.
Example:
-- consumes a cursor
create proc cursor_user(box_obj object<my_shape cursor>)
begin
-- the cursors shape will be my_shape, matching box_obj
cursor C for box_obj;
loop fetch C
begin
-- do something with C
end;
end;
-- captures a cursor and passes it on
create proc cursor_boxer()
begin
declare C cursor for select * from something_like_my_shape;
declare box_obj object<my_shape cursor>
set box from cursor C; -- produces error if shape doesn't match
call cursor_user(box_obj);
end;
Importantly, once you box a cursor the underlying SQLite statement’s lifetime is managed by the box object with normal retain/release semantics. The box and underlying statement can be released simply by setting all references to it to null as usual.
With this pattern it’s possible to, for instance, create a cursor, box it, consume some of the rows in one procedure, do some other stuff, and then consume the rest of the rows in another different procedure.
Important Notes:
sqlite3_stmt *
for a long time with all the same problems, because that is exactly is happeningCQL generally doesn’t see the whole world in one compilation. In this way it’s a lot more like, say, the C compiler than it is like, say, Java or C# or something like that. This means several things:
To make this possible there are a few interesting features
Stored procedures defined in another file can be declared to CQL in various ways for each major type of stored procedure. These are covered in the sections below.
DECLARE PROCEDURE foo(id integer, out name text not null);
This introduces the symbol name without providing the body. This has important variations.
DECLARE PROCEDURE foo(id integer, out name text not null) USING TRANSACTION;
Most procedures you write will use SQLite in some fashion, maybe a select
or something. The USING TRANSACTION
annotation indicates that the proc in question uses the database and therefore the generated code will need a database connection in-argument and it will return a SQLite error code.
If the procedure in question is going to use select
or call
to create a result set, the type of that result set has to be declared. An example might look like this:
DECLARE PROC with_result_set () (id INTEGER NOT NULL,
name TEXT,LONG INTEGER,
rate type INTEGER,
size REAL);
This says that the procedure takes no arguments (other than the implicit database connection) and it has an implicit out-argument that can be read to get a result set with the indicated columns: id, name, rate, type, and size. This form implies USING TRANSACTION
.
If the procedure emits a cursor with the OUT
statement to produce a single row then it can be declared as follows:
DECLARE PROC with_result_set () OUT (id INTEGER NOT NULL,
name TEXT,LONG INTEGER,
rate type INTEGER,
size REAL);
This form can have USING TRANSACTION
or not, since it is possible to emit a row with a value cursor and never use the database. See the previous chapter for details on the OUT
statement.
If the procedure emits many rows with the OUT UNION
statement to produce a full result set then it can be declared as follows:
DECLARE PROC with_result_set () OUT UNION (id INTEGER NOT NULL,
name TEXT,LONG INTEGER,
rate type INTEGER,
size REAL);
This form can have USING TRANSACTION
or not, since it is possible to emit a rows with a value cursor and never use the database. See the previous chapter for details on the OUT UNION
statement.
To avoid errors, the declarations for any given file can be automatically created by adding something like --generate_exports
to the command line. This will require an additonal file name to be passed in the --cg
portion to capture the exports.
That file can then be used with #include
when you combine the C pre-processor with CQL as is normally done.
Nomenclature is perhaps a bit weird here. You use --generate_exports
to export the stored procedure declarations from the translation units. Of course those exported symbols are what you then import in some other module. Sometimes this output file is called foo_imports.sql
because those exports are of course exactly what you need to import foo
. You can use whatever convention you like of course, CQL doesn’t care. The full command line might look something like this:
cql --in foo.sql --cg foo.h foo.c foo_imports.sql --generate_exports
Using the pre-processor you can get declarations from elsewhere with a directive like this:
#include "foo_imports.sql"
Here are some more examples directly from the CQL test cases; these are all auto-generated with --generate_exports
.
DECLARE PROC test (i INTEGER NOT NULL);
DECLARE PROC out_test (OUT i INTEGER NOT NULL, OUT ii INTEGER);
DECLARE PROC outparm_test (OUT foo INTEGER NOT NULL) USING TRANSACTION;
DECLARE PROC select_from_view () (id INTEGER NOT NULL, type INTEGER);
DECLARE PROC make_view () USING TRANSACTION;
DECLARE PROC copy_int (a INTEGER, OUT b INTEGER);
DECLARE PROC complex_return ()
NOT NULL,
(_bool BOOL INTEGER NOT NULL,
_integer LONG INTEGER NOT NULL,
_longint REAL NOT NULL,
_real NOT NULL,
_text TEXT
_nullable_bool BOOL);
DECLARE PROC outint_nullable (
OUT output INTEGER,
OUT result BOOL NOT NULL)
USING TRANSACTION;
DECLARE PROC outint_notnull (
OUT output INTEGER NOT NULL,
OUT result BOOL NOT NULL)
USING TRANSACTION;
DECLARE PROC obj_proc (OUT an_object OBJECT);
DECLARE PROC insert_values (
INTEGER NOT NULL,
id_ INTEGER)
type_ USING TRANSACTION;
So far we’ve avoided discussing the generated C code in any details but here it seems helpful to show exactly what these declarations correspond to in the generated C to demystify all this. There is a very straightforward conversion.
void test(cql_int32 i);
void out_test(
*_Nonnull i,
cql_int32 *_Nonnull ii);
cql_nullable_int32
(
cql_code outparm_test*_Nonnull _db_,
sqlite3 *_Nonnull foo);
cql_int32
(
cql_code select_from_view_fetch_results*_Nonnull _db_,
sqlite3 *_Nonnull result_set);
select_from_view_result_set_ref _Nullable
(sqlite3 *_Nonnull _db_);
cql_code make_view
void copy_int(cql_nullable_int32 a, cql_nullable_int32 *_Nonnull b);
(
cql_code complex_return_fetch_results*_Nonnull _db_,
sqlite3 *_Nonnull result_set);
complex_return_result_set_ref _Nullable
(
cql_code outint_nullable*_Nonnull _db_,
sqlite3 *_Nonnull output,
cql_nullable_int32 *_Nonnull result);
cql_bool
(
cql_code outint_notnull*_Nonnull _db_,
sqlite3 *_Nonnull output,
cql_int32 *_Nonnull result);
cql_bool
void obj_proc(
*_Nonnull an_object);
cql_object_ref _Nullable
(
cql_code insert_values*_Nonnull _db_,
sqlite3 ,
cql_int32 id_); cql_nullable_int32 type_
As you can see, these declarations use exactly the normal SQLite types and so it is very easy to declare a procedure in CQL and then implement it yourself in straight C, simply by conforming to the contract.
Importantly, SQLite does not know anything about CQL stored procedures, or anything at all about CQL really so CQL stored procedure names cannot be used in any way in SQL statements. CQL control flow like the call
statement can be used to invoke other procedures and results can be captured by combing the OUT
statement and a DECLARE CURSOR
construct but SQLite is not involved in those things. This is another place where the inherent two-headed nature of CQL leaks out.
Finally, this is a good place to reinforce that procedures with any of the structured result types (select
, out
, out union
) can be used with a suitable cursor.
create procedure get_stuff()
begin
select * from stuff;
end;
Can be used in two interesting ways:
create procedure meta_stuff(meta bool)
begin
if meta then
call get_stuff();
else
call get_other_stuff();
end if;
end;
Assuming that get_stuff
and get_other_stuff
have the same shape, then this procedure simply passes on one or the other’s result set unmodified as its own return value.
But you could do more than simply pass on the result.
create procedure meta_stuff(meta bool)
begin
declare C cursor for call get_stuff(); -- or get_meta_stuff(...)
loop fetch C
begin
-- do stuff with C
-- may be out union some of the rows after adjustment even
end;
end;
Here we can see that we used the procedure to get the results and then process them directly somehow.
And of course the result of an OUT
can similarly be processed using a value cursor, as previously seen.
Most of this tutorial is about the CQL language itself but here we must diverge a bit. The purpose of the result set feature of CQL is to create a C interface to SQLite data. Because of this there are a lot of essential details that require looking carefully at the generated C code. Appendix 2 covers this code in even more detail but here it makes sense to at least talk about the interface.
Let’s say we have this simple stored procedure:
create table foo(id integer not null, b bool, t text);
create proc read_foo(id_ integer not null)
begin
select * from foo where id = id_;
end;
We’ve created a simple data reader: this CQL code will cause the compiler to generate helper functions to read the data and materialize a result set.
Let’s look at the public interface of that result set now considering the most essential pieces.
/* this is almost everything in the generated header file */
#define read_foo_data_types_count 3
(
cql_result_set_type_decl, \
read_foo_result_set);
read_foo_result_set_ref
extern cql_int32 read_foo_get_id(read_foo_result_set_ref
, cql_int32 row);
_Nonnull result_setextern cql_bool read_foo_get_b_is_null(read_foo_result_set_ref
, cql_int32 row);
_Nonnull result_setextern cql_bool read_foo_get_b_value(read_foo_result_set_ref
, cql_int32 row);
_Nonnull result_setextern cql_string_ref _Nullable read_foo_get_t(
,
read_foo_result_set_ref _Nonnull result_set);
cql_int32 rowextern cql_int32 read_foo_result_count(read_foo_result_set_ref
);
_Nonnull result_setextern cql_code read_foo_fetch_results(sqlite3 *_Nonnull _db_,
*_Nonnull result_set,
read_foo_result_set_ref _Nullable );
cql_int32 id_#define read_foo_row_hash(result_set, row) \
cql_result_set_get_meta((cql_result_set_ref)(result_set))->\
rowHash((cql_result_set_ref)(result_set), row)
#define read_foo_row_equal(rs1, row1, rs2, row2) \
cql_result_set_get_meta((cql_result_set_ref)(rs1)) \
->rowsEqual( \
(cql_result_set_ref)(rs1), row1, \
(cql_result_set_ref)(rs2), row2)
Let’s consider some of these individually now
(
cql_result_set_type_decl,
read_foo_result_set); read_foo_result_set_ref
This declares the data type for read_foo_result_set
and the associated object reference read_foo_result_set_ref
. As it turns out, the underlying data type for all result sets is the same, and only the shape of the data varies.
extern cql_code read_foo_fetch_results(sqlite3 *_Nonnull _db_,
*_Nonnull result_set,
read_foo_result_set_ref _Nullable ); cql_int32 id_
The result set fetcher method gives you a read_foo_result_set_ref
if it succeeds. It accepts the id_
argument which it will internally pass along to read_foo(...)
. The latter function provides a sqlite3_stmt*
which can then be iterated in the fetcher. This method is the main public entry point for result sets.
Once you have a result set, you can read values out of it.
extern cql_int32 read_foo_result_count(read_foo_result_set_ref
); _Nonnull result_set
That function tells you how many rows are in the result set.
For each row you can use any of the row readers:
extern cql_int32 read_foo_get_id(read_foo_result_set_ref
, cql_int32 row);
_Nonnull result_setextern cql_bool read_foo_get_b_is_null(read_foo_result_set_ref
, cql_int32 row);
_Nonnull result_setextern cql_bool read_foo_get_b_value(read_foo_result_set_ref
, cql_int32 row);
_Nonnull result_setextern cql_string_ref _Nullable read_foo_get_t(
,
read_foo_result_set_ref _Nonnull result_set); cql_int32 row
These let you read the id
of a particular row, and get a cql_int32
or you can read the nullable boolean, using the read_foo_get_b_is_null
function first to see if the boolean is null and then read_foo_get_b_value
to get the value. Finally the string can be accessed with read_foo_get_t
. As you can see, there is a simple naming convention for each of the field readers.
NOTE: The compiler has runtime arrays that control naming conventions as well as using CamelCasing. Additional customizations may be created by adding new runtime arrays into the CQL compiler.
Finally, also part of the public interface, are these macros:
#define read_foo_row_hash(result_set, row)
#define read_foo_row_equal(rs1, row1, rs2, row2)
These use the CQL runtime to hash a row or compare two rows from identical result set types. Metadata included in the result set allows general purpose code to work for every result set. Based on configuration, result set copying methods can also be generated. When you’re done with a result set you can use the cql_release(...)
method to free the memory.
Importantly, all of the rows from the query in the stored procedure are materialized immediately and become part of the result set. Potentially large amounts of memory can be used if a lot of rows are generated.
The code that actually creates the result set starting from the prepared statement is always the same. The essential parts are:
First, a constant array that holds the data types for each column.
uint8_t read_foo_data_types[read_foo_data_types_count] = {
| CQL_DATA_TYPE_NOT_NULL, // id
CQL_DATA_TYPE_INT32 , // b
CQL_DATA_TYPE_BOOL, // t
CQL_DATA_TYPE_STRING};
All references are stored together at the end of the row, so we only need the count of references and the offset of the first one to do operations like cql_retain
or cql_release
on the row.
#define read_foo_refs_offset cql_offsetof(read_foo_row, t) // count = 1
Lastly we need metadata to tell us count of columns and the offset of each column within the row.
static cql_uint16 read_foo_col_offsets[] = { 3,
(read_foo_row, id),
cql_offsetof(read_foo_row, b),
cql_offsetof(read_foo_row, t)
cql_offsetof};
Using the above we can now write this fetcher
CQL_WARN_UNUSED cql_code
read_foo_fetch_results(
sqlite3 *_Nonnull _db_,
read_foo_result_set_ref _Nullable *_Nonnull result_set,
cql_int32 id_)
{
sqlite3_stmt *stmt = NULL;
cql_profile_start(CRC_read_foo, &read_foo_perf_index);
// we call the original procedure, it gives us a prepared statement
cql_code rc = read_foo(_db_, &stmt, id_);
// this is everything you need to know to fetch the result
cql_fetch_info info = {
.rc = rc,
.db = _db_,
.stmt = stmt,
.data_types = read_foo_data_types,
.col_offsets = read_foo_col_offsets,
.refs_count = 1,
.refs_offset = read_foo_refs_offset,
.rowsize = sizeof(read_foo_row),
.crc = CRC_read_foo,
.perf_index = &read_foo_perf_index,
};
// this function does all the work, it cleans up if .rc is an error code.
return cql_fetch_all_results(&info, (cql_result_set_ref *)result_set);
}
OUT UNION
The out
keyword was added for writing procedures that produce a single row result set. With that, it became possible to make any single row result you wanted, assembling it from whatever sources you needed. That is an important case as single row results happen frequently and they are comparatively easy to create and pass around using C structures for the backing store. However, it’s not everything; there are also cases where full flexibility is needed while producing a standard many-row result set. For this we have out union
which was discussed fully in Chapter 5. Here we’ll discuss the code generation behind that.
Here’s an example from the CQL tests:
create proc some_integers(start integer not null, stop integer not null)
begin
declare C cursor like select 1 v, 2 v_squared, "xx" some_text;
declare i integer not null;
set i := start;
while (i < stop)
begin
from values (i, i*i, printf("%d", i));
fetch C(v, v_squared, junk) out union C;
set i := i + 1;
end;
end;
In this example the entire result set is made up out of thin air. Of course any combination of this computation or data-access is possible, so you can ultimately make any rows you want in any order using SQLite to help you as much or as little as you need.
Virtually all the code pieces to do this already exist for normal result sets. The important parts of the output code look like this in your generated C.
We need a buffer to hold the rows we are going to accumulate; We use cql_bytebuf
just like the normal fetcher above.
// This bit creates a growable buffer to hold the rows
// This is how we do all the other result sets, too
;
cql_bytebuf _rows_(&_rows_); cql_bytebuf_open
We need to be able to copy the cursor into the buffer and retain any internal references
// This bit is what you get when you "out union" a cursor "C"
// first we +1 any references in the cursor then we copy its bits
cql_retain_row(C_); // a no-op if there is no row in the cursor
if (C_._has_row_) cql_bytebuf_append(&_rows_, (const void *)&C_, sizeof(C_));
Finally, we make the rowset when the procedure exits. If the procedure is returning with no errors the result set is created, otherwise the buffer is released. The global some_integers_info
has constants that describe the shape produced by this procedure just like the other cases that produce a result set.
cql_results_from_data(_rc_,
&_rows_,
&some_integers_info,
(cql_result_set_ref *)_result_set_);
The operations here are basically the same ones that will happen inside of the standard helper cql_fetch_all_results
, the difference, of course, is that you write the loop manually and therefore have full control of the rows as they go in to the result set.
In short, the overhead is pretty low. What you’re left with is pretty much the base cost of your algorithm. The cost here is very similar to what it would be for any other thing that make rows.
Of course, if you make a million rows, well, that would burn a lot of memory.
Here’s a fairly simple example illustrating some of these concepts including the reading of rowsets.
-- hello.sql:
create proc hello()
begin
create table my_data(
integer not null primary key,
pos not null
txt text
);
insert into my_data values(2, 'World');
insert into my_data values(0, 'Hello');
insert into my_data values(1, 'There');
select * from my_data order by pos;
end;
And this main code to open the database and access the procedure:
// main.c
#include <stdlib.h>
#include <sqlite3.h>
#include "hello.h"
int main(int argc, char **argv)
{
*db;
sqlite3 int rc = sqlite3_open(":memory:", &db);
if (rc != SQLITE_OK) {
(1); /* not exactly world class error handling but that isn't the point */
exit}
;
hello_result_set_ref result_set= hello_fetch_results(db, &result_set);
rc if (rc != SQLITE_OK) {
("error: %d\n", rc);
printf(2);
exit}
= hello_result_count(result_set);
cql_int32 result_count
for(cql_int32 row = 0; row < result_count; row++) {
= hello_get_txt(result_set, row);
cql_string_ref text (ctext, text);
cql_alloc_cstr("%d: %s\n", row, ctext);
printf(ctext, text);
cql_free_cstr}
(result_set);
cql_result_set_release
(db);
sqlite3_close}
From these pieces you can make a working example like so:
# ${cgsql} refers to the root directory of the CG-SQL sources
#
cql --in hello.sql --cg hello.h hello.c
cc -o hello -I ${cgsql}/sources main.c hello.c ${cgsql}/sources/cqlrt.c -lsqlite3
./hello
Additional demo code is available in Appendix 10
There are many cases where you might want to nest one result set inside of another one. In order to do this ecomomically you must be able to run a parent query and a child query and then link the child rows to the parent rows. One way to do this is of course to run one query for each “child” but then you end up with O(n)
child queries and if there are sub-children it would be O(n*m)
and so forth. What you really want to do here is something more like a join, only without the cross-product part of the join. Many systems have such features, sometimes they are called “chaptered rowsets” but in any case there is a general need for such a thing.
To reasonably support nested results sets the CQL language has to be extended a variety of ways, as discussed below.
Here are some things that happened along the way that are interesting.
One of the first problems we run into thinking about how a CQL program might express pieces of a rowset and turn them into child results is that a program must be able to hash a row, append row data, and extract a result set from a key. These are the essential operations required. In order to do anything at all with a child rowset, a program must be able to describe its type. Result sets must appear in the type system as well as in the runtime.
To address this we use an object type with a special “kind”, similar to how boxed statements are handled. A result set has a type that looks like this: object <proc_name set>
. Here proc_name
must the the name of a procedure that returns a result set and the object will represent a result set with the corresponding columns in it.
In addition to creating result set types, the language must be able to express cursors that capture the necessary parent/child column. These are rows with all of the parent columns plus additional columns for the child rows (note that you can have more than one child result set per parent). So for instance you might have a list of people, and one child result might be the details of the schools they attended and another could be the details of the jobs they worked.
To accomplish this kind of shape, the language must be able to describe a new output row is that is the same as the parent but includes columns for the the child results, too. This is done using a cursor declaration that comes from a typed name list. An example might be:
declare C cursor like (id integer, name text);
Importantly, such constructs include the ability to reference existing shapes by name. So we might create a cursor we need like so:
declare result cursor like (like parent_proc, child_result object<child_proc set>);
Where the above indicates all the parent columns plus a child result set. Or more than one child result set if needed.
In addition, the language needs a way to conveniently declare a cursor that is only some of the columns of an existing cursor. In particular, nested result sets require us to extract the columns that link the parent and child result sets. The columns we will “join” on. To accomplish this the language extends the familiar notion:
declare D cursor like C;
To the more general form:
declare pks cursor like C(pk1, pk2);
Which chooses just the named fields from C
and makes a cursor with only those. In this case this primary key fields, pk1
and pk2
. Additionally, for completeness, we add this form:
declare vals cursor like C(-pk1, -pk2);
To mean the cursor vals should have all the columns of C
except pk1
and pk2
i.e. all the “values”.
Using any number of intermediate construction steps, and maybe some declare X type ...
statements, any type can be formed from existing shapes by adding and removing columns.
Having done the above we can load a cursor that has just the primary keys with the usual form
from C(like pks); fetch pks
Which says we want to load pks
from the fields of C
, but using only the columns of pks
. That operation is of course going to be an exact type match by construction.
In order to express the requisite parent/child join, the language must be able to express operations like “hash a cursor” (any cursor) or “store this row into the appropriate partition”. The language provides no way to write functions that can take any cursor and dynamically do things to it based on type information, but:
The minimum requirement is that the language must be able to declare a functions that takes a generic cursor argument and to call such functions a generic cursor construct that has the necessary shape info. This form does the job:
declare function cursor_hash(C cursor) long not null;
And it can be used like so:
hash := cursor_hash(C); -- C is any cursor let
When such a call is made the C function cursor_hash
is passed a so-called “dynamic cursor” pointer which includes:
With this information you can (e.g.) generically do the hash by applying a hash to each field and then combining all of those hashes. This kind of function works on any cursor and all the extra data about the shape that’s needed to make the call is static, so really the cost of the call stays modest. Details of the dynamic cursor type are in cqlrt_common.h
and there are many example functions now in the cqlrt_common.c
file.
Three helper functions are used to do the parent/child join, they are:
DECLARE FUNC cql_partition_create ()
CREATE OBJECT<partitioning> NOT NULL;
DECLARE FUNC cql_partition_cursor (
OBJECT<partitioning> NOT NULL,
part key CURSOR,
value CURSOR)
NOT NULL;
BOOL
DECLARE FUNC cql_extract_partition (
OBJECT<partitioning> NOT NULL,
part key CURSOR)
CREATE OBJECT NOT NULL;
The first function makes a new partitioning.
The second function hashes the key columns of a cursor (specified by the key argument) and appends the values provided in the second argument into a bucket for that key. By making a pass over the child rows a procedure can easily create a partitioning with each unique key combo having a buffer of all the matching rows.
The third function is used once the partitioning is done. Given a key again, this time from the parent rows, a procedure can get the buffer it had accumulated and then make a result set out of it and return that.
Note that the third function returns a vanilla object type because it could be returning a result set of any shape so a cast is required for correctness.
Using the features mentioned above a developer could now join together any kind of complex parent and child combo as needed, but the result would be a lot of error-prone code, To avoid this CQL adds language sugar to do such partitionings automatically and type-safely, like so:
-- parent and child defined elsewhere
declare proc parent(x integer not null) (id integer not null, a integer, b integer);
declare proc child(y integer not null) (id integer not null, u text, v text);
-- join together parent and child using 'id'
-- example x_, y_ arguments for illustration only
create proc parent_child(x_ integer not null, y_ integer not null)
begin
out union call parent(x_) join call child(y_) using (id);
end;
The generated code is simple enough, even though there’s a good bit of it. But it’s a useful exercise to look at it once. Comments added for clarity.
CREATE PROC parent_child (x_ INTEGER NOT NULL, y_ INTEGER NOT NULL)
BEGIN
DECLARE __result__0 BOOL NOT NULL;
-- we need a cursor to hold just the key of the child row
DECLARE __key__0 CURSOR LIKE child(id);
-- we need our partitioning object (there could be more than one per function
-- so it gets a number, likewise everything else gets a number
:= cql_partition_create();
LET __partition__0
-- we invoke the child and then iterate its rows
DECLARE __child_cursor__0 CURSOR FOR CALL child(y_);
LOOP FETCH __child_cursor__0
BEGIN
-- we extract just the key fields (id in this case)
id) FROM VALUES(__child_cursor__0.id);
FETCH __key__0(
-- we add this child to the partition using its key
SET __result__0 := cql_partition_cursor(__partition__0, __key__0, __child_cursor__0);
END;
-- we need a shape for our result, it is the columns of the parent plus the child rowset
DECLARE __out_cursor__0 CURSOR LIKE (id INTEGER NOT NULL, a INTEGER, b INTEGER,
OBJECT<child SET> NOT NULL);
child1
-- now we call the parent and iterate it
DECLARE __parent__0 CURSOR FOR CALL parent(x_);
LOOP FETCH __parent__0
BEGIN
-- we load the key values out of the parent this time, same key fields
id) FROM VALUES(__parent__0.id);
FETCH __key__0(
-- now we create a result row using the parent columns and the child result set
id, a, b, child1) FROM VALUES(__parent__0.id, __parent__0.a, __parent__0.b, cql_extract_partition(__partition__0, __key__0));
FETCH __out_cursor__0(
-- and then we emit that row
OUT UNION __out_cursor__0;
END;
END;
This code iterates the child once and the parent once and only has two database calls, one for the child and one for the parent. And this is enough to create parent/child result sets for the most common examples.
While the above is probably the most common case, a developer might also want to make a procedure call for each parent row to compute the child. And, more generally, to work with result sets from procedure calls other than iterating them with a cursor.
The iteration pattern:
declare C cursor for call foo(args);
is very good if the data is coming from (e.g.) a select statement and we don’t want to materialize all of the results if we can stream instead. However, when working with result sets the whole point is to create materialized results for use elsewhere.
Since we can express a result set type with object<proc_name set>
the language also includes the ability to call a procedure that returns a result set and capture that result. This yields these forms:
declare child_result object<child set>;
set child_result := child(args);
or better still:
:= child(args); let child_result
And more generally, this examples shows a manual iteration:
declare proc parent(x integer not null) (id integer not null, a integer, b integer);
declare proc child(id integer not null) (id integer not null, u text, v text);
create proc parent_child(x_ integer not null, y_ integer not null)
begin
-- the result is like the parent with an extra column for the child
declare result cursor like (like parent, child object<child set>);
-- call the parent and loop over the results
declare P cursor for call parent(x_);
loop fetch P
begin
-- compute the child for each P and then emit it
from values(from P, child(P.id));
fetch result out union result;
end;
end;
After the sugar is applied to expand the types out, the net program is the following:
DECLARE PROC parent (x INTEGER NOT NULL) (id INTEGER NOT NULL, a INTEGER, b INTEGER);
DECLARE PROC child (id INTEGER NOT NULL) (id INTEGER NOT NULL, u TEXT, v TEXT);
CREATE PROC parent_child (x_ INTEGER NOT NULL, y_ INTEGER NOT NULL)
BEGIN
DECLARE result CURSOR LIKE (id INTEGER NOT NULL, a INTEGER, b INTEGER,
child OBJECT<child SET>);
DECLARE P CURSOR FOR CALL parent(x_);
LOOP FETCH P
BEGIN
id, a, b, child) FROM VALUES(P.id, P.a, P.b, child(P.id));
FETCH result(OUT UNION result;
END;
END;
Note the LIKE
and FROM
forms are make it a lot easier to express this notion of just adding one more column to the result. The code for emitting the parent_child
result doesn’t need to specify the columns of the parent or the columns of the child, only that the parent has at least the id
column. Even that could have been removed.
This call could have been used instead:
from values(from P, child(from P like child arguments)); fetch result
That syntax would result in using the columns of P that match the arguments of child
– just P.id
in this case. But if there were many such columns the sugar would be easier to understand and much less error prone.
cql_object_ref
as their C data type. This isn’t wrong exactly but it would mean that a cast would be required in every use case on the native side, and it’s easy to get the cast wrong. So the result type of column getters is adjusted to be a child_result_set_ref
instead of just cql_object_ref
where child
is the name of the child procedure.
CQL stored procs have a very simple contract so it is easy to declare procedures and then implement them in regular C; the C functions just have to conform to the contract. However, CQL procedures have their own calling conventions and this makes it very inconvenient to use external code that is not doing database things and wants to return values. Even a random number generator or something would be difficult to use because it could not be called in the context of an expression. To allow for this CQL adds declared functions
In another example of the two-headed nature of CQL, there are two ways to declare functions. As we have already seen you can make function-like procedures and call them like functions simply by making a procedure with an out
parameter. However, there are also cases where it is reasonable to make function calls to external functions of other kinds. There are three major types of functions you might wish to call.
These functions are written in regular C and provide for the ability to do operations on in-memory objects. For instance, you could create functions that allow you to read and write from a dictionary. You can declare these functions like so:
declare function dict_get_value(dict object, key_ text not null) text;
Such a function is not known to SQLite and therefore cannot appear in SQL statements. CQL will enforce this.
The above function returns a text reference, and, importantly, this is a borrowed reference. The dictionary is presumably holding on to the reference and as long as it is not mutated the reference is valid. CQL will retain this reference as soon as it is stored and release it automatically when it is out of scope. So, in this case, the dictionary continues to own the object.
It is also possible to declare functions that create objects. Such as this example:
declare function dict_create() create object;
This declaration tells CQL that the function will create a new object for our use. CQL does not retain the provided object, rather assuming ownership of the presumably one reference count the object already has. When the object goes out of scope it is released as usual.
If we also declare this procedure:
declare procedure dict_add(
object not null,
dict not null,
key_ text value text not null);
then with this family of declarations we could write something like this:
create proc create_and_init(out dict object not null)
begin
set dict := dict_create();
call dict_add(dict, "k1", "v1");
call dict_add(dict, "k2", "v2");
if (dict_get_value(dict, "k1") == dict__get_value(dict, "k2")) then
call printf("insanity has ensued\n");
end if;
end;
NOTE: Ordinary scalar functions may not use the database in any way. When they are invoked they will not be provided with the database pointer and so they will be unable to do any database operations. To do database operations, use regular procedures. You can create a function-like-procedure using the
out
convention discussed previously.
SQLite includes the ability to add new functions to its expressions using sqlite3_create_function
. In order to use this function in CQL, you must also provide its prototype definition to the compiler. You can do so following this example:
declare select function strencode(t text not null) text not null;
This introduces the function strencode
to the compiler for use in SQL constructs. With this done you could write a procedure something like this:
create table foo(id integer, t text);
create procedure bar(id_ integer)
begin
select strencode(T1.t) from foo T1 where T1.id = id_;
end;
This presumably returns the “encoded” text, whatever that might be. Note that if sqlite3_create_function
is not called before this code runs, a run-time error will ensue. Just as CQL must assume that declared tables really are created, it also assumes that declared function really are created. This is another case of telling the compiler in advance what the situation will be at runtime.
SQLite allows for many flexible kinds of user defined functions. CQL doesn’t concern itself with the details of the implementation of the function, it only needs the signature so that it can validate calls.
Note that SQL Scalar Functions cannot contain object
parameters. To pass an object
, you should instead pass the memory address of this object using a LONG INT
parameter. To access the address of an object
at runtime, you should use the ptr()
function. See the notes section below for more information.
See also: Create Or Redefine SQL Functions.
More recent versions of SQLite also include the ability to add table-valued functions to statements in place of actual tables. These functions can use their arguments to create a “virtual table” value for use in place of a table. For this to work, again SQLite must be told of the existence of the table. There are a series of steps to make this happen beginning with sqlite3_create_module
which are described in the SQLite documents under “The Virtual Table Mechanism Of SQLite.”
Once that has been done, a table-valued function can be defined for most object types. For instance it is possible to create a table-valued function like so:
declare select function dict_contents(dict object not null)
not null, v text not null); (k text
This is just like the previous type of select function but the return type is a table shape. Once the above has been done you can legally write something like this:
create proc read_dict(dict object not null, pattern text)
begin
if pattern is not null then
select k, v from dict_contents(dict) T1 where T1.k LIKE pattern;
else
select k, v from dict_contents(dict);
end if;
end;
This construct is very general indeed but the runtime set up for it is much more complicated than scalar functions and only more modern versions of SQLite even support it.
Certain SQL functions like json_extract
are variadic (they accept variable number of arguments). To use such functions within CQL, you can declare a SQL function to have untyped parameters by including the NO CHECK
clause instead of parameter types.
For example:
declare select function json_extract no check text;
This is also supported for SQL table-valued functions:
declare select function table_valued_function no check (t text, i int);
NOTE: currently the
NO CHECK
clause is not supported for non SQL Ordinary Scalar Functions.
Some of the SQLite builtin functions are hard-coded; these are the functions that have semantics that are not readily captured with a simple prototype. Other SQLite functions can be declared with declare select function ...
and then used.
CQL’s hard-coded builtin list includes:
count
max
min
sum
total
avg
group_concat
abs
changes
char
coalesce
concat
concat_ws
format
glob
hex
ifnull
iif
instr
last_insert_rowid
length
like
likelihood
likely
load_extension
lower
ltrim
max
min
nullif
octet_length
printf
quote
random
randomblob
replace
round
rtrim
sign
soundex
sqlite_compileoption_get
sqlite_compileoption_used
sqlite_offset
sqlite_source_id
sqlite_version
substr
substring
total_changes
trim
typeof
unhex
unicode
unlikely
upper
zeroblob
row_number
rank
dense_rank
percent_rank
cume_dist
ntile
lag
lead
first_value
last_value
nth_value
json
jsonb
json_array
jsonb_array
json_array_length
json_error_position
json_extract
jsonb_extract
json_insert
json_replace
json_set
jsonb_insert
jsonb_replace
jsonb_set
json_remove
jsonb_remove
json_object
jsonb_object
json_patch
jsonb_patch
json_pretty
json_type
json_valid
json_quote
json_extract
and jsonb_extract
are peculiar because they do not always return the same type. Since CQL has to assume something it assumes that json_extract
will return TEXT
and jsonb_extract
will return a BLOB
. Importantly, CQL does not add any casting operations into the SQL unless they are explicitly added which means in some sense SQLite does not “know” that CQL has made a bad assumption, or any assumption. In many cases, even most cases, a specific type is expected, this is a great time to use the pipeline cast notation to “force” the conversion.
select json_extract('{ "x" : 0 }', '$.x') ~int~ as X;
This is exactly the same as
select CAST(json_extract('{ "x" : 0 }', '$.x') as int) as X;
json_group_array
jsonb_group_array
json_group_object
jsonb_group_object
The two table functions are readily declared if they are needed like so:
DECLARE SELECT FUNC json_each NO CHECK
key BLOB, value BLOB, type TEXT, atom BLOB, id INT, parent INT, fullkey TEXT, path TEXT);
(
DECLARE SELECT FUNC json_tree NO CHECK
key BLOB, value BLOB, type TEXT, atom BLOB, id INT, parent INT, fullkey TEXT, path TEXT); (
NOTE: key, value, and atom can be any type and will require a cast operation similar to
json_extract
, see the notes above.
These can be used to create an object<cql_box>
from the various primitives. This can then be stored generically in something that holds objects. The unbox methods can be used to extract the original value.
cql_box_blob
cql_box_bool
cql_box_get_type
cql_box_int
cql_box_long
cql_box_object
cql_box_real
cql_box_text
cql_unbox_blob
cql_unbox_bool
cql_unbox_int
cql_unbox_long
cql_unbox_object
cql_unbox_real
cql_unbox_text
These functions have pipeline aliases :box
, :type
and :to_int
, :to_bool
, etc.
Dyanmic Cursor Functions
These functions all work with an unspecified cursor format. These accept a so-called dynamic cursor.
cql_cursor_format
returns a string with the names and values of every field in the cursor, useful for debuggingcql_cursor_column_count
return the number of columns in the cursorcql_cursor_column_type
returns the type of the column using CQL_DATA_TYPE_*
constantscql_cursor_get_*
returns a column of the indicated type at the indicated index, the type can be bool, int, long, real, text, blob, or objectPipeline syntax is availabe for these, you can use C:format
, C:count
, C:type(i)
, C:to_bool(i)
, C:to_int(i)
etc.
Each of the following returns an object<cql_TYPE_dictionary
where TYPE
is the indicated type.
cql_string_dictionary_create
– values are strings (text
)cql_blob_dictionary_create
– values are blobscql_object_dictionary_create
– values are any object
esp. boxed valuescql_long_dictionary_create
– values are long
cql_real_dictionary_create
– values are real
The add
functions return true
if an object was added, false
if it was replaced.
cql_string_dictionary_add
– add a string (text
)cql_blob_dictionary_add
– add a blob
cql_object_dictionary_add
– add an object
cql_long_dictionary_add
– add a long
cql_real_dictionary_add
– add a real
The find
functions return null
if there is no such value or else the stored value. Each function requires the dictionary and the key to find. The key is always a string (i.e. text
).
cql_string_dictionary_find
cql_blob_dictionary_find
cql_object_dictionary_find
cql_long_dictionary_find
– returns nullable longcql_real_dictionary_find
– returns nullable realThe pipeline syntax dict:add(key, value)
works for all of the above. Similarly, dict:find(key)
works. Array forms dict[key] := value
and x := dict[value]
also work and result in the same calls. The long form name is really only needed to create a dictionary.
As with dictionaries there are some simple built in lists. These have limited functionality but they are very handy for short term storage.
Each of the following returns an object<cql_TYPE_list
where TYPE
is the indicated type.
cql_string_list_create
– values are strings (text
)cql_blob_list_create
– values are blob
cql_long_list_create
– values are long
cql_real_list_create
– values are real
The list functions are limited to add
get_at
set_at
and count
These functions append the indicated value to the end of the list.
cql_string_list_add
– append a string (text
)cql_blob_list_add
– append a blob
cql_long_list_add
– append a long
cql_real_list_add
– append a real
Getters accept the list and an index, the index must be within bounds.
cql_string_list_get_at
– gets the string (text
) at the indicated indexcql_blob_list_get_at
– gets the blob
at the indicated indexcql_long_list_get_at
– gets the long
at the indicated indexcql_real_list_get_at
– gets the real
at the indicated indexSetters accept the list, the index and a value, the index must be within bounds.
cql_string_list_set_at
– sets the string (text
) at the indicated indexcql_blob_list_set_at
– sets the blob
at the indicated indexcql_long_list_set_at
– sets the long
at the indicated indexcql_real_list_set_at
– sets the real
at the indicated indexAnd, finally, the item count functions, in each case this just gives the count of items in the list.
cql_string_list_count
cql_blob_list_count
cql_long_list_count
cql_real_list_count
Lists can use pipeline notation such as:
list := cql_string_list_create():add("hello"):add("goodbye");
let 2 == list.count);
EXPECT!("hello" == list[0]);
EXPECT!("goodbye" == list[1]);
EXPECT!(list[0] := "salut";
"salut" == list[0]); EXPECT!(
Note: Lists use property notation for their
count
. This could have been a:count
function also generating the same count but by convention we only use:foo
when the operation is more complicated than a simple property fetch.@op cql_long_list : call count as cql_long_list_count;
would add:count
. The default is@op cql_long_list : get count as cql_long_list_count;
which allows.count
. The@op
directive is discussed below.
nullable
sensitive
ptr
These are not real functions but rather notations to the compiler.
Nullable
casts an operand to the nullable version of its type and otherwise does nothing. This cast might be useful if you need an exact type match in a situation. It is stripped from any generated SQL and generated C so it has no runtime effect at all other than the indirect consequences of changing the storage class of its operand.
Sensitive
casts an operand to the sensitive version of its type and otherwise does nothing. This cast might be useful if you need an exact type match in a situation. It is stripped from any generated SQL and generated C so it has no runtime effect at all other than the indirect consequences of changing the storage class of its operand.
Ptr
is used to cause a reference type variable to be bound as a long integer to SQLite. This is a way of giving object pointers to SQLite UDFs. Not all versions of Sqlite support binding object variables, so passing memory addresses is the best we can do on all versions.
cql_throw
cql_throw
can be used to generate an exception with whatever code you desire.
declare const group error_codes (
= 1,
ARG_ERROR = 2,
OTHER_ERROR = 3
ETC_ERROR
);
int)
proc foo(x begin
if x < 0 or x > 5 then
cql_throw(ARG_ERROR);end;
end;
As we saw in Chapter 3 certain operators become function calls after transformation. In particular the :
, []
, .
, and ->
operators can be mapped into functions. To enable this transform you declare the function you want to invoke normally and then you provide an @op
directive that redirects the operator to the function. There are examples in the section on Pipeline Notation.
Here we will review the various forms so that all the @op
patterns are easily visible together:
# | @op directive |
expression | replacement |
---|---|---|---|
1 | no declaration required | expr:func(...) |
func(expr, ...) |
2 | @op T : call func as your_func; |
expr:func(...) |
your_func(expr, ...) |
3 | @op T<kind> : call func as func_kind; |
expr:func(...) |
func_kind(expr, ...) |
4 | @op T<kind> : get foo as get_foo; |
expr.foo |
get_foo(expr) |
5 | @op T<kind> : set foo as set_foo; |
expr.foo := x |
set_foo(expr, x) |
6 | @op T<kind> : get all as getter; |
expr.foo |
getter(expr, 'foo') |
7 | @op T<kind> : set all as setter; |
expr.foo := x |
setter(expr, 'foo', x) |
8 | @op T<kind> : array get as a_get; |
expr[x,y] |
a_get(expr, x, y') |
9 | @op T<kind> : array set as a_set; |
expr[x,y] := z |
a_set(expr, x, y, z) |
10 | @op T<kind> : functor all as f; |
expr:(1, 2.0) |
f_int_real(expr, 1, 2.0) |
11 | @op T<kind> : arrow all as arr1; |
left->right |
arr1(left, right) |
12 | @op T1<kind> : arrow T2 as arr2; |
left->right |
arr2(left, right) |
13 | @op T1<kind> : arrow T2<kind> as arr3; |
left->right |
arr3(left, right) |
14 | @op cursor : call foo as foo_bar; |
C:foo(...) |
foo_bar(C, ...) |
15 | @op null : call foo as foo_null;' | null:foo(…)| foo_null(null, …)` |
Now let’s briefly go over each of these forms. In all cases the transform is only applied if expr
is of type T
. No type conversion is applied at this point, however only the base type must match so the transformation will be applied regardless of the nullability or sensitivity of expr
. If expr
is of a suitable type the transform is applied and the call is then checked for errors as usual. Based on the type of replacement function an implicit conversion might then be required. Note that the types of any additional arguments are not considered when deciding to do the transform but they can cause errors after the transform has been applied. After the transform replacement expression, including all arguments, are type checked as usual and errors could result from arguments not being compatible with the transform. This is no different than if you had written func(expr1, expr2, etc..)
with some of the arguments being not appropriate for func
.
With no declaration expr:func()
is always replaced with func(expr)
. If there are no arguments expr:func
may be used for brevity it is no different than expr:func()
.
Here a call pipelined call to func
with expr
matching T
becomes a normal call to your_func
.
This form is a special case of (2). CQL first looks for a match with the “kind”, if there is one that is used preferrably. There are examples in Pipeline Notation. This lets you have a generic conversion and more specific conversions if needed. e.g. you might have formatting for any int
but you have special formatting for int<task_id>
.
This form defines a specific property getter. Only types with a kind can have such getters so declaring a transform with a T
that has no kind is useless and likely will produce errors at some point. The getter is type checked as usual after the replacement.
This form defines a specific property setter. Only types with a kind can have such setters so declaring a transform with a T
that has no kind is useless and likely will produce errors at some point. The setter is type checked as usual after the replacement.
This form declares a generic “getter”, the property being fetched becomes a string argument. This is useful if you have a bag of propreties of the same type and a generic “get” function. Note that specific properties are consulted first (i.e. rule 4).
This form declares a generic “setter”, the property being set becomes a string argument. This is useful if you have a bag of propreties of the same type and a generic “set” function. Note that specific properties are consulted first (i.e. rule 5).
This form defines a transform for array-like read semantics. A matching array operation is turned into a function call and all the array indices become function arguments. Only the type of the expression being indexed is considered when deciding to do the transform. As usual, the replacement is checked and errors could result if the function is not suitable.
This form defines a transform for array-like write semantics. A matching array operation is turned into a function call and all the array indices become function arguments, including the value to set. Only the type of the expression being indexed is considered when deciding to do the transform. As usual, the replacement is checked and errors could result if the function is not suitable.
This form allows for a “functor-like” syntax where there is no function name provided. The name in the @op
directive becomes the base name of the replacement function. The base type names of all the arguments (but not expr
) are included in the replacement. As always the type of expr
must match the directive. The replacement could generate errors if a function is missing (e.g. you have no f_int_real
variant) or if the arguments are not type compatible (e.g. if the signature or the f_int_real
variant isn’t actually int
and real
).
The replacement system is flexible enough to allow arbitary operators to be replaced. At this point only ->
is supported and it is specified by “arrow”. The replacement happens if the left argument is exactly T
. In this form the right argumentcan be any thing. The result of the replacement is type checked as usual.
This is just like (11) except that the type of the right argument has been partially specified, it must have the indicated base type T2, such as int
, real
, etc. If this form matches the replacement takes precedence over (11). The result of the replacement is type checked as usual.
This is just like (12) except that the type and kind of the right argument has been specified, it must have the indicated base type T2 and the indicated kind. If this form matches the replacement takes precedence over (12). The result of the replacement is type checked as usual.
This form allows you to create pipeline functions on cursor types. The replacement function will be declared with a dynamic cursor as the first argument. The result of the replacement is type checked as usual.
This form allows you to create pipeline functions on the null literal. The replacement function will get null as its first argument and this can be accepted by any nullable type. This form is of limited use as it only is triggered by null literals and these are only likely to appear in a pipeline in the context of a macro. The result of the replacement is type checked as usual.
In addition to “arrow” the identifiers “lshift”, “rshift” and “concat” maybe used to similarly remap <<
, >>
, and ||
respectively. The same rules otherwise apply. Note that the fact that these operators have different binding strengths can be very useful in building fluent-style pipelines.
Other operators may be added in the future, they would follow the patterns for rules 11, 12, and 13 with only the “arrow” keyword varying. You could imagine “add”, “sub”, “mult” etc. for other operators.
These are discussed in greater detail in the Pipeline Notation section. However, by way of motivational examples here are some possible transforms in table form.
Original | Replacement |
---|---|
"test":foo() |
foo("test") |
5.0:foo() |
foo_real(5.0) |
joules:foo() |
foo_real_joules(joules) |
x:dump |
dump(x) |
new_builder():(5):(7):to_list |
tolist(add_int(add_int(b(), 5), 7)) |
x:ifnull(0) |
ifnull(x, 0) |
x:nullable |
nullable(x) |
x:n |
nullable(x) |
xml -> path |
extract_xml_path(xml, path) |
cont.y += 1 |
set_y(cont, get_y(cont) + 1) |
cont.z += 1 |
set(cont, "z", get(cont, "z") + 1) |
a[u,v] += 1 |
set_uv(a, u, v, get_uv(a, u, v) + 1) |
<div style="page-break-after: always;"></div>
<!--- @generated by make_guide.sh -->
# Chapter 9: Statements Summary and Error Checking
<!---
-- Copyright (c) Meta Platforms, Inc. and affiliates.
--
-- This source code is licensed under the MIT license found in the
-- LICENSE file in the root directory of this source tree.
-->
The following is a brief discussion of the major statement types and the
semantic rules that CQL enforces for each of the statements. A detailed
discussion of SQL statements (the bulk of these) is beyond the scope of
this document and you should refer to the SQLite documentation for most
details. However, in many cases CQL does provide additional enforcement
and it is helpful to describe the basic checking that happens for each
fragment of CQL. A much more authoritative list of the things CQL checks
for can be inferred from the error documentation. "Tricky" errors have
examples and suggested remediation.
### The Primary SQL Statements
These are, roughly, the statements that involve the database. In each case
we will review the major verifications that happen for each of the statements.
Because they are directly forwarded to SQLite for execution the details
of how they work are literally the SQLite details. So in this document
all we really need to discuss is the additional validations.
#### The `SELECT` Statement
Sqlite reference: https://www.sqlite.org/lang_select.html
Verifications:
* the mentioned tables exist and have the mentioned columns
* the columns are type compatible in their context
* any variables in the expressions are compatible in context
* aggregate functions are used only in places where aggregation makes sense
* column and table names are unambiguous
* compound selects (e.g. with UNION) are type-consistent in all the fragments
* the projection of a select has unique column labels if they are used
#### Details of `SELECT *`
`SELECT *` is special in that it creates its own result type by assembling
all the columns of the result of the `FROM` clause. CQL rewrites these
column names into a new `SELECT` with the specific columns explicitly listed.
While this makes the program slightly bigger, it means that logically deleted columns
are never present in results because `SELECT *` won't select them and attempting
to use a logically deleted column results in an error.
Also, in the event that the schema changes after the program was compiled
the `*` will not include those new columns. It will refer to the columns
as they existed in the schema at the time the program was compiled. This
vastly improves schema and program interoperability.
#### The `CREATE TABLE` Statement
Sqlite reference: https://www.sqlite.org/lang_createtable.html
Unlike the other parts of DDL the compiler actually deeply cares about tables.
It has to extract all the columns and column types out of the table. This
information is necessary to properly validate all the other statements.
Verifications:
* Verify a unique table name
* No duplicate column names
* The data is recorded to be emitted in JSON and schema output
* Correctness of constraints, see below
* Other details do not participate in further semantic analysis
See [below](#the-create-virtual-table-statement) for `CREATE VIRTUAL TABLE` which has many special considerations.
##### The `UNIQUE KEY` Clause
* A column so marked becomes a valid target for the `references` part of a foreign key
* If it is a separate constraint (i.e. not part of the column definition) then its name must be unique if it has one
##### The `FOREIGN KEY` Clause
* The referenced target must be a valid table
* The referenced columns must exist and be either a unique key or a primary key
* If it is a separate constraint (i.e. not part of the column definition) then its name must be unique if it has one
##### The `PRIMARY KEY` Clause
* A column so marked becomes a valid target for the `references` part of a foreign key
* The column(s) implicitly become `not null`
* If it is a separate constraint (i.e. not part of the column definition) then its name must be unique if it has one
##### AUTOINCREMENT
* A column marked auto-increment must be declared as either `integer primary key` or `long integer primary key`
* If `long integer` (or one of its aliases) is specified then the SQLite output will be changed to `integer`
* This special transform is necessary because SQLite insists on *exactly* `integer primary key autoincrement`
* A Sqlite `integer` can hold a 64 bit value, so `long integer` in this context only refers to:
* how the column should be fetched, and
* how big the variable that holds the result should be
##### The `CHECK` Clause
* The `CHECK` clause must be validated after the entire table has been processed
* The clause may appear as part of a column definition early in the table and refer to columns later in the table
* The result of the check clause must be a valid numeric expression
#### The `CREATE INDEX` Statement
Sqlite reference: https://www.sqlite.org/lang_createindex.html
The compiler doesn't really do anything with indices but it does validate that they
make sense:
* it should have unique name
* it should reference a table that exists
* the columns in the index should match columns in the table
* indexes on expressions have valid expressions
#### The `CREATE VIEW` Statement
Sqlite reference: https://www.sqlite.org/lang_createview.html
Create view analysis is very simple because the `select` analysis does
the heavy lifting. All the compiler has to do is validate that the view
is unique, then validate the select statement as usual.
Like most other top level select statements, in a view the select must
have these two additional properties:
* every column must have a name (implied or otherwise)
* no column may be of type `NULL`
* any NULL literal converted to some type exact type with a CAST.
* i.e. `create view foo as select NULL n` is not valid
#### The `CREATE TRIGGER` Statement
Sqlite reference: https://www.sqlite.org/lang_createtrigger.html
The create trigger statement more complex than many. Validations include:
* The trigger name must be unique
* For `insert` the "new.*" table is available in expressions/statement
* For `delete` the "old.*" table is available in expressions/statements
* For `update` both are available
* If optional columns present in the `update`, they must be unique/valid
* The `when` expression must evaluate to a numeric
* The statement list must be error free with the usual rules plus new/old
* The `raise` function may be used inside a trigger (NYI)
* The table name must be a table (not a view) UNLESS the trigger type is `INSTEAD OF`
* Select statements inside the statement block do not count as returns for the procedure and that includes the create trigger
#### The `DROP TABLE` Statement
Sqlite reference: https://www.sqlite.org/lang_droptable.html
Validations:
* the table name must have been previously mentioned in a `create table` statement
* recall that DDL that is not in a procedure only declares schema it doesn't create it, therefore you can declare an entity you intend to drop
#### The `DROP VIEW` Statement
Sqlite reference: https://www.sqlite.org/lang_dropview.html
Validations:
* the view name must have been previously mentioned in a `create view` statement
* recall that DDL that is not in a procedure only declares schema it doesn't create it, therefore you can declare an entity you intend to drop
#### The `DROP INDEX` Statement
Sqlite reference: https://www.sqlite.org/lang_dropindex.html
* the index name must have been previously mentioned in a `create index` statement
* recall that DDL that is not in a procedure only declares schema it doesn't create it, therefore you can declare an entity you intend to drop
#### The `DROP TRIGGER` Statement
Sqlite reference: https://www.sqlite.org/lang_droptrigger.html
* the index name must have been previously mentioned in a `create trigger` statement
* recall that DDL that is not in a procedure only declares schema it doesn't create it, therefore you can declare an entity you intend to drop
#### The `RAISE` Function
Sqlite reference: https://sqlite.org/syntax/raise-function.html
CQL validates that `RAISE` is being used in the context of a trigger
and that it has valid arguments
#### The `ALTER TABLE ADD COLUMN` Statement
Sqlite reference: https://sqlite.org/lang_altertable.html
CQL supports only `alter table add column`, validations include:
* the table must exist
* the column definition of the new column must be valid
* all the normal rules for column definitions in create table are checked
* e.g., foreign keys exist if present, check constraints are valid expressions
* no auto-increment columns may be added
* added columns must be either nullable or have a default value
* the added column should already exist in the declared version of the table, see below
The CQL compiler expects that the declared form of a table is complete, which means
it already includes any column that the user intends to add. This is important
to create a singular view of the schema throughout the compilation. The purpose
of `alter table add column` is then in some sense to make the reality of the
physical schema match the declared schema. This is what the built in schema upgrader
does and likewise any schema alteration that a user might want to do should be
similar -- make the real schema match the declared schema. This puts CQL in the
strange position of validating that the added column is _already declared_ and
that the details match.
>NOTE: `Alter` statements are typically used in the context of migration,
>so it is possible the table or column mentioned is condemned in a future
>version. The compiler still has to run the intervening upgrade steps
>so basically the validation must ignore the current "deadness" of the table
>or column as in context it might be "not dead yet". This will be more obvious
>in the context of the schema maintenance features. (q.v.)
#### The `DELETE` Statement
SQLite reference: https://sqlite.org/lang_delete.html
Verifications:
* the table name refers to an existing table
* if present, the `WHERE` clause is a valid numeric expression
#### The `UPDATE` Statement
SQLite reference: https://sqlite.org/lang_update.html
* the table name refers to an existing table
* the updated columns exist and are not duplicated
* every column update expression is valid and type compatible with the column it updates
* additional clauses such as `WHERE`, `LIMIT`, etc. are numeric
* the `ORDER BY` clause is a valid ordering clause (no duplicate columns)
* the `RETURNING` clause is not supported at this time
* the `FROM` clause is supported and validated as in the select statement
* the evaluation proceeds as though the target table had an unrestricted join to the `FROM` clause
* it's normal to include a `WHERE` clause so as to not get a cross product
The example in the SQLite documentation makes the `FROM` semenatics clear:
```SQL
UPDATE inventory
SET quantity = quantity - daily.amt
FROM (SELECT sum(quantity) AS amt, itemId FROM sales GROUP BY 2) AS daily
WHERE inventory.itemId = daily.itemId;
inventory
was joined against daily
(a nested select) and the join condition appears in the WHERE
clause.
The following sugared version of the UPDATE
statement is also supported:
UPDATE some_table SET (a,b,c) = (1, 2, "xx") WHERE ...;
This form at first appears to be less good than the usual form in that the clarity of which column is getting which value is gone however it creates symmetry with the INSERT
statement and like the INSERT
statement the column names or values may be generated from LIKE
and FROM
forms as described in Chapter 5. These forms allow for bundles of arguments or columns of cursors to be easily updated. Consider this example:
UPDATE something(LIKE C) = (FROM C) WHERE id = 12;
The usual shape forms are supported, so C
could be ARGUMENTS
or LOCALS
etc.
As with all the other sugared syntax forms, the statement is automatically converted to the normal form. SQLite will never see the sugar, nor indeed to later stages of the compiler.
INSERT
StatementSQLite reference: https://sqlite.org/lang_insert.html
Verifications:
@dummy_seed
is be used to generate missing column values, (Chapter 12 covers this in greater detail)Note that the column names or values may be generated from LIKE
and FROM
forms as described in Chapter 5. These forms allow for bundles of arguments or columns of cursors to be easily inserted.
The following sugared syntax is also supported:
INSERT INTO somewhere USING
1 foo,
2 bar,
"xx" baz;
This form is much less error prone than the equivalent (below) because the correspondence between columns is readily visible.
INSERT INTO somewhere(foo, bar, baz) VALUES(1,2,"xx");
The sugared version is converted into the normal version.
THROW
StatementThrow
can appear in any statement context, so it is always valid.
BEGIN TRANSACTION
StatementSQLite reference: https://www.sqlite.org/lang_transaction.html
Begin Transaction
can appear in any statement context, so it is always valid.
COMMIT TRANSACTION
StatementSQLite reference: https://www.sqlite.org/lang_transaction.html
Commit Transaction
can appear in any statement context, so it is always valid.
ROLLBACK TRANSACTION
StatementSQLite reference: https://www.sqlite.org/lang_transaction.html
Rollback transaction
can appear in any statement context, but if you’re using the format where you rollback to a particular save point, then the compiler verifies that it has seen the save point name in a previous Savepoint
statement.
SAVEPOINT
StatementSQLite reference: https://www.sqlite.org/lang_savepoint.html
The Savepoint
an appear in any statement context. The save point name is recorded, so that the compiler can verify it in a rollback
. This is statement is like a weak declaration of the save point name.
RELEASE SAVEPOINT
StatementSQLite reference: https://www.sqlite.org/lang_savepoint.html
Release Savepoint
can appear in any statement context. The compiler verifies that it has seen the save point name in a previous Savepoint
statement.
PROCEDURE SAVEPOINT
StatementA common pattern is to have a save point associated with a particular procedure. The save point’s scope is the same as the procedure’s scope. More precisely
create procedure foo()
begin
savepoint
proc begin
-- your code
end;
end;
becomes:
create procedure foo()
begin
savepoint @proc; -- @proc is always the name of the current procedure
try-- your code
savepoint @proc;
release
catchrollback transaction to savepoint @proc;
savepoint @proc;
release
throw;end;
end;
This form is more than syntactic sugar because there are some interesting rules:
proc savepoint
form must be used at the top level of the procedure, hence no leave
or continue
may escape itbegin
/end
the return
form may not be used; you must use rollback return
or commit return
(see below)throw
may be used to return an error as usualproc savepoint
may be used again, at the top level, in the same procedure, if there are, for instance, several sequential stagesproc savepoint
could call another such procedure, or a procedure that manipulates savepoints in some other wayROLLBACK RETURN
StatementThis form may be used only inside of a proc savepoint
block. It indicates that the savepoint should be rolled back and then the procedure should return. It is exactly equivalent to:
rollback transaction to savepoint @proc;
savepoint @proc;
release return; -- wouldn't actually be allowed inside of proc savepoint; see note below
NOTE: to avoid errors, the loose
return
above is not actually allowed inside ofproc savepoint
– you must userollback return
orcommit return
.
COMMIT RETURN
StatementThis form may be used only inside of a proc savepoint
block. It indicates that the savepoint should be released and then the procedure should return. It is exactly equivalent to:
savepoint @proc;
release return; -- wouldn't actually be allowed inside of proc savepoint; see note below
Of course this isn’t exactly a commit, in that there might be an outer savepoint or outer transaction that might still be rolled back, but it is commited at its level of nesting, if you will. Or, equivalently, you can think of it as merging the savepoint into the transaction in flight.
NOTE: to avoid errors, the loose
return
above is not actually allowed inside ofproc savepoint
and you must userollback return
orcommit return
.
CREATE VIRTUAL TABLE
StatementSqlite reference: https://sqlite.org/lang_createvtab.html
The SQLite CREATE VIRTUAL TABLE
form is problematic for CQL because:
So in this area CQL departs from the standard syntax to this form:
create virtual table virt_table using my_module [(module arguments)] as (
id integer not null,
name text );
The part after the AS
is used by CQL as a table declaration for the virtual table. The grammar for that section is exactly the same as a normal CREATE TABLE
statement. However, that part is not transmitted to SQLite. When the virtual table is created, SQLite sees only the part it cares about, which is the part before the AS
.
In order to have strict parsing rules, the module arguments follow one of these forms:
@attribute
form, orarguments following
create virtual table virt_table using my_module as (
id integer not null,
name text );
becomes (to SQLite)
CREATE VIRTUAL TABLE virt_table USING my_module;
NOTE: empty arguments
USING my_module()
are not allowed in the SQLite docs but do seem to work in SQLite. We take the position that no args should be formatted with no parentheses, at least for now.
create virtual table virt_table using my_module(foo, 'goo', (1.5, (bar, baz))) as (
id integer not null,
name text );
CREATE VIRTUAL TABLE virt_table USING my_module(foo, "goo", (1.5, (bar, baz)));
This form allows for very flexible arguments but not totally arbitrary arguments, so it can still be parsed and validated.
This case recognizes the popular choice that the arguments are often the actual schema declaration for the table in question. So:
create virtual table virt_table using my_module(arguments following) as (
id integer not null,
name text );
becomes
CREATE VIRTUAL TABLE virt_table USING my_module(
id INTEGER NOT NULL,
name TEXT );
The normalized text (keywords capitalized, whitespace normalized) of the table declaration in the as
clause is used as the arguments.
Virtual tables go into their own section in the JSON and they include the module
and moduleArgs
entries; they are additionally marked isVirtual
in case you want to use the same processing code for virtual tables as normal tables. The JSON format is otherwise the same, although some things can’t happen in virtual tables (e.g., there is no TEMP
option so "isTemp"
must be false in the JSON.)
For purposes of schema processing, virtual tables are on the @recreate
plan, just like indices, triggers, etc. This is the only option since the alter table
form is not allowed on a virtual table.
Semantic validation enforces “no alter statements on virtual tables” as well as other things like no indices, and no triggers, since SQLite does not support any of those things.
CQL supports the notion of eponymous virtual tables. If you intend to register the virtual table’s module in this fashion, you can use create virtual table @eponymous ...
to declare this to CQL. The only effect this has is to ensure that CQL will not try to drop this table during schema maintenance as dropping such a table is an invalid operation. In all other ways, the fact that the table is eponymous makes no difference.
Finally, because virtual tables are always on the @recreate
plan, you may not have foreign keys that reference virtual tables. Such keys seem like a bad idea in any case.
These are the statements which form the language of procedures, and do not involve the database.
CREATE PROCEDURE
StatementSemantic analysis of stored procedures is fairly easy at the core:
In addition, while processing the statement: * we determine if it uses the database; this will change the emitted signature of the proc to include a sqlite3 *db
input argument and it will return a sqlite error code (e.g. SQLITE_OK
) * select statements that are loose in the proc represent the “return” of that select; this changes the signature to include a sqlite3_stmt **pstmt
parameter corresponding to the returned statement
IF
StatementThe top level if node links the initial condition with a possible series of else_if nodes and then the else node. Each condition is checked for validity. The conditions must be valid expressions that can each be converted to a boolean.
SET
StatementThe set statement is for variable assignment. We just validate that the target exists and is compatible with the source. Cursor variables cannot be set with simple assignment and CQL generates errors if you attempt to do so.
LET
StatementLet combines a DECLARE
and a SET
. The variable is declared to be the exact type of the right hand side. All the validations for DECLARE
and SET
are applicable, but there is no chance that the variable will not be compatible with the expression. The expression could still be erroneous in the first place. The variable could be a duplicate.
SWITCH
StatementThe SWITCH
form requires a number of conditions to successfully map down to a C
switch
statement. These are:
integer not null
or long integer not null
)
WHEN
expressions must be losslessly promotable to the type of the switch-expressionWHEN
clauses must be uniqueALL VALUES
is present then:
WHEN
values must cover every value of the enum except those beginning with ’_’WHEN
values not in the enumELSE
clauseDECLARE PROCEDURE
StatementThere are three forms of this declaration: * a regular procedure with no DML * e.g. declare proc X(id integer);
* a regular procedure that uses DML (it will need a db parameter and returns a result code) * e.g. declare proc X(id integer) using transaction;
* a procedure that returns a result set, and you provide the result columns * e.g. declare proc X(id integer) : (A bool not null, B text);
The main validations here are that there are no duplicate parameter names, or return value columns.
DECLARE FUNCTION
StatementFunction declarations are similar to procedures; there must be a return type (use proc if there is none). The DECLARE SELECT FUNCTION
form indicates a function visible to SQLite; other functions are usable in the call
statement.
DECLARE
Variable StatementThis declares a new local or global variable that is not a cursor. The type is computed with the same helper that is used for analyzing column definitions. Once we have the type we walk the list of variable names, check them for duplicates and such (see above) and assign their type. The canonical name of the variable is defined here. If it is later used with a different casing the output will always be as declared. e.g. declare Foo integer; set foo = 1;
is legal but the output will always contain the variable written as Foo
.
DECLARE
Cursor StatementThere are two forms of the declare cursor, both of which allow CQL to infer the exact type of the cursor. * declare foo cursor for select etc.
* the type of the cursor is the net struct type of the select list * declare foo cursor for call proc();
* proc must be statement that produces a result set via select (see above) * the type of the cursor is the struct of the select returned by the proc * note if there is more than one loose select in the proc they must match exactly * cursor names have the same rules regarding duplicates as other variables With this in mind, both cases simply recurse on either the select or the call and then pull out the structure type of that thing and use it for the cursor’s shape. If the call
is not semantically valid according to the rules for calls or the select
is not semantically valid, then of course this declaration will generate errors.
DECLARE
Value Cursor StatementThis statement declares a cursor that will be based on the return type of a procedure. When using this form the cursor is also fetched, hence the name. The fetch result of the stored proc will be used for the value. At this point, we use its type only. * the call must be semantically valid * the procedure must return an OUT parameter (not a result set) * the cursor name must be unique
WHILE
StatementWhile semantic analysis is super simple. * the condition must be numeric * the statement list must be error-free * loop_depth is increased allowing the use of interior leave/continue
LOOP
StatementLoop analysis is just as simple as “while” – because the loop_stmt literally has an embedded fetch, you simply use the fetch helper to validate that the fetch is good and then visit the statement list. Loop depth is increased as it is with while.
CALL
StatementThere are three ways that a call can happen: * signatures of procedures that we know in full: * call foo(); * declare cursor for call foo(); * some external call to some outside function we don’t know * e.g. call printf(‘hello, world’);
The cursor form can be used if and only if the procedure has a loose select or a call to a procedure with a loose select. In that case, the procedure will have a structure type, rather than just “ok” (the normal signature for a proc). If the user is attempting to do the second case, cursor_name will be set and the appropriate verification happens here.
NOTE: Recursively calling fetch cursor is not really doable in general because at the point in the call we might not yet know that the method does in fact return a select. You could make it work if you put the select before the recursive call.
Semantic rules: * for all cases each argument must be error-free (no internal type conflicts) * for known procs * the call has to have the correct number of arguments * if the formal is an out parameter the argument must be a variable * the type of the variable must be an exact type match for the formal * non-out parameters must be type-compatible, but exact match is not required
DECLARE OUT CALL
StatementThis form is syntactic sugar and corresponds to declaring any OUT
parameters of the CALL
portion that are not already declared as the exact type of the OUT
parameter. This is intended to save you from declaring a lot of variables just so that you can use them as OUT
arguments.
Since any variables that already exist are not re-declared, there are no additional semantic rules beyond the normal call except that it is an error to use this form if no OUT
variables needed to be declared.
FETCH
StatementThe fetch statement has two forms:
fetch C into var1, var2, var3 etc.
fetch C;
The second form is the so-called automatic cursor form.
In the first form the variables of the cursor must be assignment compatible with declared structure type of the cursor and the count must be correct. In the second form, the codegen will implicitly create local variables that are exactly the correct type, but we’ll cover that later. Since no semantic error is possible in that case, we simply record that this is an automatic cursor and then later we will allow the use of C.field during analysis. Of course “C” must be a valid cursor.
CONTINUE
StatementWe just need to ensure that continue
is inside a loop
or while
.
LEAVE
StatementWe only need to ensure that leave
is inside a loop
, while
or switch
.
TRY/CATCH
StatementsNo analysis needed here other than that the two statement lists are ok.
CLOSE
CURSOR StatementFor close cursor
, we just validate that the name is in fact a cursor and it is not a boxed cursor. Boxed cursor lifetime is managed by the box object so manually closing it is not allowed. Instead, the usual reference-counting semantics apply; the boxed cursor variable typically falls out of scope and is released, or is perhaps set to NULL to release its reference early.
OUT
CURSOR StatementFor out cursor
, we first validate that the name is a cursor then we set the output type of the procedure we’re in accordingly.
The program’s control/ the overall meaning of the program / or may give the compiler specific directives as to how the program should be compiled.
@ATTRIBUTE
NotationZero or more miscellaneous attributes can be added to any statement, or any column definition. See the misc_attrs
node in the grammar for the exact placement. Each attribute notation has the general form:
@attribute
(namespace : attribute-name ) or@attribute
(namespace : attribute-name = attribute-value)The namespace portion is optional and attributes with special meaning to the compiler are all in the cql:
namespace. e.g. @attribute(cql:private). This form is so common that the special abbreviation [[foo]]`` can be used instead of
@attribute(cql:foo)and all the builtin attributes are in the
cqlnamespace so we can always use
[[foo]]`. All examples in the documentation use this form.
Since the attribute-values can nest it’s possible to represent arbitrarily complex data types in an attribute. You can even represent a LISP program.
By convention, CQL lets you define “global” attributes by applying them to a global variable of type object
ending with the suffix “database”.
The main usage of global attributes is as a way to propagate configurations into the JSON output (q.v.).
Examples:
-- "global" attributes
@attribute(attribute_1 = "value_1")
@attribute(attribute_2 = "value_2")
declare database object;
-- cql:private marking on a method to make it private, using simplified syntax
private]]
[[
proc foo()begin
end;
@ECHO
StatementEcho is valid in any top level contexts.
@PREVIOUS SCHEMA
StatementBegins the region where previous schema will be compared against what has been declared before this directive for alterations that could not be upgraded.
@SCHEMA_UPGRADE_SCRIPT
StatementWhen upgrading the DDL, it’s necessary to emit create table statements for the original version of the schema. These create statements may conflict with the current version of the schema. This attribute tells CQL to
@SCHEMA_UPGRADE_VERSION
StatementFor sql stored procedures that are supposed to update previous schema versions you can use this attribute to put CQL into that mindset. This will make the columns hidden for the version in question rather than the current version. This is important because older schema migration procedures might still refer to old columns. Those columns truly exist at that schema version.
@ENFORCE_STRICT
StatementSwitch to strict mode for the indicated item. The choices and their meanings are:
CAST
indicates that casts that would be a no-op should instead generate errors (e.g. cast(1 as int)
)CURSOR HAS ROW
indicates that cursors with storage must first be tested to see if they have a row before non-null fields are accessedENCODE CONTEXT COLUMN
indicates that an encode context must be specified in when vault_sensitive
is usedENCODE CONTEXT TYPE
type specifies the required type of encode context columnsFOREIGN KEY ON DELETE
indicates there must be some ON DELETE
action in every FKFOREIGN KEY ON UPDATE
indicates there must be some ON UPDATE
action in every FKINSERT SELECT
indicates that insert with SELECT
for values may not include top level joins (avoiding a SQLite bug)INSERT SELECT
indicates that insert with select may not include joins (**)IS TRUE`` indicates that
IS TRUE
IS FALSE
IS NOT TRUE
IS NOT FALSE` may not be used (*)JOIN
indicates only ANSI style joins may be used, and “from A,B” is rejectedPROCEDURE
indicates no calls to undeclared procedures (like loose printf calls)SELECT IF NOTHING
indicates (select ...)
expressions must include an IF NOTHING
clause if they have a FROM
partSIGN FUNCTION
indicates that the sign
function may not be used (*)TABLE FUNCTION
indicates table valued functions cannot be used on left/right joins (avoiding a SQLite bug)TRANSACTION
indicates no transactions may be started, committed, or abortedUPDATE FROM
indicates that the update
with a from
clause may not be used (*)UPSERT
indicates no upsert statement may be used (*)WINDOW FUNCTION
indicates no window functions may be used (*)WITHOUT ROWID
indicates WITHOUT ROWID
may not be usedThe items marked with * are present so that features can be disabled to target downlevel versions of SQLite that may not have those features.
The items marked with ** are present so that features can be disabled to target downlevel versions of SQLite where this feature has bugs.
Most of the strict options were discovered via “The School of Hard Knocks”, they are all recommended except for the (*) items where the target SQLite version is known.
See the grammar details for exact syntax.
@ENFORCE_NORMAL
StatementTurn off strict enforcement for the indicated item.
@ENFORCE_PUSH
StatementPush the current strict settings onto the enforcement stack. This does not change the current settings.
@ENFORCE_POP
StatementPop the previous current strict settings from the enforcement stack.
@ENFORCE_RESET
StatementTurns off all the strict modes. Best used immediately after @ENFORCE_PUSH
.
@DECLARE_SCHEMA_REGION
StatementA schema region is a partitioning of the schema such that it only uses objects in the same partition or one of its declared dependencies. One schema region may be upgraded independently from any others (assuming they happen such that dependents are done first.)
Here we validate:
@BEGIN_SCHEMA_REGION
StatementEntering a schema region makes all the objects that follow part of that region. It also means that all the contained objects must refer to only pieces of schema that are in the same region or a dependent region. Here we validate that region we are entering is in fact a valid region and that there isn’t already a schema region.
@END_SCHEMA_REGION
StatementLeaving a schema region puts you back in the default region. Here we check that we are in a schema region.
@EMIT_ENUMS
StatementDeclared enumarations can be voluminous and it is undesirable for every emitted .h
file to contain every enumeration. To avoid this problem you can emit enumeration values of your choice using @emit_enums x, y, z
which places the named enumerations into the .h
file associated with the current translation unit. If no enumerations are listed, all enums are emitted.
NOTE: generated enum definitions are protected by #ifndef X ... #endif
so multiple definitions are harmless and hence you can afford to use @emit_enums
for the same enum in several translations units, if desired.
NOTE: Enumeration values also appear in the JSON output in their own section.
@EMIT_CONSTANTS
StatementThis statement is entirely analogous to the the @EMIT_ENUMS
except that the parameters are one or more constant groups. In fact constants are put into groups precisely so that they can be emitted in logical bundles (and to encourage keeping related constants together). Placing @EMIT_CONSTANTS
causes the C version of the named groups to go into the current .h
file.
NOTE: Global constants also appear in the JSON output in their own section.
These items appear in a variety of places and are worthy of discussion. They are generally handled uniformly.
In each case we walk the entire list and do the type inference on each argument. Note that this happens in the context of a function call, and depending on what the function is, there may be additional rules for compatibility of the arguments with the function. The generic code doesn’t do those checks, there is per-function code that handles that sort of thing.
At this stage the compiler computes the type of each argument and makes sure that, independently, they are not bogus.
If a procedure is returning a select statement then we need to attach a result type to the procedure’s semantic info. We have to do some extra validation at this point, especially if the procedure already has some other select that might be returned. The compiler ensures that all the possible select results are are 100% compatible.
Every name is checked in a series of locations. If the name is known to be a table, view, cursor, or some other specific type of object then only those name are considered. If the name is more general a wider search is used.
Among the places that are considered:
Discriminators can appear on any type, int
, real
, object
, etc.
Where there is a discriminator the compiler checks that (e.g.) object<Foo>
only combines with object<Foo>
or object
. real<meters>
only combines with real<meters>
or real
. In this way its not possible to accidentally add meters
to kilograms
or to store an int<task_id>
where an int<person_id>
is required.
CASE
ExpressionThere are two parts to this: the “when” expression and the “then” expression. We compute the aggregate type of the “when” expressions as we go, promoting it up to a larger type if needed (e.g. if one “when” is an int and the other is a real, then the result is a real). Likewise, nullability is computed as the aggregate. Note that if nothing matches, the result is null, so we always get a nullable resultm unless there is an “else” expression. If we started with case expression, then each “when” expression must be comparable to the case expression. If we started with case when xx then yy; then each case expression must be numeric (typically boolean).
BETWEEN
ExpressionBetween requires type compatibility between all three of its arguments. Nullability follows the usual rules: if any might be null then the result type might be null. In any case, the result’s core type is BOOL.
CAST
ExpressionFor cast expressions we use the provided semantic type; the only trick is that we preserve the extra properties of the input argument. e.g. CAST does not remove NOT NULL
.
COALESCE
FunctionCoalesce requires type compatibility between all of its arguments. The result is a not null type if we find a not null item in the list. There should be nothing after that item. Note that ifnull
and coalesce
are really the same thing except ifnull
must have exactly two arguments.
IN
AND NOT IN
ExpressionsThe in predicate is like many of the other multi-argument operators. All the items must be type compatible. Note that in this case the nullablity of the items does not matter, only the nullability of the item being tested. Note that null in (null) is null, not true.
Aggregate functions can only be used in certain places. For instance they may not appear in a WHERE
clause.
User defined function - this is an external function. There are a few things to check:
There are a few things to check:
A top level expression defines the context for that evaluation. Different expressions can have constraints. e.g. aggregate functions may not appear in the WHERE
clause of a statement. There are cases where expression nesting can happen. This nesting changes the evaluation context accordingly, e.g. you can put a nested select in a where clause and that nested select could legally have aggregates. Root expressions keep a stack of nested contexts to facilitate the changes.
A table factor is one of three things:
select * from X
(select X,Y from..) as T2
select * from (X, Y, Z)
Each of these has its own rules discussed elsewhere.
USING
ClauseWhen specifying joins, one of the alternatives is to give the shared columns in the join e.g. select * from X inner join Y using (a,b). This method validates that all the columns are present on both sides of the join, that they are unique, and they are comparable. The return code tells us if any columns had SENSITIVE data. See the Special Note on JOIN…USING below
ON
ClauseThe most explicit join condition is a full expression in an ON clause this is like select a,b from X inner join Y on X.id = Y.id;
The on expression should be something that can be used as a bool, so any numeric will do. The return code tells us if the ON condition used SENSITIVE data.
Table valued functions can appear anywhere a table is allowed. The validation rules are:
### Special Note on the select *
and select T.*
forms
The select *
construct is very popular in many codebases but it can be unsafe to use in production code because, if the schema changes, the code might get columns it does not expect. Note the extra columns could have appeared anywhere in the result set because the *
applies to the entire result of the FROM
clause, joins and all, so extra columns are not necessarily at the end and column ordinals are not preserved. CQL mitigates this situation somewhat with some useful constraints/features:
select *
, and indeed in any query, the column names of the select must be unique, this is because:
select *
or a select T.*
CQL will automatically expand the *
into the actual logical columns that exist in the schema at the time the code was compiled
Expanding the *
at compile time means Sqlite cannot see anything that might tempt it to include different columns in the result than CQL saw at compile time.
With this done we just have to look at the places a select *
might appear so we can see if it is safe (or at least reasonably safe) to use *
and, by extension of the same argument, T.*
.
In an EXISTS
or NOT EXISTS
clause like where not exists (select * from x)
*
select *
is not even expanded in this case.In a statement that produces a result set like select * from table_or_view
*
In a cursor statement like declare C cursor for select * from table_or_view
there are two cases here:
Automatic Fetch fetch C;
Manual Fetch: fetch C into a, b, c;
So considering the cases above we can conclude that auto expanding the *
into the exact columns present in the compile-time schema version ensures that any incompatible changes result in compile time errors. Adding columns to tables does not cause problems even if the code is not recompiled. This makes the *
construct much safer, if not perfect, but no semantic would be safe from arbitrary schema changes without recompilation. At the very least here we can expect a meaningful runtime error rather than silently fetching the wrong columns.
CQL varies slightly from SQLite in terms of the expected results for joins if the USING syntax is employed. This is not the most common syntax (typically an ON clause is used) but Sqlite has special rules for this kind of join.
Let’s take a quick look. First some sample data:
create table A( id integer, a text, b text);
create table B( id integer, c text, d text);
insert into A values(1, 'a1', 'b1');
insert into B values(1, 'c1', 'd1');
insert into A values(2, 'a2', 'b2');
insert into B values(2, 'c2', 'd2');
Now let’s look at the normal join; this is our reference:
select * from A T1 inner join B T2 on T1.id = T2.id;
result:
1|a1|b1|1|c1|d1
2|a2|b2|2|c2|d2
As expected, you get all the columns of A, and all the columns of B. The ‘id’ column appears twice.
However, with the USING
syntax:
select * T1 inner join B T2 using (id);
result:
1|a1|b1|c1|d1
2|a2|b2|c2|d2
The id
column is now appearing exactly once. However, the situation is not so simple as that. It seems that the *
expansion has not included two copies of id
but the following cases show that both copies of id
are still logically in the join.
select T1.*, 'xxx', T2.* from A T1 inner join B T2 using (id);
result:
1|a1|b1|xxx|1|c1|d1
2|a2|b2|xxx|2|c2|d2
The T2.id
column is part of the join, it just wasn’t part of the *
In fact, looking further:
select T1.id, T1.a, T1.b, 'xxx', T2.id, T2.c, T2.d from A T1 inner join B T2 using (id);
result:
1|a1|b1|xxx|1|c1|d1
2|a2|b2|xxx|2|c2|d2
There is no doubt, T2.id
is a valid column and can be used in expressions freely. That means the column cannot be removed from the type calculus.
Now in CQL, the *
and T.*
forms are automatically expanded; SQLite doesn’t see the *
. This is done so that if any columns have been logically deleted they can be elided from the result set. Given that this happens, the *
operator will expand to ALL the columns. Just the same as if you did T1.*
and T2.*
.
As a result, in CQL, there is no difference between the USING
form of a join and the ON
form of a join.
select *
form could possibly be different, so in most cases this ends up being moot anyway. Typically, you can’t use *
in the presence of joins because of name duplication and ambiguity of the column names of the result set. CQL’s automatic expansion means you have a much better idea exactly what columns you will get - those that were present in the schema you declared. The CQL @COLUMNS
function (q.v.) can also help you to make good select lists more easily.
CQL has a lot of schema knowledge already and so it’s well positioned to think about schema upgrades and versioning.
It seemed essential to be able to record changes to the schema over time so CQL got an understanding of versioning. This lets you do things like:
To use cql in this fashion, the sequence will be something like the below. See Appendix 1 for command line details.
cql --in input.sql --rt schema_upgrade --cg schema_upgrader.sql \
--global_proc desired_upgrade_proc_name
There are three basic flavors of annotation
@create(version [, migration proc])
@delete(version [, migration proc])
@recreate
They have various constraints:
@create
and @delete
can only be applied to tables and columns@recreate
can only be applied to tables (nothing else needs it anyway)@recreate
cannot mix with @create
or @delete
@recreate
can include a group name as in @recreate(musketeers)
; if a group name is specified then all the tables in that group are recreated if any of them changeIndices, Views, and Triggers are always “recreated” (just like tables can be) and so neither the @recreate
nor the @create
annotations are needed (or allowed). However when an Index, View, or Trigger is retired it must be marked with @delete
so that it isn’t totally forgotten but can be deleted anywhere it might still exist. Note that when one of these items is deleted, the definition is not used as it will only be dropped anyway. The simplest creation of the object with the correct name will do the job as a tombstone.
e.g. create view used_to_be_fabulous as select 1 x @delete(12);
suffices to drop the used_to_be_fabulous
view in version 12 no matter how complicated it used to be. Its CREATE VIEW
will not be emitted into the upgrade procedure in any case. Similarly, trivial indices and triggers of the correct name can be used for the tombstone.
In addition, if there is some data migration that needs to happen at a particular schema version that isn’t associated with any particular change in schema, you can run an ad hoc migrator at any time. The syntax for that is @schema_ad_hoc_migration(version, migration proc);
. Ad hoc migrations are the last to run in any given schema version; they happen after table drop migrations.
@create
declares that the annotated object first appeared in the indicated version, and at that time the migration proc needs to be executed to fill in default values, denormalize values, or whatever the case may be.
@delete
declares that the annotated object disappeared in the indicated version, and at that time the migration proc needs to be executed to clean up the contents, or potentially move them elsewhere.
@recreate
declares that the annotated object can be dropped and recreated when it changes because there is no need to preserve its contents during an upgrade. Such objects may be changed arbitrarily from version to version.
@recreate
table may have @create
or @delete
(these aren’t needed anyway)
@recreate
never have deprecated columns (since @delete
isn’t allowed on their columns)NOTE: all annotations are suppressed from generated SQL. SQLite never sees them.
NOTE: looking at the annotations it is possible to compute the logical schema at any version, especially the original schema – it’s what you get if you disregard all
@delete
entirely (don’t delete) and then remove anything marked with@create
directives.
Not all migrations are possible in a sensible fashion, therefore CQL enforces certain limitations:
@create
)@delete
)@delete
) like tablesMoving from one schema version to another is done in an orderly fashion with the migration proc taking these essential steps in this order:
the cql_schema_facets
table is created if needed – this records the current state of the schema
the last known schema hash is read from the cql_schema_facets
tables (it is zero by default)
if the overall schema hash code matches what is stored, processing stops; otherwise an upgrade ensues
all known views are dropped (hence migration procs won’t see them!)
any index that needs to change is dropped (this includes items marked @delete
or indices that are different than before)
all known triggers are dropped (hence they will not fire during migration!)
the current schema version is extracted from cql_schema_facets
(it is zero by default)
if the current schema version is zero, then the original versions of all the tables are created
if the current schema version is <= 1 then
cql_schema_facets
cql_schema_facets
as it happens so it is not repeated
the above process is repeated for all schema versions up to the current version
all tables that are marked with @recreate
are re-created if necessary
drop
it and create the new version.all indices that changed and were not marked with @delete
are re-created
all views not marked with @delete
are re-created
all triggers not marked with @delete
are re-installed
the current schema hash is written to the cql_schema_facets
table
Here’s an example of a schema directly from the test cases:
-- crazy amount of versioning here
create table foo(
id integer not null,
rate long integer @delete(5),
rate_2 long integer @delete(4, DeleteRate2Proc),
id2 integer default 12345 @create(4, CreateId2Proc),
name text @create(5),
name_2 text @create(6)
);
-- much simpler table, lots of stuff added in v2.
-- note v1 is the first new version and v0 is base version
create table table2(
id integer not null,
name1 text @create(2, CreateName1Proc),
name2 text @create(2, CreateName2Proc),
name3 text @create(2), -- no proc
name4 text @create(2) -- no proc
);
create table added_table(
id integer not null,
name1 text,
name2 text @create(4)
) @create(3) @delete(5);
-- this view is present in the output
create view live_view as select * from foo;
-- this view is also present in the output
create view another_live_view as select * from foo;
-- this view is not present in the output
create view dead_view as select * from foo @delete(2);
-- this index is present
create index index_still_present on table2(name1, name2);
-- this index is going away
create index index_going_away on table2(name3) @delete(3);
-- this is a simple trigger, and it's a bit silly but that doesn't matter
create trigger trigger_one
after insert on foo
begin
delete from table2 where table2.id = new.id;
end;
This schema has a LOT of versioning… you can see tables and columns appearing in versions 2 through 6. There is a lot of error checking happening.
With just those annotations you can automatically create the following upgrade script which is itself CQL (and hence has to be compiled). Notice that this code is totally readable!
The script has been split into logical pieces to make it easier to explain what’s going on.
-- ...copyright notice... possibly generated source tag... elided to avoid confusion
-- no columns will be considered hidden in this script
-- DDL in procs will not count as declarations
@SCHEMA_UPGRADE_SCRIPT;
Schema upgrade scripts need to see all the columns even the ones that would be logically deleted in normal mode. This is so that things like alter table add column
can refer to real columns and drop table
can refer to a table that shouldn’t even be visible. Remember in CQL the declarations tell you the logical state of the universe and DLL mutations are expected to create that condition, so you should be dropping tables that are marked with @delete
CQL stores the current state of the universe in this table.
-- schema crc -7714030317354747478
The schema crc is computed by hashing all the schema declarations in canonical form. That’s everything in this next section.
CQL uses a set of four functions to manage a dictionary. The implementation is in cqlrt_common.c
but it’s really just a simple hash table that maps from a string key to a number. This functionality was added because over time the facets table can get pretty big and running a SQL query every time to read a single integer is not economical.
-- declare facet helpers--
DECLARE facet_data TYPE LONG<facet_data> not null;
DECLARE test_facets facet_data;
DECLARE FUNCTION cql_facets_new() facet_data;
DECLARE PROCEDURE cql_facets_delete(facets facet_data);
DECLARE FUNCTION cql_facet_add(facets facet_data, facet TEXT NOT NULL, crc LONG NOT NULL) BOOL NOT NULL;
DECLARE FUNCTION cql_facet_find(facets facet_data, facet TEXT NOT NULL) LONG NOT NULL;
Wherein all the necessary objects are declared…
-- declare sqlite_master --
CREATE TABLE sqlite_master (
type TEXT NOT NULL,
NOT NULL,
name TEXT NOT NULL,
tbl_name TEXT INTEGER NOT NULL,
rootpage NOT NULL
sql TEXT );
The sqlite_master
table is built-in but it has to be introduced to CQL so that we can query it. Like all the other loose DDL declarations here there is no code generated for this. We are simply declaring tables. To create code you have to put the DDL in a proc. Normally DDL in procs also declares the table but since we may need the original version of a table created and the final version declared we have @schema_upgrade_script
to help avoid name conflicts.
-- declare full schema of tables and views to be upgraded --
CREATE TABLE foo(
id INTEGER NOT NULL,
LONG INT @DELETE(5),
rate LONG INT @DELETE(4, DeleteRate2Proc),
rate_2 INTEGER DEFAULT 12345 @CREATE(4, CreateId2Proc),
id2 5),
name TEXT @CREATE(6)
name_2 TEXT @CREATE(
);
CREATE TABLE table2(
id INTEGER NOT NULL,
2, CreateName1Proc),
name1 TEXT @CREATE(2, CreateName2Proc),
name2 TEXT @CREATE(2),
name3 TEXT @CREATE(2)
name4 TEXT @CREATE(
);
CREATE TABLE added_table(
id INTEGER NOT NULL,
name1 TEXT,4)
name2 TEXT @CREATE(3) @DELETE(5); ) @CREATE(
NOTE: all the tables are emitted including all the annotations. This lets us do the maximum validation when we compile this script.
CREATE VIEW live_view AS
SELECT *
FROM foo;
CREATE VIEW another_live_view AS
SELECT *
FROM foo;
CREATE VIEW dead_view AS
SELECT *
FROM foo @DELETE(2);
These view declarations do very little. We only need the view names so we can legally drop the views. We create the views elsewhere.
CREATE INDEX index_still_present ON table2 (name1, name2);
CREATE INDEX index_going_away ON table2 (name3) @DELETE(3);
Just like views, these declarations introduce the index names and nothing else.
CREATE TRIGGER trigger_one
AFTER INSERT ON foo
BEGIN
DELETE FROM table2 WHERE table2.id = new.id;
END;
We have only the one trigger; we declare it here.
-- facets table declaration --
CREATE TABLE IF NOT EXISTS test_cql_schema_facets(
NOT NULL PRIMARY KEY,
facet TEXT LONG INTEGER NOT NULL
version );
This is where we will store everything we know about the current state of the schema. Below we define a few helper procs for reading and writing that table and reading sqlite_master
-- saved facets table declaration --
CREATE TEMP TABLE test_cql_schema_facets_saved(
NOT NULL PRIMARY KEY,
facet TEXT LONG INTEGER NOT NULL
version );
We will snapshot the facets table at the start of the run so that we can produce a summary of the changes at the end of the run. This table will hold that snapshot.
NOTE: the prefix “test” was specified when this file was built so all the methods and tables begin with
test_
.
-- helper proc for testing for the presence of a column/type
CREATE PROCEDURE test_check_column_exists(table_name TEXT NOT NULL,
NOT NULL,
decl TEXT OUT present BOOL NOT NULL)
BEGIN
SET present := (SELECT EXISTS(SELECT * FROM sqlite_master
WHERE tbl_name = table_name AND sql GLOB decl));
END;
check_column_exists
inspects sqlite_master
and returns true if a column matching decl
exists.
-- helper proc for creating the schema version table
CREATE PROCEDURE test_create_cql_schema_facets_if_needed()
BEGIN
CREATE TABLE IF NOT EXISTS test_cql_schema_facets(
NOT NULL PRIMARY KEY,
facet TEXT LONG INTEGER NOT NULL
version
);END;
Here we actually create the cql_schema_facets
table with DDL inside a proc. In a non-schema-upgrade script the above would give a name conflict.
-- helper proc for saving the schema version table
CREATE PROCEDURE test_save_cql_schema_facets()
BEGIN
DROP TABLE IF EXISTS test_cql_schema_facets_saved;
CREATE TEMP TABLE test_cql_schema_facets_saved(
NOT NULL PRIMARY KEY,
facet TEXT LONG INTEGER NOT NULL
version
);INSERT INTO test_cql_schema_facets_saved
SELECT * FROM test_cql_schema_facets;
END;
The save_sql_schema_facets
procedure simply makes a snapshot of the current facets table. Later we use this snapshot to report the differences by joining these tables.
-- helper proc for setting the schema version of a facet
CREATE PROCEDURE test_cql_set_facet_version(_facet TEXT NOT NULL,
LONG INTEGER NOT NULL)
_version BEGIN
INSERT OR REPLACE INTO test_cql_schema_facets (facet, version)
VALUES(_facet, _version);
END;
-- helper proc for getting the schema version of a facet
CREATE PROCEDURE test_cql_get_facet_version(_facet TEXT NOT NULL,
out _version LONG INTEGER NOT NULL)
BEGIN
TRYSET _version := (SELECT version FROM test_cql_schema_facets
WHERE facet = _facet LIMIT 1 IF NOTHING -1);
CATCHSET _version := -1;
END;
END;
The two procedures cql_get_facet_version
and cql_set_facet_version
do just what you would expect. Note the use of try
and catch
to return a default value if the select fails.
There are two additional helper procedures that do essentially the same thing using a schema version index. These two methods exist only to avoid unnecessary repeated string literals in the output file which cause bloat.
-- helper proc for getting the schema version CRC for a version index
CREATE PROCEDURE test_cql_get_version_crc(_v INTEGER NOT NULL, out _crc LONG INTEGER NOT NULL)
BEGIN
SET _crc := cql_facet_find(test_facets, printf('cql_schema_v%d', _v));
END;
-- helper proc for setting the schema version CRC for a version index
CREATE PROCEDURE test_cql_set_version_crc(_v INTEGER NOT NULL,
LONG INTEGER NOT NULL)
_crc BEGIN
INSERT OR REPLACE INTO test_cql_schema_facets (facet, version)
VALUES('cql_schema_v'||_v, _crc);
END;
As you can see, these procedures are effectively specializations of cql_get_facet_version
and cql_set_facet_version
where the facet name is computed from the integer.
Triggers require some special processing. There are so-called “legacy” triggers that crept into the system. These begin with tr__
and they do not have proper tombstones. In fact some are from early versions of CQL before they were properly tracked. To fix any old databases that have these in them, we delete all triggers that start with tr__
.
NOTE: we have to use the
GLOB
operator to do this, because_
is theLIKE
wildcard.
-- helper proc to reset any triggers that are on the old plan --
DECLARE PROCEDURE cql_exec_internal(sql TEXT NOT NULL) USING TRANSACTION;
CREATE PROCEDURE test_cql_drop_legacy_triggers()
BEGIN
DECLARE C CURSOR FOR SELECT name from sqlite_master
WHERE type = 'trigger' AND name GLOB 'tr__*';
LOOP FETCH C
BEGIN
call cql_exec_internal(printf('DROP TRIGGER %s;', C.name));
END;
END;
The ‘baseline’ or ‘v0’ schema is unannotated (no @create
or @recreate
). The first real schema management procedures are for creating and dropping these tables.
CREATE PROCEDURE test_cql_install_baseline_schema()
BEGIN
CREATE TABLE foo(
id INTEGER NOT NULL,
rate LONG_INT,
rate_2 LONG_INT
);
CREATE TABLE table2(
id INTEGER NOT NULL
);
END;
-- helper proc for dropping baseline tables before installing the baseline schema
CREATE PROCEDURE test_cql_drop_baseline_tables()
BEGIN
DROP TABLE IF EXISTS foo;
DROP TABLE IF EXISTS table2;
END;
The next section declares the migration procedures that were in the schema. These are expected to be defined elsewhere.
-- declared upgrade procedures if any
DECLARE proc CreateName1Proc() USING TRANSACTION;
DECLARE proc CreateName2Proc() USING TRANSACTION;
DECLARE proc CreateId2Proc() USING TRANSACTION;
DECLARE proc DeleteRate2Proc() USING TRANSACTION;
The code below will refer to these migration procedures. We emit a declaration so that we can use the names in context.
NOTE:
USING TRANSACTION
when applied to a proc declaration simply means the proc will access the database so it needs to be provided with asqlite3 *db
parameter.
-- drop all the views we know
CREATE PROCEDURE test_cql_drop_all_views()
BEGIN
DROP VIEW IF EXISTS live_view;
DROP VIEW IF EXISTS another_live_view;
DROP VIEW IF EXISTS dead_view;
END;
-- create all the views we know
CREATE PROCEDURE test_cql_create_all_views()
BEGIN
CREATE VIEW live_view AS
SELECT *
FROM foo;
CREATE VIEW another_live_view AS
SELECT *
FROM foo;
END;
View migration is done by dropping all views and putting all views back.
NOTE:
dead_view
was not created, but we did try to drop it if it existed.
-- drop all the indices that are deleted or changing
CREATE PROCEDURE test_cql_drop_all_indices()
BEGIN
IF cql_facet_find(test_facets, 'index_still_present_index_crc') != -6823087563145941851 THEN
DROP INDEX IF EXISTS index_still_present;
END IF;
DROP INDEX IF EXISTS index_going_away;
END;
-- create all the indices we need
CREATE PROCEDURE test_cql_create_indices()
BEGIN
IF cql_facet_find(test_facets, 'index_still_present_index_crc') != -6823087563145941851 THEN
CREATE INDEX index_still_present ON table2 (name1, name2);
CALL test_cql_set_facet_version('index_still_present_index_crc', -6823087563145941851);
END IF;
END;
Indices are processed similarly to views, however we do not want to drop indices that are not changing. Therefore we compute the CRC of the index definition. At the start of the script any indices that are condemned (e.g. index_going_away
) are dropped as well as any that have a new CRC. At the end of migration, changed or new indices are (re)created using cql_create_indices
.
- drop all the triggers we know
CREATE PROCEDURE test_cql_drop_all_triggers()
BEGIN
CALL test_cql_drop_legacy_triggers();
DROP TRIGGER IF EXISTS trigger_one;
END;
-- create all the triggers we know
CREATE PROCEDURE test_cql_create_all_triggers()
BEGIN
CREATE TRIGGER trigger_one
AFTER INSERT ON foo
BEGIN
DELETE FROM table2 WHERE table2.id = new.id;
END;
END;
Triggers are always dropped before migration begins and are re-instated quite late in the processing as we will see below.
To avoid selecting single rows out of the facets table repeatedly we introduce this procedure whose job is to harvest the facets table and store it in a dictionary. The helpers that do this were declared above. You’ve already seen usage of the facets in the code above.
CREATE PROCEDURE test_setup_facets()
BEGIN
TRYSET test_facets := cql_facets_new();
DECLARE C CURSOR FOR SELECT * from test_cql_schema_facets;
LOOP FETCH C
BEGIN
:= cql_facet_add(test_facets, C.facet, C.version);
LET added END;
CATCH-- if table doesn't exist we just have empty facets, that's ok
END;
END;
The main script orchestrates everything. There are inline comments for all of it. The general order of events is:
These operations are done in test_perform_needed_upgrades
* drop all views * drop condemned indices * fetch the current schema version * if version 0 then install the baseline schema (see below) * for each schema version with changes do the following: * create any tables that need to be created in this version * add any columns that need to be added in this version * run migration procs in this order: * create table * create column * delete trigger * delete view * delete index * delete column * delete table * drop any tables that need to be dropped in this version * mark schema upgraded to the current version so far, and proceed to the next version * each partial step is also marked as completed so that it can be skipped if the script is run again * create all the views * (re)create any indices that changed and are not dead * set the schema CRC to the current CRC
That’s it… the details are below.
CREATE PROCEDURE test_perform_upgrade_steps()
BEGIN
DECLARE column_exists BOOL NOT NULL;
DECLARE schema_version LONG INTEGER NOT NULL;
-- dropping all views --
CALL test_cql_drop_all_views();
-- dropping condemned or changing indices --
CALL test_cql_drop_all_indices();
-- dropping condemned or changing triggers --
CALL test_cql_drop_all_triggers();
---- install baseline schema if needed ----
CALL test_cql_get_version_crc(0, schema_version);
IF schema_version != -9177754326374570163 THEN
CALL test_cql_install_baseline_schema();
CALL test_cql_set_version_crc(0, -9177754326374570163);
END IF;
---- upgrade to schema version 2 ----
CALL test_cql_get_version_crc(2, schema_version);
IF schema_version != -6840158498294659234 THEN
-- altering table table2 to add column name1 TEXT;
CALL test_check_column_exists('table2', '*[( ]name1 TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE table2 ADD COLUMN name1 TEXT;
END IF;
-- altering table table2 to add column name2 TEXT;
CALL test_check_column_exists('table2', '*[( ]name2 TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE table2 ADD COLUMN name2 TEXT;
END IF;
-- altering table table2 to add column name3 TEXT;
CALL test_check_column_exists('table2', '*[( ]name3 TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE table2 ADD COLUMN name3 TEXT;
END IF;
-- altering table table2 to add column name4 TEXT;
CALL test_check_column_exists('table2', '*[( ]name4 TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE table2 ADD COLUMN name4 TEXT;
END IF;
-- data migration procedures
IF cql_facet_find(test_facets, 'CreateName1Proc') = -1 THEN
CALL CreateName1Proc();
CALL test_cql_set_facet_version('CreateName1Proc', 2);
END IF;
IF cql_facet_find(test_facets, 'CreateName2Proc') = -1 THEN
CALL CreateName2Proc();
CALL test_cql_set_facet_version('CreateName2Proc', 2);
END IF;
CALL test_cql_set_version_crc(2, -6840158498294659234);
END IF;
---- upgrade to schema version 3 ----
CALL test_cql_get_version_crc(3, schema_version);
IF schema_version != -4851321700834943637 THEN
-- creating table added_table
CREATE TABLE IF NOT EXISTS added_table(
id INTEGER NOT NULL,
name1 TEXT
);
CALL test_cql_set_version_crc(3, -4851321700834943637);
END IF;
---- upgrade to schema version 4 ----
CALL test_cql_get_version_crc(4, schema_version);
IF schema_version != -6096284368832554520 THEN
-- altering table added_table to add column name2 TEXT;
CALL test_check_column_exists('added_table', '*[( ]name2 TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE added_table ADD COLUMN name2 TEXT;
END IF;
-- altering table foo to add column id2 INTEGER;
CALL test_check_column_exists('foo', '*[( ]id2 INTEGER*', column_exists);
IF NOT column_exists THEN
ALTER TABLE foo ADD COLUMN id2 INTEGER DEFAULT 12345;
END IF;
-- logical delete of column rate_2 from foo; -- no ddl
-- data migration procedures
IF cql_facet_find(test_facets, 'CreateId2Proc') = -1 THEN
CALL CreateId2Proc();
CALL test_cql_set_facet_version('CreateId2Proc', 4);
END IF;
IF cql_facet_find(test_facets, 'DeleteRate2Proc') = -1 THEN
CALL DeleteRate2Proc();
CALL test_cql_set_facet_version('DeleteRate2Proc', 4);
END IF;
CALL test_cql_set_version_crc(4, -6096284368832554520);
END IF;
---- upgrade to schema version 5 ----
CALL test_cql_get_version_crc(5, schema_version);
IF schema_version != 5720357430811880771 THEN
-- altering table foo to add column name TEXT;
CALL test_check_column_exists('foo', '*[( ]name TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE foo ADD COLUMN name TEXT;
END IF;
-- logical delete of column rate from foo; -- no ddl
-- dropping table added_table
DROP TABLE IF EXISTS added_table;
CALL test_cql_set_version_crc(5, 5720357430811880771);
END IF;
---- upgrade to schema version 6 ----
CALL test_cql_get_version_crc(6, schema_version);
IF schema_version != 3572608284749506390 THEN
-- altering table foo to add column name_2 TEXT;
CALL test_check_column_exists('foo', '*[( ]name_2 TEXT*', column_exists);
IF NOT column_exists THEN
ALTER TABLE foo ADD COLUMN name_2 TEXT;
END IF;
CALL test_cql_set_version_crc(6, 3572608284749506390);
END IF;
CALL test_cql_create_all_views();
CALL test_cql_create_all_indices();
CALL test_cql_create_all_triggers();
CALL test_cql_set_facet_version('cql_schema_version', 6);
CALL test_cql_set_facet_version('cql_schema_crc', -7714030317354747478);
END;
We have one more helper that will look for evidence that we’re trying to move backwards to a previous schema version. This is not supported. This procedure also arranges for the original facet versions to be saved and it proceduces a difference in facets after the upgrade is done.
CREATE PROCEDURE test_perform_needed_upgrades()
BEGIN
-- check for downgrade --
IF cql_facet_find(test_facets, 'cql_schema_version') > 6 THEN
SELECT 'downgrade detected' facet;
ELSE
-- save the current facets so we can diff them later --
CALL test_save_cql_schema_facets();
CALL test_perform_upgrade_steps();
-- finally produce the list of differences
SELECT T1.facet FROM
test_cql_schema_facets T1LEFT OUTER JOIN test_cql_schema_facets_saved T2
ON T1.facet = T2.facet
WHERE T1.version is not T2.version;
END IF;
END;
This is the main function for upgrades, it checks only the master schema version. This function is separate so that the normal startup path doesn’t have to have the code for the full upgrade case in it. This lets linker order files do a superior job (since full upgrade is the rare case).
CREATE PROCEDURE test()
BEGIN
DECLARE schema_crc LONG INTEGER NOT NULL;
-- create schema facets information table --
CALL test_create_cql_schema_facets_if_needed();
-- fetch the last known schema crc, if it's different do the upgrade --
CALL test_cql_get_facet_version('cql_schema_crc', schema_crc);
IF schema_crc <> -7714030317354747478 THEN
TRYCALL test_setup_facets();
CALL test_perform_needed_upgrades();
CATCHCALL cql_facets_delete(test_facets);
SET test_facets := 0;
THROW;END;
CALL cql_facets_delete(test_facets);
SET test_facets := 0;
ELSE
-- some canonical result for no differences --
SELECT 'no differences' facet;
END IF;
END;
We had no temporary tables in this schema, but if there were some they get added to the schema after the upgrade check.
A procedure like this one is generated:
CREATE PROCEDURE test_cql_install_temp_schema()
BEGIN
CREATE TEMP TABLE tempy(
id INTEGER
);END;
This entry point can be used any time you need the temp tables. But normally it is automatically invoked.
---- install temp schema after upgrade is complete ----
CALL test_cql_install_temp_schema();
That logic is emitted at the end of the test procedure.
Schema Regions are designed to let you declare your schema in logical regions whose dependencies are specified. It enforces the dependencies you specify creating errors if you attempt to break the declared rules. Schema regions allow you to generate upgrade scripts for parts of your schema that can compose and be guaranteed to remain self-consistent.
In many cases schema can be factored into logical and independent islands. This is desireable for a number of reasons:
These all have very real applications:
This creates basically three schema regions:
There must be a separate upgrade script for both the island databases and yet a different one for the “cross-db” database
If you’re making a library with database support, your customers likely want to be able to create databases that have only features they want; you will want logical parts within your schema that can be separated for cleanliness and distribution.
Schema Regions let you create logical groupings, you simply declare the regions you want and then start putting things into those regions. The regions form a directed acyclic graph – just like C++ base classes. You create regions like this:
@declare_schema_region root;
@declare_schema_region extra using root;
The above simply declares the regions – it doesn’t put anything into them. In this case we now have a root
region and an extra
region. The root
schema items will not be allowed to refer to anything in extra
.
Without regions, you could also ensure that the above is true by putting all the extra
items afer the root
in the input file but things can get more complicated than that in general, and the schema might also be in several files, complicating ordering as the option. Also, relying on order could be problematic as it is quite easy to put things in the wrong place (e.g. add a new root
item after the extra
items). Making this a bit more complicated, we could have:
@declare_schema_region feature1 using extra;
@declare_schema_region feature2 using extra;
@declare_schema_region everything using feature1, feature2;
And now there are many paths to root
from the everything
region; that’s ok but certainly it will be tricky to do all that with ordering.
An illustrative example, using the regions defined above:
@begin_schema_region root;
create table main(
id integer,
name text
);
create view names as select name from main order by name;
@end_schema_region;
@begin_schema_region extra;
create table details(
id integer references main(id),
details text
);
create proc get_detail(id_ integer)
begin
select T1.id, T1.details, T2.name from details T1
inner join main T2 on T1.id = T2.id
where T1.id = id_;
end;
@end_schema_region;
@begin_schema_region feature1;
create table f1(
id integer references details(id),
f1_info text
);
create proc get_detail(id_ integer)
begin
select T1.id, T1.details, T2.name, f1_info from details T1
inner join f T2 on T1.id = T2.id
inner join f1 on f1.id = T1.id
where T1.id = id_;
end;
@end_schema_region;
@begin_schema_region feature2;
-- you can use details, and main but not f1
@end_schema_region;
With the structure above specified, even if a new contribution to the root
schema appears later, the rules enforce that this region cannot refer to anything other than things in root
. This can be very important if schema is being included via #include
and might get pulled into the compilation in various orders. A feature area might also have a named public region that others things can depend on (e.g. some views) and private regions (e.g. some tables, or whatever).
Schema regions do not provide additional name spaces – the names of objects should be unique across all regions. In other words, regions do not hide or scope entity names; rather they create errors if inappropriate names are used.
Case 1: The second line will fail semantic validation because table A
already exists
-- obvious standard name conflict
create table A (id integer);
create table A (id integer, name text);
Case 2: This fails for the same reason as case #1. Table A
already exists
@declare_region root;
-- table A is in no region
create table A (id integer);
@begin_region root:
-- this table A is in the root region, still an error
create table A (id integer, name text);
@end_region;
Case 3: Again fails for the same reason as case #1. Table A
already exist in region extra
, and you cannot define another table with the same name in another region.
@declare_region root;
@declare_region extra;
@begin_region extra;
-- so far so good
create table A (id integer);
@end_region;
@begin_region root;
-- no joy, this A conflicts with the previous A
create table A (id integer, name text);
@end_region;
Really the visibility rules couldn’t be anything other than the above, as SQLite has no knowledge of regions at all and so any exotic name resolution would just doom SQLite statements to fail when they finally run.
"... LIKE <table>"
statementThe rules above are enforced for all constructs except for where the syntactic sugar ... LIKE <table>
forms, which can happen in a variety of statements. This form doesn’t create a dependence on the table (but does create a dependence on its shape). When CQL generates output, the LIKE
construct is replaced with the actual names of the columns it refers to. But these are independent columns, so this is simply a keystroke saver. The table (or view, cursor, etc.) reference will be gone in any output SQL so this isn’t a real dependency on the existence of the mentioned table or shape at run time.
The cases below will succeed.
@declare_region root;
create table A (...);
create view B (....);
create procedure C {...}
@begin_region root;
create table AA(LIKE A);
create table BB(LIKE B);
create table CC(LIKE C);
@end_region;
NOTE: this exception may end up causing maintenance problems and so it might be revisited in the future.
When creating upgrade scripts, using the --rt schema_upgrade
flags you can add region options --include_regions a b c
and --exclude_regions d e f
per the following:
Included regions:
Excluded regions:
Example: Referring to the regions above you might do something like this
# All of these also need a --global_proc param for the entry point but that's not relevant here
cql --in schema.sql --cg shared.sql --rt schema_upgrade --include_regions extra
cql --in schema.sql --cg f1.cql --rt schema_upgrade --include_regions feature1 --exclude_regions extra
cql --in schema.sql --cg f2.cql --rt schema_upgrade --include_regions feature2 --exclude_regions extra
The first command generates all the shared schema for regions root
and extra
because extra
contains root
The second command declares all of root
and extra
so that the feature1
things can refer to them, however the upgrade code for these shared regions is not emitted. Only the upgrade for schema in feature1
is emitted. feature2
is completely absent. This will be ok because we know feature1
cannot depend on feature2
and extra
is assumed to be upgraded elsewhere (such as in the previous line).
The third command declares all of root
and extra
so that the feature2
things can refer to them, however the upgrade code for these shared regions is not emitted. Only the upgrade for schema in feature2
is emitted. feature1
is completely absent.
NOTE: in the above examples, CQL is generating more CQL to be compiled again (a common pattern). The CQL upgrade scripts need to be compiled as usual to produce executable code. Thus the output of this form includes the schema declarations and executable DDL.
For schema that is not in any region you might imagine that it is a special region <none>
that depends on everything. So basically you can put anything there. Schema that is in any region cannot ever refer to schema that is in <none>
.
When upgrading, if any include regions are specified then <none>
will not be emitted at all. If you want an upgrader for just <none>
this is possible with an assortment of exclusions. You can always create arbitrary grouping regions to make this easier. A region named any
that uses all other regions would make this simple.
In general, best practice is that there is no schema in <none>
, but since most SQL code has no regions some sensible meaning has to be given to DDL before it gets region encodings.
Given the above we note that some schema regions correspond to the way that we will deploy the schema. We want those bundles to be safe to deploy but to in order to be so we need a new notion – a deployable region. To make this possible CQL includes the following:
@declare_deployable_region
Because of the above, each deployable region is in fact a well defined root for the regions it contains. The deployable region becomes the canonical way in which a bundle of regions (and their content) is deployed and any given schema item can be in only one deployable region.
As we saw above, regions are logical groupings of tables/views/etc such that if an entity is in some region R
then it is allowed to only refer to the things that R
declared as dependencies D1
, D2
, etc. and their transitive closures. You can make as many logical regions as you like and you can make them as razor thin as you like; they have no physical reality but they let you make as many logical groups of things as you might want.
Additionally, when we’re deploying schema you generally need to do it in several pieces. E.g. if we have tables that go in an in-memory database then defining a region that holds all the in-memory tables makes it easy to, say, put all those in-memory tables into a particular deployment script.
Now we come to the reason for deployable regions. From CQL’s perspective, all regions are simply logical groups; some grouping is then meaningful to programmers but has no physical reality. This means you’re free to reorganize tables etc. as you see fit into new or different regions when things should move. Only, that’s not quite true. The fact that we deploy our schema in certain ways means while most logical moves are totally fine, if you were to move a table from, say, the main database region to the in-memory region you would be causing a major problem. Some installations may already have the table in the main area and there would be nothing left in the schema to tell CQL to drop the table from the main database – the best you can hope for is the new location gets a copy of the table the old location keeps it and now there are name conflicts forever.
So, the crux of the problem is this: We want to let you move schema freely between logical regions in whatever way makes sense to you, but once you pick the region you are going to deploy in, you cannot change that.
To accomplish this, CQL needs to know that some of the regions are deployable regions and there have to be rules to make it all makes sense. Importantly, every region has to be contained in at most one deployable region.
Since the regions form a DAG we must create an error if any region could ever roll up to two different deployable regions. The easiest way to describe this rule is “no peeking” – the contents of a deployable region are “private” they can refer to each other in any DAG shape but outside of the deployable region you can only refer to its root. So you can still compose them but each deployable region owns a well-defined covering. Note that you can make as many fine-grained deployable regions as you want; you don’t actually have to deploy them separately, but you get stronger rules about the sharing when you do.
Here’s an example:
Master Deployment 1
Feature 1 (Deployable)
logical regions for feature 1
Core (Deployable)
logical regions for core
Feature 2 (Deployable)
logical regions for feature 2
Core
...
Master Deployment 2
Feature 1 (Deployable)
...
Feature 3 (Deployable)
logical regions for feature 3
In the above:
NOTE: deployable regions for Feature 1, 2, and 3 aren’t actually deployed alone, but they are adding enforcement that makes the features cleaner
Because of how upgrades work, “Core” could have its own upgrader. Then when you create the upgrader for Master Deployment 1 and 2, you can specify “exclude Core” in which case those tables are assumed to be updated independently. You could create as many or as few independently upgrade-able things with this pattern. Because regions are not allowed to “peek” inside of a deployable region, you can reorganize your logical regions without breaking other parts of the schema.
The above constructs create a good basis for creating and composing regions, but a key missing aspect is the ability to hide internal details in the logical groups. This becomes increasingly important as your desire to modularize schema grows; you will want to have certain parts that can change without worrying about breaking others and without fear that there are foreign keys and so forth referring to them.
To accomplish this, CQL provides the ability to compose schema regions with the optional private
keyword. In the following example there will be three regions creatively named r1
, r2
, and r3
. Region r2
consumes r1
privately and therefore r3
is not allowed to use things in r1
even though it consumes r2
. When creating an upgrade script for r3
you will still need (and will get) all of r2
and r1
, but from a visibility perspective r3
can only directly depend on r2
.
@declare_schema_region r1;
@declare_schema_region r2 using r1 private;
@declare_schema_region r3 using r2;
@begin_schema_region r1;
create table r1_table(id integer primary key);
@end_schema_region;
@begin_schema_region r2;
create table r2_table(id integer primary key references r1_table(id));
@end_schema_region;
@begin_schema_region r3;
-- this is OK
create table r3_table_2(id integer primary key references r2_table(id));
-- this is an error, no peeking into r1
create table r3_table_1(id integer primary key references r1_table(id));
@end_schema_region;
As expected r2
is still allowed to use r1
because your private regions are not private from yourself. So you may think it’s easy to work around this privacy by simply declaring a direct dependency on r1 wherever you need it.
@declare_schema_region my_sneaky_region using r1, other_stuff_I_need;
That would seem to make it all moot. However, this is where deployable regions come in. Once you bundle your logical regions in a deployable region there’s no more peeking inside the the deployable region. So we could strengthen the above to:
@declare_deployable_region r2 using r1 private;
Once this is done it becomes an error to try to make new regions that peek into r2
; you have to take all of r2
or none of it – and you can’t see the private parts. Of course you can do region wrapping at any level so you can have as many subgroups as you like, whatever is useful. You can even add additional deployable regions that aren’t actually deployed to get the “hardened” grouping at no cost.
So, in summary, to get true privacy, first make whatever logical regions you like that are helpful. Put privacy where you need/want it. Import logical regions as much as you want in your own bundle of regions. Then wrap that bundle up in a deployable region (they nest) and then your private regions are safe from unwanted usage.
Any significant library that is centered around a database is likely to accrue significant amounts of schema to support its features. Often users of the library don’t want all its features and therefore don’t want all of its schema. CQL’s primary strategy is to allow the library author to divide the schema into regions and then the consumer of the library may generate a suitable schema deployer that deploys only the desired regions. You simply subscribe to the regions you want.
The @unsub
construct deals with the unfortunate situation of over-subscription. In the event that a customer has subscribed to regions that it turns out they don’t need, or if indeed the regions are not fine-grained enough, they may wish to (possibly much later) unsubscribe from particular tables or entire regions that they previously had included.
Unfortunately it’s not so trivial as to simply remove the regions after the fact. The problem is that there might be billions of devices that already have the undesired tables and are paying the initialization costs for them. Affirmatively removing the tables is highly desirable and that means a forward-looking annotation is necessary to tell the upgrader to generate DROP
statements at some point. Furthermore, a customer might decide at some point later that now is the time they need the schema in question, so resubcription also has to be possible.
To accomplish this we add the following construct:
@unsub(table_name);
The effects of a valid @unsub
are as follows:
@create
, then “DROP IF EXISTS table_name” is emitted into the upgrade steps for version_number@recreate
the table is unconditionally dropped as though it had been deletedThe compiler ensures that the directives are valid and stay valid.
@delete
REFERENCES
CAUTION: The legacy
@resub
directive is now an error; Resubscription is accomplished by simply removing the relevant@unsub
directive(s).
Unsubscriptions may be removed when they are no longer desired in order to resubscribe as long as this results in a valid chain of foreign keys.
These validations are sufficient to guarantee a constistent logical history for unsubscriptions.As we saw in the previous chapter, CQL includes powerful schema management tools for creating automatic upgrade scripts for your databases. However, not all schema alterations are possible after-the-fact and so CQL also includes schema comparison tools to help you avoid problems as you version your schema over time.
You can compare the previous version of a schema with the current version to do additional checks such as:
When checking @recreate
tables against the previous schema version for errors, these checks are done:
@recreate
tables to appear with no @create
needed@recreate
but not back@recreate
to @create
at the current schema version@delete
at the current schema version@create
or @delete
state to @recreate
All of these are statically checked.
To use these tools, you must run CQL in a mode where it has both the proposed and existing schema in its input stream, then it can provide suitable errors if any unsupported change is about to happen.
The normal way that you do previous schema validation is to create an input file that provides both schema.
This file may look something like this:
-- prev_check.sql
create table foo(
id integer,
1)
new_field text @create(
);
@previous_schema;
create table foo(
id integer
);
So, here the old version of foo
will be validated against the new version and all is well. A new nullable text field was added at the end.
In practice these comparisons are likely to be done in a somewhat more maintainable way, like so:
-- prev_check.sql
"table1.sql"
#include "table2.sql"
#include "table3.sql"
#include
@previous_schema;
"previous.sql" #include
Now importantly, in this configuration, everything that follows the @previous_schema
directive does not actually contribute to the declared schema. This means the --rt schema
result type will not see it. Because of this, you can do your checking operation like so:
cc -E -x c prev_check.sql | cql --cg new_previous_schema.sql --rt schema
The above command will generate the schema in new_previous_schema and, if this command succeeds, it’s safe to replace the existing previous.sql
with new_previous_schema
.
NOTE: you can bootstrap the above by leaving off the
@previous_schema
and what follows to get your first previous schema from the command above.
Now, as you can imagine, comparing against the previous schema allows many more kinds of errors to be discovered. What follows is a large chunk of the CQL tests for this area taken from the test files themselves. For easy visibility I have brought each fragment of current and previous schema close to each other and I show the errors that are reported. We start with a valid fragment and go from there.
create table foo(
id integer not null,
long int @delete(5, deletor),
rate long int @delete(4),
rate_2 integer @create(4),
id2 5),
name text @create(6)
name_2 text @create(
);-------
create table foo(
id integer not null,
long int @delete(5, deletor),
rate long int @delete(4),
rate_2 integer @create(4),
id2 5),
name text @create(6)
name_2 text @create( );
The table foo
is the same! It doesn’t get any easier than that.
create table t_create_version_changed(id integer) @create(1);
-------
create table t_create_version_changed(id integer) @create(2);
at sem_test_prev.sql:15 : in str : current create version not equal to
Error previous create version for 't_create_version_changed'
You can’t change the version a table was created in. Here the new schema says it appeared in version 1. The old schema says 2.
create table t_delete_version_changed(id integer) @delete(1);
-------
create table t_delete_version_changed(id integer) @delete(2);
at sem_test_prev.sql:18 : in str : current delete version not equal to
Error previous delete version for 't_delete_version_changed'
You can’t change the version a table was deleted in. Here the new schema says it was gone in version 1. The old schema says 2.
-- t_not_present_in_new_schema is gone
-------
create table t_not_present_in_new_schema(id integer);
at sem_test_prev.sql:176 : in create_table_stmt : table was present but now it
Error not exist (use @delete instead) 't_not_present_in_new_schema' does
So here t_not_present_in_new_schema
was removed, it should have been marked with @delete
. You don’t remove tables.
create view t_became_a_view as select 1 id @create(6);
-------
create table t_became_a_view(id integer);
at sem_test_prev.sql:24 : in create_view_stmt : object was a table but is now a
Error view 't_became_a_view'
Tables can’t become views…
create table t_created_in_wrong_version(id integer) @create(1);
-------
create table t_created_in_wrong_version(id integer);
at sem_test_prev.sql:27 : in str : current create version not equal to previous
Error create version for 't_created_in_wrong_version'
Here a version annotation is added after the fact. This item was already in the base schema.
create table t_was_correctly_deleted(id integer) @delete(1);
-------
create table t_was_correctly_deleted(id integer);
No errors here, just a regular delete.
create table t_column_name_changed(id_ integer);
-------
create table t_column_name_changed(id integer);
at sem_test_prev.sql:33 : in str : column name is different between previous and current schema 'id_' Error
You can’t rename columns. We could support this but it’s a bit of a maintenance nightmare and logical renames are possible easily without doing physical renames.
create table t_column_type_changed(id real);
-------
create table t_column_type_changed(id integer);
at sem_test_prev.sql:36 : in str : column type is different between previous and current schema 'id' Error
You can’t change the type of a column.
create table t_column_attribute_changed(id integer not null);
-------
create table t_column_attribute_changed(id integer);
at sem_test_prev.sql:39 : in str : column type is different between previous and current schema 'id' Error
Change of column attributes counts as a change of type.
create table t_column_delete_version_changed(id integer, id2 integer @delete(1));
-------
create table t_column_delete_version_changed(id integer, id2 integer @delete(2));
at sem_test_prev.sql:42 : in str : column current delete version not equal to previous delete version 'id2' Error
You can’t change the delete version after it has been set.
create table t_column_create_version_changed(id integer, id2 integer @create(1));
-------
create table t_column_create_version_changed(id integer, id2 integer @create(2));
at sem_test_prev.sql:45 : in str : column current create version not equal to previous create version 'id2' Error
You can’t change the create version after it has been set.
create table t_column_default_value_changed(id integer, id2 integer not null default 2);
-------
create table t_column_default_value_changed(id integer, id2 integer not null default 1);
at sem_test_prev.sql:48 : in str : column current default value not equal to previous default value 'id2' Error
You can’t change the default value after the fact. There’s no alter statement that would allow this even though it does make some logical sense.
create table t_column_default_value_ok(id integer, id2 integer not null default 1);
-------
create table t_column_default_value_ok(id integer, id2 integer not null default 1);
No change. No error here.
create table t_additional_attribute_present(a int not null, b int, primary key (a,b));
-------
create table t_additional_attribute_present(a int not null, b int, primary key (a,b));
No change. No error here.
create table t_additional_attribute_mismatch(a int not null, primary key (a));
-------
create table t_additional_attribute_mismatch(a int not null, b int, primary key (a,b));
at sem_test_prev.sql:57 : in pk_def : a table facet is different in the previous
Error and current schema
This is an error because the additional attribute does not match the previous schema.
create table t_columns_removed(id integer);
-------
create table t_columns_removed(id integer, id2 integer);
at sem_test_prev.sql:255 : in col_def : items have been removed from the table
Error than marked with @delete 't_columns_removed' rather
You can’t remove columns from tables. You have to mark them with @delete
instead.
create table t_attribute_added(a int not null, primary key (a));
-------
create table t_attribute_added(a int not null);
at sem_test_prev.sql:63 : in pk_def : table has a facet that is different in the
Error previous and current schema 't_attribute_added'
Table facets like primary keys cannot be added after the fact. There is no way to do this in sqlite.
@create
create table t_additional_column(a int not null, b int);
-------
create table t_additional_column(a int not null);
at sem_test_prev.sql:66 : in col_def : table has columns added without marking
Error 't_additional_column' them @create
If you add a new column like b
above you have to mark it with @create
in a suitable version.
@create
(ok)create table t_additional_column_ok(a int not null, b int @create(2), c int @create(6));
-------
create table t_additional_column_ok(a int not null, b int @create(2));
Column properly created. No errors here.
create TEMP table t_becomes_temp_table(a int not null, b int);
-------
create table t_becomes_temp_table(a int not null, b int);
at sem_test_prev.sql:72 : in create_table_stmt : table create statement attributes
Error than previous version 't_becomes_temp_table' different
Table became a TEMP table, there is no way to generate an alter statement for that. Not allowed.
create table t_new_table_ok(a int not null, b int) @create(6);
-------
-- no previous version
No errors here; this is a properly created new table.
create table t_new_table_no_annotation(a int not null, b int);
-------
-- no previous version
at sem_test_prev.sql:85 : in create_table_stmt : new table must be added with
Error @create(6) or later 't_new_table_no_annotation'
This table was added with no annotation. It has to have an @create and be at least version 6, the current largest.
create table t_new_table_stale_annotation(a int not null, b int) @create(2);
-------
-- no previous version
at sem_test_prev.sql:91 : in create_table_stmt : new table must be added with
Error @create(6) or later 't_new_table_stale_annotation'
The schema is already up to version 6. You can’t then add a table in the past at version 2.
@create
and @delete
create table t_new_table_create_and_delete(a int not null, b int @create(6) @delete(7));
-------
create table t_new_table_create_and_delete(a int not null);
at sem_test_prev.sql:96 : in col_def : table has newly added columns that are
Error both @create and @delete 't_new_table_create_and_delete' marked
Adding a column in the new version and marking it both create and delete is … weird… don’t do that. Technically you can do it (sigh) but it must be done one step at a time.
@create
correctlycreate table t_new_legit_column(a int not null, b int @create(6));
-------
create table t_new_legit_column(a int not null);
No errors here; new column added in legit version.
create table with_create_migrator(id integer) @create(1, ACreateMigrator);
-------
create table with_create_migrator(id integer) @create(1);
at sem_test_prev.sql:104 : in str : @create procedure changed in object
Error 'with_create_migrator'
You can’t add a create migration proc after the fact.
create table with_create_migrator(id integer) @create(1, ACreateMigrator);
-------
create table with_create_migrator(id integer) @create(1, ADifferentCreateMigrator);
at sem_test_prev.sql:104 : in str : @create procedure changed in object
Error 'with_create_migrator'
You can’t change a create migration proc after the fact.
create table with_delete_migrator(id integer) @delete(1, ADeleteMigrator);
-------
create table with_delete_migrator(id integer) @delete(1);
at sem_test_prev.sql:107 : in str : @delete procedure changed in object
Error 'with_delete_migrator'
You can’t add a delete migration proc after the fact.
create table with_delete_migrator(id integer) @delete(1, ADeleteMigrator);
-------
create table with_delete_migrator(id integer) @delete(1, ADifferentDeleteMigrator);
at sem_test_prev.sql:107 : in str : @delete procedure changed in object
Error 'with_delete_migrator'
You can’t change a delete migration proc after the fact.
create table view_becomes_a_table(id int);
-------
create view view_becomes_a_table as select 1 X;
at sem_test_prev.sql:110 : in create_table_stmt : object was a view but is now a
Error table 'view_becomes_a_table'
Converting views to tables is not allowed.
--- no matching view in current schema
-------
create view view_was_zomg_deleted as select 1 X;
at sem_test_prev.sql:333 : in create_view_stmt : view was present but now it does
Error not exist (use @delete instead) 'view_was_zomg_deleted'
Here the view was deleted rather than marking it with @delete
, resulting in an error.
create view view_was_temp_but_now_it_is_not as select 1 X;
-------
create temp view view_was_temp_but_now_it_is_not as select 1 X;
at sem_test_prev.sql:339 : in create_view_stmt : TEMP property changed in new
Error schema for view 'view_was_temp_but_now_it_is_not'
A temp view became a view. This flag is not allowed to change. Side note: temp views are weird.
create view view_with_different_create_version as select 1 X @create(3);
-------
create view view_with_different_create_version as select 1 X @create(2);
at sem_test_prev.sql:116 : in str : current create version not equal to previous
Error create version for 'view_with_different_create_version'
You can’t change the create version of a view after the fact.
--- no matching index in current schema
-------
create index this_index_was_deleted_with_no_annotation on foo(id);
at sem_test_prev.sql:349 : in create_index_stmt : index was present but now it
Error not exist (use @delete instead) 'this_index_was_deleted_with_no_annotation' does
You have to use @delete
on indices to remove them correctly.
create view view_created_with_no_annotation as select 1 X;
-------
--- there is no previous version
at sem_test_prev.sql:122 : in create_view_stmt : new view must be added with
Error @create(6) or later 'view_created_with_no_annotation'
You have to use @create
on views to create them correctly.
create index this_index_has_a_changed_attribute on foo(id) @create(2);
-------
create index this_index_has_a_changed_attribute on foo(id) @create(1);
at sem_test_prev.sql:125 : in str : current create version not equal to previous
Error create version for 'this_index_has_a_changed_attribute'
You can’t change the @create
version of an index.
@create
annotationcreate index this_index_was_created_with_no_annotation on foo(id);
-------
--- there is no previous version
at sem_test_prev.sql:130 : in create_index_stmt : new index must be added with
Error @create(6) or later 'this_index_was_created_with_no_annotation'
You have to use @create
on indices to make new ones.
create table create_column_migrate_test(
id int,
int @create(2, ChangedColumnCreateMigrator)
id2
);-------
create table create_column_migrate_test(
id int,
int @create(2, PreviousColumnCreateMigrator)
id2
);
at sem_test_prev.sql:136 : in str : column @create procedure changed 'id2' Error
You can’t change the @create
migration stored proc on columns.
create table delete_column_migrate_test(
id int,
int @delete(2, ChangedColumnDeleteMigrator)
id2
);-------
create table delete_column_migrate_test(
id int,
int @delete(2, PreviousColumnDeleteMigrator)
id2
);
at sem_test_prev.sql:142 : in str : column @delete procedure changed 'id2' Error
You can’t change the @delete
migration stored proc on columns.
NOTE: in addition to these errors, there are many more that do not require the previous schema which are also checked (not shown here). These comprise things like making sure the delete version is greater than the create version on any item. There is a lot of “sensibility checking” that can happen without reference to the previous schema.
CQL includes a number of features to make it easier to create what you might call “Test” procedures. These primarily are concerned with loading up the database with dummy data, and/or validating the result of normal procedures that query the database. There are several interesting language features in these dimensions.
Test code can be needlessly brittle, especially when creating dummy data; any column changes typically cause all sorts of data insertion code to need to be repaired. In many cases the actual data values are completely uninteresting to the test – any values would do. There are several strategies you can use to get good dummy data into your database in a more maintainable way.
The simplest form uses a variant of the insert statement that fills in any missing columns with a seed value. An example might be something like the below:
create proc dummy_user()
begin
insert into users () values () @dummy_seed(123)
@dummy_nullables @dummy_defaults;end;
This statement causes all values including columns that are nullable or have a default value to get the value 123
for any numeric type and 'column_name_123'
for any text.
If you omit the @dummy_nullables
then any nullable fields will be null as usual. And likewise if you omit @dummy_defaults
then any fields with a default value will use that value as usual. You might want any combination of these for your tests (null values are handy in your tests and default behavior is also handy.)
The @dummy_seed
expression provided can be anything that resolves to a non-null integer value, so it can be pretty flexible. You might use a while
loop to insert a bunch of rows with the seed value being computed from the while
loop variable.
The form above is sort of like insert * into table
in that it is giving dummy values for all columns but you can also specify some of the columns while using the seed value for others. Importantly, you can specify values you particularly want to control either for purposes of creating a more tailored test or because you need them to match existing or created rows in a table referenced by a foreign key.
As an example:
insert into users (id) values (1234) @dummy_seed(123)
@dummy_nullables @dummy_defaults;
will provide dummy values for everything but the id
column.
WITH RECURSIVE
Sometimes what you want to do is create a dummy result set without necessarily populating the database at all. If you have code that consumes a result set of a particular shape, it’s easy enough to create a fake result set with a pattern something like this:
create procedure dummy_stuff(lim integer not null)
begin
WITH RECURSIVE
AS (
dummy(x) SELECT 1
UNION ALL
SELECT x+1 FROM dummy WHERE x < lim)
SELECT
id,
x "name_%d", x) name,
printf(cast(x % 2 as bool) is_cool,
* 1.3 as rate,
x
x etc1,
x etc2FROM dummy;
end;
The first part of the above creates a series of numbers from 1 to lim
. The second uses those values to create dummy columns. Any result shape can be generated in this fashion.
You get data like this from the above:
1|name_1|1|1.3|1|1
2|name_2|0|2.6|2|2
3|name_3|1|3.9|3|3
4|name_4|0|5.2|4|4
5|name_5|1|6.5|5|5
6|name_6|0|7.8|6|6
7|name_7|1|9.1|7|7
8|name_8|0|10.4|8|8
9|name_9|1|11.7|9|9
10|name_10|0|13.0|10|10
The result of the select statement is itself quite flexible and if more dummy data is what you wanted, this form can be combined with INSERT ... FROM SELECT...
to create dummy data in real tables. And of course once you have a core query you could use it in a variety of ways, combined with cursors or any other strategy to select
out pieces and insert
them into various tables.
If you need an API to create very flexible dummy data with values of your choice you can use temporary tables and a series of helper procedures.
First, create a table to hold the results. You can of course make this table however you need to but the like
construct in the table creation is especially helpful; it creates columns in the table that match the name and type of the named object. For instance like my_proc
is shorthand for the column names ands of the shape that my_proc
returns. This is perfect for emulating the results of my_proc
.
create proc begin_dummy()
begin
drop table if exists my_dummy_data;
-- the shape of my_dummy_data matches the columns
-- returned by proc_I_want_to_emulate
create temp table my_dummy_data(
like proc_I_want_to_emulate;
);end;
Next, you will need a procedure that accepts and writes a single row to your temp table. You can of course write this all explicitly but the testing support features provide more support to make things easier; In this example, arguments of the procedure will exactly match the output of the procedure we emulating, one argument for each column the proc returns. The insert
statement gets its values from the arguments.
create proc add_dummy(like proc_I_want_to_emulate)
begin
insert into my_dummy_data from arguments;
end;
This allows you to create the necessary helper methods automatically even if the procedure changes over time.
Next we need a procedure to get our result set.
create proc get_dummy()
begin
select * from my_dummy_data;
end;
And finally, some cleanup.
create proc cleanup_dummy()
begin
drop table if exists my_dummy_data;
end;
Again the temp table could be combined with INSERT INTO ...FROM SELECT...
to create dummy data in real tables.
Wrapping your insert
statements in try/catch
can be very useful if there may be dummy data conflicts. In test code searching for a new suitable seed is pretty easy. Alternatively
set seed := 1 + (select max(id) from foo);
could be very useful. Many alternatives are also possible.
The dummy data features are not suitable for use in production code, only tests. But the LIKE features are generally useful for creating contract-like behavior in procs and there are reasonable uses for them in production code.
Here’s a more complicated example that can be easily rewritten using the sugar features. This method is designed to return a single-row result set that can be used to mock a method. I’ve replaced the real fields with ‘f1, ’f2’ etc.
CREATE PROCEDURE test_my_subject(
LONG INTEGER NOT NULL,
f1_ NOT NULL,
f2_ TEXT INTEGER NOT NULL,
f3_ LONG INTEGER NOT NULL,
f4_
f5_ TEXT,
f6_ TEXT,
f7_ TEXT,NOT NULL,
f8_ BOOL
f9_ TEXT,
f10_ TEXT
)BEGIN
DECLARE data_cursor CURSOR LIKE my_subject;
FETCH data_cursor()FROM VALUES (f1_, f2_, f3_, f4_, f5_, f6_, f7_, f8_, f9_, f10);
OUT data_cursor;
END;
This can be written much more maintainably as:
CREATE PROCEDURE test_my_subject(like my_subject)
BEGIN
DECLARE C CURSOR LIKE my_subject;
FROM ARGUMENTS;
FETCH C OUT C;
END;
Naturally, real columns have much longer names and there are often many more than 10.
Some of the patterns described above are so common that CQL offers a mechanism to automatically generate those test procedures.
The attributes dummy_table, dummy_insert, and dummy_select can be used together to create and populate temp tables.
Example:
To create a dummy row set for sample_proc
, add the [[autotest]]
attribute with dummy_table, dummy_insert, and dummy_select values.
create table foo(
id integer not null,
not null
name text
);
=(dummy_table, dummy_insert, dummy_select)]]
[[autotestcreate proc sample_proc(foo int)
begin
select * from Foo;
end;
dummy_table
generates procedures for creating and dropping a temp table with the same shape as sample_proc
.
CREATE PROC open_sample_proc()
BEGIN
CREATE TEMP TABLE test_sample_proc(LIKE sample_proc);
END;
CREATE PROC close_sample_proc()
BEGIN
DROP test_sample_proc;
END;
The dummy_insert
attribute generates a procedure for inserting into the temp table.
CREATE PROC insert_sample_proc(LIKE sample_proc)
BEGIN
INSERT INTO test_sample_proc FROM ARGUMENTS;
END;
The dummy_select
attribute generates procedures for selecting from the temp table.
CREATE PROC select_sample_proc()
BEGIN
SELECT * FROM test_sample_proc;
END;
It’s interesting to note that the generated test code does not ever need to mention the exact columns it is emulating because it can always use like
, *
, and from arguments
in a generic way.
When compiled, the above will create C methods that can create, drop, insert, and select from the temp table. They will have the following signatures:
CQL_WARN_UNUSED cql_code open_sample_proc(
sqlite3 *_Nonnull _db_);
CQL_WARN_UNUSED cql_code close_sample_proc(
sqlite3 *_Nonnull _db_);
CQL_WARN_UNUSED cql_code insert_sample_proc(
sqlite3 *_Nonnull _db_,
cql_int32 id_,
cql_string_ref _Nonnull name_);
CQL_WARN_UNUSED cql_code select_sample_proc_fetch_results(
sqlite3 *_Nonnull _db_,
select_sample_proc_result_set_ref _Nullable *_Nonnull result_set);
In some cases, using four APIs to generate fake data can be verbose. In the case that only a single row of data needs to be faked, the dummy_result_set attribute can be more convenient.
Example:
=(dummy_result_set]])
[[autotestcreate proc sample_proc()
begin
select id from Foo;
end;
Will generate the following procedure
CREATE PROC generate_sample_proc_row(LIKE sample_proc)
BEGIN
DECLARE curs CURSOR LIKE sample_proc;
FROM ARGUMENTS;
FETCH curs OUT curs;
END;
Which generates this C API:
void generate_sample_proc_row_fetch_results(
*_Nonnull result_set,
generate_sample_proc_row_rowset_ref _Nullable ,
string_ref _Nonnull foo_int64_t bar_);
These few test helpers are useful in a variety of scenarios and can save you a lot of typing and maintenance. They evolve automatically as the code changes, always matching the signature of the attributed procedure.
The most flexible test helper is the dummy_test
form. This is far more advanced than the simple helpers above. While the choices above were designed to help you create fake result sets pretty easily, dummy_test
goes much further letting you set up arbitrary schema and data so that you can run your procedure on actual data. The dummy_test
code generator uses the features above to do its job and like the other autotest options, it works by automatically generating CQL code from your procedure definition. However, you get a lot more code in this mode.
It’s easiest to study an example so let’s begin there.
To understand dummy_test
we’ll need a more complete example, so we start with this simple two-table schema with a trigger and some indices. To this we add a very small procedure that we might want to test.
create table foo(
id integer not null primary key,
name text
);
create table bar(
id integer not null primary key references foo(id),
data text
);
create index foo_index on foo(name);
create index bar_index on bar(data);
create temp trigger if not exists trigger1
before delete on foo
begin
delete from foo where name = 'this is so bogus';
end;
[[autotest=(
dummy_table,
dummy_insert,
dummy_select,
dummy_result_set,
(dummy_test, (bar, (data), ('plugh'))))
]]
create proc the_subject()
begin
select * from bar;
end;
As you can see, we have two tables, foo
and bar
; the foo
table has a trigger; both foo
and bar
have indices. This schema is very simple, but of course it could be a lot more complicated, and real cases typically are.
The procedure we want to test is creatively called the_subject
. It has lots of test attributes on it. We’ve already discussed dummy_table
, dummy_insert
, dummy_select
, and dummy_result_set
above but as you can see they can be mixed in with dummy_test
. Now let’s talk about dummy_test
. First you’ll notice that annotation has additional sub-attributes; the attribute grammar is sufficiently flexible such that, in principle, you could represent an arbitrary LISP program, so the instructions can be very detailed. In this case, the attribute provides table and column names, as well as sample data. We’ll discuss that when we get to the population code.
First let’s dispense with the attributes we already discussed – since we had all the attributes, the output will include those helpers, too. Here they are again:
-- note that the code does not actually call the test subject
-- this declaration is used so that CQL will know the shape of the result
DECLARE PROC the_subject () (id INTEGER NOT NULL, data TEXT);
CREATE PROC open_the_subject()
BEGIN
CREATE TEMP TABLE test_the_subject(LIKE the_subject);
END;
CREATE PROC close_the_subject()
BEGIN
DROP TABLE test_the_subject;
END;
CREATE PROC insert_the_subject(LIKE the_subject)
BEGIN
INSERT INTO test_the_subject FROM ARGUMENTS;
END;
CREATE PROC select_the_subject()
BEGIN
SELECT * FROM test_the_subject;
END;
CREATE PROC generate_the_subject_row(LIKE the_subject)
BEGIN
DECLARE curs CURSOR LIKE the_subject;
FROM ARGUMENTS;
FETCH curs OUT curs;
END;
That covers what we had before, so, what’s new? Actually, quite a bit. We’ll begin with the easiest:
CREATE PROC test_the_subject_create_tables()
BEGIN
CREATE TABLE IF NOT EXISTS foo(
id INTEGER NOT NULL PRIMARY KEY,
name TEXT
);CREATE TABLE IF NOT EXISTS bar(
id INTEGER NOT NULL PRIMARY KEY REFERENCES foo (id),
data TEXT
);END;
Probably the most important of all the helpers, test_the_subject_create_tables
will create all the tables you need to run the procedure. Note that in this case, even though the subject code only references bar
, CQL determined that foo
is also needed because of the foreign key.
The symmetric drop procedure is also generated:
CREATE PROC test_the_subject_drop_tables()
BEGIN
DROP TABLE IF EXISTS bar;
DROP TABLE IF EXISTS foo;
END;
Additionally, in this case there were triggers and indices. This caused the creation of helpers for those aspects.
CREATE PROC test_the_subject_create_indexes()
BEGIN
CREATE INDEX bar_index ON bar (data);
CREATE INDEX foo_index ON foo (name);
END;
CREATE PROC test_the_subject_create_triggers()
BEGIN
CREATE TEMP TRIGGER IF NOT EXISTS trigger1
BEFORE DELETE ON foo
BEGIN
DELETE FROM foo WHERE name = 'this is so bogus';
END;
END;
CREATE PROC test_the_subject_drop_indexes()
BEGIN
DROP INDEX IF EXISTS bar_index;
DROP INDEX IF EXISTS foo_index;
END;
CREATE PROC test_the_subject_drop_triggers()
BEGIN
DROP TRIGGER IF EXISTS trigger1;
END;
If there are no triggers or indices, the corresponding create/drop methods will not be generated.
With these helpers available, when writing test code you can then choose if you want to create just the tables, or the tables and indices, or tables and indices and triggers by invoking the appropriate combination of helper methods. Since all the implicated triggers and indices are automatically included, even if they change over time, maintenance is greatly simplified.
Note that in this case the code simply reads from one of the tables, but in general the procedure under test might make modifications as well. Test code frequently has to read back the contents of the tables to verify that they were modified correctly. So these additional helper methods are also included:
CREATE PROC test_the_subject_read_foo()
BEGIN
SELECT * FROM foo;
END;
CREATE PROC test_the_subject_read_bar()
BEGIN
SELECT * FROM bar;
END;
These procedures will allow you to easily create result sets with data from the relevant tables which can then be verified for correctness. Of course if more tables were implicated, those would have been included as well.
As you can see, the naming always follows the convention test_[YOUR_PROCEDURE]_[helper_type]
Finally, the most complicated helper is the one that used that large annotation. Recall that we provided the fragment (dummy_test, (bar, (data), ('plugh'))))
to the compiler. This fragment helped to produce this last helper function:
CREATE PROC test_the_subject_populate_tables()
BEGIN
INSERT OR IGNORE INTO foo(id) VALUES(1) @dummy_seed(123);
INSERT OR IGNORE INTO foo(id) VALUES(2) @dummy_seed(124)
@dummy_nullables @dummy_defaults;
INSERT OR IGNORE INTO bar(data, id) VALUES('plugh', 1) @dummy_seed(125);
INSERT OR IGNORE INTO bar(id) VALUES(2) @dummy_seed(126)
@dummy_nullables @dummy_defaults;END;
In general the populate_tables
helper will fill all implicated tables with at least two rows of data. It uses the dummy data features discussed earlier to generate the items using a seed. Recall that if @dummy_seed
is present in an insert
statement then any missing columns are generated using that value, either as a string, or as an integer (or true/false for a boolean). Note that the second of the two rows that is generated also specifies @dummy_nullables
and @dummy_defaults
. This means that even nullable columns, and columns with a default value will get the non-null seed instead. So you get a mix of null/default/explicit values loaded into your tables.
Of course blindly inserting data doesn’t quite work. As you can see, the insert code used the foreign key references in the schema to figure out the necessary insert order and the primary key values for foo
were automatically specified so that they could then be used again in bar
.
Lastly, the autotest attribute included explicit test values for the table bar
, and in particular the data
column has the value 'plugh'
. So the first row of data for table bar
did not use dummy data for the data
column but rather used 'plugh'
.
In general, the dummy_test
annotation can include any number of tables, and for each table you can specify any of the columns and you can have any number of tuples of values for those columns.
NOTE: if you include primary key and/or foreign key columns among the explicit values, it’s up to you to ensure that they are valid combinations. SQLite will complain as usual if they are not, but the CQL compiler will simply emit the data you asked for.
Generalizing the example a little bit, we could use the following:
(dummy_test, (foo, (name), ('fred'), ('barney'), ('wilma'), ('betty')),
(bar, (id, data), (1, 'dino'), (2, 'hopparoo'))))
to generate this population:
CREATE PROC test_the_subject_populate_tables()
BEGIN
INSERT OR IGNORE INTO foo(name, id) VALUES('fred', 1) @dummy_seed(123);
INSERT OR IGNORE INTO foo(name, id) VALUES('barney', 2) @dummy_seed(124)
@dummy_nullables @dummy_defaults;
INSERT OR IGNORE INTO foo(name, id) VALUES('wilma', 3) @dummy_seed(125);
INSERT OR IGNORE INTO foo(name, id) VALUES('betty', 4) @dummy_seed(126)
@dummy_nullables @dummy_defaults;
INSERT OR IGNORE INTO bar(id, data) VALUES(1, 'dino') @dummy_seed(127);
INSERT OR IGNORE INTO bar(id, data) VALUES(2, 'hopparoo') @dummy_seed(128)
@dummy_nullables @dummy_defaults;
END;
And of course if the annotation is not flexible enough, you can write your own data population.
The CQL above results in the usual C signatures. For instance:
(sqlite3 *_Nonnull _db_); CQL_WARN_UNUSED cql_code test_the_subject_populate_tables
So, it’s fairly easy to call from C/C++ test code or from CQL test code.
Generally it’s not possible to compute table usages that come from called procedures. This is because to do so you need to see the body of the called procedure and typically that body is in a different translation – and is therefore not available. A common workaround for this particular problem is to create a dummy procedure that explicitly uses all of the desired tables. This is significantly easier than creating all the schema manually and still gets you triggers and indices automatically. Something like this:
=(dummy_test]])
[[autotestcreate proc use_my_stuff()
begin
:= select 1 from t1, t2, t3, t4, t5, t6, etc..;
let x end;
use_my_stuff
simply and directly lists the desired tables. Using this approach you can have one set of test helpers for an entire unit rather than one per procedure. This is often desirable and the maintenance is not too bad. You just use the use_my_stuff
test helpers everywhere.
To help facilitate additional tools that might want to depend on CQL input files further down the toolchain, CQL includes a JSON output format for SQL DDL as well as stored procedure information, including special information for a single-statement DML. “Single-statement DML” refers to those stored procedures that consist of a single insert
, select
, update
, or delete
. Even though such procedures comprise just one statement, good argument binding can create very powerful DML fragments that are re-usable. Many CQL stored procedures are of this form (in practice maybe 95% are just one statement.)
To use CQL in this fashion, the sequence will be something like the below. See Appendix 1 for command line details.
cql --in input.sql --rt json_schema --cg out.json
The output contains many different sections for the various types of entities that CQL can process. There is a full description of the possible outputs available in diagram form
In the balance of this chapter we’ll deal with the contents of the sections and their meaning rather than the specifics of the format, which are better described with the grammar above.
The “tables” section has zero or more tables, each table is comprised of these fields:
Example:
@attribute(an_attribute=(1,('foo', 'bar')))
CREATE TABLE foo(
id INTEGER,
name TEXT );
generates:
{
"name" : "foo",
"CRC" : "-1869326768060696459",
"isTemp" : 0,
"ifNotExists" : 0,
"withoutRowid" : 0,
"isAdded" : 0,
"isDeleted" : 0,
"isRecreated": 0,
"indices" : [ "foo_name" ],
"attributes" : [
{
"name" : "an_attribute",
"value" : [1, ["foo", "bar"]]
}
],
"columns" : [
{
"name" : "id",
"type" : "integer",
"isNotNull" : 0,
"isAdded" : 0,
"isDeleted" : 0,
"isPrimaryKey" : 0,
"isUniqueKey" : 0,
"isAutoIncrement" : 0
},
{
"name" : "name",
"type" : "text",
"isNotNull" : 0,
"isAdded" : 0,
"isDeleted" : 0,
"isPrimaryKey" : 0,
"isUniqueKey" : 0,
"isAutoIncrement" : 0
}
],
"primaryKey" : [ ],
"primaryKeySortOrders" : [ ],
"foreignKeys" : [
],
"uniqueKeys" : [
],
"checkExpressions" : [
]
}
Region Information can appear on many entities, it consists of two optional elements:
Miscellaneous attributes can be present on virtual every kind of entity. They are optional. The root node introduces the attributes:
Each attribute is a name and value pair:
[[attribute]]
form is used, it is expanded into the normal cql:attribute
form in the outputEach attribute value can be:
Since the attribute values can nest it’s possible to represent arbitrarily complex data types in an attribute.
While the most common use case for attributes is to be attached to other entities (e.g., tables, columns), CQL also lets you define “global” attributes, which are included in the top level attributes
section of the JSON output. To specify global attributes you declare a variable of type object
ending with the suffix database
and attach attributes to it. CQL will merge together all the attributes from all the variables ending with database
and place them in the attributes
section of the JSON output.
Global attributes give you a way to add global configuration information into the CQL JSON output. You can, for instance, include these attributes in some root file that you #include
in the rest of your CQL code, and by doing this, these attributes will be visible in any generated JSON for those files.
Example:
@attribute(attribute_1 = "value_1")
@attribute(attribute_2 = "value_2")
declare database object;
@attribute(attribute_3 = "value_3")
declare some_other_database object;
Generates:
{
"attributes": [
{
"name": "attribute_1",
"value": "value_1"
},
{
"name": "attribute_2",
"value": "value_2"
},
{
"name": "attribute_3",
"value": "value_3"
}
]
}
Foreign keys appear only in tables, the list of keys contains zero or more entries of this form:
Unique keys appear only in tables, the list of keys contains zero or more entries of this form:
Check Expressions appear only in tables, the list of keys contains zero or more entries of this form:
?
items in the check expressionThe checkExprArgs will almost certainly be the empty list []
. In the exceedingly rare situation that the table in question was defined in a procedure and some of parts of the check expression were arguments to that procedure then the check expression is not fully known until that procedure runs and some of its literals will be decided at run time. This is an extraordinary choice but technically possible.
Columns are themselves rather complex, there are 1 or more of them in each table. The table will have a list of records of this form:
The “virtualTables” section is very similar to the “tables” section with zero or more virtual table entries.
Virtual table entries are the same as table entries with the following additions:
The JSON schema for these items was designed to be as similar as possible so that typically the same code can handle both with possibly a few extra tests of the isVirtual field.
The views section contains the list of all views in the schema, it is zero or more view entires of this form.
NOTE: The use of unbound expressions in a view would be truly extraordinary so selectArgs is essentially always going to be an empty list.
Example:
CREATE VIEW MyView AS
SELECT *
FROM foo
Generates:
{
"name" : "MyView",
"CRC" : "5545408966671198580",
"isTemp" : 0,
"isDeleted" : 0,
"projection" : [
{
"name" : "id",
"type" : "integer",
"isNotNull" : 0
},
{
"name" : "name",
"type" : "text",
"isNotNull" : 0
}
],
"select" : "SELECT id, name FROM foo",
"selectArgs" : [ ],
"fromTables" : [ "foo" ],
"usesTables" : [ "foo" ]
}
A projection defines the output shape of something that can return a table-like value such as a view or a procedure.
The projection consists of a list of one or more projected columns, each of which is:
int<job_id>
the kind is “job_id”)The dependencies section appears in many entities, it indicates things that were used by the object and how they were used. Most of the fields are optional, some fields are impossible in some contexts (e.g. inserts can happen inside of views).
The indices section contains the list of all indices in the schema, it is zero or more view entires of this form:
Example:
create index foo_name on foo(name);
Generates:
{
"name" : "foo_name",
"CRC" : "6055860615770061843",
"table" : "foo",
"isUnique" : 0,
"ifNotExists" : 0,
"isDeleted" : 0,
"columns" : [ "name" ],
"sortOrders" : [ "" ]
}
The next several sections:
All provide information about various types of procedures. Some “simple” procedures that consist only of the type of statement correspond to their section (and some other rules) present additional information about their contents. This can sometimes be useful. All the sections define certain common things about procedures so that basic information is available about all procedures. This is is basically the contents of the “general” section which deals with procedures that have a complex body of which little can be said.
The queries section corresponds to the stored procedures that are a single SELECT statement with no fragments.
The fields of a query record are:
Example:
create proc p(name_ text)
begin
select * from foo where name = name_;
end;
Generates:
{
"name" : "p",
"definedInFile" : "x",
"definedOnLine" : 3,
"args" : [
{
"name" : "name_",
"argOrigin" : "name_",
"type" : "text",
"isNotNull" : 0
}
],
"fromTables" : [ "foo" ],
"usesTables" : [ "foo" ],
"projection" : [
{
"name" : "id",
"type" : "integer",
"isNotNull" : 0
},
{
"name" : "name",
"type" : "text",
"isNotNull" : 0
}
],
"statement" : "SELECT id, name FROM foo WHERE name = ?",
"statementArgs" : [ "name_" ]
}
Procedure arguments have several generalities that don’t come up very often but are important to describe. The argument list of a procedure is 0 or more arguments of the form:
int<job_id>
it’s “job_id”An example of a simple argument was shown above, if we change the example a little bit to use the argument bundle syntax (even though it’s overkill) we can see the general form of argOrigin.
Example:
create proc p(a_foo like foo)
begin
select * from foo where name = a_foo.name or id = a_foo.id;
end;
Generates:
{
"name" : "p",
"definedInFile" : "x",
"definedOnLine" : 3,
"args" : [
{
"name" : "a_foo_id",
"argOrigin" : "a_foo foo id",
"type" : "integer",
"isNotNull" : 0
},
{
"name" : "a_foo_name",
"argOrigin" : "a_foo foo name",
"type" : "text",
"isNotNull" : 0
}
],
"fromTables" : [ "foo" ],
"usesTables" : [ "foo" ],
"projection" : [
{
"name" : "id",
"type" : "integer",
"isNotNull" : 0
},
{
"name" : "name",
"type" : "text",
"isNotNull" : 0
}
],
"statement" : "SELECT id, name FROM foo WHERE name = ? OR id = ?",
"statementArgs" : [ "a_foo_name", "a_foo_id" ]
}
Note the synthetic names a_foo_id
and a_foo_name
the argOrigin indicates that the bundle name is a_foo
which could have been anything, the shape was foo
and the column in foo
was id
or name
as appropriate.
The JSON is often used to generate glue code to call procedures from different languages. The argOrigin can be useful if you want to codegen something other normal arguments in your code.
The general insert section corresponds to the stored procedures that are a single INSERT statement with no fragments. The fields of a general insert record are:
General inserts does not include the inserted values because they are not directly extractable in general. This form is used if one of these is true:
WITH
clauseIf fragments are in use then even “generalInsert” cannot capture everything and “general” must be used (see below).
Example:
create proc p()
begin
insert into foo values (1, "foo"), (2, "bar");
end;
Generates:
{
"name" : "p",
"definedInFile" : "x",
"args" : [
],
"insertTables" : [ "foo" ],
"usesTables" : [ "foo" ],
"table" : "foo",
"statement" : "INSERT INTO foo(id, name) VALUES(1, 'foo'), (2, 'bar')",
"statementArgs" : [ ],
"statementType" : "INSERT",
"columns" : [ "id", "name" ]
}
The vanilla inserts section can be used for procedures that just insert a single row. This is a very common case and if the JSON is being used to drive custom code generation it is useful to provide the extra information. The data in this section is exactly the same as the General Inserts section except that includes the inserted values. The “values” property has this extra information.
Each value in the values list corresponds 1:1 with a column and has this form:
Example:
create proc p(like foo)
begin
insert into foo from arguments;
end;
Generates:
{
"name" : "p",
"definedInFile" : "x",
"definedOnLine" : 3,
"args" : [
{
"name" : "name_",
"argOrigin" : "foo name",
"type" : "text",
"isNotNull" : 0
},
{
"name" : "id_",
"argOrigin" : "foo id",
"type" : "integer",
"isNotNull" : 0
}
],
"insertTables" : [ "foo" ],
"usesTables" : [ "foo" ],
"table" : "foo",
"statement" : "INSERT INTO foo(id, name) VALUES(?, ?)",
"statementArgs" : [ "id_", "name_" ],
"statementType" : "INSERT",
"columns" : [ "id", "name" ],
"values" : [
{
"value" : "?",
"valueArgs" : [ "id_" ]
},
{
"value" : "?",
"valueArgs" : [ "name_" ]
}
]
}
The updates section corresponds to the stored procedures that are a single UPDATE statement with no fragments. The fields of an update record are:
Example:
create proc p(like foo)
begin
update foo set name = name_ where id = id_;
end;
Generates:
{
"name" : "p",
"definedInFile" : "x",
"definedOnLine" : 3,
"args" : [
{
"name" : "name_",
"argOrigin" : "foo name",
"type" : "text",
"isNotNull" : 0
},
{
"name" : "id_",
"argOrigin" : "foo id",
"type" : "integer",
"isNotNull" : 0
}
],
"updateTables" : [ "foo" ],
"usesTables" : [ "foo" ],
"table" : "foo",
"statement" : "UPDATE foo SET name = ? WHERE id = ?",
"statementArgs" : [ "name_", "id_" ]
}
The deletes section corresponds to the stored procedures that are a single DELETE statement with no fragments. The fields of a delete record are exactly the same as those of update. Those are the basic fields needed to bind any statement.
Example:
create proc delete_proc (name_ text)
begin
delete from foo where name like name_;
end;
Generates:
{
"name" : "delete_proc",
"definedInFile" : "x",
"definedOnLine" : 3,
"args" : [
{
"name" : "name_",
"argOrigin" : "name_",
"type" : "text",
"isNotNull" : 0
}
],
"deleteTables" : [ "foo" ],
"usesTables" : [ "foo" ],
"table" : "foo",
"statement" : "DELETE FROM foo WHERE name LIKE ?",
"statementArgs" : [ "name_" ]
}
And finally the section for procedures that were encountered that are not one of the simple prepared statement forms. The principle reasons for being in this category are:
The fields of a general procedure are something like a union of update and delete and query but with no statement info. The are as follows:
The result contract is at most one of these:
A procedure that does not produce a result set in any way will set none of these and have no projection entry.
Example:
create proc with_complex_args (inout arg real)
begin
set arg := (select arg+1 as a);
select "foo" bar;
end;
Generates:
{
"name" : "with_complex_args",
"definedInFile" : "x",
"definedOnLine" : 1,
"args" : [
{
"binding" : "inout",
"name" : "arg",
"argOrigin" : "arg",
"type" : "real",
"isNotNull" : 0
}
],
"usesTables" : [ ],
"projection" : [
{
"name" : "bar",
"type" : "text",
"isNotNull" : 1
}
],
"hasSelectResult" : 1,
"usesDatabase" : 1
}
The complex form of the arguments allows for an optional “binding”
Note that atypical binding forces procedures into the “general” section.
Example
declare interface interface1 (id integer);
Generates:
{
"name" : "interface1",
"definedInFile" : "x.sql",
"definedOnLine" : 1,
"projection" : [
{
"name" : "id",
"type" : "integer",
"isNotNull" : 0
}
]
}
The declareProcs
section contains a list of procedure declaractions. Each declaration is of the form:
The declareNoCheckProcs
describes procedures declared like so:
DECLARE PROC Foo NO CHECK
Such procedures carry on the name and attributes
The declareFuncs
section contains a list of function declarations, Each declaration is of the form:
declare function dict_create() create object;
)There are also sections for declareNoCheckFuncs
, declareSelectFuncs
, and declareNoCheckSelectFuncs
.
args
tagcreatesObject
tag (they can’t create objects)projection
instead of a returnType
if they are table-valuedThe regions section contains a list of all the region definitions. Each region is of the form:
There are more details on regions and the meaning of these terms in Chapter 10.
This section lists all of the declared ad hoc migrations. Each entry is of the form:
Exactly one of:
There are more details on ad hoc migrations in Chapter 10.
This section list all the enumeration types and values. Each entry is of the form:
Each enumeration value is of the form:
Example:
declare enum an_enumeration integer ( x = 5, y = 12 );
Generates:
{
"name" : "an_enumeration",
"type" : "integer",
"isNotNull" : 1,
"values" : [
{
"name" : "x",
"value" : 5
},
{
"name" : "y",
"value" : 12
}
]
}
This section list all the constant groups and values. Each entry is of the form:
Each constant value is of the form:
Example:
declare const group some_constants (
= cast(5 as integer<job_id>),
x = 12.0,
y = 'foo'
z );
Generates:
{
"name" : "some_constants",
"values" : [
{
"name" : "x",
"type" : "integer",
"kind" : "job_id",
"isNotNull" : 1,
"value" : 5
},
{
"name" : "y",
"type" : "real",
"isNotNull" : 1,
"value" : 1.200000e+01
},
{
"name" : "z",
"type" : "text",
"isNotNull" : 1,
"value" : "foo"
}
]
}
This section list all the schema subscriptions in order of appearance. Each entry is of the form:
This section is a little more complicated than it needs to be becasue of the legacy/deprecated @resub
directive. At this point only the table name is relevant. The version is always 1 and the type is always “unsub”.
Example:
@unsub(foo);
Generates:
{
"type" : "unsub",
"table" : "foo",
"version" : 1
}
Shared fragments allows you to reuse and compose SQL queires in a safe (type checked) and efficient manner. They are based on Common Table Expressions (CTEs), so some basic knowledge of that is recommended before using Shared Fragments.
You can think of shared fragments as being somewhat like a parameterized view, but the parameters are both value parameters and type parameters. In Java or C#, a shared fragments might have had an invocation that looked something like this: my_fragment(1,2)<table1, table2>
.
It’s helpful to consider a real example:
AS (
split_text(tok) WITH RECURSIVE
AS (
splitter(tok,rest) SELECT
'' tok,
|| ',', '') rest
IFNULL( some_variable_ UNION ALL
SELECT
substr(rest, 1, instr(rest, ',') - 1) tok,
substr(rest, instr(rest, ',') + 1) rest
FROM splitter
WHERE rest != ''
)SELECT tok from splitter where tok != '';
)
This text might appear in dozens of places where a comma separated list needs to be split into pieces and there is no good way to share the code between these locations. CQL is frequently used in conjunction with the C-pre-processor so you could come up with something using the #define construct but this is problematic for several reasons:
None of this is any good but the desire to create helpers like this is real both for correctness and for performance.
To make these things possible, we introduce the notion of shared fragments. We need to give them parameters and the natural way to create a select statement that is bindable in CQL is the procedure. So the shape we choose looks like this:
[[shared_fragment]]value TEXT)
PROC split_text(BEGIN
WITH RECURSIVE
AS (
splitter(tok,rest) SELECT
'' tok,
value || ',', '') rest
IFNULL( UNION ALL
SELECT
substr(rest, 1, instr(rest, ',') - 1) tok,
substr(rest, instr(rest, ',') + 1) rest
FROM tokens
WHERE rest != ''
)SELECT tok from splitter where tok != '';
END;
The introductory attribute [[shared_fragment]]
indicates that the procedure is to produce no code, but rather will be inlined as a CTE in other locations. To use it, we introduce the ability to call a procedure as part of a CTE declaration. Like so:
WITH
as (call split_text('x,y,z'))
result(v) select * from result;
Once the fragment has been defined, the statement above could appear anywhere, and of course the text 'x,y,z'
need not be constant. For instance:
value TEXT)
PROC print_parts(BEGIN
DECLARE C CURSOR FOR
WITH
as (CALL split_text('x,y,z'))
result(v) SELECT * from result;
LOOP FETCH C
BEGIN
CALL printf("%s\n", C.v);
END;
END;
Fragments are also composable, so for instance, we might also want some shared code that extracts comma separated numbers. We could do this:
[[shared_fragment]]value TEXT)
PROC ids_from_string(BEGIN
WITH
as (CALL split_text(value))
result(v) SELECT CAST(v as LONG) as id from result;
END;
Now we could write:
value TEXT)
PROC print_ids(BEGIN
DECLARE C CURSOR FOR
WITH
id) as (CALL ids_from_string('1,2,3'))
result(SELECT * from result;
LOOP FETCH C
BEGIN
CALL printf("%ld\n", C.id);
END;
END;
Of course these are very simple examples but in principle you can use the generated tables in whatever way is necessary. For instance, here’s a silly but illustrative example:
/* This is a bit silly */
value TEXT)
PROC print_common_ids(BEGIN
DECLARE C CURSOR FOR
WITH
id) as (CALL ids_from_string('1,2,3')),
v1(id) as (CALL ids_from_string('2,4,6'))
v2(SELECT * from v1
INTERSECT
SELECT * from v2;
LOOP FETCH C
BEGIN
CALL printf("%ld\n", C.id);
END;
END;
With a small amount of dynamism in the generation of the SQL for the above, it’s possible to share the body of v1 and v2. SQL will of course see the full expansion but your program only needs one copy no matter how many times you use the fragment anywhere in the code.
So far we have illustrated the “parameter” part of the flexibility. Now let’s look at the “generics” part; even though it’s overkill for this example, it should still be illustrative. You could imagine that the procedure we wrote above ids_from_string
might do something more complicated, maybe filtering out negative ids, ids that are too big, or that don’t match some pattern, whatever the case might be. You might want these features in a variety of contexts, maybe not just starting from a string to split.
We can rewrite the fragment in a “generic” way like so:
[[shared_fragment]]
PROC ids_from_string_table()BEGIN
WITH
source(v) LIKE (select "x" v)
SELECT CAST(v as LONG) as id from source;
END;
Note the new construct for a CTE definition: inside a fragment we can use “LIKE” to define a plug-able CTE. In this case we used a select
statement to describe the shape the fragment requires. We could also have used a name source(*) LIKE shape_name
just like we use shape names when describing cursors. The name can be any existing view, table, a procedure with a result, etc. Any name that describes a shape.
Now when the fragment is invoked, you provide the actual data source (some table, view, or CTE) and that parameter takes the role of “values”. Here’s a full example:
value TEXT)
PROC print_ids(BEGIN
DECLARE C CURSOR FOR
WITH
*) as (CALL split_text(value)),
my_data(id) as (CALL ids_from_string_table() USING my_data AS source)
my_numbers(SELECT id from my_numbers;
LOOP FETCH C
BEGIN
CALL printf("%ld\n", C.id);
END;
END;
We could actually rewrite the previous simple id fragment as follows:
[[shared_fragment]]value TEXT)
PROC ids_from_string(BEGIN
WITH
as (CALL split_text(value))
tokens(v) id) as (CALL ids_from_string_table() USING tokens as source)
ids(SELECT * from ids;
END;
And actually we have a convenient name we could use for the shape we need so we could have used the shape syntax to define ids_from_string_table
.
[[shared_fragment]]
PROC ids_from_string_table()BEGIN
WITH
source(*) LIKE split_text
SELECT CAST(tok as LONG) as id from source;
END;
These examples have made very little use of the database but of course normal data is readily available, so shared fragments can make a great way to provide access to complex data with shareable, correct code. For instance, you could write a fragment that provides the ids of all open businesses matching a name from a combination of tables. This is similar to what you could do with a VIEW
plus a WHERE
clause but:
In short, shared fragments can help with the composition of any complicated kinds of queries. If you’re producing an SDK to access a data set, they are indispensible.
When creating a fragment the following rules are enforced:
Usage of a fragment is always introduced by a “call” to the fragment name in a CTE body. When using a fragment the following rules are enforced.
(SELECT ...)
expressions
USING my_data AS values
the actual columns in my_data
must be the same as in the values
parametervalues(*) as (CALL something() using source as source)
- here the caller’s “source” takes the value of the fragment’s “source”, which is not a true conflictUSING _source AS _source
makes it clear that a parameter is being forwarded and _source
is not likely to conflict with real table or view namesNote that when shared fragments are used, the generated SQL has the text split into parts, with each fragment and its surroundings separated, therefore the text of shared fragments is shared(!) between usages if normal linker optimizations for text folding are enabled (common in production code.)
Shared fragments use dynamic assembly of the text to do the sharing but it is also possible to create alternative texts. There are many instances where it is desirable to not just replace parameters but use, for instance, an entirely different join sequence. Without shared fragments, the only way to accomplish this is to fork the desired query at the topmost level (because SQLite has no internal possibility of “IF” conditions.) This is expensive in terms of code size and also cognitive load because the entire alternative sequences have to be kept carefully in sync. Macros can help with this but then you get the usual macro maintenance problems, including poor diagnostics. And of course there is no possibility to share the common parts of the text of the code if it is forked.
However, conditional shared fragments allow forms like this:
[[shared_fragment]]
PROC ids_from_string(val TEXT)BEGIN
IF val IS NULL OR val IS '' THEN
SELECT 0 id WHERE 0; -- empty result
ELSE
WITH
as (CALL split_text(val))
tokens(v) id) as (CALL ids_from_string_table() USING tokens as source)
ids(SELECT * from ids;
END IF;
END;
Now we can do something like:
*) AS (CALL ids_from_string(str)) ids(
In this case, if the string val
is empty then SQLite will not see the complex comma splitting code, and instead will see the trivial case select 0 id where 0
. The code in a conditional fragment might be entirely different between the branches removing unnecessary code, or swapping in a new experimental cache in your test environment, or anything like that.
When a condiitional is specified without an else clause, the fragment would return a result with no rows if none of the specified conditionals are truthy.
For example:
[[shared_fragment]]NOT NULL)
PROC maybe_empty(cond BOOL BEGIN
IF cond THEN
SELECT 1 a, 2 b, 3 c;
END IF;
END;
Internally, this is actually equivalent to the following:
[[shared_fragment]]NOT NULL)
PROC maybe_empty(cond BOOL BEGIN
IF cond THEN
SELECT 1 a, 2 b, 3 c;
ELSE
SELECT NOTHING;
END IF;
END;
The SELECT NOTHING
expands to the a query that returns no rows, like this:
SELECT 0,0,0 WHERE 0; -- number of columns match the query returned by the main conditional.
The generalization is simply this:
With this additional flexibility a wide variety of SQL statements can be constructed economically and maintainability. Importantly, consumers of the fragments need not deal with all these various alternate possibilities but they can readily create their own useful combinations out of building blocks.
Ultimately, from SQLite’s perspective, all of these shared fragment forms result in nothing more complicated than a chain of CTE expressions.
See Appendix 8 for an extensive section on best practices around fragments and common table expressions in general.
TIP: If you use CQL’s query planner on shared fragments with conditionals, the query planner will only analyze the first branch by default. You need to use
[[query_plan_branch={an_integer}]]
to modify the behavior. Read Query Plan Generation for details.
The shared fragment system also has the ability to create re-usable expression-style fragments giving you something like SQL inline functions. These do come with some performance cost so they should be used for larger fragments. In many systems a simple shared fragment would not compete well with an equivalent #define
. Expression fragments shine when:
From a raw performance perspective, the best you can hope for with any of the fragment approaches is a “tie” on speed compared do directly inlining equivalent SQL or using a macro to do the same. However, from a correctness and space perspective it is very possible to come out ahead. It’s fair to say that expression fragments have the greatest overhead compared to the other types and so they are best used in cases where the size benefits are going to be important.
An expression fragment is basically a normal shared fragment with no top-level FROM
clause that generates a single column. A typical one might look like this:
-- this isn't very exciting because regular max would do the job
[[shared_fragment]]integer, y integer)
proc max_func(x begin
select case when x >= y then x else y end;
end;
The above can be used in the context of a SQL statement like so:
select max_func(T1.column1, T1.column2) the_max from foo T1;
The consequence of the above is that the body of max_func
is inlined into the generated SQL. However, like the other shared fragments, this is done in such a way that the text can be shared between instances so you only pay for the cost of the text of the SQL in your program one time, no matter how many time you use it.
In particular, for the above, the compiler will generate the following SQL:
select (
select case when x >= y then x else y end
from (select T1.column1 x, column2 y))
But each line will be its own string literal, so, more accurately, it will concatenate the following three strings:
"select (", // string1
" select case when x >= y then x else y end", // string2
" from (select T1.column1 x, column2 y))" // string3
Importantly, string2
is fixed for any given fragment. The only thing that changes is string3
, i.e., the arguments. In any modern C compilation system, the linker will unify the string2
literal across all translation units so you only pay for the cost of that text one time. It also means that the text of the arguments appears exactly one time, no matter how complex they are. For these benefits, we pay the cost of the select wrapper on the arguments. If the arguments are complex that “cost” is negative. Consider the following:
select max_func((select max(T.m) from T), (select max(U.m) from U))
A direct expansion of the above would result in something like this:
case when (select max(T.m) from T) >= (select max(U.m) from U)
then (select max(T.m) from T)
else (select max(U.m) from U)
end;
The above could be accomplished with a simple #define
style macro. However, the expression fragment generates the following code:
select (
select case when x >= y then x else y end
from select (select max(T.m) from T) x, (select max(U.m) from U) y))
Expression fragments can nest, so you could write:
[[shared_fragment]]integer, y integer, z integer)
proc max3_func(x begin
select max_func(x, max_func(y, z));
end;
Again, this particular example is a waste because regular max
would already do the job better.
To give another example, common mappings from one kind of code to another using case/when can be written and shared this way:
-- this sort of thing happens all the time
[[shared_fragment]]integer not null)
proc remap(x begin
select case x
when 1 then 1001
when 2 then 1057
when 3 then 2010
when 4 then 2011
else 9999
end;
end;
In the following:
select remap(T1.c), remap(T2.d), remap(T3.e) from T1, T2, T3... etc.
The text for remap
will appear three times in the generated SQL query but only one time in your binary.
FROM
, WHERE
, HAVING
, etc. – the result is an expressionSELECT
which could then have all the usual SELECT
elementsA simpler syntax might have been possible but expression fragments are only interesting in SQL contexts where (among other things) normal procedure and function calls are not available. So the select
keyword makes it clear to the coder and the compiler that the expression will be evaluated by SQLite and the rules for what is allowed to go in the expression are the SQLite rules.
The fragment has no FROM
clause because we’re trying to produce an expression, not a table-value with one column. If you want a table-value with one column, the original shared fragments solution already do exactly that. Expression fragments give you a solution for sharing code in, say, the WHERE
clause of a larger select statement.
Commpared to something like
case when (x) >= (y) then x else y end; #define max_func(x,y)
The macro does give you a ton of flexibility, but it has many problems:
CQL offers a way to run SQLite’s EXPLAIN QUERY PLAN
command for all the SQL statements used in a CQL file using a set of special compilation steps.
Generating query plans is inherently complicated. Any given stored procedure might include many SQL statements, each of which has a plan. To get the plans, those statements must be executed in the appropriate mode. In order for them to execute, whatever tables, views, and user-defined functions they use must exist. The statements can have any number of parameters, those have to be swapped out because they might come from anywhere. When run in --rt query_plan
mode, the compiler accomplishes all of this by analyzing the original code and creating entirely new code. Running this new code creates the schema and, with the appropriate transforms, runs all the SQL statements in the original to give us the query plans. The process requires many steps as we’ll see below.
TIP: The following steps are used in
./repl/go_query_plan.sh
, you can run it to get a quick demonstration of this feature in action. The rest of the section explains how query plan generation works and some of its quirks.
To execute query plans for a given CQL file, the following commands need to be run:
CQL_FILE= # The CQL file to compile
CQL_ROOT_DIR= # Path to cql directory
CQL=$CQL_ROOT_DIR/out/cql
# Generate Query Plan Script
$CQL --in $CQL_FILE --rt query_plan --cg go-qp.sql
# Compile and link CQL artifacts, with a main C file query_plan_test.c
$CQL --in go-qp.sql --cg go-qp.h go-qp.c --dev
cc -I$CQL_ROOT_DIR -I. -c $CQL_ROOT_DIR/query_plan_test.c go-qp.c
cc -I$CQL_ROOT_DIR -I. -O -o go_query_plan go-qp.o query_plan_test.o $CQL_ROOT_DIR/cqlrt.c -lsqlite3
# Run and generate query plans
./go_query_plan
In order to flexibily create query plans, CQL uses --rt query_plan
to generate a second CQL script that returns query plans for each SQL statement used in a given file.
The CQL repository provides the file query_plan_test.c
that can be used as the “main” function, otherwise you can make your own.
NOTE: >When compiling the CQL file generated by
--rt query_plan
, the--dev
flag is required.
CQL’s query planner generator modifies the usage of the following features to allow SQLite run EXPLAIN QUERY PLAN
successfully:
CAUTION: Generating query plans for CQL files that use table valued functions, or virtual tables results in the creation of similarly named “normal” tables which will be using in the query instead of the virtual table or table -valued function. This means that the query plan output won’t be quite right, especially if these items had complex interiors with nested SQLite queries, but it also means the code that generates the query plans does not need to have all of your native bindings for the relevant modules and functions which make it possible to actually run the queries out of context in a standalone build.
Variables used in SQL statements are stubbed into constants. The exact value varies depending on the type of the variable, but it is always equivalent to some form of "1"
.
...
SELECT *
FROM my_table
WHERE id = x;
...
...
EXPLAIN QUERY PLAN
SELECT *
FROM my_table
WHERE my_table.id = nullable(1);
...
Read Functions on details about Function Types.
Since the implementation of UDFs in a CQL file do not affect SQLite query plans, CQL’s query plan script automatically creates no-op stubs to be used instead.
Read CQL Query Fragments on details about shared fragments
Only one branch of a conditional is chosen for query plan analysis. By default this will be the first branch, which is the initial SELECT
statement following the IF
conditional. The branch to analyze can be configured with the cql:query_plan_branch
@attribute
Here’s an example of cql:query_plan_branch
being used:
[[shared_fragment]]=1]]
[[query_plan_branchCREATE PROC frag2(y int)
BEGIN
IF y == 2 THEN
SELECT 10 b;
ELSE IF y == -1 THEN
SELECT 20 b;
ELSE
SELECT 30 b;
END IF;
END;
EXPLAIN QUERY PLAN
SELECT 20 b;
cql:query_plan_branch=1
selects the second branch. Providing cql:query_plan_branch=2
instead would yield the ELSE
clause SELECT 30 b
. cql:query_plan_branch=0
would yield SELECT 10 b
, which is the same as the default behaviour.
There are two large features that involve complex uses of blobs that are supported in the default runtime. These are Blob Storage and Backed Tables, the latter of which uses key and value blobs for generic schema.
The general idea here is that you might want to take arbitrary data that is in a cursor, which is to say any database shape at all, and convert it into a single blob. You could use this blob to store composite data in a single column in the database or to send your data over some kind of wire transport in a generic fashion. The idea of blob storage is to provide a reslient way to do this that is platform neutral (i.e. you can store the blob on one system and recover it on another system with different endian order). Blob storage allows limited extension of the blob shape over time, allowing you to add new nullable columns at the end of your shape and still recover your data from older blobs.
In SQL/CQL, the main way you define structures, especially those that you want to maintain, is with tables. Hence we introduce notation like this:
[[blob_storage]]create table news_info(
who text,
what text,long -- timestamp of some kind
`when` );
The blob_storage attribute
indicates that the table we’re about to define here is not really going to be a materialized table. As a result, you will not be able to (e.g.) DROP
the table or SELECT
from it, and there will be no schema upgrade for it should you request one. However, the usual schema maintenance rules still apply (See Chapter 10 and Chapter 11) which help you to create compatible versions of this structure. For instance, new columns can be added only at the end, and only if they are nullable. Here we add source
to the schema in a hypothetical “version 6”.
[[blob_storage]]create table news_info(
who text,
what text,long -- timestamp of some kind
`when` source text @create(6)
);
NOTE: schema versions move forward globally in the schema, not locally in one table; this implies there are versions 1-5 elsewhere, not shown.
Additionally, since the storage is not backed by SQLite and therefore lacks its constraint system, default values and constraints are not allowed in a table marked with cql:blob_storage
; it’s just data. Similarly, triggers, views, and indices may not use the table marked as blob storage. It isn’t really a table.
Blob storage goes in a blob field, but recall CQL has discriminated types so we can use a form like this:
create table info(
id long primary key,
blob<news_info>
news_info );
From a SQL perspective news_info
is just a blob, you can only apply blob operations to it in the context of a query. This means WHERE
clauses that partially constraint the blob contents are not generally possible (though you could do it if you write suitable UDFs and Backed Tables actually generalize this if generic schema support is what is desired). Blob storage is really about moving the whole blob around so it’s apprpropriate when you want to crack the blob in code, and not so much in database operations.
You can use the to_blob
pipeline function on a cursor to get a blob. In the below C:to_blob
becomes simply cql_cursor_to_blob(C)
.
create proc make_blob(like news_info, out result blob<news_info>)
begin
-- declare the cursor
cursor C like news_info;
-- load it from loose arguments
from arguments;
fetch c -- set the blob variable
set result := C:to_blob;
END;
The above declares a cursor, loads it from argument values, and converts it to a blob. Of course any of the usual cursor building forms can be used to power your blob creation, you just do one serialization at the end. The above is assembling a blob from arguments but you could equally make the blob from data.
create proc get_news_info(id_ long not null, out result blob<news_info>)
begin
-- use our @columns sugar syntax for getting just news_info columns from
-- a table with potentially lots of stuff (or an error if it's missing columns)
cursor C for
select @columns(like news_info) from some_source_of_info where info.id = id_;
fetch c;set result := C:to_blob;
END;
There are many cursor fetch forms, including dummy data forms and other interesting bits of sugar. You can fetch a cursor from arguments, from other cursors, and even combinations. Because cursors are the source of new blobs, any of these data acquistion forms are viable and convenient sources of data with which to make blobs.
Again, the normal way that you work with records in CQL is by creating suitable cursors. Such cursors can be economically accessed on a field-by-field basis. What we need is a way to easily recreate a cursor from a blob so we can read the data values. To do this use this form:
-- get the blob from somewhere (b will be of type blob<news_info>)
:= (select news_info from info where id = id_ if nothing null);
let b
-- create a suitable cursor with the same shape
cursor C like b;
-- load the cursor (note that this can throw an exception if the blob is corrupt)
:from_blob(b);
C
-- now use C.who, C.what, etc.
Once the data is loaded in a cursor it is very economical to access on a field-by-field basis, and, since the deserialization of the blob happened all at once, that can also be economical.
NOTE: The runtime cannot assume that the blob is well formed, it could be coming from anywhere. If only for security reasons it must assume the blob is “hostile”. Hence the decoding validates the shape, internal lengths, and so forth. Therefore there are many ways conversion might fail.
Once you have the cursor you can do any of the usual data operations; you could even make new blobs with different combinations by slicing the cursor fields using the LIKE
operator. You can return the cursor with OUT
, or OUT UNION
, or pass the blob fields as arguments to functions using the FROM
forms. The cracked blob is fully usable for all the usual CQL things you might need to do. Importantly, blob storage can contain other blobs so you can nest shapes as needed.
The blob data must be able to evolve over time, so each blob has to be self-describing. We must also be able to throw an exception if an incorrect or invalid blob is used when loading a cursor, so the blob has to contain the following:
varint
/VLQ
format after zigzag
encoding (same article)As with many other features, it’s possible to replace the (de)serialization with code of your choice by supplying your own runtime methods.
Storage types that are going to be persisted in the database or go over a wire-protocol should be managed like schema with the usual validation rules. On the other hand, formats that will be used only transiently in memory can be changed at whim from version to version. As mentioned above, the design specifically considers cases where a new client discovers an old-format blob (with fewer columns) and, the reverse, cases where an old client recieves a datagram from a new client with too many columns. Any customizations should consider these same cases.
The code samples below illustrate some of the more common blob operations that are likely to come up.
[[blob_storage]]create table news_info(
who text,
what text,long -- timestamp of some kind
`when`
);
-- a place where the blob appears in storage
create table some_table(
integer,
x integer,
y blob<news_info>
news_blob
);
-- a procedure that creates a news_info blob from loose args
create proc make_blob(like news_info, out result blob<news_info>)
begin
cursor C like news_info;
from arguments;
fetch C := C:to_blob;
result end;
-- a procedure that cracks the blob, creating a cursor
create proc crack_blob(data blob<news_info>)
begin
cursor C like news_info;
:from_blob(data);
Cout C;
end;
-- A procedure that cracks the blob into loose args if needed
-- the OUT statement was created specifically to allow you to
-- avoid this sort mass OUT awfulness consider that the blob
-- might have dozens of columns, this quickly gets unweildy.
-- But, as an illustration, here is explicit extraction.
create proc crack_blob_to_vars(
data blob<news_info>,
out who text,
out what text,
out `when` long) -- when is a keyword, `when` is not.
begin
cursor C like news_info;
:from_blob(data);
C:= C.who;
who := C.what;
what := C.`when`;
`when` end;
-- we're going to have a result with x/y columns in it
-- this shape lets us select those out easily using LIKE
interface my_basic_columns (int,
x int
y
);
-- this defines a shape for the result we want
-- we're never actually defining this procedure
interface my_result_shape (like my_basic_columns,
like news_info
);
create proc select_and_crack(whatever bool not null)
begin
cursor C for select * from some_table where whatever;
loop fetch c
begin
-- crack the blob in c into a new cursor 'news'
declare news cursor like news_info;
from blob c.news_blob;
fetch news
-- assemble the result we want from the parts we have
declare result cursor like my_result_shape;
from values (from c like my_basic_columns, from news);
fetch result
-- emit one row, some came from the original data some cracked
-- callers will never know that some data was encoded in a blob
out union result;
end;
end;
These forms are no longer supported:
-- loading a blob from a cursor
set a_blob from cursor C;
-- new supported forms, these are all the same thing
let b := C:to_blob;
C:to_blob(b);
let b := cql_cursor_to_blob(C);
cql_cursor_to_blob(C, b);
-- loading a cursor from a blob
fetch C from blob b;
-- new supported forms, these are all the same thing
C:from_blob(b);
cql_cursor_from_blob(C, b);
Most production databases include some tables that are fairly generic, they use maybe a simple key-value combination to store some simple settings or something like that. In the course of feature development this kind of thing comes up pretty often and in large client applications there are many small features that need a little bit of state.
It’s easy enough to model whatever state you need with a table or two but this soon results in an explosion of tiny tables. In some cases there are only a few rows of configuration data and indeed the situation can be so bad that the text of the schema for the little state table is larger than the sum of all the data you will ever store there. This situation is a bit tragic because SQLite has initialization cost associated with each table. So these “baby tables” are really not paying for themselves at all.
What we’d like to do is use some kind of generic table as the backing store for many of these small tables while preserving type safety. The cost of access might be a bit higher but since data volumes are expected to be low anyway this would be a good trade-off. And we can have as many as we like. In some cases the state doesn’t even need to be persisted, so we’re talking about tables in an in-memory database. Here low cost of initialization is especially important. And lastly, if your project has dozens or even hundreds of small features like this, the likelihood that all of them are even used in any one session is quite low and so again, having a low fixed cost for the schema is a good thing. No need to create a few hundred in-memory tables on the off chance that they are needed.
First you need a place to store the data. We using “backing” tables to store the data for “backed” tables. And we define a backing table in the usual way. A simple backing table is just a key/value store and looks like this:
[[backing_table]]CREATE TABLE backing(
BLOB PRIMARY KEY,
k BLOB NOT NULL
v );
The backing_table
attribute indicates that the table we’re about to define is to be used for backing storage. At present it is signficantly restricted. It has to have exactly two columns, both of which are blobs, one is the key and one is the value. It should be either “baseline” schema (no @create annotation) or annotated with @create
as it is expected to be precious data. @recreate
is an error. If it’s an in-memory table then versioning is somewhat moot but really the backing store schema is not supposed to change over time, that’s the point.
As showwn below there are additional attributes that can be put on the backing table to define the functions that will be used to store data there.
In future versions we expect to allow some number of additional physical columns which can be used by the backed tables (discussed below) but for now only the simple key/value pattern is allowed. The columns can have any names but as they will be frequently used short names like “k” and “v” are recommended.
Backed tables look like this:
=backing]]
[[backed_byCREATE TABLE backed(
id INTEGER PRIMARY KEY,
NOT NULL,
name TEXT REAL
bias
);
=backing]]
[[backed_byCREATE TABLE backed2(
id INTEGER PRIMARY KEY,
NOT NULL
name TEXT );
The backed_by
attribute indicates that the table we’re about to define is not really going to be its own table. As a result, you will not be able to (e.g.) DROP
the table or CREATE INDEX
or CREATE TRIGGER
on it, and there will be no schema upgrade for it should you request one with --rt schema_upgrade
. The table may not contain constraints as there would be no way to enforce them, but they may have default values. As compensation for these restrictions, backed tables can be changed freely and have no associated physical schema cost.
NOTE: Adding new not null columns creatives effectively a new backed table, any previous data will seem “lost”. See below.
To understand how reading works, imagine that we had a VIEW for each backed table which simply reads the blobs out of the backing store and then extracts the backed columns using some blob extraction functions. This would work, but then we’d be trading view schema for table schema so the schema savings we’re trying to achieve would be lost.
We can’t use an actual VIEW but CQL already has something very “view like” – the shared fragment structure. So what CQL does instead of views is to automatically create a shared fragment just like the view we could have made. The shared fragment looks like this:
[[shared_fragment]]CREATE PROC _backed ()
BEGIN
SELECT
rowid,
id) AS id,
cql_blob_get(T.k, backed.AS name,
cql_blob_get(T.v, backed.name) AS bias
cql_blob_get(T.v, backed.bias) FROM backing AS T
WHERE cql_blob_get_type(backed, T.k) = 2105552408096159860L;
END;
Two things to note:
First, this fragment has the right shape, but the shared fragment doesn’t directly call blob extraction UDFs. Rather it uses indirect functions like cql_blob_get
. The point of these helpers is to make the actual blob functions configurable. The default runtime includes an implementation of extration functions with the default names, but you can create whatever blob format you want by defining suitable functions. You can even have different encodings in different backed tables.
Second, there is a type code embedded in the procedure. The type code is a hash of the type name and the names and types of all the not-null fields in the backed table. The hash is arbitrary but repeatable, any system can compute the same hash and find the records they want without having to share headers. The actual hash function is included in the open source but it’s just a SHA256 reduced to 64 bits with some name canonicalization. The JSON output also includes the relevant hashes so you can easily consume them without even having to know the hash function.
As a second example, the expanded code for backed2
is shown below:
[[shared_fragment]]CREATE PROC _backed2 ()
BEGIN
SELECT
rowid,
id) AS id,
cql_blob_get(T.k, backed2.AS name
cql_blob_get(T.v, backed2.name) FROM backing AS T
WHERE cql_blob_get_type(backed, T.k) = -1844763880292276559L;
END;
As you can see it’s very similar – the type hash is different and of course it has different columns but the pattern should be clear.
The type hash, which is also sometimes called the record type, is designed to stay fixed over time even if you add new optional fields. Contrariwise, if you change the name of the type or if you add new not null fields the type identity is considered to be changed and any data you have in the backing table will basically be ignored because the type hash will not match. This is lossy but safe. More complicated migrations from one type shape to another are possible by introducing a new backed type and moving data.
cql_blob_get
and cql_blob_get_type
By default cql_blob_get
turns into either bgetkey
or bgetval
depending on if you are reading from the key blob or the value blob. This is controlled by attributes on the backing table. All users of that table will apply the same functions.
= bgetkey]]
[[get_key = bgetval]]
[[get_val = bgetkey_type]] [[get_type
The key can no optional fields, it is formed from the primary key of the backing table. As a result it’s possible to index the fields by number and infer their types. Values, on the other hand, are expected to have nullable fields so some might be missing. They are indexed using a hash of the field name and the type, similar to the type code that identifies the backed table in the storage.
These attributes control offsets vs. codes:
-- omit and keys use offsets by default
[[use_key_codes]] -- omit and values use codes by default [[use_val_offsets]]
Here the offset means the zero based ordinal of the column in the key or the value. The order is the order that they appear in the table definition, there might be gaps because keys and values could interleave, each gets its own ordinals.
=something]]
[[backed_bycreate table foo(
int, -- ordinal 0 for keys
x int, -- ordinal 0 for values
a int -- ordinal 1 for values
b int, -- ordinal 1 for keys
y int -- ordinal 2 for values
c primary key (x,y)
);
NOTE: The blob format for keys must be canonical in the sense that the same values always produce the same blob, even after replacements, so if you use a field id based blob it will be important to always store the fields in the same order. Contrariwise use of offsets for the value blob indicates that your particular value blob will have fixed storage that is addressable by offset. This may be suitable for your needs, for instance maybe you only ever need to store 0-3 integers in the value.
The attributes [[json]]
and [[jsonb]]
cause the compiler to ignore the other backing attributes and use a JSON array for the keys (with the leading column being the type) and a JSON. This requires that the SQLite you are using supports the JSON functions, in particular json_array
, json_object
, json_set
, and the ->>
extraction operator. Some stock versions of Linux do not have them but any recent SQLite amalgamation will have these features.
Armed with these basic transforms we can already do a simple transform to make select statement work. Suppose CQL sees:
cursor C for select * from backed;
The compiler can make this select statement work with a simple transform:
cursor C FOR WITH
*) AS (CALL _backed())
backed (SELECT *
FROM backed;
Now remember _backed
was the automatically created shared fragment. Basically, when the compiler encounters a select statement that mentions any backed table it adds a call to the corresponding shared fragment in the WITH
clause, creating a `WITH`` clause if needed. This effectively creates necessary “view”. And, because we’re using the shared fragment form, all users of this fragment will share the text of the view. So there is no schema and the “view” text of the backed table appears only once in the binary. More precisely we get this after full expansion:
WITH
rowid, id, name, bias) AS (
backed (SELECT
rowid,
0), -- 0 is offset of backed.id in key blob
bgetkey(T.k, -6639502068221071091L), -- note hash of backed.name
bgetval(T.v, -3826945563932272602L) -- note hash of backed.bias
bgetval(T.v, FROM backing AS T
WHERE bgetkey_type(T.k) = 2105552408096159860L)
SELECT rowid, id, name, bias
FROM backed;
Now with this in mind we can see that it would be very beneficial to also add this:
[[deterministic]]declare select function bgetkey_type(b blob) long;
CREATE INDEX backing_index ON backing(bgetkey_type(k));
or more cleanly:
CREATE INDEX backing_index ON backing(cql_blob_get_type(backing, k));
Either of these results in a computed index on the row type stored in the blob. Other physical indices might be helpful too and these can potentially be shared by many backed tables, or used in partial indicies.
NOTE: Your type function can be named something other than the default
bgetkey_type
.
Now consider a slightly more complex example:
select T1.* from backed T1 join backed2 T2 where T1.id = T2.id;
This becomes:
WITH
backed (rowid, id, name, bias) AS (CALL _backed()),
backed2 (rowid, id, name) AS (CALL _backed2())
SELECT T1.*
FROM backed AS T1
INNER JOIN backed2 AS T2
WHERE T1.id = T2.id;
Now even though two different backed tables will be using the backing store, the original select “just works” once the CTE’s have been added. All the compiler had to do was add both backed table fragments. Even if backed
was joined against itself, that would also just work.
It will be useful to consider a simple example such as:
insert into backed values (1, "n001", 1.2), (2, "n002", 3.7);
This statement has to insert into the backing storage, converting the various values into key and value blobs. The compiler uses a simple transform to do this job as well. The above becomes:
WITH
id, name, bias) AS (
_vals (VALUES(1, "n001", 1.2), (2, "n002", 3.7)
)INSERT INTO backing(k, v) SELECT
id, backed.id),
cql_blob_create(backed, V.
cql_blob_create(backed,
V.name, backed.name,
V.bias, backed.bias)FROM _vals AS V;
Again the compiler is opting for a transform that is universal and here the issue is that the data to be inserted can be arbitrarily complicated. It might include nested select expressions, value computations, or any similar thing. In this particular case the data is literal values but in general the values could be anything.
To accompodate this possiblity the compiler’s transform takes the original values and puts them in its own CTE _vals
. It then generates a new insert statement targetting the backing store by converting _vals into two blobs – one for the key and one for the value. There is only the one place it needs to do this for any given insert
statement no matter now many items are inserted or how complex the insertion computation might be.
The compiler uses cql_blob_create
as a wrapper to that can expand to a user configured function with optional hash codes and mandatory field types. The default configuration that corresponds to this:
= bcreatekey]]
[[create_key = bcreateval]] [[create_val
The final SQL for an insert operation looks like this:
WITH
id, name, bias) AS (
_vals (VALUES(1, "n001", 1.2), (2, "n002", 3.7)
)INSERT INTO backing(k, v) SELECT
2105552408096159860, V.id, 1), -- type 1 is integer, offset implied
bcreatekey(2105552408096159860,
bcreateval(-6639502068221071091, V.name, 4, -- hash as before, type 4 is text,
-3826945563932272602, V.bias, 3) -- hash as before, type 3 is real,
FROM _vals AS V
As can be seen, both blobs have the same overall type code (2105552408096159860) as in the select case. The key blob is configured for to use offsets and the argument positions give the implied offset. In contrast the value blob is using hash codes (offset
was not specified). This configuration is typical.
A more complex insert works just as well:
insert into backed
select id+10, name||'x', bias+3 from backed where id < 3;
The above insert statement is a bit of a mess. It’s taking some of the backed data and using that data to create new backed data. But the simple transform above works just as before. We add the needed backed
CTE for the select and create _vals
like before.
WITH
*) AS (CALL _backed()),
backed (id, name, bias) AS (
_vals (SELECT id + 10, name || 'x', bias + 3
FROM backed
WHERE id < 3
)INSERT INTO backing(k, v)
SELECT
id, backed.id),
cql_blob_create(backed, V.
cql_blob_create(backed, V.name, backed.name, V.bias, backed.bias)FROM _vals AS V;
Looking closely at the above we see a few things:
cql_blob_create
will expand as before (not shown)backed(*)
as usual_vals
once again just has the exact unchanged insert clauseinsert into backing(k, v)
part is identical, the same recipe always worksAgain, we begin with a simple example:
delete from backed where id = 7;
The compiler requires a transformation that is quite general and while this case is simple the where condition could be very complicated. Fortunately there is such a simple transform:
WITH
*) AS (CALL _backed())
backed (DELETE FROM backing
WHERE rowid IN (
SELECT rowid
FROM backed
WHERE id = 7
);
All the compiler has to do here is:
_backed
CTEWHERE
clause into a subordinate SELECT
that gives us the rowids to delete.With the backed table in scope, any WHERE
clause works. If other backed tables are mentioned, the compiler simply adds those as usual.
Below is a more complicated delete, it’s a bit crazy but illustrative:
delete from backed where
id in (select id from backed2 where name like '%x%');
So this is using rows in backed2
to decide which rows to deleted in backed
. The same simple transform works directly.
WITH
*) AS (CALL _backed2()),
backed2 (*) AS (CALL _backed())
backed (DELETE FROM backing
WHERE rowid IN (
SELECT rowid
FROM backed
WHERE id IN (
SELECT id FROM backed2 WHERE name LIKE '%x%'
) );
What happened:
WHERE
clause went directly into the body of the rowid selectbacked
was used as before but now we also need backed2
The delete pattern does not need any additional cql helpers beyond what we’ve already seen.
The update
statement is the most complicated of all the DML forms, it requires all the transforms from the previous statements plus one additional transform.
First, the compiler requires two more blob helpers that are configurable. By default they look like this:
= bupdatekey]]
[[update_key = bupdateval]]; [[update_val
These are used to replace particular columns in a stored blob. Now let’s start with a very simple update to see now it all works:
update backed set name = 'foo' where id = 5;
Fundamentally we need to do these things:
where
clauseApplying all of the above, we get a transform like the following:
WITH
*) AS (CALL _backed())
backed (UPDATE backing
SET v = cql_blob_update(v, 'foo', backed.name)
WHERE rowid IN (SELECT rowid
FROM backed
WHERE id = 5);
Looking into the details:
backed
rowsWHERE
clause moved into a WHERE rowid
sub-select just like in the DELETE
casek
and v
very much like the INSERT
case, except we used an update helper that takes the current blob and creates a new blob to store
The above gives a update statement that is almost working. The remaining problem is that it is possible to use the existing column values in the update expressions and there is no way to use our backed
CTE to get those values in that context since the final update has to be all relative to the backing table.
Let’s look at another example to illustrate this last wrinkle:
update backed set name = name || 'y' where bias < 5;
This update basically adds the letter ‘y’ to the name of some rows. This is a silly example but this kind of thing happens in many contexts that are definitely not silly. To make these cases work the reference to name
inside of the set expression has to change. We end up with something like this:
WITH
*) AS (CALL _backed())
backed (UPDATE backing
SET v = cql_blob_update(v,
|| 'y',
cql_blob_get(v, backed.name)
backed.name)WHERE rowid IN (SELECT rowid
FROM backed
WHERE bias < 5);
Importantly the reference to name
in the set expression was changed to cql_blob_get(v, backed.name)
– extracting the name from the value blob. After which it is appended with ‘y’ as usual.
The rest of the pattern is just as it was after the first attempt above, in fact literally everything else is unchanged. It’s easy to see that the WHERE
clause could be arbitrarily complex with no difficulty.
NOTE: Since the
UPDATE
statement has noFROM
clause only the fields in the target table might need to be rewritten, so in this casename
,id
, andbias
were possible but onlyname
was mentioned.
After the cql_blob_get
and cql_blob_update
are expanded the result looks like this:
WITH
rowid, id, name, bias) AS (
backed (SELECT
rowid,
0),
bgetkey(T.k, -6639502068221071091L),
bgetval(T.v, -3826945563932272602L)
bgetval(T.v, FROM backing AS T
WHERE bgetkey_type(T.k) = 2105552408096159860L
)UPDATE backing
SET v =
bupdateval(
v,-6639502068221071091L, bgetval(v, -6639502068221071091L) || 'y', 4
)WHERE rowid IN (SELECT rowid
FROM backed
WHERE bias < 5);
NOTE: The blob update function for the value blob requires the original blob, the hash or offset to update, the new value, and the type of the new value. The blob update function for the key blob is the nearly same (blob, hash/offset, value) but the type is not required since the key blob necessarily has all the fields present because they are necessarily not null. Therefore the type codes are already all present and so the type of every column is known. The value blob might be missing nullable values hence their type might not be stored/known.
Normally backed tables are used without having to know the details of the transforms and the particulars of how each of the helper UDFs is invoked but for illustration purposes we can make another small example that shows a few more variations that might be created. In this examples keys and values need to be mutated.
=backing]]
[[backed_bycreate table sample(
name text,long,
state long,
prev_state primary key(name, state)
);
This update mixes all kinds of values around…
update sample
set state = state + 1, prev_state = state
where name = 'foo';
And the final output will be:
WITH
sample (rowid, name, state, prev_state) AS (
SELECT
rowid,
0),
bgetkey(T.k, 1),
bgetkey(T.k, -4464241499905806900)
bgetval(T.v, FROM backing AS T
WHERE bgetkey_type(T.k) = 3397981749045545394
)SET
= bupdatekey(k, bgetkey(k, 1) + 1, 1),
k = bupdateval(v, -4464241499905806900, bgetkey(k, 1), 2)
v WHERE rowid IN (SELECT rowid
FROM sample
WHERE name = 'foo');
As expected the bupdatekey
call gets the column offset (1) but not the type code (2). bupdateval
gets a hash code and a type.
If you want to refer to your blob functions in your own code, such as for indices you’ll also need to do something like this:
declare select function bgetkey_type(b blob) long;
[[deterministic]] declare select function bgetval_type(b blob) long;
[[deterministic]] declare select function bgetkey(b blob, iarg integer) long; -- polymorphic
[[deterministic]] declare select function bgetval(b blob, iarg integer) long; -- polymorphic
[[deterministic]] declare select function bcreateval no check blob;
[[deterministic]] declare select function bcreatekey no check blob;
[[deterministic]] declare select function bupdateval no check blob;
[[deterministic]] declare select function bupdatekey no check blob; [[deterministic]]
bgetval
and bgetkey
are not readily declarable generally because their result is polymorphic so it’s preferable to use cql_blob_get
like the compiler does (see examples above) which then does the rewrite for you. But it is helpful to have a UDF declaration for each of the above, especially if you want the --rt query_plan
output to work seamlessly. Typically bgetval
or bgetval
would only be needed in the context of a create index
statement and cql_blob_get
can be used instead in that case.
NOTE: it’s possible to rewrite
CREATE INDEX
on a backed table into and index on the backing table but this has not yet been implemented.
This example has the whole set.
-- this makes it a backing table
[[backing_table]] = bgetkey_type]] -- the function to get the type out of the key blob
[[get_type = bgetkey]] -- the function to get a column out of the key blob
[[get_key = bgetval]] -- the function to get a column out of the value blob
[[get_val = bcreatekey]] -- the function to create a key blob
[[create_key = bcreateval]] -- the function to create a value blob
[[create_val = bupdatekey]] -- the function to update a key blob
[[update_key = bupdateval]] -- the function to update a value blob
[[update_val -- if present key functions use a column code, by default key funcs use column offsets
[[use_key_codes]] -- if present value functions use a column offset, by default value funcs use column codes
[[use_val_offsets]] -- ignores the above and targets json_* instead (don't combine it)
[[json]] -- ignores the above and targets jsonb_* instead (don't combine it)
[[jsonb]] create table foo(
blob primary key,
k blob
v );
This feature was removed from the compiler on 2024/12/28
The general idea of “vaulted” values was that some fields in your database might contain sensitive data and you might not want to make this data visible to all the code in your whole project. The data could be sensitive for a number of reasons, maybe it’s PII, or maybe it’s confidential for some other reason. The idea here was that you can’t very well limit access to this data in a lower layer like CQL can provide because it’s likely that the functions accessing the data at this layer are in the business of processing this data. However, you can make it so that an additional step is needed “decode” the data so that it’s apparent where such data is being used and with some kind of “key” you could make it so that code that has no access to the keys cannot possibly display sensitive data. This isn’t foolproof but it is a good line of defense for privacy.
However, nobody was actually using this feature and indeed if you want to do this the best way is not to rely on the CQL code generation directly but rather to wrap results just like we do for say JNI access. Indeed the python used to do JNI wrappers could easily be modified to encode sensitive fields directly in that API. This is true for any access API, even the native one. Doing it with the CQL JSON and your own python gives you flexibility to do your encoding your way. It’s best left out of the hands of the lowest level codegen. Again just like the JNI choice, or the dotnet choice.
So, we deleted a ton of code in favor of a hopefully much smaller amount of python for those who want this feature (which seems to be nobody at the moment).
Note: We may add an example python vaulter in the future.
Pre-processing features are a recent introduction to the CQL language; previously any pre-processing functionality was provided by running the C Pre-Processor over the input file before processing. The practice of using cc -E
or the equivalent was deprecated because:
cc -E
often gives specious warningsTo address these problems CQL introduced pre-processing features including structured macros. That is, macros that describe the sort of thing they intend to produce and the kinds of things they consume. This allows for reasonable syntax and type checking and much better error reporting. To this we also add @include
to import code and @ifdef
/@ifndef
for conditionals.
Users can “define” conditional compilation switches using --defines x y z
on the command line. Additionally, if --rt foo
is specified then __rt__foo
will be defined.
NOTE: that if
-cg
is specified but no--rt
is specified then the default is--rt c
and so__rt__c
will be defined.
The syntax for conditional compilation is the familiar:
@ifdef name
... this code will be processed if "name" is defined
@else
... this code will be processed if "name" is not defined
@endif
The @else
is optional and the construct nests as one would expect. @ifndef
is also available and simply reverses the sense of the condition.
In order to avoid “rude” macro patterns with @ifdef
in the middle of SQL or expressions, this construct is itself a statement in the grammar. In order to create conditional expression pieces or other internal pieces you should use one of the macro types and define it two or more ways. These design choices were made to avoid the weird token pasting that inevitably results if pre-processing is allowed everywhere. Conditional cte_tables
, select_core
and even just expr
are highly flexible and give clear composition in the code with no weird syntax.
Note that in CQL even the code that is conditionally compiled out must at least parse correctly. Semantic analysis does not run, and indeed often there would be conflicts if it did, but the code must at least be correct enough to parse.
Conditionally choosing one of several macro implementations for use later in the code is a very powerful way to get conditionality throughout your code cleanly. @ifdef
can only appear inside of statement list (stmt_list
) macros becasue @ifdef
is a statement so it can’t appear in expressions and query fragments. Hence the most powerful pattern is:
@ifdef something
...) foo! begin choice1 end;
@macro(@else
...) foo! begin choice2 end;
@macro(@endif
-- foo! is now conditionally defined
Pulling in common headers is also supported. The syntax is
@include "any/path/foo.sql"
In addition to the current directory any paths specified with --include_paths x y z
will be checked.
Like the @ifdef
forms, @include
can only appear at the statement level, so it cannot be used to do exotic token pasting like often happens with #include
. Furthermore, it must appear at the top of files, so it’s a lot more like the import features of other languages than it is like the C pre-Processor token stream. Once normal statements begin further includes are not possible, each file, gets an include section and a statements section. Note that @ifdef
and include do not compose. Again, @include
is more like an import. If you need conditionals the included item should conditionally produce declarations and possibly macros. This means that file dependencies are consistent regardless of conditionals.
A typical macro declaration might look like this:
@MACRO(STMT_LIST) assert!(exp! expr)
begin
if not exp! then
call printf("assertion failed\n");
end if;
end;
-- Usage
< bar); assert!(foo
This example is a macro that produces a statement list (stmt_list
), so it can be used in the places where a statement list can appear. Every macro definition specifies the nature of the thing it produces, which limits the places that it can appear.
The nature of macros means that while you may get an error for using the wrong macro type in the wrong location, you cannot get syntax errors due to the replacement. Of course semantic errors are still possible, so for instance maybe the macro references a table that doesn’t exist or maybe the table doesn’t have certain necessary columns. Such errors are possible, but the macro is sure to expand correctly.
Any errors are reported on the lines of the macro not where the macro is used.
The example in the introduction is a macro that produces a statement list. It can be used anywhere a statement list would be valid. The full list of macro types is as follows:
Type | Notes |
---|---|
cte_tables | part of the contents of a WITH expression |
expr | any expression |
query_parts | something that goes in a FROM clause |
select_core | one or more select statement that can be unioned |
select_expr | one or more select named expressions |
stmt_list | one more statements |
Here are examples that illustrate the various macro types, in alphabetical order with examples. The names of the macro types are the same as the same structure in the grammar so the definition is easy to spot.
cte_tables
@macro(cte_tables) foo!()
begin
as (select 1, 2),
x(a,b) as (select 3, 4)
y(d,e) end;
-- all or part of the cte tables in the with clause
with foo!() select * from x join y;
expr
@macro(expr) pi!()
begin
3.14159
end;
Macro arguments can have the same types as macros themselves and expressions are a common choice as we saw in the assert macro.
@macro(expr) max!(a! expr, b! expr)
begin
case when a! > b! then a! else b! end
end;
max!(not 3, 1=2);
-- this generates
CASE WHEN (NOT 3) > (1 = 2) THEN NOT 3
ELSE 1 = 2
END;
Order of operations isn’t a problem with CQL macros, no extra parentheses are needed for arguments and so forth since the macro drops directly into the syntax tree. This means that in the above even though NOT
is lower precedence than >
, the correct expression is generated with no extra effort for the coder. If the expanded syntax tree is rendered as text with --echo
and --exp
like in the above, any necessary parentheses are added, but the tree is always the right shape for the arguments.
query_parts
A query part macro generates “something you could put in a from
clause”. It could be the whole from
clause or it could be one or more of the joined tables.
@macro(query_parts) foo!(x! expr)
begin
inner join bar on foo.a == bar.a and foo.x == x!
foo end;
select * from foo!(y);
-- this generates
SELECT *
FROM foo
INNER JOIN bar ON foo.a = bar.a AND foo.x = y;
select_core
A select core macro generates “something you could union”. It’s the part of a select statement that comes before order by
. If you’re trying to make a macro that assembles parts of a set of results which are then unioned and ordered, this is what you need.
@macro(select_core) foo!()
begin
select x, y from T
union all
select x, y from U
end;
foo!()union all
select x, y from V
order by x;
-- this generates
SELECT x, y FROM T
UNION ALL
SELECT x, y FROM U
UNION ALL
SELECT x, y FROM V
ORDER BY x;
A select_core
macro can be a useful way to specify a set of tables and values while leaving filtering and sorting open for customization.
select_expr
A select expression macro can let you codify certain common columns and alises that might might want to select. Such as:
@macro(select_expr) foo!()
begin
+ T1.y as A, T2.u / T2.v * 100 as pct
T1.x end;
select foo!() from X as T1 join Y as T2 on T1.id = T2.id;
--- this generates
SELECT T1.x + T1.y AS A, T2.u / T2.v * 100 AS pct
FROM X AS T1
INNER JOIN Y AS T2 ON T1.id = T2.id;
If certain column extractions are common you can easily make a macro that lets you pull out the columns you want. This can be readily generalized. This becomes very useful when it’s normal to extract (e.g.) the same 20 columns from various queries.
@MACRO(SELECT_EXPR) foo!(t1! EXPR, t2! EXPR)
BEGIN
+ t1!.y AS A, t2!.u / t2!.v * 100 AS pct
t1!.x END;
select foo!(X, Y) from X join Y on X.id = Y.id;
-- this generates
SELECT X.x + X.y AS A, Y.u / Y.v * 100 AS pct
FROM X
INNER JOIN Y ON X.id = Y.id;
In this second case we have provided the table names as arguments rather than hard coding them.
stmt_list
We began with a statement list macro before many of the concepts had been introduced. Let’s revisit where we started.
@MACRO(STMT_LIST) assert!(exp! expr)
begin
if not exp! then
call printf("assertion failed\n");
end if;
end;
1 == 1);
assert!(
-- this generates
IF NOT 1 = 1 THEN
CALL printf("assertion failed\n");
END IF;
Recall that in SQL order of operations NOT
is very weak. This is in contrast to many other languages where !
binds quite strongly. But as it happens we don’t have to care. The expression would have been evaluated correctly regardless of the binding strength of what surrounds the macro because the replacement is in the AST not in the text.
This rounds out all of the macro types.
In order to avoid language ambiguity and to allow macro fragments like a cte_table
in unusual locations. The code must specify the type of the macro argument. Expressions are the defaul type, the others use a function-like syntax to do the job.
Type | Syntax |
---|---|
cte_tables | with( x(*) as (select 1 x, 2 y)) |
expr | no keyword required just foo!(x) |
query_parts | from(U join V) |
select_core | rows(select * from U union all select * from V) |
select_expr | select(1 x, 2 y) |
stmt_list | begin statement1; statement2; end |
With these forms the type of macro argument is unambiguous and can be immediately checked against the macro requirements.
Note that when using a
select_core
macro or macro argument in source it is necessary to doROWS(name!)
. This is an unfortunate but unavoidable concession to the grammar tools.
Note that none of the macro args requires qualifications when used in a macro argument context because they can always be type checked later, therefore foo!(a!, b!, c!) always works. The other macro types require their wrappings to have clean grammar.
And example with all of the types:
@macro(stmt_list) mondo1!(
a! expr,
b! query_parts,
c! select_core,
d! select_expr,
e! cte_tables,
f! stmt_list)begin
-- macros can be used without qualification in @ID and @TEXT
set zz := @text(a!, b!, c!, d!, e!, f!);
end;
@macro(stmt_list) mondo2!(
a! expr,
b! query_parts,
c! select_core,
d! select_expr,
e! cte_tables,
f! stmt_list)begin
-- arguments can be forwarded unambigously
mondo1!(a!, b!, c!, d!, e!, f!);if a! then -- an expression (the most common)
-- a statement list (next most common)
f!; else
-- these are the parts of a query that you might
-- want to macroize
with e! -- cte tables
select d! -- select expressions
from b! -- query parts
union all
rows(c!); -- select core
end if;
end;
-- and this is how you encode each type
mondo2!(1+2,
from(x join y),
rows(select 1 from foo union select 2 from bar),
select(20 xx),
with(f(*) as (select 99 from yy)),
begin let qq := 201; end
);
Variable | Notes |
---|---|
@LINE |
The current line number |
@MACRO_LINE |
The line where macro expansion began |
@MACRO_FILE |
The file where macro expansion began |
@MACRO_LINE
and @MACRO_FILE
are useful for providing error information that refer to the source file that used the macro rather than the macro itself. Like in an assert
macro. @LINE
is useful to report problems in the macro itself, like an invariant.
We can make the assert macro better still:
@MACRO(STMT_LIST) assert!(exp! expr)
begin
if not exp! then
call printf("%s:%d assertion failed: %s\n",
exp!));
@MACRO_FILE, @MACRO_LINE, @TEXT(end if;
end;
1 == 1);
assert!(
-- this generates
IF NOT 1 = 1 THEN
CALL printf("%s:%d assertion failed: %s\n", 'myfile.sql', 9, "1 = 1");
END IF;
Note that here the macro was invoked on line 9 of myfile.sql. @LINE
would have been much less useful, reporting the line of the printf
.
You can create new tokens in one of three ways:
@TEXT
will create a string from one or more parts.@ID
will make a new identifier from one or more parts
@TMP
will make an identifier just like @ID
but it will include a tmp_nnn
prefix where nnn
is unique to the macro expansion it appears in. This lets you safely create temporary variables for use in a macro body without fear of conflict with other expansions of the same macro. @TMP(_foo)
creates tmp_nnn_foo
so you can make as many such temporaries as you need. @TMP
is otherwise identical to @ID
.This can be used to create very powerful code-helpers. Consider this code, which is very similar to the actual test helpers in the compiler’s test cases.
@MACRO(stmt_list) EXPECT!(pred! expr)
begin
if not pred! then
throw;end if;
end;
@MACRO(stmt_list) TEST!(name! expr, body! stmt_list)
begin
create procedure @ID("test_", name!)()
begin
trybody!;
catchcall printf("Test failed %s\n", @TEXT(name!));
throw;end;
end;
end;
TEST!(try_something,BEGIN
1 == 1);
EXPECT!(END);
--- This generates:
CREATE PROC test_try_something ()
BEGIN
TRYIF NOT 1 = 1 THEN
THROW;END IF;
CATCHCALL printf("Test failed %s\n", "try_something");
THROW;END;
END;
And of course additional diagnostics can be readily added (and they are present in the real code). For instance all of the tricks used in the assert!
macro would be helpful in the expect!
macro.
There is no macro form that can stand in for a type name. However, identifiers are legal types and so @ID(...)
is an excellent construct for creating type names from expressions like strings or identifiers. In general, @ID(...)
can allow you to use an expression macro where an expression is not legal but a name is.
For instance:
@macro(stmt_list) make_var!(x! expr, t! expr)
begin
declare @id(t!,"_var") @id(t!);
set @id(t!,"_var") := x!;
end;
-- declares real_var as an real and stores 1+5 in it
1+5, "real");
make_var!(
--- this generates
DECLARE real_var real;
SET real_var := 1 + 5;
Since @id(...)
can go anywhere an identifier can go, it is not only suitable for use for type names but also for procedure names, table names – any identifer. As mentioned above @id
will generate an error if the expression does not make a legal normal identifier.
Sometimes type can be inferred, and indeed it might make sense to infer it. For instance part of a test macro might look like this:
-- this uses @ID (via @TMP) and @TEXT for errors
@macro(stmt_list) expect_equal!(x! expr, y! expr)
begin
let @tmp(x) := x!; -- evaluate only once
let @tmp(y) := y!; -- evaluate only once
if @tmp(x) IS MOT !tmp(y) then
printf("%s IS NOT %s\nleft: %s\nright:%s", @TEXT(x!), @TEXT(y!), @tmp(x), @tmp(y));
end;
end;
The pipeline syntax expr:macro_name!(args...)
can be used instead of macro_name!(expr, args)
. This allows macros to appear in expressions written in the fluent style. Note that args
may be empty in which case the form can be written expr:macro_name!()
or the even shorter expr:macro_name!
both of which become macro_name!(expr)
"%s", x:fmt!); printf(
CQL has a variety of command line (CLI) options but many of them are only interesting for cql development. Nonetheless this is a comprehensive list:
NOTE: CQL is often used after the c pre-processor is run so this kind of invocation is typical:
cc -E -x c foo.sql | cql [args]
Example:
cql --in test.sql
Example:
cql --in sem_test.sql --sem
cql --in sem_test.sql --sem --ast >sem_ast.out
Example
cql --in test.sql --echo >test.out # test.out is "equivalent" to test.sql
Example:
cql --dot --in dottest.sql
Example:
cql --in foo.sql --cg foo.h foo.c
Example:
cql --in test.sql --nolines --cg foo.h foo.c
NOTE: different result types require a different number of output files with different meanings
--test
is included--echo
this is kind of like cc -E
it can give you a view of what happened after the pre-processing--ast
it is a useful test tool (its primary function)These are the various outputs the compiler can produce.
#include "foo.h"
into the C output instead of #include "cqlrt.h"
foo_exports.sql
includes procedure declarations for the contents of foo.sql
foo.h
from foo.c
--rt schema_upgrade
or --rt schema
--rt schema_upgrade
What follows is taken from a grammar snapshot with the tree building rules removed. It should give a fair sense of the syntax of CQL (but not semantic validation).
These are in order of priority lowest to highest
"UNION ALL" "UNION" "INTERSECT" "EXCEPT"
":=" "+=" "-=" "*=" "/=" "%=" "|=" "&=" "<<=" ">>="
"OR"
"AND"
"NOT"
"BETWEEN" "NOT BETWEEN" "<>" "!=" '=' "==" "LIKE" "NOT LIKE" "GLOB" "NOT GLOB" "MATCH" "NOT MATCH" "REGEXP" "NOT REGEXP" "IN" "NOT IN" "IS NOT" "IS" "IS TRUE" "IS FALSE" "IS NOT TRUE" "IS NOT FALSE"
"ISNULL" "NOTNULL"
'<' '>' ">=" "<="
"<<" ">>" '&' '|'
'+' '-'
'*' '/' '%'
"||" "->" "->>" ':' '.' '[' '~'
"COLLATE"
"UMINUS"
"@INCLUDE quoted-filename" "end of included file"
"BEGIN"
"FOR"
NOTE: The above varies considerably from the C binding order!!!
Literals:
ID /* a name */
STRLIT /* a string literal in SQL format e.g. 'it''s sql' */
CSTRLIT /* a string literal in C format e.g. "hello, world\n" */
BLOBLIT /* a blob literal in SQL format e.g. x'12ab' */
INTLIT /* integer literal */
LONGLIT /* long integer literal */
REALLIT /* floating point literal */
"@ATTRIBUTE" "@BEGIN_SCHEMA_REGION" "@COLUMNS" "@CREATE"
"@DECLARE_DEPLOYABLE_REGION" "@DECLARE_SCHEMA_REGION"
"@DELETE" "@DUMMY_SEED" "@ECHO" "@ELSE" "@EMIT_CONSTANTS"
"@EMIT_ENUMS" "@EMIT_GROUP" "@ENDIF" "@END_SCHEMA_REGION"
"@ENFORCE_NORMAL" "@ENFORCE_POP" "@ENFORCE_PUSH"
"@ENFORCE_RESET" "@ENFORCE_STRICT" "@EPONYMOUS" "@FILE"
"@ID" "@IFDEF" "@IFNDEF" "@KEEP_TABLE_NAME_IN_ALIASES"
"@LINE" "@MACRO" "@MACRO_FILE" "@MACRO_LINE" "@OP"
"@PREVIOUS_SCHEMA" "@PROC" "@RC" "@RECREATE"
"@SCHEMA_AD_HOC_MIGRATION" "@SCHEMA_UPGRADE_SCRIPT"
"@SCHEMA_UPGRADE_VERSION" "@SENSITIVE" "@TEXT" "@TMP"
"@UNSUB" "ABORT" "ACTION" "ADD" "AFTER" "ALL" "ALTER"
"ARGUMENTS" "AS" "ASC" "AUTOINCREMENT" "BEFORE" "BLOB" "BY"
"CALL" "CASCADE" "CASE" "CAST" "CATCH" "CHECK" "CLOSE"
"COLUMN" "COMMIT" "CONST" "CONSTRAINT" "CONTEXT COLUMN"
"CONTEXT TYPE" "CONTINUE" "CREATE" "CROSS" "CTE_TABLES"
"CURRENT ROW" "CURSOR HAS ROW" "CURSOR" "DECLARE" "DEFAULT"
"DEFERRABLE" "DEFERRED" "DELETE" "DESC" "DISTINCT"
"DISTINCTROW" "DO" "DROP" "ELSE IF" "ELSE" "ENCODE" "END"
"ENUM" "EXCLUDE CURRENT ROW" "EXCLUDE GROUP" "EXCLUDE NO
OTHERS" "EXCLUDE TIES" "EXCLUSIVE" "EXISTS" "EXPLAIN"
"EXPR" "FAIL" "FETCH" "FILTER" "FIRST" "FOLLOWING" "FOR
EACH ROW" "FOREIGN" "FROM" "FUNC" "FUNCTION" "GROUP"
"GROUPS" "HAVING" "HIDDEN" "IF" "IGNORE" "IMMEDIATE"
"INDEX" "INITIALLY" "INNER" "INOUT" "INSERT" "INSTEAD"
"INT" "INTEGER" "INTERFACE" "INTO" "JOIN" "KEY" "LAST"
"LEAVE" "LEFT" "LET" "LIMIT" "LONG" "LONG_INT"
"LONG_INTEGER" "LOOP" "NO" "NOT DEFERRABLE" "NOTHING"
"NULL" "NULLS" "OBJECT" "OF" "OFFSET" "ON CONFLICT" "ON"
"OPEN" "ORDER" "OUT" "OUTER" "OVER" "PARTITION" "PRECEDING"
"PRIMARY" "PRIVATE" "PROC" "PROCEDURE" "QUERY PLAN"
"QUERY_PARTS" "RAISE" "RANGE" "REAL" "RECURSIVE"
"REFERENCES" "RELEASE" "RENAME" "REPLACE" "RESTRICT"
"RETURN" "RETURNING" "RIGHT" "ROLLBACK" "ROWID" "ROWS"
"SAVEPOINT" "SELECT" "SELECT_CORE" "SELECT_EXPR" "SET"
"SIGN FUNCTION" "STATEMENT" "STMT_LIST" "SWITCH" "TABLE"
"TEMP" "TEXT" "THEN" "THROW" "TO" "TRANSACTION" "TRIGGER"
"TRY" "TYPE" "TYPE_CHECK" "UNBOUNDED" "UNIQUE" "UPDATE"
"UPSERT" "USING" "VALUES" "VAR" "VIEW" "VIRTUAL" "WHEN"
"WHERE" "WHILE" "WINDOW" "WITH" "WITHOUT"
Note that in many cases the grammar is more generous than the overall language and errors have to be checked on top of this, often this is done on purpose because even when it’s possible it might be very inconvenient to do checks with syntax. For example the grammar cannot enforce non-duplicate ids in id lists, but it could enforce non-duplicate attributes in attribute lists. It chooses to do neither as they are easily done with semantic validation. Thus the grammar is not the final authority on what constitutes a valid program but it’s a good start.
program: top_level_stmts
;
top_level_stmts:
/* nil */
| include_stmts
| stmt_list
| include_stmts stmt_list
;
include_section: "@INCLUDE quoted-filename" top_level_stmts "end of included file"
;
include_stmts:
include_section
| include_section include_stmts
;
opt_stmt_list:
/* nil */
| stmt_list
;
macro_ref:
"ID" '!'
| "ID" '!' '(' opt_macro_args ')'
;
stmt_list:
stmt
| stmt_list stmt
;
stmt:
misc_attrs any_stmt ';'
| ifdef_stmt
| ifndef_stmt
;
expr_stmt: expr
;
any_stmt:
alter_table_add_column_stmt
| expr_stmt
| begin_schema_region_stmt
| begin_trans_stmt
| call_stmt
| close_stmt
| commit_return_stmt
| commit_trans_stmt
| const_stmt
| continue_stmt
| create_index_stmt
| create_proc_stmt
| create_table_stmt
| create_trigger_stmt
| create_view_stmt
| create_virtual_table_stmt
| declare_deployable_region_stmt
| declare_enum_stmt
| declare_const_stmt
| declare_group_stmt
| declare_func_stmt
| declare_select_func_stmt
| declare_out_call_stmt
| declare_proc_no_check_stmt
| declare_proc_stmt
| declare_interface_stmt
| declare_schema_region_stmt
| declare_vars_stmt
| declare_forward_read_cursor_stmt
| declare_fetched_value_cursor_stmt
| declare_type_stmt
| delete_stmt
| drop_index_stmt
| drop_table_stmt
| drop_trigger_stmt
| drop_view_stmt
| echo_stmt
| emit_enums_stmt
| emit_group_stmt
| emit_constants_stmt
| end_schema_region_stmt
| enforce_normal_stmt
| enforce_pop_stmt
| enforce_push_stmt
| enforce_reset_stmt
| enforce_strict_stmt
| explain_stmt
| select_nothing_stmt
| fetch_call_stmt
| fetch_stmt
| fetch_values_stmt
| guard_stmt
| if_stmt
| insert_stmt
| leave_stmt
| let_stmt
| loop_stmt
| macro_def_stmt
| op_stmt
| out_stmt
| out_union_stmt
| out_union_parent_child_stmt
| previous_schema_stmt
| proc_savepoint_stmt
| release_savepoint_stmt
| return_stmt
| rollback_return_stmt
| rollback_trans_stmt
| savepoint_stmt
| select_stmt
| schema_ad_hoc_migration_stmt
| schema_unsub_stmt
| schema_upgrade_script_stmt
| schema_upgrade_version_stmt
| set_stmt
| switch_stmt
| throw_stmt
| trycatch_stmt
| update_cursor_stmt
| update_stmt
| upsert_stmt
| while_stmt
| for_stmt
| keep_table_name_in_aliases_stmt
;
explain_stmt:
"EXPLAIN" opt_query_plan explain_target
;
opt_query_plan:
/* nil */
| "QUERY PLAN"
;
explain_target: select_stmt
| begin_trans_stmt
| commit_trans_stmt
| delete_stmt
| drop_index_stmt
| drop_table_stmt
| drop_trigger_stmt
| drop_view_stmt
| insert_stmt
| update_stmt
| upsert_stmt
;
previous_schema_stmt:
"@PREVIOUS_SCHEMA"
;
schema_upgrade_script_stmt:
"@SCHEMA_UPGRADE_SCRIPT"
;
schema_upgrade_version_stmt:
"@SCHEMA_UPGRADE_VERSION" '(' "integer-literal" ')'
;
set_stmt:
"SET" sql_name ":=" expr
| "SET" sql_name "FROM" "CURSOR" name
| "SET" sql_name '[' arg_list ']' ":=" expr
;
let_stmt:
"LET" sql_name ":=" expr
;
const_stmt:
"CONST" sql_name ":=" expr
;
version_attrs_opt_recreate:
/* nil */
| "@RECREATE" opt_delete_plain_attr
| "@RECREATE" '(' name ')' opt_delete_plain_attr
| version_attrs
;
opt_delete_plain_attr:
/* nil */
| "@DELETE"
;
opt_version_attrs:
/* nil */
| version_attrs
;
version_attrs:
"@CREATE" version_annotation opt_version_attrs
| "@DELETE" version_annotation opt_version_attrs
;
opt_delete_version_attr:
/* nil */
| "@DELETE" version_annotation
;
drop_table_stmt:
"DROP" "TABLE" "IF" "EXISTS" sql_name
| "DROP" "TABLE" sql_name
;
drop_view_stmt:
"DROP" "VIEW" "IF" "EXISTS" sql_name
| "DROP" "VIEW" sql_name
;
drop_index_stmt:
"DROP" "INDEX" "IF" "EXISTS" sql_name
| "DROP" "INDEX" sql_name
;
drop_trigger_stmt:
"DROP" "TRIGGER" "IF" "EXISTS" sql_name
| "DROP" "TRIGGER" sql_name
;
create_virtual_table_stmt: "CREATE" "VIRTUAL" "TABLE" opt_vtab_flags sql_name
"USING" name opt_module_args
"AS" '(' col_key_list ')' opt_delete_version_attr ;
opt_module_args: /* nil */
| '(' misc_attr_value_list ')'
| '(' "ARGUMENTS" "FOLLOWING" ')'
;
create_table_prefix_opt_temp:
"CREATE" opt_temp "TABLE" ;
create_table_stmt:
create_table_prefix_opt_temp opt_if_not_exists sql_name '(' col_key_list ')' opt_no_rowid version_attrs_opt_recreate
;
opt_temp:
/* nil */
| "TEMP"
;
opt_if_not_exists:
/* nil */
| "IF" "NOT" "EXISTS"
;
opt_no_rowid:
/* nil */
| "WITHOUT" "ROWID"
;
opt_vtab_flags:
/* nil */
| "IF" "NOT" "EXISTS"
| "@EPONYMOUS"
| "@EPONYMOUS" "IF" "NOT" "EXISTS"
| "IF" "NOT" "EXISTS" "@EPONYMOUS"
;
col_key_list:
col_key_def
| col_key_def ',' col_key_list
;
col_key_def:
col_def
| pk_def
| fk_def
| unq_def
| check_def
| shape_def
;
check_def:
"CONSTRAINT" name "CHECK" '(' expr ')'
| "CHECK" '(' expr ')'
;
shape_exprs:
shape_expr ',' shape_exprs
| shape_expr
;
shape_expr:
sql_name
| '-' sql_name
;
shape_def:
shape_def_base
| shape_def_base '(' shape_exprs ')'
;
shape_def_base:
"LIKE" sql_name
| "LIKE" name "ARGUMENTS"
;
sql_name:
name
| "`quoted identifier`"
;
misc_attr_key:
name
| name ':' name
;
cql_attr_key:
name
| name ':' name
;
misc_attr_value_list:
misc_attr_value
| misc_attr_value ',' misc_attr_value_list
;
misc_attr_value:
sql_name
| any_literal
| const_expr
| '(' misc_attr_value_list ')'
| '-' num_literal
| '+' num_literal
;
misc_attr:
"@ATTRIBUTE" '(' misc_attr_key ')'
| "@ATTRIBUTE" '(' misc_attr_key '=' misc_attr_value ')'
| '[' '[' cql_attr_key ']' ']'
| '[' '[' cql_attr_key '=' misc_attr_value ']' ']'
;
misc_attrs:
/* nil */
| misc_attr misc_attrs
;
col_def:
misc_attrs sql_name data_type_any col_attrs
;
pk_def:
"CONSTRAINT" sql_name "PRIMARY" "KEY" '(' indexed_columns ')' opt_conflict_clause
| "PRIMARY" "KEY" '(' indexed_columns ')' opt_conflict_clause
;
opt_conflict_clause:
/* nil */
| conflict_clause
;
conflict_clause:
"ON CONFLICT" "ROLLBACK"
| "ON CONFLICT" "ABORT"
| "ON CONFLICT" "FAIL"
| "ON CONFLICT" "IGNORE"
| "ON CONFLICT" "REPLACE"
;
opt_fk_options:
/* nil */
| fk_options
;
fk_options:
fk_on_options
| fk_deferred_options
| fk_on_options fk_deferred_options
;
fk_on_options:
"ON" "DELETE" fk_action
| "ON" "UPDATE" fk_action
| "ON" "UPDATE" fk_action "ON" "DELETE" fk_action
| "ON" "DELETE" fk_action "ON" "UPDATE" fk_action
;
fk_action:
"SET" "NULL"
| "SET" "DEFAULT"
| "CASCADE"
| "RESTRICT"
| "NO" "ACTION"
;
fk_deferred_options:
"DEFERRABLE" fk_initial_state
| "NOT DEFERRABLE" fk_initial_state
;
fk_initial_state:
/* nil */
| "INITIALLY" "DEFERRED"
| "INITIALLY" "IMMEDIATE"
;
fk_def:
"CONSTRAINT" sql_name "FOREIGN" "KEY" '(' sql_name_list ')' fk_target_options
| "FOREIGN" "KEY" '(' sql_name_list ')' fk_target_options
;
fk_target_options:
"REFERENCES" sql_name '(' sql_name_list ')' opt_fk_options
;
unq_def:
"CONSTRAINT" sql_name "UNIQUE" '(' indexed_columns ')' opt_conflict_clause
| "UNIQUE" '(' indexed_columns ')' opt_conflict_clause
;
opt_unique:
/* nil */
| "UNIQUE"
;
indexed_column:
expr opt_asc_desc
;
indexed_columns:
indexed_column
| indexed_column ',' indexed_columns
;
create_index_stmt:
"CREATE" opt_unique "INDEX" opt_if_not_exists sql_name "ON" sql_name '(' indexed_columns ')' opt_where opt_delete_version_attr
;
name:
"ID"
| "ABORT"
| "ACTION"
| "ADD"
| "AFTER"
| "ALTER"
| "ASC"
| "@ID" '(' text_args ')'
| "@TMP" '(' text_args ')'
| "AUTOINCREMENT"
| "BEFORE"
| "CASCADE"
| "COLUMN"
| "CREATE"
| "CTE_TABLES"
| "DEFAULT"
| "DEFERRABLE"
| "DEFERRED"
| "DELETE"
| "DESC"
| "DROP"
| "ENCODE"
| "EXCLUSIVE"
| "EXPLAIN"
| "EXPR"
| "FAIL"
| "FETCH"
| "FIRST"
| "FOLLOWING"
| "GROUPS"
| "HIDDEN"
| "IGNORE"
| "IMMEDIATE"
| "INDEX"
| "INITIALLY"
| "INSTEAD"
| "INTO"
| "KEY"
| "LAST"
| "NULLS"
| "OUTER"
| "PARTITION"
| "PRECEDING"
| "PRIVATE"
| "QUERY_PARTS"
| "RANGE"
| "REFERENCES"
| "RELEASE"
| "RENAME"
| "REPLACE"
| "RESTRICT"
| "ROWID"
| "SAVEPOINT"
| "SELECT_CORE"
| "SELECT_EXPR"
| "STATEMENT"
| "STMT_LIST"
| "TABLE"
| "TEMP"
| "TEXT"
| "TRANSACTION"
| "TRIGGER"
| "TYPE"
| "VIEW"
| "VIRTUAL"
| "WITHOUT"
;
loose_name:
name
| "CALL"
| "SET"
| "BOOL"
| "INT"
| "LONG"
| "REAL"
| "BLOB"
| "OBJECT"
| "RIGHT"
| "LEFT"
;
loose_name_or_type:
loose_name
| "ALL"
| loose_name '<' loose_name '>'
;
opt_sql_name:
/* nil */
| sql_name
;
name_list:
name
| name ',' name_list
;
sql_name_list:
sql_name
| sql_name ',' sql_name_list
;
opt_name_list:
/* nil */
| name_list
;
opt_sql_name_list:
/* nil */
| sql_name_list
;
cte_binding_list:
cte_binding
| cte_binding ',' cte_binding_list
;
cte_binding: name name
| name "AS" name
;
col_attrs:
/* nil */
| not_null opt_conflict_clause col_attrs
| "PRIMARY" "KEY" opt_conflict_clause col_attrs
| "PRIMARY" "KEY" opt_conflict_clause "AUTOINCREMENT" col_attrs
| "DEFAULT" '-' num_literal col_attrs
| "DEFAULT" '+' num_literal col_attrs
| "DEFAULT" num_literal col_attrs
| "DEFAULT" const_expr col_attrs
| "DEFAULT" str_literal col_attrs
| "COLLATE" name col_attrs
| "CHECK" '(' expr ')' col_attrs
| "UNIQUE" opt_conflict_clause col_attrs
| "HIDDEN" col_attrs
| "@SENSITIVE" col_attrs
| "@CREATE" version_annotation col_attrs
| "@DELETE" version_annotation col_attrs
| fk_target_options col_attrs
;
version_annotation:
'(' "integer-literal" ',' name ')'
| '(' "integer-literal" ',' name ':' name ')'
| '(' "integer-literal" ')'
;
opt_kind:
/* nil */
| '<' name '>'
;
data_type_numeric:
"INT" opt_kind
| "INTEGER" opt_kind
| "REAL" opt_kind
| "LONG" opt_kind
| "BOOL" opt_kind
| "LONG" "INTEGER" opt_kind
| "LONG" "INT" opt_kind
| "LONG_INT" opt_kind
| "LONG_INTEGER" opt_kind
;
data_type_any:
data_type_numeric
| "TEXT" opt_kind
| "BLOB" opt_kind
| "OBJECT" opt_kind
| "OBJECT" '<' name "CURSOR" '>'
| "OBJECT" '<' name "SET" '>'
| "ID"
| "@ID" '(' text_args ')'
;
not_null: "NOT" "NULL" | '!'
;
data_type_with_options:
data_type_any
| data_type_any not_null
| data_type_any "@SENSITIVE"
| data_type_any "@SENSITIVE" not_null
| data_type_any not_null "@SENSITIVE"
;
str_literal:
str_chain
;
str_chain:
str_leaf
| str_leaf str_chain
;
str_leaf:
"sql-string-literal"
| "c-string-literal"
;
num_literal:
"integer-literal"
| "long-literal"
| "real-literal"
| "TRUE"
| "FALSE"
;
const_expr:
"CONST" '(' expr ')'
;
any_literal:
str_literal
| num_literal
| "NULL"
| "@FILE" '(' str_literal ')'
| "@LINE"
| "@MACRO_LINE"
| "@MACRO_FILE"
| "@PROC"
| "@TEXT" '(' text_args ')'
| "sql-blob-literal"
;
text_args:
text_arg
| text_arg ',' text_args
;
text_arg : expr ;
raise_expr:
"RAISE" '(' "IGNORE" ')'
| "RAISE" '(' "ROLLBACK" ',' expr ')'
| "RAISE" '(' "ABORT" ',' expr ')'
| "RAISE" '(' "FAIL" ',' expr ')'
;
opt_distinct:
/* nil */
| "DISTINCT"
;
simple_call:
loose_name '(' opt_distinct arg_list ')' opt_filter_clause
| "GLOB" '(' opt_distinct arg_list ')' opt_filter_clause
| "LIKE" '(' opt_distinct arg_list ')' opt_filter_clause
;
call:
simple_call
| basic_expr ':' simple_call
| basic_expr ':' loose_name
| basic_expr ':' '(' arg_list ')'
| basic_expr ':' "ID" '!'
| basic_expr ':' "ID" '!' '(' opt_macro_args ')'
;
basic_expr:
name
| "`quoted identifier`"
| macro_ref
| '*'
| "@RC"
| basic_expr '.' sql_name
| basic_expr '.' '*'
| any_literal
| const_expr
| '(' expr ')'
| call
| window_func_inv
| raise_expr
| '(' select_stmt ')'
| '(' select_stmt "IF" "NOTHING" expr ')'
| '(' select_stmt "IF" "NOTHING" "OR" "NULL" expr ')'
| '(' select_stmt "IF" "NOTHING" "OR" "NULL" "THEN" expr ')'
| '(' select_stmt "IF" "NOTHING" "OR" "NULL" "THEN" "THROW" ')'
| '(' select_stmt "IF" "NOTHING" "OR" "NULL" "THROW" ')'
| '(' select_stmt "IF" "NOTHING" "THEN" expr ')'
| '(' select_stmt "IF" "NOTHING" "THEN" "THROW"')'
| '(' select_stmt "IF" "NOTHING" "THROW"')'
| "EXISTS" '(' select_stmt ')'
| "CASE" expr case_list "END"
| "CASE" expr case_list "ELSE" expr "END"
| "CASE" case_list "END"
| "CASE" case_list "ELSE" expr "END"
| "CAST" '(' expr "AS" data_type_any ')'
| "TYPE_CHECK" '(' expr "AS" data_type_with_options ')'
| basic_expr '[' arg_list ']'
| basic_expr '~' data_type_any '~'
| basic_expr "->" basic_expr
| basic_expr "->>" '~' data_type_any '~' basic_expr
;
math_expr:
basic_expr
| math_expr '&' math_expr
| math_expr '|' math_expr
| math_expr "<<" math_expr
| math_expr ">>" math_expr
| math_expr '+' math_expr
| math_expr '-' math_expr
| math_expr '*' math_expr
| math_expr '/' math_expr
| math_expr '%' math_expr
| math_expr "IS NOT TRUE"
| math_expr "IS NOT FALSE"
| math_expr "ISNULL"
| math_expr "NOTNULL"
| math_expr "IS TRUE"
| math_expr "IS FALSE"
| '-' math_expr
| '+' math_expr
| '~' math_expr
| "NOT" math_expr
| math_expr '=' math_expr
| math_expr "==" math_expr
| math_expr '<' math_expr
| math_expr '>' math_expr
| math_expr "<>" math_expr
| math_expr "!=" math_expr
| math_expr ">=" math_expr
| math_expr "<=" math_expr
| math_expr "NOT IN" '(' opt_expr_list ')'
| math_expr "NOT IN" '(' select_stmt ')'
| math_expr "IN" '(' opt_expr_list ')'
| math_expr "IN" '(' select_stmt ')'
| math_expr "LIKE" math_expr
| math_expr "NOT LIKE" math_expr
| math_expr "MATCH" math_expr
| math_expr "NOT MATCH" math_expr
| math_expr "REGEXP" math_expr
| math_expr "NOT REGEXP" math_expr
| math_expr "GLOB" math_expr
| math_expr "NOT GLOB" math_expr
| math_expr "BETWEEN" math_expr "AND" math_expr
| math_expr "NOT BETWEEN" math_expr "AND" math_expr
| math_expr "IS NOT" math_expr
| math_expr "IS" math_expr
| math_expr "||" math_expr
| math_expr "COLLATE" name
;
expr:
math_expr
| expr "AND" expr
| expr "OR" expr
| expr ":=" expr
| expr "+=" expr
| expr "-=" expr
| expr "/=" expr
| expr "*=" expr
| expr "%=" expr
| expr "&=" expr
| expr "|=" expr
| expr "<<=" expr
| expr ">>=" expr
;
case_list:
"WHEN" expr "THEN" expr
| "WHEN" expr "THEN" expr case_list
;
arg_expr: expr
| shape_arguments
;
arg_exprs:
arg_expr
| arg_expr ',' arg_exprs
;
arg_list:
/* nil */
| arg_exprs
;
opt_expr_list:
/* nil */
| expr_list
;
expr_list:
expr
| expr ',' expr_list
;
shape_arguments:
"FROM" name
| "FROM" name shape_def
| "FROM" "ARGUMENTS"
| "FROM" "ARGUMENTS" shape_def
;
column_calculation:
"@COLUMNS" '(' col_calcs ')'
| "@COLUMNS" '(' "DISTINCT" col_calcs ')'
;
col_calcs:
col_calc
| col_calc ',' col_calcs
;
col_calc:
sql_name
| shape_def
| sql_name shape_def
| sql_name '.' sql_name
;
cte_tables:
cte_table
| cte_table ',' cte_tables
;
cte_decl:
name '(' sql_name_list ')'
| name '(' '*' ')'
| name
;
shared_cte:
call_stmt
| call_stmt "USING" cte_binding_list
;
cte_table:
cte_decl "AS" '(' select_stmt ')'
| cte_decl "AS" '(' shared_cte')'
| '(' call_stmt ')'
| '(' call_stmt "USING" cte_binding_list ')'
| cte_decl "LIKE" '(' select_stmt ')'
| cte_decl "LIKE" sql_name
| macro_ref
;
with_prefix:
"WITH" cte_tables
| "WITH" "RECURSIVE" cte_tables
;
with_select_stmt:
with_prefix select_stmt_no_with
;
select_nothing_stmt:
"SELECT" "NOTHING"
;
select_stmt:
with_select_stmt
| select_stmt_no_with
;
select_stmt_no_with:
select_core_list opt_orderby opt_limit opt_offset
;
select_core_list:
select_core
| select_core compound_operator select_core_list
;
values:
'(' insert_list ')'
| '(' insert_list ')' ',' values
;
select_core:
"SELECT" select_opts select_expr_list opt_from_query_parts opt_where opt_groupby opt_having opt_select_window
| "ROWS" '(' macro_ref ')'
| "VALUES" values
;
compound_operator:
"UNION"
| "UNION ALL"
| "INTERSECT"
| "EXCEPT"
;
window_func_inv:
simple_call "OVER" window_name_or_defn
;
opt_filter_clause:
/* nil */
| "FILTER" '(' opt_where ')'
;
window_name_or_defn: window_defn
| name
;
window_defn:
'(' opt_partition_by opt_orderby opt_frame_spec ')'
;
opt_frame_spec:
/* nil */
| frame_type frame_boundary_opts frame_exclude
;
frame_type:
"RANGE"
| "ROWS"
| "GROUPS"
;
frame_exclude:
/* nil */
| "EXCLUDE NO OTHERS"
| "EXCLUDE CURRENT ROW"
| "EXCLUDE GROUP"
| "EXCLUDE TIES"
;
frame_boundary_opts:
frame_boundary
| "BETWEEN" frame_boundary_start "AND" frame_boundary_end
;
frame_boundary_start:
"UNBOUNDED" "PRECEDING"
| expr "PRECEDING"
| "CURRENT ROW"
| expr "FOLLOWING"
;
frame_boundary_end:
expr "PRECEDING"
| "CURRENT ROW"
| expr "FOLLOWING"
| "UNBOUNDED" "FOLLOWING"
;
frame_boundary:
"UNBOUNDED" "PRECEDING"
| expr "PRECEDING"
| "CURRENT ROW"
;
opt_partition_by:
/* nil */
| "PARTITION" "BY" expr_list
;
opt_select_window:
/* nil */
| window_clause
;
window_clause:
"WINDOW" window_name_defn_list
;
window_name_defn_list:
window_name_defn
| window_name_defn ',' window_name_defn_list
;
window_name_defn:
name "AS" window_defn
;
region_spec:
name
| name "PRIVATE"
;
region_list:
region_spec ',' region_list
| region_spec
;
declare_schema_region_stmt:
"@DECLARE_SCHEMA_REGION" name
| "@DECLARE_SCHEMA_REGION" name "USING" region_list
;
declare_deployable_region_stmt:
"@DECLARE_DEPLOYABLE_REGION" name
| "@DECLARE_DEPLOYABLE_REGION" name "USING" region_list
;
begin_schema_region_stmt:
"@BEGIN_SCHEMA_REGION" name
;
end_schema_region_stmt:
"@END_SCHEMA_REGION"
;
schema_unsub_stmt:
"@UNSUB" '(' sql_name ')'
;
schema_ad_hoc_migration_stmt:
"@SCHEMA_AD_HOC_MIGRATION" version_annotation
| "@SCHEMA_AD_HOC_MIGRATION" "FOR" "@RECREATE" '(' name ',' name ')'
;
emit_enums_stmt:
"@EMIT_ENUMS" opt_name_list
;
emit_group_stmt:
"@EMIT_GROUP" opt_name_list
;
emit_constants_stmt:
"@EMIT_CONSTANTS" name_list
;
opt_from_query_parts:
/* nil */
| "FROM" query_parts
;
opt_where:
/* nil */
| "WHERE" expr
;
opt_groupby:
/* nil */
| "GROUP" "BY" groupby_list
;
groupby_list:
groupby_item
| groupby_item ',' groupby_list
;
groupby_item:
expr
;
opt_asc_desc:
/* nil */
| "ASC" opt_nullsfirst_nullslast
| "DESC" opt_nullsfirst_nullslast
;
opt_nullsfirst_nullslast:
/* nil */
| "NULLS" "FIRST"
| "NULLS" "LAST"
;
opt_having:
/* nil */
| "HAVING" expr
;
opt_orderby:
/* nil */
| "ORDER" "BY" orderby_list
;
orderby_list:
orderby_item
| orderby_item ',' orderby_list
;
orderby_item:
expr opt_asc_desc
;
opt_limit:
/* nil */
| "LIMIT" expr
;
opt_offset:
/* nil */
| "OFFSET" expr
;
select_opts:
/* nil */
| "ALL"
| "DISTINCT"
| "DISTINCTROW"
;
select_expr_list:
select_expr
| select_expr ',' select_expr_list
;
select_expr:
expr opt_as_alias
| column_calculation
;
opt_as_alias:
/* nil */
| as_alias
;
as_alias:
"AS" sql_name
| sql_name
;
query_parts:
table_or_subquery_list
| join_clause
;
table_or_subquery_list:
table_or_subquery
| table_or_subquery ',' table_or_subquery_list
;
join_clause:
table_or_subquery join_target_list
;
join_target_list:
join_target
| join_target join_target_list
;
table_or_subquery:
sql_name opt_as_alias
| '(' select_stmt ')' opt_as_alias
| '(' shared_cte ')' opt_as_alias
| table_function opt_as_alias
| '(' query_parts ')'
| macro_ref opt_as_alias
;
join_type:
/* nil */
| "LEFT"
| "RIGHT"
| "LEFT" "OUTER"
| "RIGHT" "OUTER"
| "INNER"
| "CROSS"
;
join_target: join_type "JOIN" table_or_subquery opt_join_cond
;
opt_join_cond:
/* nil */
| join_cond
;
join_cond:
"ON" expr
| "USING" '(' name_list ')'
;
table_function:
name '(' arg_list ')'
;
create_view_stmt:
"CREATE" opt_temp "VIEW" opt_if_not_exists sql_name "AS" select_stmt opt_delete_version_attr
| "CREATE" opt_temp "VIEW" opt_if_not_exists sql_name '(' name_list ')' "AS" select_stmt opt_delete_version_attr
;
delete_stmt:
delete_stmt_plain
| delete_stmt_plain returning_suffix
| with_prefix delete_stmt_plain
| with_prefix delete_stmt_plain returning_suffix
;
delete_stmt_plain:
"DELETE" "FROM" sql_name opt_where
;
opt_insert_dummy_spec:
/* nil */
| "@DUMMY_SEED" '(' expr ')' dummy_modifier
;
dummy_modifier:
/* nil */
| "@DUMMY_NULLABLES"
| "@DUMMY_DEFAULTS"
| "@DUMMY_NULLABLES" "@DUMMY_DEFAULTS"
| "@DUMMY_DEFAULTS" "@DUMMY_NULLABLES"
;
insert_stmt_type:
"INSERT" "INTO"
| "INSERT" "OR" "REPLACE" "INTO"
| "INSERT" "OR" "IGNORE" "INTO"
| "INSERT" "OR" "ROLLBACK" "INTO"
| "INSERT" "OR" "ABORT" "INTO"
| "INSERT" "OR" "FAIL" "INTO"
| "REPLACE" "INTO"
;
opt_column_spec:
/* nil */
| '(' opt_sql_name_list ')'
| '(' shape_def ')'
;
column_spec:
'(' sql_name_list ')'
| '(' shape_def ')'
;
from_shape:
"FROM" "CURSOR" name opt_column_spec
| "FROM" name opt_column_spec
| "FROM" "ARGUMENTS" opt_column_spec
;
insert_stmt_plain:
insert_stmt_type sql_name opt_column_spec select_stmt opt_insert_dummy_spec
| insert_stmt_type sql_name opt_column_spec from_shape opt_insert_dummy_spec
| insert_stmt_type sql_name "DEFAULT" "VALUES"
| insert_stmt_type sql_name "USING" select_stmt
| insert_stmt_type sql_name "USING" expr_names opt_insert_dummy_spec
;
returning_suffix: "RETURNING" select_expr_list
insert_stmt:
insert_stmt_plain
| insert_stmt_plain returning_suffix
| with_prefix insert_stmt_plain
| with_prefix insert_stmt_plain returning_suffix
;
insert_list_item:
expr
| shape_arguments
;
insert_list:
/* nil */
| insert_list_item
| insert_list_item ',' insert_list
;
basic_update_stmt:
"UPDATE" opt_sql_name "SET" update_list opt_from_query_parts opt_where
;
update_stmt:
update_stmt_plain
| update_stmt_plain returning_suffix
| with_prefix update_stmt_plain
| with_prefix update_stmt_plain returning_suffix
;
update_stmt_plain:
"UPDATE" sql_name "SET" update_list opt_from_query_parts opt_where opt_orderby opt_limit
| "UPDATE" sql_name "SET" column_spec '=' '(' insert_list ')' opt_from_query_parts opt_where opt_orderby opt_limit
;
update_entry:
sql_name '=' expr
;
update_list:
update_entry
| update_entry ',' update_list
;
upsert_stmt:
upsert_stmt_plain
| upsert_stmt_plain returning_suffix
| with_prefix upsert_stmt_plain
| with_prefix upsert_stmt_plain returning_suffix
upsert_stmt_plain:
insert_stmt_plain "ON CONFLICT" conflict_target "DO" "NOTHING"
| insert_stmt_plain "ON CONFLICT" conflict_target "DO" basic_update_stmt
;
update_cursor_stmt:
"UPDATE" "CURSOR" name opt_column_spec "FROM" "VALUES" '(' insert_list ')'
| "UPDATE" "CURSOR" name opt_column_spec from_shape
| "UPDATE" "CURSOR" name "USING" expr_names
;
conflict_target:
/* nil */
| '(' indexed_columns ')' opt_where
;
function: "FUNC" | "FUNCTION"
;
declare_out_call_stmt:
"DECLARE" "OUT" call_stmt
;
declare_enum_stmt:
"DECLARE" "ENUM" name data_type_numeric '(' enum_values ')'
;
enum_values:
enum_value
| enum_value ',' enum_values
;
enum_value:
name
| name '=' expr
;
declare_const_stmt:
"DECLARE" "CONST" "GROUP" name '(' const_values ')'
;
declare_group_stmt:
"DECLARE" "GROUP" name "BEGIN" simple_variable_decls "END"
;
simple_variable_decls:
declare_vars_stmt ';'
| declare_vars_stmt ';' simple_variable_decls
;
const_values:
const_value
| const_value ',' const_values
;
const_value: name '=' expr
;
declare_select_func_stmt:
"DECLARE" "SELECT" function name '(' params ')' data_type_with_options
| "DECLARE" "SELECT" function name '(' params ')' '(' typed_names ')'
| "DECLARE" "SELECT" function name "NO" "CHECK" data_type_with_options
| "DECLARE" "SELECT" function name "NO" "CHECK" '(' typed_names ')'
;
declare_func_stmt:
"DECLARE" function loose_name '(' func_params ')' data_type_with_options
| "DECLARE" function loose_name '(' func_params ')' "CREATE" data_type_with_options
| "DECLARE" function loose_name "NO" "CHECK" data_type_with_options
| "DECLARE" function loose_name "NO" "CHECK" "CREATE" data_type_with_options
;
procedure: "PROC" | "PROCEDURE"
;
declare_proc_no_check_stmt:
"DECLARE" procedure loose_name "NO" "CHECK"
;
declare_proc_stmt:
"DECLARE" procedure loose_name '(' func_params ')'
| "DECLARE" procedure loose_name '(' func_params ')' '(' typed_names ')'
| "DECLARE" procedure loose_name '(' func_params ')' "USING" "TRANSACTION"
| "DECLARE" procedure loose_name '(' func_params ')' "OUT" '(' typed_names ')'
| "DECLARE" procedure loose_name '(' func_params ')' "OUT" '(' typed_names ')' "USING" "TRANSACTION"
| "DECLARE" procedure loose_name '(' func_params ')' "OUT" "UNION" '(' typed_names ')'
| "DECLARE" procedure loose_name '(' func_params ')' "OUT" "UNION" '(' typed_names ')' "USING" "TRANSACTION"
;
declare_interface_stmt:
"DECLARE" "INTERFACE" name '(' typed_names ')'
| "INTERFACE" name '(' typed_names ')'
;
create_proc_stmt:
"CREATE" procedure loose_name '(' params ')' "BEGIN" opt_stmt_list "END"
| procedure loose_name '(' params ')' "BEGIN" opt_stmt_list "END"
;
inout:
"IN"
| "OUT"
| "INOUT"
;
typed_name:
sql_name data_type_with_options
| shape_def
| name shape_def
;
typed_names:
typed_name
| typed_name ',' typed_names
;
func_param:
param
| name "CURSOR"
;
func_params:
/* nil */
| func_param
| func_param ',' func_params
;
param:
sql_name data_type_with_options
| inout sql_name data_type_with_options
| shape_def
| name shape_def
;
params:
/* nil */
| param
| param ',' params
;
declare_value_cursor:
"DECLARE" name "CURSOR" shape_def
| "CURSOR" name shape_def
| "DECLARE" name "CURSOR" "LIKE" select_stmt
| "CURSOR" name "LIKE" select_stmt
| "DECLARE" name "CURSOR" "LIKE" '(' typed_names ')'
| "CURSOR" name "LIKE" '(' typed_names ')'
;
row_source: select_stmt | explain_stmt | insert_stmt | delete_stmt | update_stmt | upsert_stmt | call_stmt
;
declare_forward_read_cursor_stmt:
"DECLARE" name "CURSOR" "FOR" row_source
| "CURSOR" name "FOR" row_source
| "DECLARE" name "CURSOR" "FOR" expr
| "CURSOR" name "FOR" expr
;
declare_fetched_value_cursor_stmt:
"DECLARE" name "CURSOR" "FETCH" "FROM" call_stmt
| "CURSOR" name "FETCH" "FROM" call_stmt
;
declare_type_stmt:
"DECLARE" name "TYPE" data_type_with_options
| "TYPE" name data_type_with_options
;
declare_vars_stmt:
"DECLARE" sql_name_list data_type_with_options
| "VAR" name_list data_type_with_options
| declare_value_cursor
;
call_stmt: "CALL" loose_name '(' arg_list ')'
;
for_stmt: "FOR" expr ';' stmt_list "BEGIN" opt_stmt_list "END"
;
while_stmt:
"WHILE" expr "BEGIN" opt_stmt_list "END"
;
switch_stmt:
"SWITCH" expr switch_case switch_cases
| "SWITCH" expr "ALL" "VALUES" switch_case switch_cases
;
switch_case:
"WHEN" expr_list "THEN" stmt_list
| "WHEN" expr_list "THEN" "NOTHING"
;
switch_cases:
switch_case switch_cases
| "ELSE" stmt_list "END"
| "END"
;
loop_stmt:
"LOOP" fetch_stmt "BEGIN" opt_stmt_list "END"
;
leave_stmt:
"LEAVE"
;
return_stmt:
"RETURN"
;
rollback_return_stmt:
"ROLLBACK" "RETURN"
;
commit_return_stmt:
"COMMIT" "RETURN"
;
throw_stmt:
"THROW"
;
trycatch_stmt:
"BEGIN" "TRY" opt_stmt_list "END" "TRY" ';' "BEGIN" "CATCH" opt_stmt_list "END" "CATCH"
| "TRY" opt_stmt_list "CATCH" opt_stmt_list "END"
;
continue_stmt:
"CONTINUE"
;
fetch_stmt:
"FETCH" name "INTO" name_list
| "FETCH" name
;
fetch_values_stmt:
"FETCH" name opt_column_spec "FROM" "VALUES" '(' insert_list ')' opt_insert_dummy_spec
| "FETCH" name opt_column_spec from_shape opt_insert_dummy_spec
| "FETCH" name "USING" expr_names opt_insert_dummy_spec
;
expr_names:
expr_name
| expr_name ',' expr_names
;
expr_name: expr as_alias
;
fetch_call_stmt:
"FETCH" name opt_column_spec "FROM" call_stmt
;
close_stmt:
"CLOSE" name
;
out_stmt:
"OUT" name
;
out_union_stmt:
"OUT" "UNION" name
;
out_union_parent_child_stmt:
"OUT" "UNION" call_stmt "JOIN" child_results
;
child_results:
child_result
| child_result "AND" child_results
;
child_result:
call_stmt "USING" '(' name_list ')'
| call_stmt "USING" '(' name_list ')' "AS" name
;
if_ending: "END" "IF" | "END" ;
if_stmt:
"IF" expr "THEN" opt_stmt_list opt_elseif_list opt_else if_ending
;
opt_else:
/* nil */
| "ELSE" opt_stmt_list
;
elseif_item:
"ELSE IF" expr "THEN" opt_stmt_list
;
elseif_list:
elseif_item
| elseif_item elseif_list
;
opt_elseif_list:
/* nil */
| elseif_list
;
control_stmt:
commit_return_stmt
| continue_stmt
| leave_stmt
| return_stmt
| rollback_return_stmt
| throw_stmt
guard_stmt:
"IF" expr control_stmt
;
transaction_mode:
/* nil */
| "DEFERRED"
| "IMMEDIATE"
| "EXCLUSIVE"
;
begin_trans_stmt:
"BEGIN" transaction_mode "TRANSACTION"
| "BEGIN" transaction_mode
;
rollback_trans_stmt:
"ROLLBACK"
| "ROLLBACK" "TRANSACTION"
| "ROLLBACK" "TO" savepoint_name
| "ROLLBACK" "TRANSACTION" "TO" savepoint_name
| "ROLLBACK" "TO" "SAVEPOINT" savepoint_name
| "ROLLBACK" "TRANSACTION" "TO" "SAVEPOINT" savepoint_name
;
commit_trans_stmt:
"COMMIT" "TRANSACTION"
| "COMMIT"
;
proc_savepoint_stmt: procedure "SAVEPOINT" "BEGIN" opt_stmt_list "END"
;
savepoint_name:
"@PROC"
| name
;
savepoint_stmt:
"SAVEPOINT" savepoint_name
;
release_savepoint_stmt:
"RELEASE" savepoint_name
| "RELEASE" "SAVEPOINT" savepoint_name
;
echo_stmt:
"@ECHO" name ',' str_literal
| "@ECHO" name ',' "@TEXT" '(' text_args ')'
;
alter_table_add_column_stmt:
"ALTER" "TABLE" sql_name "ADD" "COLUMN" col_def
;
create_trigger_stmt:
"CREATE" opt_temp "TRIGGER" opt_if_not_exists trigger_def opt_delete_version_attr
;
trigger_def:
sql_name trigger_condition trigger_operation "ON" sql_name trigger_action
;
trigger_condition:
/* nil */
| "BEFORE"
| "AFTER"
| "INSTEAD" "OF"
;
trigger_operation:
"DELETE"
| "INSERT"
| "UPDATE" opt_of
;
opt_of:
/* nil */
| "OF" name_list
;
trigger_action:
opt_foreachrow opt_when_expr "BEGIN" trigger_stmts "END"
;
opt_foreachrow:
/* nil */
| "FOR EACH ROW"
;
opt_when_expr:
/* nil */
| "WHEN" expr
;
trigger_stmts:
trigger_stmt
| trigger_stmt trigger_stmts
;
trigger_stmt:
trigger_update_stmt ';'
| trigger_insert_stmt ';'
| trigger_delete_stmt ';'
| trigger_select_stmt ';'
;
trigger_select_stmt:
select_stmt_no_with
;
trigger_insert_stmt:
insert_stmt
;
trigger_delete_stmt:
delete_stmt
;
trigger_update_stmt:
basic_update_stmt
;
enforcement_options:
"FOREIGN" "KEY" "ON" "UPDATE"
| "FOREIGN" "KEY" "ON" "DELETE"
| "JOIN"
| "UPSERT" "STATEMENT"
| "WINDOW" function
| "WITHOUT" "ROWID"
| "TRANSACTION"
| "SELECT" "IF" "NOTHING"
| "INSERT" "SELECT"
| "TABLE" "FUNCTION"
| "IS TRUE"
| "CAST"
| "SIGN FUNCTION"
| "CURSOR HAS ROW"
| "UPDATE" "FROM"
| "AND" "OR" "NOT" "NULL" "CHECK"
;
enforce_strict_stmt:
"@ENFORCE_STRICT" enforcement_options
;
enforce_normal_stmt:
"@ENFORCE_NORMAL" enforcement_options
;
enforce_reset_stmt:
"@ENFORCE_RESET"
;
enforce_push_stmt:
"@ENFORCE_PUSH"
;
enforce_pop_stmt:
"@ENFORCE_POP"
;
keep_table_name_in_aliases_stmt:
"@KEEP_TABLE_NAME_IN_ALIASES"
expr_macro_def:
"@MACRO" '(' "EXPR" ')' name '!' '(' opt_macro_formals ')'
stmt_list_macro_def:
"@MACRO" '(' "STMT_LIST" ')' name '!' '(' opt_macro_formals ')'
;
query_parts_macro_def:
"@MACRO" '(' "QUERY_PARTS" ')' name '!' '(' opt_macro_formals ')'
;
cte_tables_macro_def:
"@MACRO" '(' "CTE_TABLES" ')' name '!' '(' opt_macro_formals ')'
;
select_core_macro_def:
"@MACRO" '(' "SELECT_CORE" ')' name '!' '(' opt_macro_formals ')'
;
select_expr_macro_def:
"@MACRO" '(' "SELECT_EXPR" ')' name '!' '(' opt_macro_formals ')'
;
op_stmt: "@OP" data_type_any ':' loose_name loose_name_or_type "AS" loose_name
| "@OP" "CURSOR" ':' loose_name loose_name_or_type "AS" loose_name
| "@OP" "NULL" ':' loose_name loose_name_or_type "AS" loose_name
;
ifdef: "@IFDEF" name
;
ifndef: "@IFNDEF" name
;
elsedef: "@ELSE"
;
endif: "@ENDIF"
;
ifdef_stmt:
ifdef opt_stmt_list elsedef opt_stmt_list endif
| ifdef opt_stmt_list endif
;
ifndef_stmt:
ifndef opt_stmt_list elsedef opt_stmt_list endif
| ifndef opt_stmt_list endif
;
macro_def_stmt:
expr_macro_def "BEGIN" expr "END"
| stmt_list_macro_def "BEGIN" stmt_list "END"
| query_parts_macro_def "BEGIN" query_parts "END"
| cte_tables_macro_def "BEGIN" cte_tables "END"
| select_core_macro_def "BEGIN" select_core_list "END"
| select_expr_macro_def "BEGIN" select_expr_list "END"
;
opt_macro_args:
/* nil */
| macro_args
;
macro_arg:
expr
| "BEGIN" stmt_list "END"
| "FROM" '(' query_parts ')'
| "WITH" '(' cte_tables ')'
| "ROWS" '(' select_core_list ')'
| "SELECT" '(' select_expr_list ')'
;
macro_args:
macro_arg
| macro_arg ',' macro_args
;
opt_macro_formals:
/* nil */
| macro_formals
;
macro_formals:
macro_formal
| macro_formal ',' macro_formals
;
macro_formal: name '!' macro_type
;
macro_type:
"EXPR"
| "STMT_LIST"
| "QUERY_PARTS"
| "CTE_TABLES"
| "SELECT_CORE"
| "SELECT_EXPR"
;
The control directives are those statements that begin with @
and they are distinguished from other statements because they influence the compiler rather than the program logic. Some of these are of great importance and discussed elsewhere.
The complete list (as of this writing) is:
@ENFORCE_STRICT
@ENFORCE_NORMAL
FOREIGN KEY ON UPDATE
: all FK’s must choose some ON UPDATE
strategyFOREIGN KEY ON DELETE
: all FK’s must choose some ON DELETE
strategyPROCEDURE
: all procedures must be declared before they are called (eliminating the vanilla C
call option)JOIN
: all joins must be ANSI style, the form FROM A,B
is not allowed (replace with A INNER JOIN B
WINDOW FUNC
: window functions are disallowed (useful if targeting old versions of SQLite)UPSERT STATEMENT
: the upsert form is disallowed (useful if targeting old versions of SQLite)@SENSITIVE
* marks a column or variable as ‘sensitive’ for privacy purposes, this behaves somewhat like nullability (See Chapter 3) in that it is radioactive, contaminating anything it touches * the intent of this annotation is to make it clear where sensitive data is being returned or consumed in your procedures * this information appears in the JSON output for further codegen or for analysis (See Chapter 13)
@DECLARE_SCHEMA_REGION
@DECLARE_DEPLOYABLE_REGION
@BEGIN_SCHEMA_REGION
@END_SCHEMA_REGION
@SCHEMA_AD_HOC_MIGRATION
* Allows for the creation of a ad hoc migration step at a given schema version, (See Chapter 10)
@ECHO
* Emits text into the C output stream, useful for emiting things like function prototypes or preprocessor directives * e.g. `echo C, ‘#define foo bar’
@RECREATE
@CREATE
@DELETE
* used to mark the schema version where an object is created or deleted, or alternatively indicate the the object is always dropped and recreated when it changes (See Chapter 10)
@SCHEMA_UPGRADE_VERSION
* used to indicate that the code that follows is part of a migration script for the indicated schema version * this has the effect of making the schema appear to be how it existed at the indicated version * the idea here is that migration procedures operate on previous versions of the schema where (e.g.) some columns/tables hadn’t been deleted yet
@PREVIOUS_SCHEMA
* indicates the start of the previous version of the schema for comparison (See Chapter 11)
@SCHEMA_UPGRADE_SCRIPT
* CQL emits a schema upgrade script as part of its upgrade features, this script declares tables in their final form but also creates the same tables as they existed when they were first created * this directive instructs CQL to ignore the incompatible creations, the first declaration controls * the idea here is that the upgrade script is in the business of getting you to the finish line in an orderly fashion and some of the interim steps are just not all the way there yet * note that the upgrade script recapitulates the version history, it does not take you directly to the finish line, this is so that all instances get to the same place the same way (and this fleshes out any bugs in migration)
@DUMMY_NULLABLES
@DUMMY_DEFAULTS
@DUMMY_SEED
* these control the creation of dummy data for insert
and fetch
statements (See Chapters 5 and 12)
@FILE
* a string literal that corresponds to the current file name with a prefix stripped (to remove build lab junk in the path)
@ATTRIBUTE
* the main purpose of @attribute
is to appear in the JSON output so that it can control later codegen stages in whatever way you deem appropriate * the nested nature of attribute values is sufficiently flexible than you could encode an arbitrary LISP program in an attribute, so really anything you might need to express is possible * there are a number of attributes known to the compiler which are listed below (complete as of this writing)
cql:autodrop=(table1, table2, ...)
when present the indicated tables, which must be temp tables, are dropped when the results of the procedure have been fetched into a rowsetcql:identity=(column1, column2, ...)
the indicated columns are used to create a row comparator for the rowset corresponding to the procedure, this appears in a C macro of the form procedure_name_row_same(rowset1, row1, rowset2, row2)
cql:suppress_getters
the annotated procedure should not emit its related column getter functions.
cql:emit_setters
the annotated procedure should emit setter functions so that result set columns can be mutated, this can be quite useful for business logic but it is more costlycql:suppress_result_set
the annotated procedure should not emit its related “fetch results” function.
cql:suppress_getters
; since there is no result set, getters would be redundant.OUT UNION
procedure cannot have a suppressed result set since all such a procedure does is produce a result set. This attribute is ignored for out union procedures.cql:private
the annotated procedure will be static in the generated C
static
it cannot be called from other modules and therefore will not go in any CQL exports file (that would be moot since you couldn’t call it).cql:suppress_result_set
since only CQL code in the same translation unit could possibly call it and hence the result set procedure is useless to other C code.cql:generate_copy
the code generation for the annotated procedure will produce a [procedure_name]_copy
function that can make complete or partial copies of its result set.cql:shared_fragment
is used to create shared fragments (See Chapter 14)cql:no_table_scan
for query plan processing, indicates that attributed table should never be table scanned in any plan (for better diagnostics)cql:ok_table_scan=([t1], [t2], ...)
indicates that the attributed procedure scans the indicated tables and that’s not a problem. This helps to suppress errors in expensive search functions that are known to scan big tables.cql:autotest=([many forms])
declares various autotest features (See Chapter 12)cql:query_plan_branch=[integer]
is used by the query plan generator to determine which conditional branch to use in query plan analysis when a shared fragment that contains an IF
statement is used. (See Chapter 15)cql:alias_of=[c_function_name]
are used on function declarations to declare a function or procedure in CQL that calls a function of a different name. This is intended to used for aliasing native (C) functions. Both the aliased function name and the original function name may be declared in CQL at the same time. Note that the compiler does not enforce any consistency in typing between the original and aliased functions.cql:backing_table
used to define a key value store table (docs need, there is only a brief wiki article)cql:backed_by=[a_backing_table]
specifies that the attributed table should have its data stored in the specified backing table, (docs needed, there is only a brief wiki article)cql:blob_storage
the attributed table isn’t really a physical table, it specifies the layout of a serializable blob, see this introductory articlecql:deterministic
when applied to a `declare select function`` indicates that the function is determinisitic and hence ok to use in indices.cql:implements=[interface]
interfaces may be declared with declare interface
and are basically a normal CQL shape. This annotation specifies that the annotated procedure has all the needed columns to encode the indicated shape. It may be used more than once to indicate several shapes are supported. The requirements are validated by the compiler but they do not affect code generation at all other than the JSON file (see Chapter 13). The intent here is to allow downstream code generators like you might have for Java and Objective C to incorporate the interface into the signature of result sets and define the interface as needed. The C and Lua output do not have any such notions. The present Objective C output (--rt objc
) doesn’t support interfaces and is likely to be replace by a python equivalent based on the JSON output that does. Generally, --rt objc
was a mistake as was `–rt java``, but the latter has been removed. Interfaces are highly useful even without codegen just for declarative purposes.cql:java_package=[a name]
this is not used by the compiler but it can be handy to apply to various items to give java code generators a hint where the code should go or where it comes from. Any support for this needs to be in the your java code generator. A sample is in the java_demo
directory.cql:try_is_proc_body
(See Initialization Improvements), this indicates that the annotated try block should be considered the entire body of the procedure for intialization purposes. In particular, it ensures that all parameters of the current procedure have been initialized by the end of the `TRY`` and prevents this check from happening again at the end of the procedure. This is needed is that for various reasons sometimes we need to wrap certain stored procedures in a try/catch such that custom error handling or logging can be implemented. In doing so, however, they can break our normal assumptions about things like initialization of OUT parameters in the errant case and that must be ok.cql:vault_sensitive
or cql:vault_sensitive([columns]...)
cql:vault_sensitive(context, [columns]...)
these forms indicate that @sensitive
columns should get additional encoding. These columns are marked with CQL_DATA_TYPE_ENCODED
in the result set metadata. This causes cql_copy_encoder
to be called to make an abstract encoder from a database instance and then functions like cql_encode_text
to be called to encode each column. To make sure of these attributes you will need a custom cqlrt.c
with suitable encodings for your application.cql:from_recreate
is used to mark tables that transitioned from @recreate
to @create
cql:module_must_not_be_deleted_see_docs_for_CQL0392
must be added to an @delete
attribute on a virtual table to remind the developer that the module for the virtual table must never be deleted elsewise database upgrades will fail. See CQL0392
cql:alt_prefix=prefix
this may be applied to a procedure definition or declaration to override the prefix found in rt_common.c
, this is really only interesting if there is a prefix in the first place so the default configuration of rt_common.c
doesn’t make this very interesting. However it is possible to change .symbol_prefix
(Meta does) and having done so you might want to change it to something else.cql:custom_type_for_encoded_column
the objective C output has the option of using different type for strings that have been encoded because they are sensitive. This attribute is placed on a procedure and it causes all strings in result sets to be declared with cql_string_ref_encode
from the objc result type. The type is ocnfigured with RT_STRING_ENCODE
which is cql_string_ref_encode
by default. This is deprecated because the entire --rt objc
feature is slated for destruction to be replaced with python that processes the JSON like the Java case.
integer math operators like << >> & and | are not compatible with real-valued arguments
Most arithmetic operators (e.g. +, -, *) do not work on objects. Basically comparison is all you can do.
Most arithmetic operators (e.g. +, -, *) do not work on objects. Basically comparison is all you can do.
Most arithmetic operators (e.g. +, -, *) do not work on blobs. Basically comparison is all you can do.
Most arithmetic operators (e.g. +, -, *) do not work on blobs. Basically comparison is all you can do.
Most arithmetic operators (e.g. +, -, *) do not work on strings. Basically comparison is all you can do.
Most arithmetic operators (e.g. +, -, *) do not work on strings. Basically comparison is all you can do.
The expression type indicated by subject required a TEXT as the next item and found something else. This could be a binary operator, part of a CASE expression, the parts of an IN expression or any other place where several expressions might need to be compatible with each other.
The expression type indicated by subject required an OBJECT as the next item and found something else. This could be a binary operator, part of a CASE expression, the parts of an IN expression or any other place where several expressions might need to be compatible with each other.
The expression type indicated by subject required a BLOB as the next item and found something else. This could be a binary operator, part of a CASE expression, the parts of an IN expression or any other place where several expressions might need to be compatible with each other.
The expression type indicated by subject required a numeric as the next item and found something else. This could be a binary operator, part of a CASE expression, the parts of an IN expression or any other place where several expressions might need to be compatible with each other.
Here assign/copy can be the simplest case of assigning to a local variable or an OUT parameter but this error also appears when calling functions. You should think of the IN arguments as requiring that the actual argument be assignable to the formal variable and OUT arguments requiring that the formal be assignable to the actual argument variable.
Here assign/copy can be the simplest case of assigning to a local variable or an OUT parameter but this error also appears when calling functions. You should think of the IN arguments as requiring that the actual argument be assignable to the formal variable and OUT arguments requiring that the formal be assignable to the actual argument variable.
Many SQL clauses require a numeric expression such as WHERE/HAVING/LIMIT/OFFSET. This expression indicates the expression in the given context is not a numeric.
When this error is produced it means the result of the join would have the same table twice with no disambiguation between the two places. The conflicting name is provided. To fix this, make an alias both tables. e.g.
SELECT T1.id AS parent_id, T2.id AS child_id
FROM foo AS T1
INNER JOIN foo AS T2 ON T1.id = T2.parent_id;
@delete
instead) ‘index’The named index is in the previous schema bit it is not in the current schema. All entities need some kind of tombstone in the schema so that they can be correctly deleted if they are still present.
An index with the indicated name already exists.
The table part of a CREATE INDEX statement was not a valid table name.
A table contains two constraints with the same name.
The table in a foreign key REFERENCES clause is not a valid table.
The indicated foreign key has at least one column with a different type than corresponding column in the table it references. This usually means that you have picked the wrong table or column in the foreign key declaration.
The number of column in the foreign key must be the same as the number of columns specified in the foreign table. This usually means a column is missing in the REFERENCES part of the declaration.
When using the cql_cursor_to_blob
function or its equivalent shorthand C:to_blob
the cursor must be declared like so cursor C like table_name
where table_name
is the name of a table marked for blob storage ([[blob storage]]
).
If this is not the case then the type of the resulting blob cannot be inferred. If you do not want/need the type to be inferred you can use the form cql_cursor_to_blob(a_cursor, a_blob)
or equivalently a_cursor:to_blob(a_blob)
in which case no inferred type is required.
In an @create
or @delete
annotation, the version number must be > 0. This error usually means there is a typo in the version number.
There can only be one @create
, @delete
, or @recreate
annotation for any given table/column. More than one @create
is redundant. This error usually means the @create
was cut/paste to make an @delete
and then not edited or something like that.
The indicated migration procedure e.g. the foo in @create(5, foo)
appears in another annotation. Migration steps should happen exactly once. This probably means the annotation was cut/paste and the migration proc was not removed.
When a foreign key is specified in the column definition it is the entire foreign key. That means the references part of the declaration can only be for that one column. If you need more columns, you have to declare the foreign key independently.
SQLite is very fussy about autoincrement columns. The column in question must be either a LONG INTEGER or an INTEGER and it must be PRIMARY KEY. In fact, CQL will rewrite LONG INTEGER into INTEGER because only that exact form is supported, but SQLite INTEGERs can hold LONG values so that’s ok. Any other autoincrement form results in this error.
This error indicates a pattern like “id text not null not null” was found. The same attribute shouldn’t appear twice.
In a column definition, the column can only be marked with at most one of PRIMARY KEY or UNIQUE
The SQLite ALTER TABLE ADD COLUMN statement is used to add new columns to the schema. This statement puts the columns at the end of the table. In order to make the CQL schema align as closely as possible to the actual sqlite schema you will get you are required to add columns where SQLite will put them. This will help a lot if you ever connect to such a database and start doing select * from <somewhere with creates>
@delete
, ‘column’If the table is using the @recreate
plan then you can add and remove columns (and other things freely) you don’t need to mark columns with @create
or @delete
just add/remove them. This error prevents the build up of useless annotations.
Any new column added to a schema must have a default value or be nullable so that its initial state is clear and so that all existing insert statements do not have to be updated to include it. Either make the column nullable or give it a default value.
Similarly, any column being deleted must be nullable or have a default value. The column can’t actually be deleted (not all versions of SQLite support this) so it will only be “deprecated”. But if the column is not null and has no default then it would be impossible to write a correct insert statement for the table with the deleted column.
As a consequence you can neither add nor remove columns that are not null and have no default.
You can’t @delete
a column in a version before it was even created. Probably there is a typo in one or both of the versions.
The indicated column is being deleted in a version that is before the table it is found in was even created. Probably there is a typo in the delete version.
The indicated column is being deleted in a version that is after the table has already been deleted. This would be redundant. Probably one or both have a typo in their delete version.
<=
the table create version ‘column’The indicated column is being created in a version that is before the table it is found in was even created. Probably there is a typo in the delete version.
>=
the table delete version ‘column’The indicated column is being created in a version that that is after it has already been supposedly deleted. Probably there is a typo in one or both of the version numbers.
The indicated column is the second column to be marked with AUTOINCREMENT in its table. There can only be one such column.
The OBJECT data type is only for use in parameters and local variables. SQLite has no storage for object references. The valid data types include INTEGER
, LONG INTEGER
, REAL
, BOOL
, TEXT
, BLOB
The indicated operator can only be used to compare two strings.
The indicated operator can only be used to compare two strings.
The MATCH operator is a complex sqlite primitive. It can only appear within SQL expressions. See the CQL documentation about it being a two-headed-beast when it comes to expression evaluation.
None of the unary math operators e.g. ‘-’ and ‘~’ allow blobs as an operand.
None of the unary math operators e.g. ‘-’ and ‘~’ allow objects as an operand.
None of the unary math operators e.g. ‘-’ and ‘~’ allow strings as an operand.
The ’*’ special operator can only appear in the COUNT function. e.g.:
select count(*) from some_table
It is not a valid function argument in any other context.
Select statements of the form select 1, 'foo';
are valid but select '*';
is not. The *
shortcut for columns only makes sense if there is something to select from. e.g. select * from some_table;
is valid. Similarly T.*
makes no sense without a from clause and the @columns(…) construction requires columns.
The indicated table was used in a select statement like select T.* from ...
but no such table was present in the FROM
clause.
Referring to the select statement on the failing line, that select statement was used in a context where all the columns must have a name. Examples include defining a view, a cursor, or creating a result set from a procedure. The failing code might look something like this. select 1, 2 B;
it needs to look like this select 1 A, 2 B
;
In some contexts the type of a constant is used to imply the type of the expression. The NULL literal cannot be used in such contexts because it has no specific type.
In a SELECT statement the NULL literal has no type. If the type of the column cannot be inferred then it must be declared specifically.
In a LET statement, the same situation arises LET x := NULL;
doesn’t specify what type ‘x’ is to be.
You can fix this error by changing the NULL
to something like CAST(NULL as TEXT)
.
A common place this problem happens is in defining a view or returning a result set from a stored procedure. In those cases all the columns must have a name and a type.
If a stored procedure might return one of several result sets, each of the select statements it might return must have the same number of columns. Likewise, if several select results are being combined with UNION
or UNION ALL
they must all have the same number of columns.
If a stored procedure might return one of several result sets, each of the select statements must have the same column names for its result. Likewise, if several select results are being combined with UNION
or UNION ALL
they must all have the same column names.
This is important so that there can be one unambiguous column name for every column for group of select statements.
e.g.
select 1 A, 2 B
union
select 3 A, 4 C;
Would provoke this error. In this case the error would report that the problem was in column 2 and that error was ‘B’ vs. ‘C’
The referenced name is the name of a local or a global in the same scope as the name of a column. This can lead to surprising results as it is not clear which name takes priority (previously the variable did rather than the column, now it’s an error).
Example:
create proc foo(id integer)
begin
-- this is now an error, in all cases the argument would have been selected
select id from bar T1 where T1.id != id;
end;
To correct this, rename the local/global. Or else pick a more distinctive column name, but usually the local is the problem.
The referenced table is marked recreate so it must be in the same recreate group as the current table or in a recreate group that does not introduce a cyclic foreign key dependency among recreate groups. Otherwise, the referenced table might be recreated away leaving all the foreign key references in current table as orphans.
So we check the following: If the referenced table is marked recreate then any of the following result in CQL0060
The referenced table is a recreate table and one of the 4 above conditions was not met. Either don’t reference it or else put the current table and the referenced table into the same recreate group.
In a stored proc with multiple possible selects providing the result, all of the columns of all the selects must be an exact type match.
e.g.
if x then
select 1 A, 2 B
else
select 3 A, 4.0 B;
end if;
Would provoke this error. In this case ‘B’ would be regarded as the offending column and the error is reported on the second B.
In a stored proc with multiple possible selects providing the result, all of the columns of all the selects must be an exact type match. This error indicates that the specified column differs by nullability.
If the procedure is using SELECT to create a result set it cannot also use the OUT keyword to create a one-row result set.
SQLite doesn’t understand object references, so that means you cannot try to use a variable or parameter of type object inside of a SQL statement.
e.g.
create proc foo(X object)
begin
select X is null;
end;
In this example X is an object parameter, but even to use X for an is null
check in a select statement would require binding an object which is not possible.
On the other hand this compiles fine.
create proc foo(X object, out is_null bool not null)
begin
set is_null := X is null;
end;
This is another example of XQL being a two-headed beast.
There is more than one variable/column with indicated name in equally near scopes. The most common reason for this is that there are two column in a join with the same name and that name has not been qualified elsewhere.
e.g.
SELECT A
FROM (SELECT 1 AS A, 2 AS B) AS T1
INNER JOIN (SELECT 1 AS A, 2 AS B) AS T2;
There are two possible columns named A
. Fix this by using T1.A
or T2.A
.
@recreate
, its indices must be in its schema region ‘index_name’If a table is marked @recreate
that means that when it changes it is dropped and created during schema maintenance. Of course when it is dropped its indices are also dropped. So the indices must also be recreated when the table changes. So with such a table it makes no sense to have indices that are in a different schema region. This can only work if they are all always visible together.
Tables on the @create
plan are not dropped so their indices can be maintained separately. So they get a little extra flexibility.
To fix this error move the offending index into the same schema region as the table. And probably put them physically close for maintenance sanity.
The code is trying to access fields in the named cursor but the automatic field generation form was not used so there are no such fields.
e.g.
declare a integer;
declare b integer;
declare C cursor for select 1 A, 2 B;
into a, b; -- C.A and C.B not created (!)
fetch C if (C.A) then -- error
...
end if;
Correct usage looks like this:
declare C cursor for select 1 A, 2 B;
-- automatically creates C.A and C.B
fetch C; if (C.A) then
...
end if;
The indicated field is not a valid field in a cursor expression.
e.g.
declare C cursor for select 1 A, 2 B;
-- automatically creates C.A and C.B
fetch C; if (C.X) then -- C has A and B, but no X
...
end if;
The indicated name could not be resolved in the scope in which it appears. Probably there is a typo. But maybe the name you need isn’t available in the scope you are trying to use it in.
Two expressions of type object are holding a different object type e.g.
declare x object<Foo>;
declare y object<Bar>;
set x := y;
Here the message would report that ‘Bar’ is incompatible. The message generally refers to the 2nd object type as the first one was ok by default then the second one caused the problem.
The BETWEEN operator works on numerics and strings only.
The BETWEEN operator works on numerics and strings only.
The CAST function does highly complex and subtle conversions, including date/time functions and other things. It’s not possibly to emulate this accurately and there is no sqlite helper to do the job directly from a C call. Consequently it’s only supported in the context of CQL statements. It can be used in normal expressions by using the nested SELECT
form (select ...)
There must be at least two arguments in a call to coalesce
.
The ifnull
function requires exactly two arguments.
Adding a NULL literal to IFNULL
or COALESCE
is a no-op. It’s most likely an error.
In an IFNULL
or COALESCE
call, only the last argument may be known to be not null. If a not null argument comes earlier in the list, then none of the others could ever be used. That is almost certainly an error. The most egregious form of this error is if the first argument is known to be not null in which case the entire IFNULL
or COALESCE
can be removed.
The (select…) option for IN
or NOT IN
only makes sense in certain expression contexts. Other uses are most likely errors. It cannot appear in a loose expression because it fundamentally requires sqlite to process it.
The indicated function was called with the wrong number of arguments. There are various functions supported each with different rules. See the SQLite documentation for more information about the specified function.
Many functions can only appear in certain contexts. For instance most aggregate functions are limited to the select list or the HAVING clause. They cannot appear in, for instance, a WHERE, or ON clause. The particulars vary by function.
The indicated aggregate function was used in a select statement with no tables. For instance
select MAX(7);
Doesn’t make any sense.
The indicated macro appeared in a context in which that type of macro is not allowed. For instance attempting to use a statement list macro where an expression belongs results in this error.
Macro errors include a detailed trace of the path to the macro with the problem. The problem could be nested quite deeply.
The indicated macro reference does not refer to an actual macro or macro argument. There is probably a typo in the name of the macro.
Macro errors include a detailed trace of the path to the macro with the problem. The problem could be nested quite deeply.
The argument at the named position is requried to be one of the named types in the named function.
This message is used for a variety of argument type mismatches in function invocation.
The result of the concatenation of an @ID
construct is not a valid identifier. To be a valid identifier:
The result of the invalid concatenation is included in the message.
Macro errors include a detailed trace of the path to the macro with the problem. The problem could be nested quite deeply.
A macro invocation tried to satisfy the indicated formal name with an argument that was the wrong type. For instance attempting to provide a statement list as an argument to a macro that expected an expression.
Macro errors include a detailed trace of the path to the macro with the problem. The problem could be nested quite deeply.
The indicated macro was used with not enough or too many arguments as indicated by the message (it will be one or the other).
Sometimes this message looks surprising because there are actually mismatched making more arguments look like fewer. Check the parens and the total count.
Macro errors include a detailed trace of the path to the macro with the problem. The problem could be nested quite deeply.
External C functions declared with declare function ...
are not for use in sqlite. They may not appear inside statements.
SQLite user defined functions (or builtins) declared with declare select function
may only appear inside of sql statements. In the case of user defined functions they must be added to sqlite by the appropriate C APIs before they can be used in CQL stored procs (or any other context really). See the sqlite documentation on how to add user defined functions. Create Or Redefine SQL Functions
object<T SET>
has a T that is not a procedure with a result set, ‘name’The data type object<T SET>
refers to the shape of a result set of a particular procedure. In this case the indicated name is not such a procedure.
The most likely source of this problem is that there is a typo in the indicated name. Alternatively the name might be a valid shape like a cursor name or some other shape name but it’s a shape that isn’t coming from a procedure.
object<T SET>
has a T that is not a public procedure with a result set, ‘name’The data type object<T SET>
refers to the shape of a result set of a particular procedure. In this case the indicated procedure name was tagged with either the cql:private
attribute or the cql:suppress_result_set
attribute.
Either of these attributes will make it impossible to actually use this result set type. They must be removed.
SQLite only supports this kind of control flow in the context of triggers, certain trigger predicates might need to unconditionally fail and complex logic can be implemented in this way. However this sort of thing is not really recommended. In any case this is not a general purpose construct.
Only forms with a string as the second argument are supported by SQLite.
The indicated function is not implemented in CQL. Possibly you intended to declare it with declare function
as an external function or declare select function
as a sqlite builtin. Note not all sqlite builtins are automatically declared.
The indicated name is neither a table nor a view. It is possible that the table/view is now deprecated with @delete
and therefore will appear to not exist in the current context.
In the JOIN ... USING(x,y,z)
form, all the columns in the using clause must appear on both sides of the join. Here the indicated name is not present on the left side of the join.
In the JOIN ... USING(x,y,z)
form, all the columns in the using clause must appear on both sides of the join. Here the indicated name is not present on the right side of the join.
In the JOIN ... USING(x,y,z)
form, all the columns in the using clause must appear on both sides of the join and have the same data type. Here the data types differ in the named column.
The HAVING
clause makes no sense unless there is also a GROUP BY
clause. SQLite enforces this as does CQL.
In a WITH
clause, the indicated common table name was defined more than once.
In a WITH
clause the indicated common table expression doesn’t include enough column names to capture the result of the select
statement it is associated with.
e.g.
WITH foo(a) as (SELECT 1 A, 2 B) ...`
The select statement produces two columns the foo
declaration specifies one.
In a WITH
clause the indicated common table expression has more column names than the select
expression it is associated with.
e.g.
WITH foo(a, b) as (SELECT 1) ... `
The select statement produces one column the foo
declaration specifies two.
The indicated table or view must be unique in its context. The version at the indicated line number is a duplicate of a previous declaration.
@delete
instead) ‘name’During schema validation, CQL found a view that used to exist but is now totally gone. The correct procedure is to mark the view with @delete
(you can also make it stub with the same name to save a little space). This is necessary so that CQL can know what views should be deleted on client devices during an upgrade. If the view is eradicated totally there would be no way to know that the view should be deleted if it exists.
Converting a view into a table, or otherwise creating a table with the same name as a view is not legal.
@delete
instead) ‘name’During schema validation, CQL found a trigger that used to exist but is now totally gone. The correct procedure is to mark the trigger with @delete
(you can also make it stub with the same name to save a little space). This is necessary so that CQL can know what triggers should be deleted on client devices during an upgrade. If the trigger is eradicated totally there would be no way to know that the trigger should be deleted if it exists. That would be bad.
Attempting to declare that an object has been deleted before it was created is an error. Probably there is a typo in one or both of the version numbers of the named object.
The indicated table was not declared anywhere. Note that CQL requires that you declare all tables you will work with, even if all you intend to do with the table is drop it. When you put a CREATE TABLE
statement in global scope this only declares a table, it doesn’t actually create the table. See the documentation on DDL for more information.
The object named in a DROP TABLE
statement must be a table, not a view.
The indicated view was not declared anywhere. Note that CQL requires that you declare all views you will work with, even if all you intend to do with the view is drop it. When you put a CREATE VIEW
statement in global scope this only declares a view, it doesn’t actually create the view. See the documentation on DDL for more information.
The object named in a DROP VIEW
statement must be a view, not a table.
The indicated index was not declared anywhere. Note that CQL requires that you declare all indices you will work with, even if all you intend to do with the index is drop it. When you put a CREATE INDEX
statement in global scope this only declares an index, it doesn’t actually create the index. See the documentation on DDL for more information.
The indicated trigger was not declared anywhere. Note that CQL requires that you declare all triggers you will work with, even if all you intend to do with the trigger is drop it. When you put a CREATE TRIGGER
statement in global scope this only declares a trigger, it doesn’t actually create the trigger. See the documentation on DDL for more information.
The indicated table was previously marked with @create
indicating it has precious content and should be upgraded carefully. The current schema marks the same table with @recreate
meaning it has discardable content and should be upgraded by dropping it and recreating it. This transition is not allowed. If the table really is non-precious now you can mark it with @delete
and then make a new similar table with @recreate
. This really shouldn’t happen very often if at all. Probably the error is due to a typo or wishful thinking.
The indicated table was previously marked with @create
at some version (x) and now it is being created at some different version (y !=x ). This not allowed (if it were then objects might be created in the wrong/different order during upgrade which would cause all kinds of problems).
The indicated table was previously marked with @delete
at some version (x) and now it is being deleted at some different version (y != x). This not allowed (if it were then objects might be deleted in the wrong/different order during upgrade which would cause all kinds of problems).
@delete
procedure changed in object ‘table_name’The @delete
attribute can optional include a “migration proc” that is run when the upgrade happens. Once set, this proc can never be changed.
@create
procedure changed in object ‘table_name’The @create
attribute can optional include a “migration proc” that is run when the upgrade happens. Once set, this proc can never be changed.
Since there is no sqlite operation that allows for columns to be renamed, attempting to rename a column is not allowed.
NOTE: you can also get this error if you remove a column entirely, or add a column in the middle of the list somewhere.
Since columns (also) cannot be reordered during upgrade, CQL expects to find all the columns in exactly the same order in the previous and new schema. Any reordering, or deletion could easily look like an erroneous rename. New columns must appear at the end of any existing columns.
It is not possible to change the data type of a column during an upgrade, SQLite provides no such options. Attempting to do so results in an error. This includes nullability.
The indicated column was previously marked with @create
at some version (x) and now it is being created at some different version (y !=x ). This not allowed (if it were then objects might be created in the wrong/different order during upgrade which would cause all kinds of problems).
The indicated column was previously marked with @delete
at some version (x) and now it is being deleted at some different version (y != x). This not allowed (if it were then objects might be deleted in the wrong/different order during upgrade which would cause all kinds of problems).
@delete
procedure changed ‘name’The @delete
attribute can optional include a “migration proc” that is run when the upgrade happens. Once set, this proc can never be changed.
@create
procedure changed ‘name’The @create
attribute can optional include a “migration proc” that is run when the upgrade happens. Once set, this proc can never be changed.
The default value of a column may not be changed in later versions of the schema. There is no SQLite operation that would allow this.
@delete
instead) ‘table’During schema validation, CQL found a table that used to exist but is now totally gone. The correct procedure is to mark the table with @delete
. This is necessary so that CQL can know what tables should be deleted on client devices during an upgrade. If the table is eradicated totally there would be no way to know that the table should be deleted if it exists. That would be bad.
The indicated object was a table in the previous schema but is now a view in the current schema. This transformation is not allowed.
The indicated column changed in one of its more exotic attributes, examples:
FOREIGN KEY
rules changed in some wayPRIMARY KEY
status changedUNIQUE
status changedBasically the long form description of the column is now different and it isn’t different in one of the usual way like type or default value. This error is the catch all for all the other ways a column could change such as “the FK rule for what happens when an update fk violation occurs is now different” – there are dozens of such errors and they aren’t very helpful anyway.
@delete
‘column_name’During schema validation, CQL found a column that used to exist but is now totally gone. The correct procedure is to mark the column with @delete
. This is necessary so that CQL can know what columns existed during any version of the schema, thereby allowing them to be used in migration scripts during an upgrade. If the column is eradicated totally there would be no way to know that the exists, and should no longer be used. That would be bad.
Of course @recreate
tables will never get this error because they can be altered at whim.
@create
‘column_name’The indicated column was added but it was not marked with @create
. The table in question is not on the @recreate
plan so this is an error. Add a suitable @create
annotation to the column declaration.
@create
and @delete
‘column_name’The indicated column was simultaneously marked @create
and @delete
. That’s surely some kind of typo. Creating a column and deleting it in the same version is weird.
The indicated table has changes in one of its non-column features. These changes might be:
None of these are allowed to change. Of course @recreate
tables will never get this error because they can be altered at whim.
The error indicates that the table has had some stuff removed from it. The “stuff” might be:
Since there is no way to change any of the constraints after the fact, they may not be changed at all if the table is on the @create
plan. Of course @recreate
tables will never get this error because they can be altered at whim.
The error indicates that the table has had some stuff added to it. The “stuff” might be:
Since there is no way to change any of the constraints after the fact, they may not be changed at all if the table is on the @create
plan. Of course @recreate
tables will never get this error because they can be altered at whim.
The ‘flags’ on the CREATE TABLE
statement changed between versions. These flags capture the options like theTEMP
in CREATE TEMP TABLE
and the IF NOT EXISTS
. Changing these is not allowed.
Trigger names may not be duplicated. Probably there is copy/pasta going on here.
In a CREATE TRIGGER
statement, the indicated name is neither a table or a view. Either a table or a view was expected in this context.
In a CREATE TRIGGER
statement, the named target of the trigger was a view but the trigger type is not INSTEAD OF
. Only INSTEAD OF
can be applied to views because views are not directly mutable so none of the other types make sense. e.g. there can be no delete operations, on a view, so BEFORE DELETE
or AFTER DELETE
are not really a thing.
The indicated object is a temporary. Since temporary do not survive sessions it makes no sense to try to version them for schema upgrade. They are always recreated on demand. If you need to remove one, simply delete it entirely, it requires no tombstone.
The indicated column is part of a temporary table. Since temp tables do not survive sessions it makes no sense to try to version their columns for schema upgrade. They are always recreated on demand.
SQLite uses its ROWID
internally for AUTOINCREMENT
columns. Therefore WITHOUT ROWID
is not a possibility if AUTOINCREMENT
is in use.
In a CREATE TABLE
statement, the indicated column was defined twice. This is probably a copy/pasta issue.
The indicated table has more than one column with the PRIMARY KEY
attribute or multiple PRIMARY KEY
constraints, or a combination of these things. You’ll have to decide which one is really intended to be primary.
In an ALTER TABLE
statement, the table to be altered is actually a view. This is not allowed.
In an ALTER TABLE
statement, the table to be altered was not defined, or perhaps was marked with @delete
and is no longer usable in the current schema version.
NOTE:
ALTER TABLE
is typically not used directly; the automated schema upgrade script generation system uses it.
In an ALTER TABLE
statement, the attributes on the column may not include @create
or @delete
. Those annotations go on the columns declaration in the corresponding CREATE TABLE
statement.
NOTE:
ALTER TABLE
is typically not used directly; the automated schema upgrade script generation system uses it.
In an ALTER TABLE
statement, the attributes on the column may not include AUTOINCREMENT
. SQLite does not support the addition of new AUTOINCREMENT
columns.
NOTE:
ALTER TABLE
is typically not used directly; the automated schema upgrade script generation system uses it.
In an ALTER TABLE
statement the attributes on the named column must include a default value or else the column must be nullable. This is so that SQLite knows what value to put on existing rows when the column is added and also so that any existing insert statements will not suddenly all become invalid. If the column is nullable or has a default value then the existing insert statements that don’t specify the column will continue to work, using either NULL or the default.
NOTE:
ALTER TABLE
is typically not used directly; the automated schema upgrade script generation system uses it.
@create
, exact name match required ‘column_name’In CQL loose schema is a declaration, it does not actually create anything unless placed inside of a procedure. A column that is added with ALTER TABLE
is not actually declared as part of the schema by the ALTER
. Rather the schema declaration is expected to include any columns you plan to add. Normally the way this all happens is that you put @create
notations on a column in the schema and the automatic schema upgrader then creates suitable ALTER TABLE
statements to arrange for that column to be added. If you manually write an ALTER TABLE
statement it isn’t allowed to add columns at whim; in some sense it must be creating the reality already described in the declaration. This is exactly what the automated schema upgrader does – it declares the end state and then alters the world to get to that state.
It’s important to remember that from CQL’s perspective the schema is fixed for any given compilation, so runtime alterations to it are not really part of the type system. They can’t be. Even DROP TABLE
does not remove the table from type system – it can’t – the most likely situation is that you are about to recreate that same table again for another iteration with the proc that creates it.
This particular error is saying that the column you are trying to add does not exist in the declared schema.
NOTE:
ALTER TABLE
is typically not used directly; the automated schema upgrade script generation system uses it.
In CQL loose schema is a declaration, it does not actually create anything unless placed inside of a procedure. A column that is added with ALTER TABLE
is not actually declared as part of the schema by the ALTER
. Rather the schema declaration is expected to include any columns you plan to add. Normally the way this all happens is that you put @create
notations on a column in the schema and the automatic schema upgrader then creates suitable ALTER TABLE
statements to arrange for that column to be added. If you manually write an ALTER TABLE
statement it isn’t allowed to add columns at whim; in some sense it must be creating the reality already described in the declaration. This is exactly what the automated schema upgrader does – it declares the end state and then alters the world to get to that state.
It’s important to remember that from CQL’s perspective the schema is fixed for any given compilation, so runtime alterations to it are not really part of the type system. They can’t be. Even DROP TABLE
does not remove the table from type system – it can’t – the most likely situation is that you are about to recreate that same table again for another iteration with the proc that creates it.
This particular error is saying that the column you are trying to add exists in the declared schema, but its definition is different than you have specified in the ALTER TABLE
statement.
NOTE:
ALTER TABLE
is typically not used directly; the automated schema upgrade script generation system uses it.
In an IF
statement the condition (predicate) must be a numeric. The body of the IF
runs if the value is not null and not zero.
In a DELETE
statement, the indicated table does not exist. Probably it’s a spelling mistake, or else the table has been marked with @delete
and may no longer be used in DELETE
statements.
In a DELETE
statement, the target of the delete must be a table, but the indicated name is a view.
In an UPDATE
statement, you can only specify any particular column to update once.
e.g. UPDATE coordinates set x = 1, x = 3;
will produce this error. UPDATE coordinates set x = 1, y = 3;
might be correct.
This error is most likely caused by a typo or a copy/pasta of the column names, especially if they were written one per line.
In an UPDATE
statement, the target table does not exist. Probably it’s a spelling mistake, or else the table has been marked with @delete
and may no longer be used in UPDATE
statements.
In an UPDATE
statement, the target of the update must be a table but the name of a view was provided.
The INSERT
statement statement supports the notion of synthetically generated values for dummy data purposes. A ‘seed’ integer is used to derive the values. That seed (in the @seed()
position) must be a non-null integer.
The most common reason for this error is that the seed is an input parameter and it was not declared NOT NULL
.
In an INSERT
statement of the form INSERT INTO foo(a, b, c) VALUES(x, y, z)
the number of values (x, y, z) must be the same as the number of columns (a, b, c). Note that there are many reasons you might not have to specify all the columns of the table but whichever columns you do specify should have values.
In an INSERT
statement such as INSERT INTO foo(a,b,c) VALUES(x,yz)
this error is indicating that there is a column in foo
(the one indicated in the error) which was not in the list (i.e. not one of a, b, c) and that column is neither nullable, nor does it have a default value. In order to insert a row a value must be provided. To fix this include the indicated column in your insert statement.
Adding an index to a virtual table isn’t possible, the virtual table includes whatever indexing its module provides, no further indexing is possible.
From the SQLite documentation: “One cannot create additional indices on a virtual table. (Virtual tables can have indices but that must be built into the virtual table implementation. Indices cannot be added separately using CREATE INDEX statements.)”
In an INSERT
statement attempting to insert into the indicated table name is not possible because there is no such table. This error might happen because of a typo, or it might happen because the indicated table has been marked with @delete
and is logically hidden.
In an INSERT
statement attempting to insert into the indicated name is not possible because that name is a view not a table. Inserting into views is not supported.
Adding a trigger to a virtual table isn’t possible.
From the SQLite documentation: “One cannot create a trigger on a virtual table.”
Several statements support the FROM ARGUMENTS
sugar format like INSERT INTO foo(a,b,c) FROM ARGUMENTS
which causes the arguments of the current procedure to be used as the values. This error is complaining that you have used this form but the statement does not occur inside of a procedure so there can be no arguments. This form does not make sense outside of any procedure.
This is not supported by SQLite.
From the SQLite documentation: “One cannot run ALTER TABLE … ADD COLUMN commands against a virtual table.”
Cursors come in two flavors. There are “statement cursors” which are built from something like this:
declare C cursor for select * from foo;
fetch C;-- or --
into a, b, c; fetch C
That is, they come from a SQLite statement and you can fetch values from that statement. The second type comes from procedural values like this.
declare C cursor like my_table;
from values(1, 2, 3); fetch C
In the second example C
’s data type will be the same as the columns in my_table
and we will fetch its values from 1,2,3
– this version has no database backing at all, it’s just data.
This error says that you declared the cursor in the first form (with a SQL statement) but then you tried to fetch it using the second form, the one for data. These forms don’t mix. If you need a value cursor for a row you can copy data from one cursor into another.
In a value cursor, declared something like this:
declare C cursor like my_table;
from values(1, 2, 3); fetch C
The type of the cursor ( in this case from my_table
) requires a certain number of columns, but that doesn’t match the number that were provided in the values.
To fix this you’ll need to add/remove values so that the type match.
In a value cursor, declared something like this:
declare C cursor like my_table;
from values(1, 2, 3); fetch C(a,b,c)
This error is saying that there is some other field in the table ‘d’ and it was not specified in the values. Nor was there a usable dummy data for that column that could be used. You need to provide a value for the missing column.
An insert/delete/update statement lacking the RETURNING clause does not produce any result therefore it is not suitable for use to fill a cursor or other such things that need results. With the RETURNING clause the same statement is, in very real way, behaving like a special SELECT. It can therefore be used in a cursor.
The indicated name was used in a context where an enumerated type name was expected but there is no such type.
Perhaps the enum was not included (missing a #include) or else there is a typo.
The operand of the CAST
expression is already the type that it is being cast to. The cast will do nothing but waste space in the binary and make the code less clear. Remove it.
In a scoped name list, like the columns of a cursor (for a fetch), or the columns of a particular table (for an index) a name appeared that did not belong to the universe of legal names. Trying to make a table index using a column that is not in the table would produce this error. There are many instances where a list of names belongs to some limited scope.
In a scoped name list, like the columns of a cursor (for a fetch), or the columns of a particular table (for an index) a name appeared twice in the list where the names must be unique. Trying to make a table index using the same column twice would produce this error.
In a SET
statement, the target of the assignment is not a valid variable name in that scope.
In a SET
statement, the target of the assignment is a cursor variable, you cannot assign to a cursor variable.
In a parameter list for a function or a procedure, the named parameter appears more than once. The formal names for function arguments must be unique.
There can only be one migration rule for a table or group, the indicated item already has such an action. If you need more than one migration action you can create a containing procedure that dispatches to other migrators.
Global constants must be either a combination other constants for numeric expressions or else string literals. The indicated expression was not one of those.
This can happen if the expression uses variables, or has other problems that prevent it from evaluating, or if a function is used that is not supported.
In an argument list, the LIKE
construct was used to create arguments that are the same as the return type of the named procedure. However the named procedure does not produce a result set and therefore has no columns to mimic. Probably the name is wrong.
Any shared fragment can have only one statement. There are three valid forms – IF/ELSE, WITH … SELECT, and SELECT.
This error indicates the named procedure, which is a shared fragment, has more than one statement.
In a procedure that returns a result either with a loose SELECT
statement or in a place where the result of a SELECT
is captured with a FETCH
statement the named column appears twice in the projection of the SELECT
in question. The column names must be unique in order to have consistent cursor field names or consistent access functions for the result set of the procedure. One instance of the named column must be renamed with something like select T1.foo first_foo, T2.foo second_foo
.
In a cql:autodrop
annotation, the given name is unknown entirely.
In a cql:autodrop
annotation, the given name is not a table (it’s probably a view).
In a cql:autodrop
annotation, the given name is a table but it is not a temp table. The annotation is only valid on temp tables, it’s not for “durable” tables.
The CREATE PROCEDURE
statement may not appear inside of another stored procedure. The named procedure appears in a nested context.
In a CREATE PROCEDURE
statement, the given name conflicts with an already declared function (DECLARE FUNCTION
or DECLARE SELECT FUNCTION
). You’ll have to choose a different name.
In a CREATE PROCEDURE
statement, the indicated name already corresponds to a created (not just declared) stored procedure. You’ll have to choose a different name.
N
for proc ‘name’The named procedure was declared as a schema migration procedure in an @create
or @delete
annotation for schema version N
. In order to correctly type check such a procedure it must be compiled in the context of schema version N
. This restriction is required so that the tables and columns the procedure sees are the ones that existed in version N
not the ones that exist in the most recent version as usual.
To create this condition, the procedure must appear in a file that begins with the line:
@schema_upgrade_version <N>;
And this declaration must come before any CREATE TABLE
statements. If there is no such declaration, or if it is for the wrong version, then this error will be generated.
The named procedure was declared as a schema migration procedure in an @create
or @delete
annotation, however the procedure does not have any DML in it. That can’t be right. Some kind of data reading and writing is necessary.
The named procedure was previously declared with a DECLARE PROCEDURE
statement but when the CREATE PROCEDURE
was encountered, it did not match the previous declaration.
In a context with a typed name list (e.g. id integer, t text
) the named column occurs twice. Typed name lists happen in many contexts, but a common one is the type of the result in a declared procedure statement or declared function statement.
A DECLARE FUNCTION
statement for the named function is happening inside of a procedure. This is not legal. To correct this move the declaration outside of the procedure.
The named function in a DECLARE FUNCTION
statement conflicts with an existing declared or created procedure. One or the other must be renamed to resolve this issue.
The named function in a DECLARE FUNCTION
statement conflicts with an existing declared function, or it was declared twice. One or the other declaration must be renamed or removed to resolve this issue.
A DECLARE PROCEDURE
statement for the named procedure is itself happening inside of a procedure. This is not legal. To correct this move the declaration outside of the procedure.
The named procedure in a DECLARE PROCEDURE
statement conflicts with an existing declared function. One or the other declaration must be renamed or removed to resolve this issue.
The named procedure was previously declared with a DECLARE PROCEDURE
statement. Now there is another declaration and it does not match the previous declaration
In a DECLARE
statement, a variable of the same name already exists in that scope. Note that CQL does not have block level scope, all variables are procedure level, so they are in scope until the end of the procedure. To resolve this problem, either re-use the old variable if appropriate or rename the new variable.
In a DECLARE
statement, the named variable is a global (declared outside of any procedure) and has the same name as a table or view. This creates a lot of confusion and is therefore disallowed. To correct the problem, rename the variable. Global variables generally are problematic, but sometimes necessary.
In a DECLARE
statement that declares a CURSOR FOR CALL
the procedure that is being called does not produce a result set with the SELECT
statement. As it has no row results it is meaningless to try to put a cursor on it. Probably the error is due to a copy/pasta of the procedure name.
In a DECLARE
statement that declares a CURSOR LIKE
another cursor, the indicated name is a variable but it is not a cursor, so we cannot make another cursor like it. Probably the error is due to a typo in the ‘like_name’.
In an INSERT
or FETCH
statement using the form FROM ARGUMENTS(LIKE [name])
The shape [name]
had columns that did not appear in as arguments to the current procedure. Maybe arguments are missing or maybe the name in the like
part is the wrong name.
In a DECLARE
statement that declares a CURSOR LIKE
some other name, the indicated name is not the name of any of the things that might have a valid shape to copy, like other cursors, procedures, tables, or views. Probably there is a typo in the name.
In the DECLARE [cursor_name] CURSOR FETCH FROM CALL <something>
form, the code is trying to create the named cursor by calling a procedure that doesn’t actually produce a single row result set with the OUT
statement. The procedure is valid (that would be a different error) so it’s likely that the wrong procedure is being called rather than being an outright typo. Or perhaps the procedure was changed such that it no longer produces a single row result set.
This form is equivalent to:
DECLARE [cursor_name] LIKE procedure;
FROM CALL procedure(args); FETCH [cursor_name]
It’s the declaration that’s failing here, not the call.
The indicated name appeared in a context where the name of a cursor was expected, but the name does not refer to a cursor.
There are many contexts where a list of names appears in the CQL grammar and the list must not contain duplicate names. Some examples are:
JOIN ... USING(x,y,z,...)
clauseFETCH [cursor] INTO x,y,z...
statementCTE(x,y,z,...) as (SELECT ...)
@declare_schema_region <name> USING x,y,z,...
The indicated name was duplicated in such a context.
In a procedure call, the indicated parameter of the procedure is an OUT or INOUT parameter but the call site doesn’t have a variable in that position in the argument list.
Example:
declare proc foo(out x integer);
-- the constant 1 cannot be used in the out position when calling foo
call foo(1); '
A shared fragment will be expanded into the body of a SQL select statement, as such it can have no side-effects such as out arguments.
In a procedure call, the indicated parameter is in an ‘out’ position, it is a viable local variable but it is not an exact type match for the parameter. The type of variable used to capture out parameters must be an exact match.
declare proc foo(out x integer);
create proc bar(out y real)
begin
call foo(y); -- y is a real variable, not an integer.
end;
The above produces:
out parameter: arg must be an exact type match
CQL0209: proc integer; found real) 'y' (expected
(expected expected_type; found actual_type) ‘variable_name’
In a procedure call, the indicated parameter is in an ‘out’ position, it is a viable local variable of the correct type but the nullability does not match. The type of variable used to capture out parameters must be an exact match.
declare proc foo(out x integer not null);
create proc bar(out y integer)
begin
call foo(y); -- y is nullable but foo is expecting not null.
end;
The above produces:
out parameter: arg must be an exact type match (even nullability)
CQL0210: proc integer notnull; found integer) 'y' (expected
In a function call, the target of the function call was a procedure, procedures can be used like functions but their last parameter must be marked out
. That will be the return value. In this case the last argument was not marked as out
and so the call is invalid.
Example:
declare proc foo(x integer);
create proc bar(out y integer)
begin
set y := foo(); -- foo does not have an out argument at the end
end;
In a procedure call to the named procedure, not enough arguments were provided to make the call. One or more arguments may have been omitted or perhaps the called procedure has changed such that it now requires more arguments.
In a procedure call to the named procedure, the target of the call had compilation errors. As a consequence this call cannot be checked and therefore must be marked in error, too. Fix the errors in the named procedure.
The named procedure results a result set, either with the SELECT
statement or the OUT
statement. However it is being called from outside of any procedure. Because of this, its result cannot then be returned anywhere. As a result, at the global level the result must be capture with a cursor.
Example:
create proc foo()
begin
select * from bar;
end;
call foo(); -- this is not valid
declare cursor C for call foo(); -- C captures the result of foo, this is ok.
In a FETCH
statement of the form FETCH [cursor]
or FETCH [cursor] INTO
the named cursor is a value cursor. These forms of the FETCH
statement apply only to statement cursors.
Example:good
-- value cursor shaped like a table
declare C cursor for select * from bar;
--ok, C is fetched from the select results
fetch C;
Example: bad
-- value cursor shaped like a table
declare C cursor like bar;
-- invalid, there is no source for fetching a value cursor
fetch C;-- ok assuming bar is made up of 3 integers
from values(1,2,3); fetch C
In a FETCH
statement, the indicated name, which is supposed to be a cursor, is not in fact a valid name at all.
Probably there is a typo in the name. Or else the declaration is entirely missing.
In a FETCH [cursor] INTO [variables]
the number of variables specified did not match the number of columns in the named cursor. Perhaps the source of the cursor (a select statement or some such) has changed.
The CONTINUE
statement may only appear inside of looping constructs. CQL only has two LOOP FETCH ...
and WHILE
The LEAVE
statement may only appear inside of looping constructs or the switch statement.
CQL has two loop types: LOOP FETCH ...
and WHILE
and of course the SWITCH
statement.
The errant LEAVE
statement is not in any of those.
In a ROLLBACK
statement that is rolling back to a named savepoint, the indicated savepoint was never mentioned before. It should have appeared previously in a SAVEPOINT
statement. This probably means there is a typo in the name.
In a RELEASE SAVEPOINT
statement that is rolling back to a named savepoint, the indicated savepoint was never mentioned before. It should have appeared previously in a SAVEPOINT
statement. This probably means there is a typo in the name.
The statement form OUT [cursor_name]
makes a procedure that returns a single row result set. It doesn’t make any sense to do this outside of any procedure because there is no procedure to return that result. Perhaps the OUT
statement was mis-placed.
The statement form OUT [cursor_name]
makes a procedure that returns a single row result set that corresponds to the current value of the cursor. If the cursor never held values directly then there is nothing to return.
Example:
declare C cursor for select * from bar;
out C; -- error C was never fetched
declare C cursor for select * from bar;
into x, y, z;
fetch C -- error C was used to load x, y, z so it's not holding any data
out C;
declare C cursor for select * from bar;
-- create storage in C to hold bar columns (e.g. C.x, C,y, C.z)
fetch C;-- ok, C holds data
out C;
Inside of a WITH clause you can create a CTE by calling a shared fragment like so:
WITH
my_shared_something(*) AS (CALL shared_proc(5))
SELECT * from my shared_something;
or you can use a nested select expression like
SELECT * FROM (CALL shared_proc(5)) as T;
However shared_proc
must define a shareable fragment, like so:
[[shared_fragment]]
create proc shared_proc(lim_ integer)
begin
select * from somewhere limit lim_;
end;
Here the target of the CALL is not a shared fragment.
The @previous_schema
directive says that any schema that follows should be compared against what was declared before this point. This gives CQL the opportunity to detect changes in schema that are not supportable.
The previous schema directive must be outside of any stored procedure.
Example:
@previous_schema; -- ok here
create proc foo()
begin
schema; -- nope
@previous end;
The @schema_upgrade_script
directive tells CQL that the code that follows is intended to upgrade schema from one version to another. This kind of script is normally generated by the --rt schema_upgrade
option discussed elsewhere. When processing such a script, a different set of rules are used for DDL analysis. In particular, it’s normal to declare the final versions of tables but have DDL that creates the original version and more DDL to upgrade them from wherever they are to the final version (as declared). Ordinarily these duplicate definitions would produce errors. This directive allows those duplications.
This error is reporting that the directive happened inside of a stored procedure, this is not allowed.
Example:
@schema_upgrade_script; -- ok here
create proc foo()
begin
-- nope
@schema_upgrade_script; end;
The @schema_upgrade_script
directive tells CQL that the code that follows is intended to upgrade schema from one version to another. This kind of script is normally generated by the --rt schema_upgrade
option discussed elsewhere. When processing such a script, a different set of rules are used for DDL analysis. In particular, it’s normal to declare the final versions of tables but have DDL that creates the original version and more DDL to upgrade them from wherever they are to the final version (as declared). Ordinarily these duplicate definitions would produce errors. This directive allows those duplications.
In order to do its job properly the directive must come before any tables are created with DDL. This error tells you that the directive came too late in the stream. Or perhaps there were two such directives and one is late in the stream.
When authoring a schema migration procedure that was previously declared in an @create
or @delete
directive, the code in that procedure expects to see the schema as it existed at the version it is to migrate. The @schema_upgrade_version
directive allows you to set the visible schema version to something other than the latest. There can only be one such directive.
This error says that the version you are trying to view is not a positive integer version (e.g version -2)
When authoring a schema migration procedure that was previously declared in an @create
or @delete
directive, the code in that procedure expects to see the schema as it existed at the version it is to migrate. The @schema_upgrade_version
directive allows you to set the visible schema version to something other than the latest. There can only be one such directive.
This error says that a second @schema_upgrade_version
directive has been found.
When authoring a schema migration procedure that was previously declared in an @create
or @delete
directive, the code in that procedure expects to see the schema as it existed at the version it is to migrate. The @schema_upgrade_version
directive allows you to set the visible schema version to something other than the latest. There can only be one such directive.
This error says that the @schema_upgrade_version
directive was found inside of a stored procedure. This is not allowed.
When authoring a schema migration procedure that was previously declared in an @create
or @delete
directive, the code in that procedure expects to see the schema as it existed at the version it is to migrate. The @schema_upgrade_version
directive allows you to set the visible schema version to something other than the latest. There can only be one such directive.
This error says that the @schema_upgrade_version
directive came after tables were already declared. This is not allowed, the directive must come before any DDL.
In a SELECT
expression like set x := (select id from bar)
the select statement must return exactly one column as in the example provided. Note that a runtime error will ensue if the statement returns zero rows, or more than one row, so this form is very limited. To fix this error change your select statement to return exactly one column. Consider how many rows you will get very carefully also, that cannot be checked at compile time.
When authoring a schema migration procedure that was previously declared in an @create
or @delete
directive that procedure will be called during schema migration with no context available. Therefore, the schema migration proc is not allowed to have any arguments.
The named procedure has the autodrop
annotation (to automatically drop a temporary table) but the procedure in question doesn’t return a result set so it has no need of the autodrop feature. The purpose that that feature is to drop the indicated temporary tables once all the select results have been fetched.
In a CALL
statement, or a function call, the named procedure takes fewer arguments than were provided. This error might be due to some copy/pasta going on or perhaps the argument list of the procedure/function changed to fewer items. To fix this, consult the argument list and adjust the call accordingly.
The named procedure has the autodrop
annotation (to automatically drop a temporary table) but the procedure in question doesn’t even use the database at all, much less the named table. This annotation is therefore redundant.
@enforce_strict
has been use to enable strict foreign key enforcement. When enabled every foreign key must have an action for the ON UPDATE
rule. You can specify NO ACTION
but you can’t simply leave the action blank.
@enforce_strict
has been use to enable strict foreign key enforcement. When enabled every foreign key must have an action for the ON DELETE
rule. You can specify NO ACTION
but you can’t simply leave the action blank.
The [[identity=(col1, col2, ...)]]
form has been used to list the identity columns of a stored procedures result set. These columns must exist in the result set and they must be unique. The indicated column name is not part of the result of the procedure that is being annotated.
The [[vault_sensitive=(col1, col2, ...]]
form has been used to list the columns of a stored procedures result set. These columns must exist in the result set. The indicated column name will be encoded if they are sensitive and the cursor that produced the result_set is a DML.
The [[identity=(col1, col2,...)]]
form has been used to list the identity columns of a stored procedures result set. These columns must exist in the result set and they must be unique. In this case, the named procedure doesn’t even return a result set. Probably there is a copy/pasta going on. The identity attribute can likely be removed.
The SQLite ||
operator has complex string conversion rules making it impossible to faithfully emulate. Since there is no helper function for doing concatenations, CQL choses to support this operator only in contexts where it will be evaluated by SQLite. That is, inside of some SQL statement.
Examples:
declare X text;
set X := 'foo' || 'bar'; -- error
set X := (select 'foo' || 'bar'); -- ok
If concatenation is required in some non-sql context, use the (select ..)
expression form to let SQLite do the evaluation.
There is an explicit (set x := y
) or implicit assignment (e.g. conversion of a parameter) where the storage for the target is a smaller numeric type than the expression that is being stored. This usually means a variable that should have been declared LONG
is instead declared INTEGER
or that you are typing to pass a LONG to a procedure that expects an INTEGER
We explicitly do not wish to support string concatenation for blobs that holds non-string data. If the blob contains string data, make your intent clear by converting it to string first using CAST
before doing the concatenation.
In a @declare_schema_region
statement one of the USING regions is not a valid region name. Or in @begin_schema_region
the region name is not valid. This probably means there is a typo in your code.
The indicated region was previously defined, it cannot be redefined.
Another @begin_schema_region
directive was encountered before the previous @end_schema_region
was found.
An @end_schema_region
directive was encountered but there was no corresponding @begin_schema_region
directive.
All of the *_schema_region
directives must be used at the top level of your program, where tables are typically declared. They do not belong inside of procedures. If you get this error, move the directive out of the procedure near the DDL that it affects.
The indicated identifier appears in the context of a table, it is a function, but it is not a table-valued function. Either the declaration is wrong (use something like declare select function foo(arg text) (id integer, t text)
) or the name is wrong. Or both.
In a select statement, there is a reference to the indicated table-valued-function. For instance:
-- the above error happens if my_function has not been declared
-- as a table valued function
select * from my_function(1,2,3);
However , my_function
has not been declared as a function at all. A correct declaration might look like this:
declare select function my_function(a int, b int, c int)
int, y text); (x
Either there is a typo in the name or the declaration is missing, or both…
An @PROC literal was used outside of any procedure. It cannot be resolved if it isn’t inside a procedure.
@schema_upgrade_version
was usedWhen authoring a schema migration script (a stored proc named in an @create
or @delete
annotation) you must create that procedure in a file that is marked with @schema_upgrade_verison
specifying the schema version it is upgrading. If you do this, then the proc (correctly) only sees the schema as it existed at that version. However that makes the schema unsuitable for analysis using @previous_schema
because it could be arbitrarily far in the past. This error prevents you from combining those features. Previous schema validation should only happen against the current schema.
The indicated function (usually min or max) only works on strings and numerics. NULL
literals, blobs, or objects are not allowed in this context.
The form:
SET [name] FROM CURSOR [cursor_name]
Is used to wrap a cursor in an object so that it can be returned for forwarded. This is the so-called “boxing” operation on the cursor. The object can then be “unboxed” later to make a cursor again. However the point of this is to keep reading forward on the cursor perhaps in another procedure. You can only read forward on a cursor that has an associated SQLite statement. That is the cursor was created with something like this
DECLARE [name] CURSOR FOR SELECT ... | CALL ...
If the cursor isn’t of this form it’s just values, you can’t move it forward and so “boxing” it is of no value. Hence not allowed. You can return the cursor values with OUT
instead.
The LIKE [procedure]
ARGUMENTS` form creates a shape for use in a cursor or procedure arguments.
The indicated name is a procedure with no arguments so it cannot be used to make a shape.
You can enable strict enforcement of joins to avoid the form
select * from A, B;
which sometimes confuses people (the above is exactly the same as
select * from A inner join B on 1;
Usually there are constraints on the join also in the WHERE clause but there don’t have to be.
@enforce_strict join
turns on this mode.
The new unique key must have at least one column that is not in a previous key AND it must not have all the columns from any previous key.
e.g:
create table t1 (
int,
a long,
b
c text,real,
d UNIQUE (a, b),
UNIQUE (a, b, c), -- INVALID (a, b) is already unique key
UNIQUE (b, a), -- INVALID (b, a) is the same as (a, b)
UNIQUE (c, d, b, a), -- INVALID subset (b, a) is already unique key
UNIQUE (a), -- INVALID a is part of (a, b) key
UNIQUE (a, c), -- VALID
UNIQUE (d), -- VALID
UNIQUE (b, d) -- VALID
);
If you are calling a procedure that returns a value cursor (using OUT
) then you accept that cursor using the pattern
DECLARE C CURSOR FETCH FROM CALL foo(...);
The pattern
DECLARE C CURSOR FOR CALL foo(...);
Is used for procedures that provide a full select
statement.
Note that in the former cause you don’t then use fetch
on the cursor. There is at most one row anyway and it’s fetched for you so a fetch would be useless. In the second case you fetch as many rows as there are and/or you want.
select * from foo offset 1;
Is not supported by SQLite. OFFSET
may only be used if LIMIT
is also present. Also, both should be small because offset is not cheap. There is no way to do offset other than to read and ignore the indicated number of rows. So something like offset 1000
is always horrible.
If you’re creating a table t2 with foreign keys on table t1, then the set of t1’s columns reference in the foreign key statement for table t2 should be:
t1
e.g:create table t1(a text primary key);
create table t2(a text primary key, foreign key(a) references t1(a));
t1
e.g:create table t1(a text unique);
create table t2(a text primary key, foreign key(a) references t1(a));
t1
e.g:create table t1(a text, b int, unique(a, b));
create table t2(a text, b int, foreign key(a, b) references t1(a, b));
t1
e.g:create table t1(a text, b int, primary key(a, b));
create table t2(a text, b int, foreign key(a, b) references t1(a, b));
t1
e.g:create table t1(a text, b int);
create unique index unq on t1(a, b);
create table t2(a text, b int, foreign key(a, b) references t1(a, b));
In a cql:autotest
annotation, the given dummy_test info (table name, column name, column value) has incorrect format.
In a cql:autotest
annotation, the given table name for dummy_test attribute does not exist.
In a cql:autotest
annotation, the given column name for dummy_test attribute does not exist.
In a cql:autotest
annotation, the given column value’s type for dummy_test attribute does not match the column type.
In a cql:autotest
annotation, the format is incorrect.
In a cql:autotest
annotation, the given attribute name is not valid.
If you’re doing an UPSERT on table T
, the columns listed in the conflict target should be: - A primary key in T
- A unique key in T
- A group of unique key in T
- A group of primary key in T
- A unique index in T
When the INSERT
statement to which the UPSERT is attached takes its values from a SELECT
statement, there is a potential parsing ambiguity. The SQLite parser might not be able to tell if the ON
keyword is introducing the UPSERT or if it is the ON
clause of a join. To work around this, the SELECT
statement should always include a WHERE
clause, even if that WHERE
clause is just WHERE 1
(always true).
NOTE: The CQL parser doesn’t have this ambiguity because it treats “ON CONFLICT” as a single token so this is CQL reporting that SQLite might have trouble with the query as written. e.g:
insert into foo select id from bar where 1 on conflict(id) do nothing;
The UPDATE statement of and UPSERT should not include the table name because the name is already known from the INSERT statement part of the UPSERT. e.g:
insert into foo select id from bar where 1 on conflict(id) do update set id=10;
The UPDATE statement should always include a table name except if the UPDATE statement is part of an UPSERT statement. e.g:
update foo set id=10;
insert into foo(id) values(1) do update set id=10;
The INSERT statement part of an UPSERT statement can only uses INSERT INTO ...
e.g:
insert into foo(id) values(1) on conflict do nothing;
insert into foo(id) values(1) on conflict do update set id=10;
@schema_ad_hoc_migration
must provide both a version number and a migrate procedure name. This is unlike the other version directives like @create
where the version number is optional. This is because the whole point of this migrator is to invoke a procedure of your choice.
In @schema_ad_hoc_migration
you cannot change the version number of the directive once it has been added to the schema because this could cause inconsistencies when upgrading.
You can change the body of the method if you need to but this is also not recommended because again there could be inconsistencies. However careful replacement and compensation is possible. This is like going to 110% on the reactor… possible, but not recommended.
An @schema_ad_hoc_migration
cannot be removed because it could cause inconsistencies on upgrade.
You can change the body of the method if you need to but this is also not recommended because again there could be inconsistencies. However careful replacement and compensation is possible. This is like going to 110% on the reactor… possible, but not recommended.
@enforce_strict
has been use to enable strict upsert statement enforcement. When enabled all sql statement should not use the upsert statement. This is because sqlite version running in some iOS and Android version is old. Upsert statement was added to sqlite in the version 3.24.0 (2018-06-04).
<deployable_region>
not into the middle: <error_region>
Deployable regions have an “inside” that is in some sense “private”. In order to keep the consistent (and independently deployable) you can’t peek into the middle of such a region, you have to depend on the root (i.e. <deployable_region>
itself). This allows the region to remain independently deployable and for its internal logical regions to be reorganized in whatever manner makes sense.
To fix this error probably you should change error_region
so that it depends directly on deployable_region
--dev
mode because its result set may vary between sqlite versionsThe EXPLAIN statement is intended for debugging and analysis only. It helps engineer understand how Sqlite will execute their query and the cost attached to it. SQLite can and does change the shape of the results of the explain statement and so it should not be used in production code. This is why this statement is only available in --dev
mode in CQL indicating “not for production” output.
CQL only support [EXPLAIN QUERY PLAN stmt] sql statement.
Not all SQLite builtin function can be used as a window function.
Window name referenced in the select list should be defined in the Window clause of the same select statement.
Window name defined in the window clause of a select statement should always be used within that statement.
In this form:
insert into YourTable() FROM your_cursor;
The ()
means no columns are being specified, the cursor will never be used. The only source of columns is maybe dummy data (if it was specified) or the default values or null. In no case will the cursor be used. If you really want this use FROM VALUES()
and don’t implicate a cursor or an argument bundle.
The cursor in question has no storage associated with it. It was loaded with something like:
fetch C into x, y, z;
You can only use a cursor as a source of data if it was fetched with its own storage like
fetch C
This results in a structure for the cursor. This gives you C.x
, C.y
, C.z
etc.
If you fetched the cursor into variables then you have to use the variables for any inserting.
The named shape was used in a fetch statement but the number of columns fetched is smaller than the number required by the statement we are processing.
If you need to use the cursor plus some other data then you can’t use this form, you’ll have to use each field individually like
from values(C.x, C.y, C.z, other_stuff)
The shape with too few fields might be the source or the target of the statement.
The argument of the function should be an integer.
The second argument of the function should be an integer between 0 and INTEGER_MAX.
The first and third arguments of the function have to be of the same type because the third argument provide a default value in cause the first argument is NULL.
The second argument of the function must be and integer between 1 and INTEGER_MAX.
The keyword DISTINCT can only be used with one argument in an aggregate function.
Only aggregated functions and user defined functions can use the keyword DISTINCT. Others type of functions are not allowed to use it.
There are basically two checks here both of which have to do with the “nesting level” at which the return
occurs.
A loose return
statement (not in a procedure) is meaningless so that produce an error. There is nothing to return from.
If the return statement is not inside of an “if” or something like that then it will run unconditionally. Nothing should follow the return (see CQL0308) so if we didn’t fall afoul of CQL0308 and we’re at the top level then the return is the last thing in the proc, in which case it is totally redundant.
Both these situations produce an error.
Control flow will exit the containing procedure after a return
statement, so any statements that follow in its statement list will certainly not run. So the return statement must be the last statement, otherwise there are dead/unreachable statements which is most likely done by accident.
To fix this probably the things that came after the return should be deleted. Or alternately there was a condition on the return that should have been added but wasn’t, so the return should have been inside a nested statement list (like the body of an if
maybe).
The indicated table was newly added – it is not present in the previous schema. However the version number it was added at is in the past. The new table must appear at the current schema version or later. That version is provided in the error message.
To fix this, change the @create
annotation on the table to be at the indicated version or later.
The indicated column was newly added – it is not present in the previous schema. However the version number it was added at is in the past. The new column must appear at the current schema version or later. That version is provided in the error message.
To fix this, change the @create
annotation on the table to be at the indicated version or later.
An object may not move between deployment regions, because users of the schema will depend on its contents. New objects can be added to a deployment region but nothing can move from one region to another. The indicated object appears to be moving.
@enforce_strict
has been use to enable strict window function enforcement. When enabled all sql statement should not invoke window function. This is because sqlite version running in some iOS version is old. Window function was added to SQLite in the version 3.25.0 (2018-09-15).
CQL (currently) limits use of blob literals to inside of SQL fragments. There’s no easy way to get a blob constant variable into the data section so any implementation would be poor. These don’t come up very often in any case so this is a punt basically. You can fake it with (select x’1234’) which makes it clear that you are doing something expensive. This is not recommended. Better to pass the blob you need into CQL rather than cons it from a literal. Within SQL it’s just text and SQLite does the work as usual so that poses no problems. And of course non-literal blobs (as args) work find and are bound as usual.
CQL built-in function does not require a select function declaration. You can used it directly in your SQL statement.
Columns on a table must have default value or be nullable in order to use INSERT INTO <table>
DEFAULT VALUES statement.
INSERT
statement with DEFAULT VALUES
can not be used in a upsert statement. This form is not supported by SQLite.
The indicated name is an index or a trigger. These objects may not have a migration script associated with them when they are deleted.
The reason for this is that both these types of objects are attached to a table and the table itself might be deleted. If the table is deleted it becomes impossible to express even a tombstone for the deleted trigger or index without getting errors. As a consequence the index/trigger must be completely removed. But if there had been a migration procedure on it then some upgrade sequences would have run it, but others would not (anyone who upgraded after the table was deleted would not get the migration procedure). To avoid this problem, migration procedures are not allowed on indices and triggers.
If you get this error it means that there is a typo in the name of the procedure you are trying to call, or else the declaration for the procedure is totally missing. Maybe a necessary #include
needs to be added to the compiland.
Previously if you attempted to call an unknown CQL would produce a generic function call. If you need to do this, especially a function with varargs, then you must declare the function with something like:
DECLARE PROCEDURE printf NO CHECK;
This option only works for void functions. For more complex signatures check DECLARE FUNCTION
and DECLARE SELECT FUNCTION
. Usually these will require a simple wrapper to call from CQL.
In all cases there must be some kind of declaration,to avoid mysterious linker failures or argument signature mismatches.
In a foreign key, we enforce the following rules:
@recreate
tables can see any version they like, if the name is in scope that’s good enoughNormal processing can’t actually get into this state because if you tried to create the referencing table with the smaller version number first you would get errors because the name of the referenced table doesn’t yet exist. But if you created them at the same time and you made a typo in the version number of the referenced table such that it was accidentally bigger you’d create a weirdness. So we check for that situation here and reject it to prevent that sort of typo.
If you see this error there is almost certainly a typo in the version number of the referenced table; it should be fixed.
The values for the attribute ok_table_scan
can only be names.
CQL attributes can have a variety of values but in this case the attribute refers to the names of tables so no other type of attribute is reasonable.
The names provided to ok_table_scan
attribute should be names of existing tables.
The attribute indicates tables that are ok to scan in this procedure even though they are typically not ok to scan due to ‘no_table_scan’. Therefore the attribute must refer to an existing table. There is likely a typo in the the table name that needs to be corrected.
The attribute attribute_name
doesn’t take a value.
When marking a statement with [[<attribute_name>]]
there is no need for an attribute value.
The attribute_name
attribute can only be assigned to specific statements.
The marking [[<attribute_name>]]
only makes sense on specific statement. It’s likely been put somewhere strange, If it isn’t obviously on the wrong thing, look into possibly how the source is after macro expansion.
The ok_table_scan
can only be placed on a create procedure statement.
The marking [[ok_table_scan=...]]
indicates that the procedure may scan the indicated tables. This marking doesn’t make sense on other kinds of statements.
Dummy insert feature makes only sense when it’s used in a VALUES clause that is not part of a compound select statement.
VALUES clause requires at least a value for each of the values list. Empty values list are not supported.
The number of values for each values list in VALUES clause should always be the same.
To avoid name conflicts in the upgrade script, migration procedures are not allowed to end in ’_crc’ this suffix is reserved for internal use.
@enforce_strict
has been used to enable strict WITHOUT ROWID
enforcement. When enabled no CREATE TABLE statement can have WITHOUT ROWID clause.
The named procedure has a call that uses the FROM ARGUMENTS pattern but it doesn’t have any arguments. This is almost certainly a cut/paste from a different location that needs to be adjusted.
The argument for the CQL builtin function cql_get_blob_size should always be of type blob
Functions like ifnull_crash
only make sense if the argument is nullable. If it’s already not null the operation is uninteresting/redundant.
The most likely cause is that the function call in question is vestigial and you can simply remove it.
object<T cursor>
where T is a valid shape name ‘variable’It’s possible to take the statement associated with a statement cursor and store it in an object variable. Using the form:
declare C cursor for X;
The object variable ‘X’ must be declared as follows:
declare X object<T cursor>;
Where T
refers to a named object with a shape, like a table, a view, or a stored procedure that returns a result set. This type T
must match the shape of the cursor exactly i.e. having the column names and types.
The reverse operation, storing a statement cursor in a variable is also possible with this form:
set X from cursor C;
This has similar constraints on the variable X
.
This error indicates that the variable in question (X
in this example) is not a typed object variable so it can’t be the source of a cursor, or accept a cursor.
See Chapter 5 of the CQL Programming Language.
The indicated function was declared with DECLARE SELECT FUNCTION
meaning it is to be used in the context of SQLite statements. However, SQLite doesn’t understand functions that return type object at all. Therefore declaration is illegal.
When working with pointer type through SQLite it is often possibly to encode the object as an long integer assuming it can pass through unchanged with no retain/release semantics or any such thing. If that is practical you can move objects around by returning long integers.
Collation order really only makes sense on text fields. Possibly blob fields but we’re taking a stand on blob for now. This can be relaxed later if that proves to be a mistake. For now, only text
In a CREATE TABLE statement, the indicated column name came after a constraint. SQLite expects all the column definitions to come before any constraint definitions. You must move the offending column definition above the constraints.
The ROLLBACK RETURN
and COMMIT RETURN
forms are only usable inside of a PROC SAVEPOINT
block because they rollback or commit the savepoint that was created at the top level.
The indicated statement may only appear inside procedure and not nested. The classic example of this is the PROC SAVEPOINT
form which can only be used at the top level of procedures.
The normal RETURN
statement cannot be used inside of PROC SAVEPOINT
block, you have to indicate if you want to commit or rollback the savepoint when you return. This makes it impossible to forget to do so which is in some sense the whole point of PROC SAVEPOINT
.
The constant expression could not be evaluated. This is most likely because it includes an operator that is not supported or a function call which is not support. Very few functions can be used in constant expressions The supported functions include iif
, which is rewritten; abs
; ifnull
, nullif
, and coalesce
.
While processing a declare enum
statement the indicated member of the enum appeared twice.
This is almost certainly a copy/paste of the same enum member twice.
While processing a declare enum
statement the indicated member of the enum could not be evaluated as a constant expression.
There could be a non-constant in the expression or there could be a divide-by-zero error.
The two described declare enum
statements have the same name but they are not identical.
The error output contains the full text of both declarations to compare.
The indicated member is not part of the enumeration.
A DECLARE ENUM
statement for the named enum is happening inside of a procedure. This is not legal.
To correct this move the declaration outside of the procedure.
If a type name is used twice then the two or more declarations must be identical. The conflicting types will be included in the output and printed in full.
The indicated name is not a valid type name.
Return data type in a create function definition can only be TEXT, BLOB or OBJECT.
These are the only reference types and so CREATE makes sense only with those types. An integer, for instance, can’t start with a +1 reference count.
In order to ensure that SQLite will parse HIDDEN as part of the type it has to come before any other attributes like NOT NULL.
This limitation is due to the fact that CQL and SQLite use slightly different parsing approaches for attributes and in SQLite HIDDEN isn’t actually an attribute. The safest place to put the attribute is right after the type name and before any other attributes as it is totally unambiguous there so CQL enforces this.
Each @enforce_pop
should match an @enforce_push
, but there is nothing to pop on the stack now.
@enforce_strict transaction
has been used, while active no transaction operations are allowed. Savepoints may be used. This is typically done to prevent transactions from being used in any ad hoc way because they don’t nest and typically need to be used with some “master plan” in mind.
In the indicated type declaration, the indicated attribute was specified twice. This is almost certainly happening because the line in question looks like this declare x type_name not null;
but type_name
is already not null
.
@enforce_strict select if nothing
has been enabled. This means that select expressions must include if nothing then throw
(the old default) if nothing then [value]
or if nothing or null then [value]
. This options exists because commonly the case where a row does not exist is not handled correctly when (select ...)
is used without the if nothing
options.
If your select expression uses a built-in aggregate function, this check may not be enforced because they can always return a row. But there are exceptions. The check is still enforced when one of the following is in the expression: - a GROUP BY
clause - a LIMIT
that evaluates to less than 1, or is a variable - an OFFSET
clause - You have a min
or max
function with more than 1 argument. Those are scalar functions.
This form allows for error control of (select…) expressions. But SQLite does not understand the form at all, so it can only appear at the top level of expressions where CQL can strip it out. Here are some examples:
good:
set x := (select foo from bar where baz if nothing then 0);
if (select foo from bar where baz if nothing then 1) then ... end if;
bad:
select foo from bar where (select something from somewhere if nothing then null);
delete from foo where (select something from somewhere if nothing then 1);
Basically if you are already in a SQL context, the form isn’t usable because SQLite simply doesn’t understand if nothing at all. This error makes it so that you’ll get a build time failure from CQL rather than a run time failure from SQLite.
There is an unfortunate memory leak in older versions of SQLite (research pending on particular versions, but 3.28.0 has it). It causes this pattern to leak:
-- must be autoinc table
create table x (
integer primary key autoincrement
pk
);
-- any join will do (this is a minimal repro)
insert into x
select NULL pk from
select 1) t1 inner join (select 1) t2; (
You can workaround this with a couple of fairly simple rewrites. This form is probably the cleanest.
with
as (select .. anything you need)
cte (pk) insert into x
select * from cte;
Simply wrapping your desired select in a nested select also suffices. So long as the top level is simple.
insert into x
select * from (
select anything you need....
);
This error is generated by @enforce_strict table function
. It is there to allow safe use of Table Valued Functions (TVFs) even though there was a bug in SQLite prior to v 3.31.0 when joining against them. The bug appears when the TVF is on the right of a left join. For example:
select * from foo left join some_tvf(1);
In this case the join becomes an INNER join even though you wrote a left join. Likewise
select * from some_tvf(1) right join foo;
Becomes an inner join even though you wrote a right join. The same occurs when a TVF is on either side of a cross join.
The workaround is very simple. You don’t want the TVF to be the target of the join directly. Instead:
with tvf_(*) as (select * from some_tvf(1))
select * from foo left join tvf_;
OR
select * from foo left join (select * from some_tvf(1));
It is always the case that SELECT ... IF NOTHING OR NULL THEN NULL
is equivalent to SELECT ... IF NOTHING THEN NULL
. As such, do not do this:
select foo from bar where baz if nothing or null then null
Do this instead:
select foo from bar where baz if nothing then null
Attempting to check if some value x
is NULL via x = NULL
or x == NULL
, or isn’t NULL via x <> NULL
or x != NULL
, will always produce NULL regardless of the value of x
. Instead, use x IS NULL
or x IS NOT NULL
to get the expected boolean result.
CQL found a redundant select operation (e.g., set x := (select NULL);
).
There is no need to write a select expression that always evaluates to NULL. Simply use NULL instead (e.g., set x := NULL;
).
@recreate
to @create
must use @create(nn,cql:from_recreate)
‘table name’The indicated table is moving from @recreate
to @create
meaning it will now be schema managed in an upgradable fashion. When this happens end-user databases might have some stale version of the table from a previous installation. This stale version must get a one-time cleanup in order to ensure that the now current schema is correctly applied. The cql:from_recreate
annotation does this. It is required because otherwise there would be no record that this table “used to be recreate” and therefore might have an old version still in the database.
A correct table might look something like this:
create table correct_migration_to_create(
id integer primary key,
t text7, cql:from_recreate); ) @create(
The indicated name is a valid built-in migration procedure but it is not valid on this kind of item. For instance cql:from_recreate
can only be applied to tables.
Certain schema migration steps are built-in. Currently the only one is cql:from_recreate
for moving to @create
from @recreate
. Others may be added in the future. The cql:
prefix ensures that this name cannot conflict with a valid user migration procedure.
In a SWITCH
statement each expression each expression in a WHEN
clause must be made up of constants and simple numeric math operations. See the reference on the const(..)
expression for the valid set.
It’s most likely that a variable or function call appears in the errant expression.
The SWITCH
statement can only switch over integers or long integers. It will be translated directly to the C switch statement form. TEXT
, REAL
, BLOB
, BOOL
, and OBJECT
cannot be used in this way.
The WHEN
expression evaluates to a LONG INTEGER
but the expression in the SWITCH
is INTEGER
.
The ALL VALUES
form of switch means that: * the SWITCH
expression is an enumerated type * the WHEN
cases will completely cover the values of the enum
If you allow the ELSE
form then ALL VALUES
becomes meaningless because of course they are all covered. So with ALL VALUES
there can be no ELSE
.
You can list items that have no action with this form:
WHEN 10, 15 THEN NOTHING -- explicitly do nothing in these cases so they are still covered
No code is generated for such cases.
Either there were no WHEN
clauses at all, or they were all WHEN ... THEN NOTHING
so there is no actual code to execute. You need to add some cases that do work.
In a SWITCH
statement all of the values in the WHEN
clauses must be unique. The indicated errant entry is a duplicate.
In a SWITCH
statement with ALL VALUES
specified the switch expression was not an enumerated type. ALL VALUES
is used to ensure that there is a case for every value of an enumerated type so this switch cannot be so checked. Either correct the expression, or remove ALL VALUES
.
In a SWITCH
statement with ALL VALUES
specified the errant enum member did not appear in any WHEN
clause. All members must be specified when ALL VALUES
is used.
In a SWITCH
statement with ALL VALUES
specified the errant integer value appeared in in a WHEN
clause. This value is not part of the members of the enum. Note that enum members that begin with ’_’ are ignored as they are, by convention, considered to be pseudo-members. e.g. in declare enum v integer (v0 = 0, v1 =1, v2 =2, _count = 3)
_count
is a pseudo-member.
The errant entry should probably be removed. Alternatively, ALL VALUES
isn’t appropriate as the domain of the switch is actually bigger than the domain of the enumeration. One of these changes must happen.
The purpose of the DECLARE OUT
form is to automatically declare the out parameters for that procedure.
This cannot be done if the type of the procedure is not yet known.
The DECLARE OUT CALL
form was used, but the procedure has no OUT
arguments that need any implicit declaration. Either they have already all been declared or else there are no OUT
arguments at all, or even no arguments of any kind.
When a cursor is boxed—i.e., wrapped in an object—the lifetime of the box and underlying statement are automatically managed via reference counting. Accordingly, it does not make sense to manually call CLOSE on such a cursor as it may be retained elsewhere. Instead, to allow the box to be freed and the underlying statement to be finalized, set all references to the cursor to NULL.
Note: As with all other objects, boxed cursors are automatically released when they fall out of scope. You only have to set a reference to NULL if you want to release the cursor sooner, for some reason.
When the schema upgrader runs, if the virtual table is deleted it will attempt to do DROP TABLE IF EXISTS
on the indicated table. This table is a virtual table. SQLite will attempt to initialize the table even when you simply try to drop it. For that to work the module must still be present. This means modules can never be deleted! This attribute is here to remind you of this fact so that you are not tempted to delete the module for the virtual table when you delete the table. You may, however, replace it with a shared do-nothing stub.
The attribute itself does nothing other than hopefully cause you to read this documentation.
CHECK
expressions and partial indexes (CREATE INDEX
with a WHERE
clause) require that the expressions be deterministic. User defined functions may or may not be deterministic.
Use [[deterministic) on a UDF declaration (declare select function…]] to mark it deterministic and allow its use in an index.
SQLite does not allow the use of correlated subqueries or other embedded select statements inside of a CHECK expression or the WHERE clauses of a partial index. This would require additional joins on every such operation which would be far too expensive.
A table valued function should be used like a table e.g.
-- this is right
select * from table_valued_func(5);
Not like a value e.g.
-- this is wrong
select table_valued_func(5);
-- this is also wrong
select 1 where table_valued_func(5) = 3;
If you are putting DDL inside of a procedure then that is going to run regardless of any @create
, @delete
, or @recreate
attributes;
DDL in entires do not get versioning attributes, attributes are reserved for schema declarations outside of any procedure.
This error is about either a trigger or an index. In both cases you are trying to use @delete
on the index/trigger but the table that the named object is based on is itself deleted, so the object is an orphan. Because of this, the orphaned object doesn’t need, or no longer needs, an @delete
tombstone because when the table is dropped, all of its orphaned indices and triggers will also be dropped.
To fix this error, remove the named object entirely rather than marking it @delete
.
Note: if the index/trigger was previously deleted and now the table is also deleted, it is now safe to remove the index/trigger @delete
tombstone and this error reminds you to do so.
When specifying an ORDER BY
for a compound select, you may only order by indices (e.g., 3
) or names (e.g., foo
) that correspond to an output column, not by the result of an arbitrary expression (e.g., foo + bar
).
For example, this is allowed:
SELECT x, y FROM t0 UNION ALL select x, y FROM t1 ORDER BY y
The equivalent using an index is also allowed:
SELECT x, y FROM t0 UNION ALL select x, y FROM t1 ORDER BY 2
This seemingly equivalent version containing an arbitrary expression, however, is not:
SELECT x, y FROM t0 UNION ALL select x, y FROM t1 ORDER BY 1 + 1;
The indicated table changed from @recreate
to @create
but it did so in a past schema version. The change must happen in the current schema version. That version is indicated by the value of nn.
To fix this you can change the @create
annotation so that it matches the number in this error message
The indicated operator has been suppressed with @enforce_strict is true
because it is not available on older versions of sqlite.
The construct:
DECLARE PROCEDURE printf NO CHECK;
Is used to tell CQL about an external procedure that might take any combination of arguments. The canonical example is printf
. All the arguments are converted from CQL types to basic C types when making the call (e.g. TEXT variables become temporary C strings). Once a procedure has been declared in this way it can’t then also be declared as a normal CQL procedure via CREATE
or DECLARE PROCEDURE
. Likewise a normal procedure can’t be redeclared with the NO CHECK
pattern.
If a procedure has no known type—that is, it was originally declared with NO CHECK
, and has not been subsequently re-declared with DECLARE FUNCTION
or DECLARE SELECT FUNCTION
—it is not possible to use it in an expression. You must either declare the type before using it or call the procedure outside of an expression via a CALL
statement:
DECLARE PROCEDURE some_external_proc NO CHECK;
-- This works even though `some_external_proc` has no known type
-- because we're using a CALL statement.
CALL some_external_proc("Hello!");
DECLARE FUNCTION some_external_proc(t TEXT NOT NULL) INT NOT NULL;
-- Now that we've declared the type, we can use it in an expression.
:= some_external_proc("Hello!"); let result
A common mistake with substr is to assume it uses zero based indices like C does. It does not. In fact the result when using 0 as the second argument is not well defined. If you want the first n
characters of a string you use substr(haystack, 1, n)
.
CQL 0407 available for re-use
If the left side of an IS NULL
or IS NOT NULL
expression is of a NOT NULL
type, the answer will always be the same (FALSE
or TRUE
, respectively). Such a check often indicates confusion that may lead to unexpected behavior (e.g., checking, incorrectly, if a cursor has a row via cursor IS NOT NULL
).
NOTE: Cursor fields of cursors without a row and uninitialized variables of a NOT NULL reference type are exceptions to the above rule: Something may be NULL even if it is of a NOT NULL type in those cases. CQL will eventually eliminate these exceptions. In the cursor case, one can check whether or not a cursor has a row by using the cursor-as-boolean-expression syntax (e.g.,
IF cursor THEN ... END IF;
,IF NOT cursor ROLLBACK RETURN;
, et cetera). In the uninitialized variables case, writing code that checks for initialization is not recommended (and, indeed, use before initialization will soon be impossible anyway): One should simply always initialize the variable.
The same flag cannot be used more than once per substitution within a format string.
It is not sensible to use both the +
flag and the space flag within the same substitution (e.g., %+ d
) as it is equivalent to just using the +
flag (e.g., %+d
).
The flag used (-
or 0
) for a substitution within a format string does not make sense unless accompanied by a width (e.g., %-10d
).
The use of the l
length specifier within a format string, e.g. %ld
, has no effect in SQLite. If the argument is to be a LONG
, use ll
instead (e.g., %lld
). If the argument is to be an INTEGER
, simply omit the length specifier entirely (e.g., %d
).
Length specifiers are only for use with integer type specifiers (e.g. %lld
) and the !
flag is only for use with non-integer type specifiers (e.g. %!10s
and %!f
). It therefore makes no sense to use both within the same substitution.
The type specifier used is accepted by SQLite, but it would be either useless or unsafe if used within the context of CQL.
The type specifier used within the format string is not known to SQLite.
The type specifier provided does not make sense given one or more flags that appear within the same substitution. For example, it makes no sense to have a substitution like %+u
: the +
indicates the sign of the number will be shown, while the u
indicates the number will be shown as an unsigned integer.
The type specifier provided cannot be used with a length specifier. For example, %lls
makes no sense because ll
only makes sense with integer types and s
is a type specifier for strings.
The format string ends with a substitution that is incomplete. This can be the case if a format string ends with a %
(e.g., "%d %s %"
). If the intent is to have a literal %
printed, use %%
instead (e.g., “%d %s %%”`).
The first argument to the function must be a string literal.
More arguments were provided to the function than its format string indicates are necessary. The most likely cause for this problem is that the format string is missing a substitution.
Fewer arguments were provided to the function than its format string indicates are necessary. The most likely cause for this problem is that an argument was accidentally omitted.
If a procedure has an INOUT
parameter, it cannot be used as a function: It may only be called via a CALL
statement.
For a procedure to be used as a function, it must have exactly one OUT
parameter, and that parameter must be the last parameter of the procedure. In all other cases, procedures with one or more OUT
parameters may only be called via a CALL
statement.
When a variable is passed as an OUT
or INOUT
argument, it may not be used as another argument within the same procedure call. It can, however, be used within a subexpression of another argument. For example:
CREATE PROC some_proc(IN a TEXT, OUT b TEXT)
BEGIN
...
END
DECLARE t TEXT;
-- This is NOT legal.
CALL some_proc(t, t);
-- This, however, is perfectly fine.
CALL some_proc(some_other_proc(t), t);
When creating a shared fragment you can specify “table parameters” by defining their shape like so:
[[shared_fragment]]
create proc shared_proc(lim_ integer)
begin
with source(*) LIKE any_shape
select * from source limit lim_;
end;
However this LIKE form only makes sense withing a shared fragment, and only as a top level CTE in such a fragment. So either:
In a CALL clause to access a shared fragment there is a duplicate table name in the USING portion.
Example:
my_cte(*) AS (call my_fragment(1) USING something as param1, something_else as param1),
Here param1
is supposed to take on the value of both something
and something_else
. Each parameter may appear only once in the USING
clause.
In a CALL clause to access a shared fragment there are table bindings but the shared fragment that is being called does not have any table bindings. e.g.:
[[shared_fragment]]create proc my_fragment(lim integer not null)
begin
select * from a_location limit lim;
end;
-- here we try to use my_fragment with table parameter but it has none
with
*) AS (call my_fragment(1) USING something as param)
my_cte(select * from my_cte;
In a CALL clause to access a shared fragment the table bindings are missing a table parameter.
Example:
[[shared_fragment]]create proc my_fragment(lim integer not null)
begin
with source LIKE source_shape
select * from source limit lim;
end;
-- here we try to use my_fragment but no table was specified to play the role of "source"
with
*) AS (call my_fragment(1))
my_cte(select * from my_cte;
In a CALL clause to access a shared fragment the table bindings refer to a table parameter that does not exist.
Example:
[[shared_fragment]]create proc my_fragment(lim integer not null)
begin
with source LIKE source_shape
select * from source limit lim;
end;
-- here we try to use my_fragment but there is a table name "soruce" that doesn't match source
with
*) AS (call my_fragment(1) USING something as soruce)
my_cte(select * from my_cte;
In a CALL clause to access a shared fragment the table bindings are trying to use a table that has the wrong number of columns. The column count, names, and types must be compatible. Extra columns for instance are not allowed because they might create ambiguities that were not present in the shared fragment. e.g.:
[[shared_fragment]]create proc my_fragment(lim integer not null)
begin
with source LIKE (select 1 x, 2 y)
select * from source limit lim;
end;
-- here we try to use my_fragment but we provided 3 columns not 2
with
*) AS (select 1 x, 2 y, 3 z),
my_source(*) AS (call my_fragment(1) USING my_source as source)
my_cte(select * from my_cte;
Here my_fragment
wants a source
table with 2 columns (x, y). But 3 were provided.
In a CALL clause to access a shared fragment the table bindings are trying to use a table that is missing a required column. e.g.:
[[shared_fragment]]create proc my_fragment(lim integer not null)
begin
with source LIKE (select 1 x, 2 y)
select * from source limit lim;
end;
-- here we try to use my_fragment but we passed in a table with (w,x) not (x,y)
with
*) AS (select 1 w, 2 x),
my_source(*) AS (call my_fragment(1) USING my_source as source)
my_cte(select * from my_cte;
The indicated name is the name of a shared fragment, these fragments may be used inside of SQL code (e.g. select statements) but they have no meaning in a normal call outside of a SQL statement. e.g.:
[[shared_fragment]]create proc my_fragment(lim integer not null)
begin
select * from somewhere limit lim;
end;
call my_fragment();
Here my_fragment
is being used like a normal procedure. This is not valid. A correct use of a fragment might look something like this:
with
call my_fragment())
(select * from my_fragment;
In a SQLite SELECT
expression, WHERE
, GROUP BY
, HAVING
, and WINDOW
clauses see the columns of the FROM
clause before they see any aliases in the expression list. For example, assuming some table t
has columns x
and y
, the following two expressions are equivalent:
SELECT x AS y FROM t WHERE y > 100
SELECT x AS y FROM t WHERE t.y > 100
In the first expression, the use of y > 100
makes it seem as though the y
referred to could be the y
resulting from x as y
in the expression list, but that is not the case. To avoid such confusion, CQL requires the use of the qualified form t.y > 100
instead.
Unlike many databases (e.g., PostgreSQL and SQL Server), SQLite allows the aliases of a SELECT
expression list to be referenced from clauses that are evaluated before the expression list. It does this by replacing all such alias references with the expressions to which they are equivalent. For example, assuming t
does not have a column x
, the following two expressions are equivalent:
SELECT a + b AS x FROM t WHERE x > 100
SELECT a + b AS x FROM t WHERE a + b > 100
This can be convenient, but it is also error-prone. As mentioned above, the above equivalency only holds if x
is not a column in t
: If x
is a column in t
, the WHERE
clause would be equivalent to t.x > 100
instead, and there would be no syntactically obvious way to know this without first manually determining all of the columns present in t
.
To avoid such confusion, CQL disallows referencing expression list aliases from WHERE
, GROUP BY
, HAVING
, and WINDOW
clauses altogether. Instead, one should simply use the expression to which the alias is equivalent (as is done in the second example above).
The name of a common table expression may not shadow a previously declared table or view. To rectify the problem, simply use a different name.
The variable indicated must be initialized before it is used because it is of a reference type (BLOB
, OBJECT
, or TEXT
) that is also NOT NULL
.
CQL is usually smart enough to issue this error only in cases where initialization is truly lacking. Be sure to verify that the variable will be initialized before it is used for all possible code paths.
The parameter indicated must be initialized before the procedure returns because it is of a reference type (BLOB
, OBJECT
, or TEXT
) that is also NOT NULL
.
CQL is usually smart enough to issue this error only in cases where initialization is truly lacking. Be sure to verify that the parameter will be initialized both before the end of the procedure and before all cases of RETURN
and ROLLBACK RETURN
. (Initialization before THROW
is not required.)
The indicated procedure is one of the fragment types but has an empty body. This is not valid for any fragment type.
Example:
[[shared_fragment]]
create proc my_fragment(lim integer not null)
begin
/* something has to go here */
end;
A shared fragment may consist of just one SELECT statement (including WITH…SELECT) or it can be an IF/ELSE statement that has a series of compatible select statements. There are no other valid options.
In a shared fragment with conditionals the top level statement is an “IF”. All of the statement lists in the IF must have exactly one valid select statement. This error indicates that a statement list has the wrong number or type of statement.
This error will be followed by additional diagnostic information about the call chain that is problematic. For instance:
Procedure innermost has a different CTE that is also named foo
The above originated from CALL inner USING foo AS source
The above originated from CALL middle USING foo AS source
The above originated from CALL outer USING foo AS source
This indicates that you are trying to call outer
which in turn calls middle
which in turn called inner
. The conflict happened when the foo
parameter was passed in to inner
because it already has a CTE named foo
that means something else.
The way to fix this problem is to rename the CTE in probably the outermost call as that is likely the one you control. Renaming it in the innermost procedure might also be wise if that procedure is using a common name likely to conflict.
It is wise to name the CTEs in shared fragments such that they are unlikely to eclipse outer CTEs that will be needed as table parameters.
The attribute cql:try_is_proc_body
cannot be used with any values (e.g., cql:try_is_proc_body=(...)
).
The purpose of cql:try_is_proc_body
is to indicate that a particular TRY
block contains what should be considered to be the true body of the procedure. As it makes no sense for a procedure to have multiple bodies, cql:try_is_proc_body
must appear only once within any given procedure.
By definition, an eponymous virtual table has the same name as its module. If you use the @eponymous notation on a virtual table, you must also make the module and table name match.
The indicated table was on the recreate plan and was then deleted by adding an @delete(version)
attribute.
However, the previous @recreate
annotation was removed. This would make the table look like it was a baseline table that had been deleted, and it isn’t. To correctly drop a table on the @recreate
you leave the recreate directive as it was and simply add @delete
. No version information is required because the table is on the recreate plan anyway.
Example:
create table dropping_this
(
f1 integer,
f2 text
) @recreate(optional_group) @delete;
This error indicates that the @recreate(optional_group)
annotation was removed. You should put it back.
The indicated table was marked for blob storage or is a backed table. In both cases there is no physical schema associated with it so unsubscribe does not make any sense there. If it’s a backed table perhaps the intent was to remove the backing table?
When using a shared fragment like an expression, the shared fragment must consist of a simple SELECT without a FROM clause. That SELECT, however, may contain a nested SELECT expression which, itself, may have a FROM clause.
Additional constraints:
Certain built-in functions like COUNT
can be used with DISTINCT
or FILTER
options like so:
select count(distinct ...);
select sum(...) filter(where ...) over (...)
These options are not valid when calling a procedure as a function and so they generate errors if used.
Due to an enabled enforcement (e.g., @enforce_strict sign function;
), the indicated function may not be used within SQL because it is not supported on old versions of SQLite.
The CQL forms SET [blob] FROM CURSOR [cursor]
and FETCH [cursor] FROM [blob]
require that the blob variable be declared with a type kind and the type of the blob matches a suitable table.
In this case the blob was declared like so:
DECLARE blob_var blob<table_name>
But the named table table_name
is not a table.
The CQL forms SET [blob] FROM CURSOR [cursor]
and FETCH [cursor] FROM [blob]
require that the blob variable be declared with a type kind and the type of the blob matches a suitable table.
In this case the blob was declared like so:
DECLARE blob_name blob;
But it must be:
DECLARE blob_name blob<table_name>;
Where table_name
is a suitable table.
The CQL forms SET [blob] FROM CURSOR [cursor]
and FETCH [cursor] FROM [blob]
require that the blob variable be declared with a type kind and the type of the blob matches a suitable table.
In this case the blob was declared like:
DECLARE blob_var blob<view_name>
Where the named type view_name
is a view, not a table.
The CQL forms SET [blob] FROM CURSOR [cursor]
and FETCH [cursor] FROM [blob]
require that the blob variable be declared with a type kind and the type of the blob matches a suitable table.
In this case the blob was declared like:
DECLARE blob_var blob<table_name>
but the indicated table is missing the necessary attribute [[blob_storage]]
.
This attribute is necessary so that CQL can enforce additional rules on the table to ensure that it is viable for blob storage. For instance, the table can have no primary key, no foreign keys, and may not be used in normal SQL statements.
The indicated table has been marked with [[blob_storage]]
. This means that it isn’t a real table – it will have no SQL schema. Since it’s only a storage shape, it cannot be used in normal operations that use tables such as DROP TABLE
, CREATE INDEX
, or inside of SELECT
statements.
The CREATE TABLE
construct is used to declare a blob storage type because it’s the natural way to define a structure in SQL and also because the usual versioning rules are helpful for such tables. But otherwise, blob storage isn’t really a table at all.
The indicated table was marked with [[blob_storage]]
. This indicates that the table is going to be used to define the shape of blobs that could be stored in the database. It isn’t going to be a “real” table.
There are a number of reasons why a table might not be a valid as blob storage.
For instance:
This error indicates that one of these items is present. The specific cause is included in the text of the message.
If a cursor has a field of a nonnull reference type (e.g., TEXT NOT NULL
), it is necessary to verify that the cursor has a row before accessing the field (unless the cursor has been fetched in such a way that it must have a row, e.g., via FETCH ... FROM VALUES
or LOOP FETCH
). The reason for this is that, should the cursor not have a row, the field will be NULL
despite the nonnull type.
Assume we have the following:
create table t (x text not null);
declare proc requires_text_notnull(x text not null);
The following code is illegal:
declare c cursor for select * from t;
fetch c;-- ILLEGAL because `c` may not have a row and thus
-- `c.x` may be `NULL`
call requires_text_notnull(c.x);
To fix it, the cursor must be verified to have a row before the field is accessed:
declare c cursor for select * from t;
fetch c;if c then
-- legal due to the above check
call requires_text_notnull(c.x);
end if;
Alternatively, one can perform a “negative” check by returning (or using another control flow statement) when the cursor does not have a row:
declare c cursor for select * from t;
fetch c;if not c then
call some_logging_function("no rows in t");
return;
end if;
-- legal as we would have returned if `c` did not
-- have a row
call requires_text_notnull(c.x);
If you are sure that a row must be present, you can throw to make that explicit:
declare c cursor for select * from t;
fetch c;if not c throw;
-- legal as we would have thrown if `c` did not
-- have a row
call requires_text_notnull(c.x);
A DECLARE GROUP
statement for the named enum is happening inside of a procedure. This is not legal.
To correct this, move the declaration outside of the procedure.
The two described DECLARE GROUP
statements have the same name but they are not identical.
The error output contains the full text of both declarations to compare.
The indicated name was used in a context where a variable group name was expected but there is no such group.
Perhaps the group was not included (missing an #include) or else there is a typo.
:=
must be a nameThe assignment syntax may be generalized at some point to support things like arrays but for now only simple names may appear on the left of the :=
operator.
The indicated name is not a valid table or view.
When using a shared fragment like an expression, the shared fragment may contain a nested select with a WITH statement that calls other fragments, like this:
[[shared_fragment]]create proc expression_frag()
begin
select (
with
call frag())
(select frag.col from frag
...
) valend;
However the nested fragment(s) cannot take any arguments. The inner fragment must be moved outside to allow the fragment be used like a sql function.
In order to use a shared fragment the compiler must see the full body of the fragment, this is because the fragment will be inlined into the SQL in which it appears. As a consequence it makes no sense to try to apply the attribute to a procedure declaration. e.g.:
-- incorrect, the compiler needs to see the whole body of the shared fragment
[[shared_fragment]]declare proc x() (x integer);
create proc y()
begin
with (call x())
select * from x;
end;
Instead provide the whole body of the fragment. e.g.:
[[shared_fragment]]create proc x()
begin
select 1 x; -- the procedure body must be present
end;
In an @unsub directive, the indicated table/view has already been deleted. It can no longer be managed via subscriptions.
object<something>
‘operator’Array operations from a type like object<foo>
generate calls to
get_from_object_foo(index)
or set_in_object_foo(index, value)
NOTE: There can be more than one index if desired e.g.,
foo[a, b]
.
In order to do this and create sensible unique names the thing that has array-like behavior has to have a type kind.
NOTE: This works even for things that are primitive types. For instance you could use array notation to get optional fields from a task id even if the task id is an integer.
int<task_id> not null
can have an helper functionget_from_int_task_id(index integer);
and it “just works”.
NOTE: Arrays can work in a SQL context if the appropriate
select functions
are defined. Array syntax is only sugar.
Using the dot (.) operator can also map to set_object_foo
or get_object_foo
and likewise requires a type kind.
Expressions can be top level statements and a statement like x = 5;
is technically legal but this is almost certainly supposed to be x := 5
. To avoid that problem we make the former an error. Note that in SQLite and CQL =
and ==
are the same thing. :=
is assignment.
In an @unsub directive, the indicated table/view has already been unsubscribed. It doesn’t need another unsubscription.
This error indicates that you are attempting to @unsub a table/view while there are still other tables/views that refer to it (e.g. by FK). You must @unsub all of those as well in order to safely @unsub the present table/view. All such dependencies will be listed. Note that some of those might, in turn, have the same issue. In short, a whole subtree has to be removed in order to do this operation safely.
For generality *
can appear in many expression contexts but when it is used it must appear by itself.
For instance, select *, * from foo
is not supported.
A function declared with declare select function
will be called by SQLite – it has no possibilty of having a call-by-reference out argument. Therefore these are disallowed. Both out
and in out
forms generate this error and may not be used.
In a DECLARE INTERFACE
statement, the given name conflicts with an already declared function (DECLARE FUNCTION
or DECLARE SELECT FUNCTION
). You’ll have to choose a different name.
In a DECLARE INTERFACE
statement, the indicated name already corresponds to a created or declared stored procedure. You’ll have to choose a different name.
The interface was previously declared with a DECLARE INTERFACE
statement but when subsequent DECLARE INTERFACE
was encountered, it did not match the previous declaration.
A DECLARE INTERFACE
statement is happening inside of a procedure. This is not legal. To correct this move the declaration outside of the procedure.
In a CREATE PROCEDURE
/ DECLARE PROCEDURE
statement, the given name conflicts with an already declared interface (DECLARE INTERFACE
). You’ll have to choose a different name.
Interface with the name provided in cql:implements
attribute does not exist
The indicated table was marked with [[backing_table]]
. This indicates that the table is going to be used to as a generic storage location stored in the database.
There are a number of reasons why a table might not be a valid as backing storage.
For instance:
This error indicates that one of these items is present. The specific cause is included in the text of the message.
Procedure should return all columns defined by the interface (and possibly others). The columns may be returned in any order.
Procedure should return at least all columns defined by the interface and column type should be the same.
The same function cannot be declared as a function with unchecked parameters with the NO CHECK
clause and then redeclared with typed parameters, or vice versa.
--- Declaration of an external function foo with unchecked parameters.
DECLARE SELECT FUNCTION foo NO CHECK t text;
...
--- A redeclaration of foo with typed paramters. This would be invalid if the previous declaration exists.
DECLARE SELECT FUNCTION foo() t text;
Make sure the redeclaration of the function is consistent with the original declaration, or remove the redeclaration.
The indicated table was marked with [[backing_table]]
. This indicates that the table is going to be used to as a generic storage location stored in the database.
There are a number of reasons why a table might not be a valid as backing storage. For instance:
This error indicates that one of these items is present. The specific cause is included in the text of the message.
When declaring a backed table, you must specify the physical table that will hold its data. The backed table is marked with [[backed_by=table_name)
. The backing table is marked with @attribute(cql:backing]]
. The backing and backed_by attributes applies extra checks to tables to ensure they are suitable candidates.
This error indicates that the named table is not marked as a backed table.
The named table is a backed table, but it does not have the indicated column.
The database blob access functions cql_blob_get
, cql_blob_create
, cql_blob_update
all allow you to specify the backed table name and column you are trying to read/create/update. The named function was called with a table.column combination where the table is not a backed table, hence the call is invalid. Note that normally this error doesn’t happen because these functions are typically called by CQL itself as part of the rewriting process for backed tables. However it is possible to use them manually, hence they are error checked.
When using the cql_blob_create
helper function, the first argument must be a valid backed table (i.e. one that was marked with [[backed_by=some_backing_table)]]
. The type signature of this table is used to create a hash valid for the type of the blob that is created. This error indicates that the first argument is not even an identifier, much less a table name that is backed. There are more specific errors if the table is not found or the table is not backed. Note that normally this error doesn’t happen because this functions is typically called by CQL itself as part of the rewriting process for backed tables. However it is possible to cql_blob_create
manually, hence it is error checked.
There are several cases here:
Assigment expressions are only allowed so as to make the SET keyword optional for readability.
Why this limitation?
First, control flow in expressions with side effects is weird and SQL has a lot of this:
case when something then x := 5 else z := 2 end;
Would wreak havoc with the logic for testing for un-initialized variables. This is solvable with work.
Second, the normal way chained assignment works alters the type as you go along so this form:
:= b := 1; a
Would give an error if b
is nullable and a
is not nullable which is very bizaree indeed.
For these reasons, at least for now, :=
in expressions is just a convenience feature to let you skip the SET
keyword which makes code a bit more readable.
The other cases involve ‘’ and ’T.’ which are only allowed where column or argument replacement is implied by them. A loose ‘’ like ’ := 5;’ is just wrong.
The indicated table name was marked as backed storage. Therefore it does not have a physical manifestation, and therefore it cannot be used in an index or in a trigger. You may be able to get the index or trigger you want by creating an index on the backing storage and then using the blob access functions to index a column or check colulmns in a trigger. For instance this index is pretty normal:
[[backing_table]]
create table backing (
k blob primary key,
v blob not null
);
create index backing_type_index on backing(cql_blob_get_type(k));
This gives you a useful index on the type field of the blob for all backed tables that use backing_table
.
But generally, physical operations like indices, triggers, and drop are not applicable to backed tables.
When selecting columns from a shape you can use this form
LIKE some_shape(name1, name2)
to extract the named columns or this form
LIKE some_shape(-name1, -name2)
to extract everything but the named columns. You can’t mix the positive and negative forms
An expression that is supposed to select some columns from a shape such as
LIKE some_shape(-name1, -name2)
ended up removing all the columns from some_shape
.
A common case for conditional shared fragments is that there are rows that should be optionally included. The normal way this is handled is to have a condition like this
IF something THEN
SELECT your_data;
ELSE
SELECT dummy data WHERE 0;
END IF;
The problem here is that dummy_data could be complex and involve a lot of typing to get nothing. To make this less tedious CQL allows:
IF something THEN
SELECT your_data;
ELSE
SELECT NOTHING;
END IF;
However this is the only place SELECT NOTHING is allowed. It must be:
Any violation results in the present error.
SQLite supports an extended format of the update statement with a FROM clause. At this time backed tables cannot be updated using this form. This is likely to change fairly soon.
@enforce_strict
has been use to enable strict update enforcement. When enabled update statements may not include a FROM clause. This is done if the code expects to target SQLite version 3.33 or lower.
cql:alias_of
attributes may only be used in DECLARE FUNC
statements or DECLARE PROC
statements.
cql:alias_of
must have a string argument to indicate the underlying function name that the aliased function references. For example:
=foo]]
[[alias_ofdeclare function bar() int
All subsequent calls to bar()
in CQL will call the foo()
function.
In a CASE
statement each WHEN
expression must not be a constant NULL. For example:
declare hex text;
set hex := case color
when NULL then "#FFFFFF" -- error
when "red" then "#FF0000"
when "green" then "#00FF00"
when "blue" then "#0000FF"
else "#000000"
end;
When the case expression color
evaluates to NULL, it does not match with an expression that evaluates to NULL. Consequently, the CASE statement will default to the ELSE clause, provided it is defined.
When you declare variables with the const
syntax, they cannot be re-assigned a new value (e.g. with a set
statement, are being passed to an out argument).
Declare these variables with a let
statement instead if you would like to mutate them.
The indicated argument or operand is expected to be JSON in the form of either text or a blob. will indicate which argument number or operand that is incorrect.
The indicated argument or operand is expected to be a json path in text form. The error will indicate which argument number or operand that is incorrect.
When using the pipeline syntax with no function name e.g.
1):(2):(3); foo:(
The left argument of the pipeline, foo
in this example, must have a type “kind” because the pipeline uses the “kind” to generate the name of the function it will call.
object<builder>;
var foo := new_builder();
foo := foo:(1);
let u
-- becomes
:= object_builder_int(foo, 1); let u
Without the:(2):(3); builder
the name isn’t unique enough to be useful.
foo
) then it can be chained as in the first example foo:(1):(2):(3)
.
What follows is taken from the JSON validation grammar with the tree building rules removed.
json_schema: '{'
'"tables"' ':' '[' opt_tables ']' ','
'"virtualTables"' ':' '[' opt_virtual_tables ']' ','
'"views"' ':' '[' opt_views ']' ','
'"indices"' ':' '[' opt_indices ']' ','
'"triggers"' ':' '[' opt_triggers ']' ','
'"attributes"' ':' '[' opt_attribute_list ']' ','
'"queries"' ':' '[' opt_queries ']' ','
'"inserts"' ':' '[' opt_inserts ']' ','
'"generalInserts"' ':' '[' opt_inserts_general ']' ','
'"updates"' ':' '[' opt_updates ']' ','
'"deletes"' ':' '[' opt_deletes ']' ','
'"general"' ':' '[' opt_generals ']' ','
'"declareProcs"' ':' '[' opt_declare_procs']' ','
'"declareNoCheckProcs"' ':' '[' opt_declare_no_check_procs']' ','
'"declareFuncs"' ':' '[' opt_declare_funcs']' ','
'"declareNoCheckFuncs"' ':' '[' opt_declare_no_check_funcs']' ','
'"declareSelectFuncs"' ':' '[' opt_declare_select_funcs']' ','
'"declareNoCheckSelectFuncs"' ':' '[' opt_declare_no_check_select_funcs']' ','
'"interfaces"' ':' '[' opt_interfaces ']' ','
'"regions"' ':' '[' opt_regions ']' ','
'"adHocMigrationProcs"' ':' '[' opt_ad_hoc_migrations ']' ','
'"enums"' ':' '[' opt_enums ']' ','
'"constantGroups"' ':' '[' opt_const_groups ']' ','
'"subscriptions"' ':' '[' opt_subscriptions ']'
'}'
;
BOOL_LITERAL: '0' | '1'
;
opt_tables: | tables
;
tables: table | table ',' tables
;
opt_backing_details: | '"isBacking"' ':' '1' ',' | '"isBacked"' ':' '1' ',' '"typeHash"' ':' num_literal ','
;
opt_type_hash: | '"typeHash"' ':' num_literal ','
;
table: '{'
'"name"' ':' STRING_LITERAL ','
'"schema"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
'"isTemp"' ':' BOOL_LITERAL ','
'"ifNotExists"' ':' BOOL_LITERAL ','
'"withoutRowid"' ':' BOOL_LITERAL ','
'"isAdded"' ':' BOOL_LITERAL ','
opt_added_version
'"isDeleted"' ':' BOOL_LITERAL ','
opt_deleted_version
'"isRecreated"' ':' BOOL_LITERAL ','
opt_recreate_group_name
opt_unsub_version
opt_backing_details
opt_region_info
opt_table_indices
opt_attributes
'"columns"' ':' '[' columns ']' ','
'"primaryKey"' ':' '[' opt_column_names ']' ','
'"primaryKeySortOrders"' ':' '[' opt_sort_order_names ']' ','
opt_primary_key_name
'"foreignKeys"' ':' '[' opt_foreign_keys ']' ','
'"uniqueKeys"' ':' '[' opt_unique_keys ']' ','
'"checkExpressions"' ':' '[' opt_check_expressions ']'
'}'
;
opt_primary_key_name: | '"primaryKeyName"' ':' STRING_LITERAL ','
;
opt_virtual_tables: | virtual_tables
;
virtual_tables: virtual_table | virtual_table ',' virtual_tables
;
virtual_table: '{'
'"name"' ':' STRING_LITERAL ','
'"schema"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
'"isTemp"' ':' '0' ','
'"ifNotExists"' ':' BOOL_LITERAL ','
'"withoutRowid"' ':' BOOL_LITERAL ','
'"isAdded"' ':' BOOL_LITERAL ','
opt_added_version
'"isDeleted"' ':' BOOL_LITERAL ','
opt_deleted_version
'"isRecreated"' ':' BOOL_LITERAL ','
opt_region_info
'"isVirtual"' ':' '1' ','
'"isEponymous"' ':' BOOL_LITERAL ','
'"module"' ':' STRING_LITERAL ','
opt_module_args
opt_attributes
'"columns"' ':' '[' columns ']' ','
'"primaryKey"' ':' '[' opt_column_names ']' ','
'"primaryKeySortOrders"' ':' '[' opt_sort_order_names ']' ','
'"foreignKeys"' ':' '[' opt_foreign_keys ']' ','
'"uniqueKeys"' ':' '[' opt_unique_keys ']' ','
'"checkExpressions"' ':' '[' opt_check_expressions ']'
'}'
;
opt_module_args: | '"moduleArgs"' ':' STRING_LITERAL ','
;
opt_added_version: | '"addedVersion"' ':' any_integer ',' opt_added_migration_proc
;
opt_added_migration_proc: | '"addedMigrationProc"' ':' STRING_LITERAL ','
;
opt_unsub_version: | '"unsubscribedVersion"' ':' any_integer ','
;
opt_deleted_version: | '"deletedVersion"' ':' any_integer ',' opt_deleted_migration_proc
;
opt_deleted_migration_proc: | '"deletedMigrationProc"' ':' STRING_LITERAL ','
;
opt_recreate_group_name: | '"recreateGroupName"' ':' STRING_LITERAL ','
;
opt_index_names: | index_names
;
index_names: STRING_LITERAL | STRING_LITERAL ',' index_names
;
opt_arg_names: | arg_names
;
arg_names: STRING_LITERAL | STRING_LITERAL ',' arg_names
;
opt_column_names: | column_names
;
column_names: STRING_LITERAL | STRING_LITERAL ',' column_names
;
opt_table_names: | table_names
;
table_names: STRING_LITERAL | STRING_LITERAL ',' table_names
;
opt_view_names: | view_names
;
view_names: STRING_LITERAL | STRING_LITERAL ',' view_names
;
opt_procedure_names: | procedure_names
;
procedure_names: STRING_LITERAL | STRING_LITERAL ',' procedure_names
;
opt_sort_order_names: | sort_order_names
;
sort_order_names: STRING_LITERAL | STRING_LITERAL ',' sort_order_names
;
columns: column | column ',' columns
;
column: '{'
'"name"' ':' STRING_LITERAL ','
opt_attributes
'"type"' ':' STRING_LITERAL ','
opt_kind
opt_is_sensitive
'"isNotNull"' ':' BOOL_LITERAL ','
'"isAdded"' ':' BOOL_LITERAL ','
opt_added_version
'"isDeleted"' ':' BOOL_LITERAL ','
opt_deleted_version
opt_default_value
opt_collate
opt_check_expr
opt_type_hash
'"isPrimaryKey"' ':' BOOL_LITERAL ','
'"isUniqueKey"' ':' BOOL_LITERAL ','
'"isAutoIncrement"' ':' BOOL_LITERAL
'}'
;
opt_collate : | '"collate"' ':' STRING_LITERAL ','
;
opt_check_expr: | '"checkExpr"' ':' STRING_LITERAL ',' '"checkExprArgs"' ':' '[' opt_arg_names ']' ','
;
opt_default_value: | '"defaultValue"' ':' any_literal ','
;
opt_foreign_keys : | foreign_keys
;
opt_kind: | '"kind"' ':' STRING_LITERAL ','
;
opt_is_sensitive: | '"isSensitive"' ':' '1' ','
;
foreign_keys : foreign_key | foreign_key ',' foreign_keys
;
foreign_key : '{'
opt_name
'"columns"' ':' '[' column_names ']' ','
'"referenceTable"' ':' STRING_LITERAL ','
'"referenceColumns"' ':' '[' column_names ']' ','
'"onUpdate"' ':' STRING_LITERAL ','
'"onDelete"' ':' STRING_LITERAL ','
'"isDeferred"' ':' BOOL_LITERAL
'}'
;
opt_unique_keys : | unique_keys
;
unique_keys : unique_key | unique_key ',' unique_keys
;
unique_key: '{'
opt_name
'"columns"' ':' '[' column_names ']' ','
'"sortOrders"' ':' '[' sort_order_names ']'
'}'
;
opt_check_expressions: | check_expressions
;
check_expressions: check_expression | check_expression ',' check_expressions
;
check_expression: '{'
opt_name
'"checkExpr"' ':' STRING_LITERAL ','
'"checkExprArgs"' ':' '[' ']'
'}'
;
opt_name: | '"name"' ':' STRING_LITERAL ','
;
opt_table_indices: | table_indices
;
table_indices: '"indices"' ':' '[' opt_index_names ']' ','
;
opt_attributes: | attributes
;
attributes: '"attributes"' ':' '[' attribute_list ']' ','
;
opt_attribute_list: | attribute_list
;
attribute_list: attribute | attribute ',' attribute_list
;
attribute: '{'
'"name"' ':' STRING_LITERAL ','
'"value"' ':' attribute_value
'}'
;
attribute_array: '[' opt_attribute_value_list ']'
;
opt_attribute_value_list: | attribute_value_list
;
attribute_value_list: attribute_value | attribute_value ',' attribute_value_list
;
attribute_value: any_literal | attribute_array
;
any_integer: BOOL_LITERAL | INT_LITERAL
;
any_literal: BOOL_LITERAL |
INT_LITERAL | '-' INT_LITERAL |
LONG_LITERAL | '-' LONG_LITERAL |
REAL_LITERAL | '-' REAL_LITERAL |
STRING_LITERAL | NULL_LITERAL
;
num_literal: BOOL_LITERAL |
INT_LITERAL | '-' INT_LITERAL |
LONG_LITERAL | '-' LONG_LITERAL |
REAL_LITERAL | '-' REAL_LITERAL
;
opt_views: | views
;
views: view | view ',' views
;
view: '{'
'"name"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
'"isTemp"' ':' BOOL_LITERAL ','
'"isDeleted"' ':' BOOL_LITERAL ','
opt_deleted_version
opt_region_info
opt_attributes
projection
'"select"' ':' STRING_LITERAL ','
'"selectArgs"' ':' '[' ']' ','
dependencies
'}'
;
opt_region_info: | '"region"' ':' STRING_LITERAL ',' | '"region"' ':' STRING_LITERAL ',' '"deployedInRegion"' ':' STRING_LITERAL ','
;
opt_projection: | projection
;
projection: '"projection"' ':' '[' projected_columns ']' ','
;
projected_columns: projected_column | projected_column ',' projected_columns
;
projected_column: '{'
'"name"' ':' STRING_LITERAL ','
'"type"' ':' STRING_LITERAL ','
opt_kind
opt_is_sensitive
'"isNotNull"' ':' BOOL_LITERAL
'}'
;
opt_indices: | indices
;
indices: index | index ',' indices
;
index: '{'
'"name"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
'"table"' ':' STRING_LITERAL ','
'"isUnique"' ':' BOOL_LITERAL ','
'"ifNotExists"' ':' BOOL_LITERAL ','
'"isDeleted"' ':' BOOL_LITERAL ','
opt_deleted_version
opt_region_info
opt_partial_index_where
opt_attributes
'"columns"' ':' '[' column_names ']' ','
'"sortOrders"' ':' '[' sort_order_names ']'
'}'
;
opt_partial_index_where: | '"where"' ':' STRING_LITERAL ','
;
opt_triggers: | triggers
;
triggers: trigger | trigger ',' triggers
;
trigger: '{'
'"name"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
'"target"' ':' STRING_LITERAL ','
'"isTemp"' ':' BOOL_LITERAL ','
'"ifNotExists"' ':' BOOL_LITERAL ','
'"isDeleted"' ':' BOOL_LITERAL ','
opt_deleted_version
before_after_instead ','
delete_insert_update ','
opt_for_each_row
opt_when_expr
'"statement"' ':' STRING_LITERAL ','
'"statementArgs"' ':' '[' opt_arg_names ']' ','
opt_region_info
opt_attributes
dependencies
'}'
;
before_after_instead: '"isBeforeTrigger"' ':' '1' | '"isAfterTrigger"' ':' '1' | '"isInsteadOfTrigger"' ':' '1'
;
delete_insert_update: '"isDeleteTrigger"' ':' '1' | '"isInsertTrigger"' ':' '1' | '"isUpdateTrigger"' ':' '1'
;
opt_for_each_row: | '"forEachRow"' ':' BOOL_LITERAL ','
;
opt_when_expr: | '"whenExpr"' ':' STRING_LITERAL ',' '"whenExprArgs"' ':' '[' opt_arg_names ']' ','
;
dependencies: opt_insert_tables
opt_update_tables
opt_delete_tables
opt_from_tables
opt_uses_procedures
opt_uses_views
'"usesTables"' ':' '[' opt_table_names ']'
;
opt_uses_views: | '"usesViews"' ':' '[' opt_view_names ']' ','
;
opt_insert_tables: | '"insertTables"' ':' '[' opt_table_names ']' ','
;
opt_update_tables: | '"updateTables"' ':' '[' opt_table_names ']' ','
;
opt_delete_tables: | '"deleteTables"' ':' '[' opt_table_names ']' ','
;
opt_from_tables: | '"fromTables"' ':' '[' opt_table_names ']' ','
;
opt_uses_procedures : | '"usesProcedures"' ':' '[' opt_procedure_names ']' ','
;
opt_queries: | queries ;
queries: query | query ',' queries ;
query: '{'
'"name"' ':' STRING_LITERAL ','
'"definedInFile"' ':' STRING_LITERAL ','
'"definedOnLine"' ':' INT_LITERAL ','
'"args"' ':' '[' opt_args ']' ','
dependencies ','
opt_region_info
opt_attributes
projection
'"statement"' ':' STRING_LITERAL ','
'"statementArgs"' ':' '[' opt_arg_names ']'
'}'
;
opt_args: | args
;
args: arg | arg ',' args
;
arg: '{'
'"name"' ':' STRING_LITERAL ','
'"argOrigin"' ':' STRING_LITERAL ','
'"type"' ':' STRING_LITERAL ','
opt_kind
opt_is_sensitive
'"isNotNull"' ':' BOOL_LITERAL
'}'
;
opt_inserts: | inserts
;
inserts: insert | insert ',' inserts
;
insert : '{' insert_details ',' '"values"' ':' '[' opt_values ']' '}'
;
opt_inserts_general: | inserts_general
;
inserts_general: insert_general | insert_general ',' inserts_general
;
insert_details:
'"name"' ':' STRING_LITERAL ','
'"definedInFile"' ':' STRING_LITERAL ','
'"definedOnLine"' ':' INT_LITERAL ','
'"args"' ':' '[' opt_args ']' ','
dependencies ','
opt_region_info
opt_attributes
'"table"' ':' STRING_LITERAL ','
'"statement"' ':' STRING_LITERAL ','
'"statementArgs"' ':' '[' opt_arg_names ']' ','
'"statementType"' ':' STRING_LITERAL ','
'"columns"' ':' '[' column_names ']'
insert_general : '{' insert_details '}'
;
opt_values: | values
;
values: value | value ',' values
;
value: '{'
'"value"' ':' STRING_LITERAL ','
'"valueArgs"' ':' '[' opt_arg_names ']'
'}'
;
opt_updates: | updates
;
updates: update | update ',' updates
;
update : '{'
'"name"' ':' STRING_LITERAL ','
'"definedInFile"' ':' STRING_LITERAL ','
'"definedOnLine"' ':' INT_LITERAL ','
'"args"' ':' '[' opt_args ']' ','
dependencies ','
opt_region_info
opt_attributes
'"table"' ':' STRING_LITERAL ','
'"statement"' ':' STRING_LITERAL ','
'"statementArgs"' ':' '[' opt_arg_names ']'
'}'
;
opt_deletes: | deletes
;
deletes: delete | delete ',' deletes
;
delete : '{'
'"name"' ':' STRING_LITERAL ','
'"definedInFile"' ':' STRING_LITERAL ','
'"definedOnLine"' ':' INT_LITERAL ','
'"args"' ':' '[' opt_args ']' ','
dependencies ','
opt_region_info
opt_attributes
'"table"' ':' STRING_LITERAL ','
'"statement"' ':' STRING_LITERAL ','
'"statementArgs"' ':' '[' opt_arg_names ']'
'}'
;
opt_generals: | generals
;
generals: general | general ',' generals
;
general: '{'
'"name"' ':' STRING_LITERAL ','
'"definedInFile"' ':' STRING_LITERAL ','
'"definedOnLine"' ':' INT_LITERAL ','
'"args"' ':' '[' opt_complex_args ']' ','
dependencies ','
opt_regions
opt_attributes
opt_projection
opt_result_contract
'"usesDatabase"' ':' BOOL_LITERAL
'}'
;
opt_result_contract: | '"hasSelectResult"' ':' '1' ',' | '"hasOutResult"' ':' '1' ',' | '"hasOutUnionResult"' ':''1' ','
;
opt_complex_args: | complex_args
;
complex_args: complex_arg | complex_arg ',' complex_args
;
complex_arg: '{'
binding
'"name"' ':' STRING_LITERAL ','
opt_arg_origin
'"type"' ':' STRING_LITERAL ','
opt_kind
opt_is_sensitive
'"isNotNull"' ':' BOOL_LITERAL
'}'
;
binding: | '"binding"' ':' '"inout"' ',' | '"binding"' ':' '"out"' ','
;
opt_arg_origin: | arg_origin
;
arg_origin: '"argOrigin"' ':' STRING_LITERAL ','
;
opt_enums: | enums
;
enums: enum | enum ',' enums
;
enum: '{'
'"name"' ':' STRING_LITERAL ','
'"type"' ':' STRING_LITERAL ','
'"isNotNull"' ':' '1' ','
'"values"' ':' '[' enum_values ']'
'}'
;
enum_values: enum_value | enum_value ',' enum_values
;
enum_value: '{'
'"name"' ':' STRING_LITERAL ','
'"value"' ':' num_literal
'}'
;
opt_declare_procs: | declare_procs
;
declare_procs: declare_proc | declare_proc ',' declare_procs
declare_proc: '{'
'"name"' ':' STRING_LITERAL ','
'"args"' ':' '[' opt_complex_args ']' ','
opt_attributes
opt_projection
'"usesDatabase"' ':' BOOL_LITERAL
'}'
;
opt_declare_no_check_procs: | declare_no_check_procs
;
declare_no_check_procs: declare_no_check_proc | declare_no_check_proc ',' declare_no_check_procs
;
declare_no_check_proc: '{' '"name"' ':' STRING_LITERAL ',' '"attributes"' ':' '[' attribute_list ']' '}' | '{' '"name"' ':' STRING_LITERAL '}'
;
opt_declare_funcs: | declare_funcs
;
declare_funcs: declare_func | declare_func ',' declare_funcs
;
opt_declare_select_funcs: | declare_select_funcs
;
declare_select_funcs: declare_select_func | declare_select_func ',' declare_select_funcs
;
declare_select_func:
'{' '"name"' ':' STRING_LITERAL ',' '"args"' ':' '[' opt_complex_args ']' ',' opt_attributes return_type '}'
| '{' '"name"' ':' STRING_LITERAL ',' '"args"' ':' '[' opt_complex_args ']' ',' opt_attributes '"projection"' ':' '[' projected_columns ']' '}'
;
opt_declare_no_check_funcs: | declare_no_check_funcs
;
declare_no_check_funcs: declare_no_check_func | declare_no_check_func ',' declare_no_check_funcs
;
opt_declare_no_check_select_funcs: | declare_no_check_select_funcs
;
declare_no_check_select_funcs: declare_no_check_select_func | declare_no_check_select_func ',' declare_no_check_select_funcs
;
declare_no_check_select_func:
'{' '"name"' ':' STRING_LITERAL ',' opt_attributes return_type '}'
| '{' '"name"' ':' STRING_LITERAL ',' opt_attributes '"projection"' ':' '[' projected_columns ']' '}'
;
declare_func: '{'
'"name"' ':' STRING_LITERAL ','
'"args"' ':' '[' opt_complex_args ']' ','
opt_attributes
return_type ','
'"createsObject"' ':' BOOL_LITERAL
'}'
;
declare_no_check_func: '{'
'"name"' ':' STRING_LITERAL ','
opt_attributes
return_type ','
'"createsObject"' ':' BOOL_LITERAL
'}'
;
return_type: '"returnType"' ':' '{'
'"type"' ':' STRING_LITERAL ','
opt_kind
opt_is_sensitive
'"isNotNull"' ':' BOOL_LITERAL
'}'
;
opt_interfaces: | interfaces
;
interfaces: interface | interface ',' interfaces
;
interface: '{'
'"name"' ':' STRING_LITERAL ','
'"definedInFile"' ':' STRING_LITERAL ','
'"definedOnLine"' ':' INT_LITERAL ','
opt_attributes
'"projection"' ':' '[' projected_columns ']'
'}'
;
opt_subscriptions: | subscriptions
;
subscriptions: subscription | subscription ',' subscriptions
;
subscription: '{'
'"type"' ':' STRING_LITERAL ','
'"table"' ':' STRING_LITERAL ','
opt_region_info
'"version"' ':' any_integer
'}'
;
opt_const_groups: | const_groups
;
const_groups: const_group | const_group ',' const_groups
;
const_group: '{'
'"name"' ':' STRING_LITERAL ','
'"values"' ':' '[' const_values ']'
'}'
;
const_values: const_value | const_value ',' const_values
;
const_value: '{'
'"name"' ':' STRING_LITERAL ','
'"type"' ':' STRING_LITERAL ','
opt_kind
'"isNotNull"' ':' BOOL_LITERAL ','
'"value"' ':' num_literal
'}'
| '{'
'"name"' ':' STRING_LITERAL ','
'"type"' ':' STRING_LITERAL ','
opt_kind
'"isNotNull"' ':' BOOL_LITERAL ','
'"value"' ':' STRING_LITERAL
'}'
;
opt_regions: | regions
;
regions: region | region ',' regions
;
region: '{'
'"name"' ':' STRING_LITERAL ','
'"isDeployableRoot"' ':' BOOL_LITERAL ','
'"deployedInRegion"' ':' STRING_LITERAL ','
'"using"' ':' '[' opt_region_names ']' ','
'"usingPrivately"' ':' '[' opt_bool_list ']'
'}'
;
opt_region_names: | region_names
;
region_names: STRING_LITERAL | STRING_LITERAL ',' region_names
;
opt_bool_list: | bool_list
;
bool_list: BOOL_LITERAL | BOOL_LITERAL ',' bool_list
;
opt_ad_hoc_migrations: | ad_hoc_migrations
;
ad_hoc_migrations: ad_hoc_migration | ad_hoc_migration ',' ad_hoc_migrations
;
ad_hoc_migration: '{'
'"name"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
opt_attributes
'"version"' ':' any_integer
'}'
| '{'
'"name"' ':' STRING_LITERAL ','
'"crc"' ':' STRING_LITERAL ','
opt_attributes
'"onRecreateOf"' ':' STRING_LITERAL
'}'
;
What follows is a series of examples intended to illustrate the most important features of the CQL language. This appendix was significantly influenced by a similar article on Python at https://learnxinyminutes.com/docs/python/
Also of interest: * http://sqlite.org * https://learnxinyminutes.com/docs/sql
And with no further delay, CQL in 20 minutes…
-- Single line comments start with two dashes
/* C style comments also work
*
* C pre-processor features like #include and #define are generally available
* CQL is typically run through the C pre-processor before it is compile.
*/
/**********************************************************
* 1. Primitive Datatypes and Operators
*********************************************************/
-- You have numbers
3 -- an integer
3L -- a long integer
3.5 -- a real literal
0x10 -- 16 in hex
-- Math is what you would expect
1 + 1 --> 2
8 - 1 --> 7
10 * 2 --> 20
35.0 / 5 --> 7.0
-- Modulo operation, same as C and SQLite
7 % 3 --> 1
-7 % 3 --> -1
7 % -3 --> 1
-7 % 3 --> -1
-- Bitwise operators bind left to right like in SQLite
1 | 4 & 3 --> 1 (not 0)
-- Enforce precedence with parentheses
1 + 3 * 2 --> 7
(1 + 3) * 2 --> 8
-- Use true and false for bools, nullable bool is possible
true --> how to true
false --> how to false
null --> null means "unknown" in CQL like SQLite
-- Negate with not
not true --> false
not false --> true
not null --> null (not unknown is unknown)
-- Logical Operators
1 and 0 --> 0
0 or 1 --> 1
0 and x --> 0 and x not evaluated
1 or x --> 1 and x not evaluated
-- Remember null is "unknown"
null or false --> null
null or true --> true
null and false --> false
null and true --> null
-- Non-zero values are truthy
0 --> false
4 --> true
-6 --> true
0 and 2 --> 0 (false)
-5 or 0 --> 1 (true)
-- Equality is == or =
1 == 1 --> true
1 = 1 --> true (= and == are the same thing)
2 == 1 --> false
-- Note that null is not equal to anything (like SQL)
null == 1 --> null (hence not true)
null == null --> null (hence not true)
"x" == "x" --> true
-- IS lets you compare against null
1 IS 1 --> true
2 IS 1 --> false
null IS 1 --> false
null IS null --> true (Unknown is Unknown? Yes it is!)
"x" IS "x" --> true
-- x IS NOT y is the same as NOT (x IS y)
1 IS NOT 1 --> false
2 IS NOT 1 --> true
null IS NOT 1 --> true
null IS NOT null --> false
"x" IS NOT "x" --> false
-- Inequality is != or <>
1 != 1 --> false
2 <> 1 --> true
null != 1 --> null
null <> null --> null
-- More comparisons
1 < 10 --> true
1 > 10 --> false
2 <= 2 --> true
2 >= 2 --> true
10 < null --> null
-- To test if a value is in a range
1 < 2 and 2 < 3 --> true
2 < 3 and 3 < 2 --> false
-- BETWEEN makes this look nicer
2 between 1 and 3 --> true
3 between 2 and 2 --> false
-- Strings are created with "x" or 'x'
"This is a string.\n" -- can have C style escapes (no embedded nulls)
"Th\x69s is a string.\n" -- even hex literals
'This isn''t a C style string' -- use '' to escape single quote ONLY
/**********************************************************
* 2. Simple Variables
*********************************************************/
-- CQL can call simple libc methods with a no-check declaration
-- we'll need this for later examples so we can do something
-- with our expressions (i.e. print them)
declare procedure printf no check;
call printf("I'm CQL. Nice to meet you!\n");
-- Variables are declared with DECLARE.
-- Keywords and identifiers are not case sensitive.
declare x integer not null;
-- You can call it X, it is the same thing.
set X := 0;
-- All variables begin with a null value if allowed, else a zero value.
declare y integer not null;
if y == 0 then
call printf("Yes, this will run.\n");
end if;
-- A nullable variable (i.e. not marked with not null) is initialized to null
declare z real;
if z is null then
call printf("Yes, this will run.\n");
end if;
-- The various types
declare a_blob blob;
declare a_string text;
declare a_real real;
declare an_int integer;
declare a_long long;
declare an_object object;
-- There are some typical SQL synonyms
declare an_int int;
declare a_long long integer;
declare a_long long int;
declare a_long long_int;
-- The basic types can be tagged to make them less miscible
declare m real<meters>;
declare kg real<kilos>;
set m := kg; -- error!
-- Object variables can also be tagged so that they are not mixed-up easily
declare dict object<dict> not null;
declare list object<list> not null;
set dict := create_dict(); -- an external function that creates a dict
set dict := create_list(); -- error
set list := create_list(); -- ok
set list := dict; -- error
-- Implied type initialization
LET i := 1; -- integer not null
LET l := 1L; -- long not null
LET t := "x"; -- text not null
LET b := x IS y; -- bool not null
LET b := x = y; -- bool (maybe not null depending on x/y)
-- The psuedo function "nullable" converts the type of its arg to the nullable
-- version of the same thing.
LET n_i := nullable(1); -- nullable integer variable initialized to 1
LET l_i := nullable(1L); -- nullable long variable initialized to 1
/**********************************************************
* 3. Control Flow
*********************************************************/
-- Just make a variable
declare some_var integer not null;
set some_var := 5
-- Here is an IF statement
if some_var > 10 then
call printf("some_var is totally bigger than 10.\n")
else if some_var < 10 then -- else if is optional
call printf("some_var is smaller than 10.\n")
else -- else is optional
call printf("some_var is indeed 10.\n")
end if;
-- WHILE loops iterate as usual
declare i integer not null;
set i := 0;
while i < 5
begin
call printf("%d\n", i);
set i := i + 1;
end;
-- Use LEAVE to end a loop early
declare i integer not null;
set i := 0;
while i < 500
begin
if i >= 5 then
-- we are not going to get anywhere near 500
leave;
end if;
call printf("%d\n", i);
set i := i + 1;
end;
-- Use CONTINUE to go back to the loop test
declare i integer not null;
set i := 0;
while i < 500
begin
set i := i + 1;
if i % 2 then
-- Note: we to do this after "i" is incremented!
-- to avoid an infinite loop
continue;
end if;
-- odd numbers will not be printed because of continue above
call printf("%d\n", i);
end;
/**********************************************************
* 4. Complex Expression Forms
*********************************************************/
-- Case is an expression, so it is more like the C "?:" operator
-- than a switch statement. It is like "?:" on steroids.
case i -- a switch expression is optional
when 1 then "one" -- one or more cases
when 2 then "two"
else "other" -- else is optional
end;
-- Case with no common expression is a series of independent tests
case
when i == 1 then "i = one" -- booleans could be completely unrelated
when j == 2 then "j = two" -- first match wins
else "other"
end;
-- If nothing matches the cases, the result is null.
-- The following expression yields null because 7 is not 1.
case 7 when 1 then "one" end
-- Case is just an expression, so it can nest
case X
when 1
case y when 1 "x:1 y:1"
else "x:1 y:other"
end
else
case when z == 1 "x:other z:1"
else "x:other z:other"
end
end;
-- IN is used to test for membership
5 IN (1, 2, 3, 4, 5) --> true
7 IN (1, 2) --> false
null in (1, 2, 3) --> null
null in (1, null, 3) --> null (null == null is not true)
7 NOT IN (1, 2) --> true
null not in (null, 3) --> null
/**********************************************************
* 4. Working with and "getting rid of" null
*********************************************************/
-- Null can be annoying, you might need a not null value.
-- In most operations null is radioactive:
null + x --> null
null * x --> null
null == null --> null
-- IS and IS NOT always return 0 or 1
null is 1 -> 0
1 is not null -> 1
-- COALESCE returns the first non null arg, or the last arg if all were null.
-- If the last arg is not null, you get a non null result for sure.
-- The following is never null, but it's false if either x or y is null
COALESCE(x==y, false) -> thought excercise: how is this different than x IS y?
-- IFNULL is coalesce with 2 args only (COALESCE is more general)
IFNULL(x, -1) --> use -1 if x is null
-- The reverse, NULLIF, converts a sentinel value to unknown, more exotic
NULLIF(x, -1) --> if x is -1 then use null
-- the else part of a case can get rid of nulls
CASE when x == y then 1 else 0 end; --> true iff x = y and neither is null
-- CASE can be used to give you a default value after various tests
-- The following expression is never null; "other" is returned if x is null.
CASE when x > 0 then "pos" when x < 0 then "neg" else "other" end;
-- You can "throw" out of the current procedure (see exceptions below)
declare x integer not null;
set x := ifnull_throw(nullable_int_expr); -- returns non null, throws if null
-- If you have already tested the expression then control flow analysis
-- improves its type to "not null". Many common check patterns are recognized.
if nullable_int_expr is not null then
-- nullable_int_expression is known to be not null in this context
set x := nullable_int_expr;
end if;
/**********************************************************
* 5. Tables, Views, Indices, Triggers
*********************************************************/
-- Most forms of data definition language DDL are supported.
-- "Loose" DDL (outside of any procedure) simply declares
-- schema, it does not actually create it; the schema is assumed to
-- exist as you specified.
create table T1(
id integer primary key,
t text,
r real
);
create table T2(
id integer primary key references T1(id),
l long,
b blob
);
-- CQL can take a series of schema declarations (DDL) and
-- automatically create a procedure that will materialize
-- that schema and even upgrade previous versions of the schema.
-- This system is discussed in Chapter 10 of The Guide.
-- To actually create tables and other schema you need
-- procedures that look like the below:
create proc make_tables()
begin
create table T1 if not exists (
id integer primary key,
t text,
r real
);
end;
-- Views are supported
create view V1 as (select * from T1);
-- Triggers are supported
create trigger if not exists trigger1
before delete on T1
begin
delete from T2 where id = old.id;
end;
-- Indices are supported
create index I1 on T1(t);
create index I2 on T1(r);
-- The various drop forms are supported
drop index I1;
drop index I2;
drop view V1;
drop table T2;
drop table T1;
-- A complete discussion of DDL is out of scope, refer to sqlite.org
/**********************************************************
* 6. Selecting Data
*********************************************************/
-- We will use this scratch variable in the following examples
declare rr real;
-- First observe CQL is a two-headed language
set rr := 1+1; -- this is evaluated in generated C code
set rr := (select 1+1); -- this expresion goes to SQLite; SQLite does the addition
-- CQL tries to do most things the same as SQLite in the C context
-- but some things are exceedingly hard to emulate correctly.
-- Even simple looking things such as:
set rr := (select cast("1.23" as real)); --> rr := 1.23
set rr := cast("1.23" as real); --> error (not safe to emulate SQLite)
-- In general, numeric/text conversions have to happen in SQLite context
-- because the specific library that does the conversion could be and usually
-- is different than the one CQL would use. It would not do to give different answers
-- in one context or another so those conversions are simply not supported.
-- Loose concatenation is not supported because of the implied conversions.
-- Loose means "not in the context of a SQL statement".
set r := 1.23;
set r := (select cast("100"||r as real)); --> 1001.23 (a number)
set r := cast("100"||r as real); --> error, concat not supported in loose expr
-- A simple insertion
insert into T1 values (1, "foo", 3.14);
-- Finally, reading from the database
set r := (select r from T1 where id = 1); --> r = 3.14
-- The (select ...) form requires the result to have at least one row.
-- You can use IF NOTHING forms to handle other cases such as:
set r := (select r from T1
where id = 2
if nothing -1); --> r = -1
-- If the SELECT statement might return a null result you can handle that as well
set r := (select r from T1
where id = 2
if nothing or null -1); --> r = -1
-- With no IF NOTHING clause, lack of a row will cause the SELECT expression to throw
-- an exception. IF NOTHING THROW merely makes this explicit.
set r := (select r from T1 where id = 2 if nothing throw); --> will throw
/**********************************************************
* 6. Procedures, Results, Exceptions
*********************************************************/
-- Procedures are a list of statements that can be executed, with arguments.
create proc hello()
begin
call printf("Hello, world\n");
end;
-- IN, OUT, and INOUT parameters are possible
create proc swizzle(x integer, inout y integer, out z real not null)
begin
set y := x + y; -- any computation you like
-- bizarre way to compute an id but this is just an illustration
set z := (select r from T1 where id = x if nothing or null -1);
end;
-- Procedures like "hello" (above) have a void signature -- they return nothing
-- as nothing can go wrong. Procedures that use the database like "swizzle" (above)
-- can return an error code if there is a problem.
-- "will_fail" (below) will always return SQLITE_CONSTRAINT, the second insert
-- is said to "throw". In CQL exceptions are just result codes.
create proc will_fail()
begin
insert into T1 values (1, "x", 1);
insert into T1 values (1, "x", 1); --> duplicate key
end;
-- DML that fails generates an exception and
-- exceptions can be caught. Here is a example:
create proc upsert_t1(
id_ integer primary key,
t_ text,
r_ real
)
begin
try
-- try to insert
insert into T1(id, t, r) values (id_, t_, r_);
catch
-- if the insert fails, try to update
update T1 set t = t_, r = r_ where id = id_;
end;
end;
-- Shapes can be very useful in avoiding boilerplate code
-- the following is equivalent to the above.
-- More on shapes later.
create proc upsert_t1(LIKE t1) -- my args are the same as the columns of T1
begin
try
insert into T1 from arguments
catch
update T1 set t = t_, r = r_ where id = id_;
end;
end;
-- You can (re)throw an error explicitly.
-- If there is no current error you get SQLITE_ERROR
create proc upsert_wrapper(LIKE t1) -- my args are the same as the columns of T1
begin
if r_ > 10 then throw end if; -- throw if r_ is too big
call upsert_t1(from arguments);
end;
-- Procedures can also produce a result set.
-- The compiler generates the code to create this result set
-- and helper functions to read rows out of it.
create proc get_low_r(r_ real)
begin
-- optionally insert some rows or do other things
select * from T1 where T1.r <= r_;
end;
-- A procedure can choose between various results, the choices must be compatible.
-- The last "select" to run controls the ultimate result.
create proc get_hi_or_low(r_ real, hi_not_low bool not null)
begin
-- trying to do this with one query would result in a poor plan, so
-- instead we use two economical queries.
if hi_not_low then
select * from T1 where T1.r >= r_;
else
select * from T1 where T1.r <= r_;
end if;
end;
-- Using IF to create to nice selects above is a powerful thing.
-- SQLite has no IF, if we tried to create a shared query we get
-- something that does not use indices at all. As in the below.
-- The two-headed CQL beast has its advantages!
select * from T1 where case hi_not_low then T1.r >= r_ else T1.r <= r_ end;
-- You can get the current return code and use it in your CATCH logic.
-- This upsert is a bit better than the first:
create proc upsert_t1(LIKE t1) -- my args are the same as the columns of T1
begin
try
insert into T1 from arguments
catch
if @rc == 19 /* SQLITE_CONSTRAINT */ then
update T1 set t = t_, r = r_ where id = id_;
else
throw; -- rethrow, something bad happened.
end if;
end;
end;
-- By convention, you can call a procedure that has an OUT argument
-- as its last argument using function notation. The out argument
-- is used as the return value. If the called procedure uses the
-- database then it could throw which causes the caller to throw
-- as usual.
create proc fib(n integer not null, out result integer not null)
begin
set result := case n <= 2 then 1 else fib(n-1) + fib(n-2) end;
end;
/**********************************************************
* 7. Statement Cursors
*********************************************************/
-- Statement cursors let you iterate over a select result.
-- Here we introduce cursors, LOOP and FETCH.
create proc count_t1(r_ real, out rows_ integer not null)
begin
declare rows integer not null; -- starts at zero guaranteed
declare C cursor for select * from T1 where r < r_;
loop fetch C -- iterate until fetch returns no row
begin
-- goofy code to illustrate you can process the cursor
-- in whatever way you deem appropriate
if C.r < 5 then
rows := rows + 1; -- count rows with C.r < 5
end if;
end;
set rows_ := rows;
end;
-- Cursors can be tested for presence of a row
-- and they can be closed before the enumeration is finished.
-- As before the below is somewhat goofy example code.
create proc peek_t1(r_ real, out rows_ integer not null)
begin
/* rows_ is set to zero for sure! */
declare C cursor for select * from T1 where r < r_ limit 2;
open C; -- this is no-op, present because other systems have it
fetch C; -- fetch might find a row or not
if C then -- cursor name as bool indicates presence of a row
set rows_ = rows_ + (C.r < 5);
fetch C;
set rows_ = rows_ + (C and C.r < 5);
end if;
close C; -- cursors auto-closed at end of method but early close possible
end;
-- The FETCH...INTO form can be used to fetch directly into variables
fetch C into id_, t_, r_; --> loads named locals instead of C.id, C.t, C.r
-- A procedure can be the source of a cursor
declare C cursor for call get_low_r(3.2); -- valid cursor source
-- OUT can be used to create a result set that is just one row
create proc one_t1(r_ real)
begin
declare C cursor for select * from T1 where r < r_ limit 1;
fetch C;
out C; -- emits a row if we have one, no row is ok too, empty result set.
end;
/**********************************************************
* 8. Value Cursors, Out, and Out Union
*********************************************************/
-- To consume a procedure that uses "out" you can declare a value cursor.
-- By itself such as cursor does not imply use of the database, but often
-- the source of the cursor uses the database. In this example
-- consume_one_t1 uses the database because of the call to one_t1.
create proc consume_one_t1()
begin
-- a cursor whose shape matches the one_t1 "out" statement
declare C cursor like one_t1;
-- load it from the call
fetch C from call one_t1(7);
if C.r > 10 then
-- use values as you see fit
call printf("woohoo");
end if;
end;
-- You can do the above in one step with the compound form:
declare C cursor fetch from call one_t1(7); -- declare and fetch
-- Value cursors can come from anywhere and can be a procedure result
create proc use_t1_a_lot()
begin
-- T1 is the same shape as one_t1, this will work, too
declare C cursor like T1;
fetch C from call one_t1(7); -- load it from the call
-- some arbitrary logic might be here
-- load C again with different args
fetch C from call one_t1(12); -- load it again
-- some arbitrary logic might be here
-- now load C yet again with explicit args
fetch C using
1 id,
"foo" t,
8.2 r;
-- now return it
out C;
end;
-- Make a complex result set one row at a time
create proc out_union_example()
begin
-- T1 is the same shape as one_t1, this will work, too
declare C cursor like T1;
-- load it from the call
fetch C from call one_t1(7);
-- note out UNION rather than just out, indicating potentially many rows
out union C;
-- load it again with different args
fetch C from call one_t1(12);
out union C;
-- do something, then maybe load it again with explicit args
fetch C using
1 id,
"foo" t,
8.2 r;
out union C;
-- we have generated a 3 row result set
end;
-- Consume the above
create proc consume_result()
begin
declare C cursor for call out_union_example();
loop fetch C
begin
-- use builtin cql_cursor_format to make the cursor into a string
call printf("%s\n", cql_cursor_format(C)); --> prints every column and value
end;
end;
/**********************************************************
* 9. Named Types and Enumerations
*********************************************************/
-- Create a simple named types
declare my_type type integer not null; -- make an alias for integer not null
declare i my_type; -- use it, "i" is an integer
-- Mixing in type kinds is helpful
declare distance type real<meters>; -- e.g., distances to be measured in meters
declare time type real<seconds>; -- e.g., time to be measured in seconds
declare job_id type long<job_id>;
declare person_id type long<person_id>;
-- With the above done
-- * vars/cols of type "distance" are incompatible with those of type "time"
-- * vars/cols of types job_id are incompatible with person_id
-- This is true even though the underlying type is the same for both!
-- ENUM declarations can have any numeric type as their base type
declare enum implement integer (
pencil, -- values start at 1 unless you use = to choose something
pen, -- the next enum gets previous + 1 as its value (2)
brush = 7 -- with = expression you get the indicated value
);
-- The above also implicitly does this
declare implement type integer<implement> not null;
-- Using the enum -- simply use dot notation
declare impl implement;
set impl := implement.pen; -- value "2"
-- You can emit an emum into the current .h file we are going to generate.
-- Do not put this directive in an include file, you want it to go to one place.
-- Instead, pick one compiland that will "own" the emission of the enum.
-- C code can then #include that one .h file.
@emit_enums implement;
/**********************************************************
* 10. Shapes and Their Uses
*********************************************************/
-- Shapes first appeared to help define value cursors like so:
-- A table or view name defines a shape
declare C cursor like T1;
-- The result of a proc defines a shape
declare D cursor like one_t1;
-- A dummy select statement defines a shape (the select does not run)
-- this one is the same as (x integer not null, y text not null)
declare E cursor like select 1 x, "2" y;
-- Another cursor defines a shape
declare F cursor like C;
-- The arguments of a procedure define a shape. If you have
-- create proc count_t1(r_ real, out rows_ integer not null) ...
-- the shape will be:
-- (r_ real, rows_ integer not null)
declare G cursor like count_t1 arguments;
-- A loaded cursor can be used to make a call
call count_t1(from G); -- the args become G.r_, G.rows_
-- A shape can be used to define a procedures args, or some of the args
-- In the following "p" will have arguments:s id_, t_, and r_ with types
-- matching table T1.
-- Note: To avoid ambiguity, an _ was added to each name!
create proc p(like T1)
begin
-- do whatever you like
end;
-- The arguments of the current procedure are a synthetic shape
-- called "arguments" and can used where other shapes can appear.
-- For instance, you can have "q" shim to "p" using this form:
create proc q(like T1, print bool not null)
begin
-- maybe pre-process, silly example
set id_ := id_ + 1;
-- shim to p
call p(from arguments); -- pass my args through, whatever they are
-- maybe post-process, silly example
set r_ := r_ - 1;
if print then
-- convert args to cursor
declare C like q arguments;
fetch C from arguments;
call printf("%s\n", cql_cursor_format(C)); --> prints every column and value
end if;
-- insert a row based on the args
insert into T1 from arguments;
end;
-- You an use a given shape more than once if you name each use.
-- This would be more exciting if T1 was like a "person" or something.
create proc r(a like T1, b like T1)
begin
call p(from a);
call p(from b);
-- you can refer to a.id, b.id etc.
declare C like a;
fetch C from a;
call printf("%s\n", cql_cursor_format(C));
fetch C from b;
call printf("%s\n", cql_cursor_format(C));
end;
-- Shapes can be subsetted, for instance in the following example
-- only the arguments that match C are used in the FETCH.
fetch C from arguments(like C);
-- Fetch the columns of D into C using the cursor D for the data source.
-- Other columns get default values.
fetch C(like D) from D;
-- Use the D shape to load C, dummy values for the others.
-- In this example, dummy_seed means use the provided value, 11, for
-- any numerics that are not specified (not in D) and and use
-- "col_name_11" for any strings/blobs. This pattern is useful in test code
-- to create dummy data, hence the name.
fetch C(like D) from D @dummy_seed(11);
-- Use the Z shape to control which fields are copied.
-- Use the dummy value even if the field is nullable and null would have be ok.
fetch C(like Z) from D(like Z) @dummy_seed(11) @dummy_nullables;
-- The above patterns also work for insert statements
-- The shape constraints are generally useful. The dummy data
-- sources are useful for inserting test data.
insert into T1(like Z) from D(like Z) @dummy_seed(11) @dummy_nullables;
-- We'll need this dummy procedure some_shape so we can use its return
-- value in the examples that follow. We will never actual create this
-- proc, we only declare it to define the shape, so this is kind of like
-- a typedef.
declare proc some_shape() (x integer not null, y integer not null, z integer not null);
-- You can make a helper procedure to create test args that are mostly constant
-- or computable.
create get_foo_args(X like some_shape, seed_ integer not null)
begin
declare C cursor like foo arguments;
-- any way of loading C could work this is one
fetch C(like X) from X @dummy_seed(seed_);
out C;
end;
-- Now we can use the "get_foo_args" to get full set of arguments for "foo" and then
-- call "foo" with those arguments. In this example we're providing
-- some of the arguments explicitly, "some_shape" is the part of the args that
-- needs to manually vary in each test iteration, the rest of the arguments will
-- be dummy values. There could be zillions of args in either category.
-- In the below "some_shape" is going to get the manual values 1, 2, 3 while 100
-- will be the seed for the dummy args.
declare foo_args cursor fetch from call get_foo_args(1,2,3, 100);
call foo(from foo_args);
/**********************************************************
* 11. INSERT USING and FETCH USING
*********************************************************/
-- This kind of thing is a pain
insert into foo(a, b, c, d, e, f, g)
values(1, 2, 3, null, null, 5, null);
-- Instead, write this form:
insert into foo USING
1 a, 2 b, 3 c, null d, null e, 5 f, null g;
-- The FETCH statement can also be "fetch using"
declare C cursor like foo;
fetch C USING
1 a, 2 b, 3 c, null d, null e, 5 f, null g;
If you’ve read this far you know more than most now. :)
These are a few of the antipatterns I’ve seen while travelling through various CQL source files. They are in various categories.
Refer also to Appendix 8: Best Practices.
For these examples let’s create a couple of tables we might need for examples
CREATE TABLE foo (
id integer primary key,
name text
);
CREATE TABLE bar (
id integer primary key,
real
rate );
DECLARE v LONG NOT NULL;
SET v := 1;
better:
:= 1L; -- long literals have the L suffix like in C LET v
Similarly:
DECLARE v REAL NOT NULL;
SET v := 1;
better:
:= 1.0; -- use scientific notation or add .0 to make a real literal LET v
Redundant casts fatten the code and don’t really add anything to readability. Sometimems it’s necessary to cast NULL to a particular type so that you can be sure that generated result set has the right data type, but most of the casts below are not necessary.
SELECT
CAST(foo.id as INTEGER) as id,
CAST(foo.name as TEXT) as name,
CAST(NULL as REAL) as rate
FROM foo
UNION ALL
SELECT
CAST(bar.id as INTEGER) as id,
CAST(NULL as TEXT) as name,
CAST(bar.rate as REAL) as rate
FROM bar
Better [at some point, redundant casts actually started generating errors.]
SELECT
id,
foo.
foo.name,CAST(NULL as REAL) as rate
FROM foo
UNION ALL
SELECT
id,
bar.CAST(NULL as TEXT) as name,
bar.rateFROM bar
It’s possible to do the following to make this even cleaner:
-- somewhere central
CAST(NULL as TEXT)
#define NULL_TEXT CAST(NULL as REAL)
#define NULL_REAL CAST(NULL as INTEGER)
#define NULL_INT CAST(NULL as LONG) #define NULL_LONG
Then you can write
SELECT
id,
foo.
foo.name,as rate
NULL_REAL FROM foo
UNION ALL
SELECT
id,
bar.as name,
NULL_TEXT
bar.rateFROM bar
TRUE and FALSE can be used as boolean literals.
SQLite doesn’t care about the type but CQL will get the type information it needs to make the columns of type BOOL
SELECT
id,
foo.
foo.name,as rate,
NULL_REAL TRUE as has_name, -- this is a bit artificial but you get the idea
FALSE as has_rate
FROM foo
UNION ALL
SELECT
id,
bar.as name,
NULL_TEXT
bar.rate,FALSE as has_name,
TRUE as has_rate
FROM bar
It’s easy to get carried away with the power of CASE
expressions, I’ve seen this kind of thing:
CAST(CASE WHEN foo.name IS NULL THEN 0 ELSE 1 END AS BOOL)
But this is simply
IS NOT NULL foo.name
In general, if your case alternates are booleans a direct boolean expression would have served you better.
Somtimes there’s clamping or filtering going on in a case statement
CAST(CASE WHEN foo.name > 'm' THEN foo.name ELSE NULL END AS TEXT)
Here the CAST
is not needed at all so we could go to
CASE WHEN foo.name > 'm' THEN foo.name ELSE NULL END
NULL
is already the default value for the ELSE
clause so you never need ELSE NULL
So better:
CASE WHEN foo.name > 'm' THEN foo.name END
Consider
SELECT *
FROM foo
WHERE foo.name IS NOT NULL AND foo.name > 'm';
There’s no need to test for NOT NULL
here, the boolean will result in NULL
if foo.name
is null which is not true so the WHERE
test will fail.
Better:
SELECT *
FROM foo
WHERE foo.name > 'm';
In this statement we do not want to have a null result for the boolean expression
SELECT
id,
name,CAST(IFNULL(name > 'm', 0) AS BOOL) AS name_bigger_than_m
FROM FOO;
So now we’ve made several mistakes. We could have used the usual FALSE
defintion to avoid the cast. But even that would have left us with an IFNULL that’s harder to read. Here’s a much simpler formulation:
SELECT
id,
name,> 'm' IS TRUE AS name_bigger_than_m
name FROM FOO;
Even without the TRUE
macro you could do IS 1
above and still get a result of type BOOL NOT NULL
IS
when it makes sense to do soThis kind of boolean expression is also verbose for no reason
IS NOT NULL AND rate = 20 rate
In a WHERE
clause probably rate = 20
suffices but even if you really need a NOT NULL BOOL
result the expression above is exactly what the IS
operator is for. e.g.
IS 20 rate
The IS
operator is frequently avoided except for IS NULL
and IS NOT NULL
but it’s a general equality operator with the added semantic that it never returns NULL
. NULL IS NULL
is true. NULL IS [anything not null]
is false.
Consider
SELECT foo.id,
foo.name,
bar.rateFROM foo
LEFT JOIN bar ON foo.id = bar.id
WHERE bar.rate > 5;
This is no longer a left join because the WHERE
clause demands a value for at least one column from bar
.
Better:
SELECT foo.id,
foo.name,
bar.rateFROM foo
INNER JOIN bar ON foo.id = bar.id
WHERE bar.rate > 5;
This is a brief discussion of every statement type and some general best practices for that statement. The statements are in mostly alphabetical order except related statements were moved up in the order to make logical groups.
Refer also to Appendix 7: Anti-patterns.
ALTER TABLE ADD COLUMN
CREATE INDEX
CREATE PROC
CREATE TABLE
CREATE TRIGGER
CREATE VIEW
CREATE VIRTUAL TABLE
DROP INDEX
DROP TABLE
DROP TRIGGER
DROP VIEW
These statements almost never appear in normal procedures and generally should be avoided. The normal way of handling schema in CQL is to have one or more files declare all the schema you need and then let CQL create a schema upgrader for you. This means you’ll never manually drop tables or indices etc. The create
declarations with their annotations will totally drive the schema.
Any ad hoc DDL is usually a very bad sign. Test code is an obvious exception to this as it often does setup and teardown of schema to set up things for the test.
@SCHEMA_AD_HOC_MIGRATION
This is a special upgrade step that should be taken at the version indicated in the statement. These can be quite complex and even super important but should not be used lightly. Any migration procedure has to be highly tolerant of a variety of incoming schema versions and previous partial successes. In any case this directive should not appear in normal code. It should be part of the schema DDL declarations.
BEGIN TRANSACTION
COMMIT TRANSACTION
ROLLBACK TRANSACTION
Transactions do not nest and most procedures do not know the context in which they will be called, so the vast majority of procedures will not and should not actually start transactions. You can only do this if you know, somehow, for sure, that the procedure in question is somehow a “top level” procedure. So generally, don’t use these statements.
SAVEPOINT
ROLLBACK TO SAVEPOINT
RELEASE SAVEPOINT
PROC SAVEPOINT
COMMIT RETURN
ROLLBACK RETURN
Savepoints are the preferred tool for having interim state that can be rolled back if needed. You can use ad hoc savepoints, just give your save point and name then use RELEASE SAVEPOINT
to commit it, or else ROLLBACK TO SAVEPOINT
followed by a RELEASE
to abort it. Note that you always RELEASE
savepoints in both the rollback and the commit case.
Managing savepoints can be tricky, especially given the various error cases. They combine nicely with TRY CATCH
to do this job. However, even that is a lot of boilerplate. The best way to use savepoints is with PROC SAVEPOINT BEGIN
.. END
;
When you use PROC SAVEPOINT
, a savepoint is created for you with the name of your procedure. When the block exits the savepoint is released (committed). However you also get an automatically generated try/catch block which will rollback the savepoint if anything inside the block were to invoke THROW
. Also, you may not use a regular RETURN
inside this block, you must use either ROLLBACK RETURN
or COMMIT RETURN
. Both of these directly indicate the fate of the automatically generated statement when they run. This gives you useful options to early-out (with no error) while keeping or abandoning any work in progress. Of course you can use THROW
to return an error and abandon the work in progress.
@ENFORCE_NORMAL
@ENFORCE_POP
@ENFORCE_PUSH
@ENFORCE_RESET
@ENFORCE_STRICT
CQL allows you to specify a number of useful options such as “do not allow Window Functions” or “all foreign keys must choose some update or delete strategy”. These additional enforcements are designed to prevent errors. Because of this they should be established once, somewhere central and they should be rarely if ever overridden. For instance @ENFORCE_NORMAL WINDOW FUNCTION
would allow you to use window functions again, but this is probably a bad idea. If strict mode is on, disallowing them, that probably means your project is expected to target versions of SQLite that do not have window functions. Overriding that setting is likely to lead to runtime errors.
In general you don’t want to see these options in most code.
@PREVIOUS_SCHEMA
CQL can ensure that the current schema is compatible with the previous schema, meaning that an upgrade script could reasonably be generated to go from the previous to the current. This directive demarks the start of the previous schema section when that validation happens. This directive is useless except for creating that schema validation so it should never appear in normal procedures.
@BEGIN_SCHEMA_REGION
@DECLARE_DEPLOYABLE_REGION
@DECLARE_SCHEMA_REGION
@END_SCHEMA_REGION
CQL allows you to declare arbitrary schema regions and limit what parts of the schema any given region may consume. This helps you to prevent schema from getting entangled. There is never a reason to use this directives inside normal procedures; They should appear only in your schema declaration files.
@SCHEMA_UPGRADE_SCRIPT
@SCHEMA_UPGRADE_VERSION
The @SCHEMA_UPGRADE_SCRIPT
directive is only used by CQL itself to declare that the incoming file is an autogenerated schema upgrade script. These scripts have slightly different rules for schema declaration that are not useful outside of such scripts. So you should never use this.
@SCHEMA_UPGRADE_VERSION
on the other hand is used if you are creating a manual migration script. You need this script to run in the context of the schema version that it affects. Use this directive at the start of the file to do so. Generally manual migration scripts are to be avoided so hopefully this directive is rarely if ever used.
@ECHO
This directive emits plain text directly into the compiler’s output stream. It can be invaluable for adding new runtime features and for ensuring that (e.g.) additional #include
or #define
directives are present in the output but you can really break things by over-using this feature. Most parts of the CQL output are subject to change so any use of this should be super clean. The intended use was, as mentioned, to allow an extra #include
in your code so that CQL could call into some library. Most uses of this combine with DECLARE FUNCTION
or DECLARE PROCEDURE
to declare an external entity.
DECLARE ENUM
@EMIT_ENUMS
Avoid embedded constants whenever possible. Instead declare a suitable enumeration. Use @EMIT_ENUMS Some_Enum
to get the enumeration constants into the generated .h file for C. But be sure to do this only from one compiland. You do not want the enumerations in every .h file. Choose a single .sql file (not included by lots of other things) to place the @EMIT_ENUMS
directive. You can make a file specifically for this purpose if nothing else is serviceable.
CLOSE
OPEN
The OPEN
statement is a no-op, SQLite has no such notion. It was included because it is present in MYSQL
and other variants and its inclusion can ease readability sometimes. But it does nothing. The CLOSE
statement is normally not necessary because all cursors are closed at the end of the procedure they are declared in (unless they are boxed, see below). You only need CLOSE
if you want to close a global cursor (which has no scope) or if you want to close a local cursor “sooner” because waiting to the end of the procedure might be a very long time. Using close more than once is safe, the second and later close operations do nothing.
CALL
THROW
TRY CATCH
Remember that if you call a procedure and it uses THROW
or else uses some SQL that failed, this return code will cause your code to THROW
when the procedure returns. Normally that’s exactly what you want, the error will ripple out and some top-level CATCH
will cause a ROLLBACK
and the top level callers sees the error. If you have your own rollback needs be sure to install your own TRY
/CATCH
block or else use PROC SAVEPOINT
as above to do it for you.
Inside of a CATCH
block you can use the special variable @RC
to see the most recent return code from SQLite.
CONTINUE
LEAVE
RETURN
These work as usual but beware, you can easily use any of these to accidentally leave a block with a savepoint or transaction and you might skip over the ROLLBACK
or COMMIT
portions of the logic. Avoid this problem by using PROC SAVEPOINT
.
DECLARE FUNCTION
DECLARE SELECT FUNCTION
DECLARE PROCEDURE
The best practice is to put any declarations into a shared header file which you can #include
in all the places it is needed. This is especially important should you have to forward declare a procedure. CQL normally provides exports for all procedures so you basically get an automatically generated and certain-to-be-correct #include
file. But, if the procedures are being compiled together then an export file won’t have been generated yet at the time you need it; To work around this you use the DECLARE PROCEDURE
form. However, procedure declarations are tricky; they include not just the type of the arguments but the types of any/all of the columns in any result set the procedure might have. This must not be wrong or callers will get unpredictable failures.
The easiest way to ensure it is correct is to use the same trick as you would in C – make sure that you #include
the declaration the in the translation unit with the definition. If they don’t match there will be an error.
A very useful trick: the error will include the exact text of the correct declaration. So if you don’t know it, or are too lazy to figure it out; simply put ANY
declaration in the shared header file and then paste in the correct declaration from the error. should the definition ever change you will get a compilation error which you can again harvest to get the correct declaration.
In this way you can be sure the declarations are correct.
Functions have no CQL equivalent, but they generally don’t change very often. Use DECLARE FUNCTION
to allow access to some C code that returns a result of some kind. Be sure to add the CREATE
option if the function returns a reference that the caller owns.
Use DECLARE SELECT FUNCTION
to tell CQL about any User Defined Functions you have added to SQLite so that it knows how to call them. Note that CQL does not register those UDFs, it couldn’t make that call lacking the essential C information required to do so. If you find that you are getting errors when calling a UDF the most likely reason for the failure is that the UDF was declared but never registered with SQLite at runtime. This happens in test code a lot – product code tends to have some central place to register the UDFs and it normally runs at startup, e.g. right after the schema is upgraded.
DELETE
INSERT
SELECT
UPDATE
UPSERT
These statements are the most essential and they’ll appear in almost every procedure. There are a few general best practices we can go over.
Try to do as much as you can in one batch rather than iterating; e.g.
DELETE
statement that deletes one row if you can avoid it, write a delete statement that deletes all you need to deleteSELECT
statement that fetches one row, try to fetch all the rows you need with one selectMake sure UPSERT
is supported on the SQLite system you are using, older versions do not support it
Don’t put unnecessary casts in your SELECT
statements, they just add fat
Don’t use CASE
/WHEN
to compute a boolean, the boolean operations are more economical (e.g. use IS
)
Don’t use COUNT
if all you need to know is whether a row exists or not, use EXISTS
Don’t use GROUP BY
, ORDER BY
, or DISTINCT
on large rowsets, the sort is expensive and it will make your SELECT
statements write to disk rather than just read
Always use the INSERT INTO FOO USING
form of the INSERT
statement, it’s much easier to read than the standard form and compiles to the same thing
DECLARE OUT CALL
DECLARE
LET
SET
These are likely to appear all over as well. If you can avoid a variable declaration by using LET
then do so; The code will be more concise and you’ll get the exact variable type you need. This is the same as var x = foo();
in other languages. Once the variable is declared use SET
.
You can save yourself a lot of declarations of OUT
variables with DECLARE OUT CALL
. That declaration form automatically declares the OUT
variables used in the call you are about to make with the correct type. If the number of arguments changes you just have to add the args you don’t have to also add new declarations.
The LIKE
construct can be used to let you declare things whose type is the same as another thing. Patterns like DECLARE ARGS CURSOR LIKE FOO ARGUMENTS
save you a lot of typing and also enhance correctness. There’s a whole chapter dedicated to “shapes” defined by LIKE
.
EXPLAIN
Explain can be used in front of other queries to generate a plan. The way SQLite handles this is that you fetch the rows of the plan as usual. So basically EXPLAIN
is kind of like SELECT QUERY PLAN OF
. This hardly ever comes up in normal coding. CQL has an output option where it will generate code that gives you the query plan for a procedures queries rather than the normal body of the procedure.
FETCH
UPDATE CURSOR
The FETCH
statement has many variations, all are useful at some time or another. There are a few helpful guidelines.
FETCH USING
form (as you would with INSERT INTO USING
) because it is less error proneFETCH INTO
is generally a bad idea, you’ll have to declare a lot of variables, instead just rely on automatic storage in the cursor e.g. fetch my_cursor
rather than fetch my_cursor into a, b, c
UPDATE CURSOR
, this can let you adjust values or apply defaultsIF
LOOP
SWITCH
WHILE
These are your bread and butter and they will appear all over.
TIP: Use the
ALL VALUES
variant of switch whenever possible to ensure that you haven’t missed any cases.
OUT
OUT UNION
If you know you are producing exactly one row OUT
is more economical than SELECT
If you need complete flexibility on what rows to produce (e.g. skip some, add extras, mutate some) then OUT UNION
will give you that, use it only when needed, it’s more expensive than just SELECT
To understand what kinds of things you can reasonably do with fragments, really you just have to understand the things that you can do with common table expressions or CTEs. For those who don’t know, CTEs are the things you declare in the WITH clause of a SELECT statement. They’re kind of like local views. Well, actually, they are exactly like local views.
Query fragments help you to define useful CTEs so basically what you can do economically in a CTE directly determines what you can do economically in a fragment.
To demonstrate some things that happen with CTEs we’re going to use these three boring tables.
create table A
(id integer primary key,
not null
this text
);
create table B
(id integer primary key,
not null
that text
);
create table C
(id integer primary key,
not null
other text );
Let’s start with a very simple example, the first few examples are like control cases.
explain query plan
select * from A
inner join B on B.id = A.id;
QUERY PLAN
--SCAN TABLE A
|--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?) \
OK as we can see A
is not constrained so it has to be scanned but B
isn’t scanned, we use its primary key for the join. This is the most common kind of join: a search based on a key of the table you are joining to.
Let’s make it a bit more realistic.
explain query plan
select * from A
inner join B on B.id = A.id
where A.id = 5;
QUERY PLAN
--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?) \
Now A
is constrained by the WHERE
clause so we can use its index and then use the B
index. So we get a nice economical join from A
to B
and no scans at all.
Now suppose we try this with some CTE replacements for A
and B
. Does this make it worse?
explain query plan
with
id, this) as (select * from A),
AA(id, that) as (select * from B)
BB(select * from AA
left join BB on BB.id = AA.id
where AA.id = 5;
QUERY PLAN
--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?) \
The answer is a resounding no. The CTE AA
was not materialized it was expanded in place, as was the CTE BB
. We get exactly the same query plan. Now this means that the inner expressions like select * from A
could have been fragments such as:
[[shared_fragment]]create proc A_()
begin
select * from A;
end;
[[shared_fragment]]create proc B_()
begin
select * from B;
end;
explain query plan
with
call A_()), -- short for A_(*) AS (call A_())
(call B_()) -- short for B_(*) AS (call B_())
(select * from A_
left join B_ on B_.id = A_.id
where A_.id = 5;
NOTE: I’ll use the convention that
A_
is the fragment proc that could have generated the CTEAA
, likewise withB_
and so forth.
The above will expand into exactly what we had before and hence will have the exactly same good query plan. Of course this is totally goofy, why make a fragment like that – it’s just more typing. Well now lets generalize the fragments just a bit.
[[shared_fragment]]create proc A_(experiment bool not null)
begin
-- data source might come from somewhere else due to an experiment
if not experiment then
select * from A;
else
select id, this from somewhere_else;
end if;
end;
[[shared_fragment]]create proc B_()
begin
-- we don't actually refer to "B" if the filter is null
if b_filter is not null then
-- applies b_filter if specified
select * from B where B.other like b_filter;
else
-- generates the correct shape but zero rows of it
select null as id, null as that where false;
end if;
end;
create proc getAB(
integer not null,
id_ not null,
experiment bool
b_filter text)begin
with
call A_(experiment)),
(call B_(b_filter))
(select * from A_
left join B_ on B_.id = A_.id
where A_.id = id_;
end;
The above now has 4 combos economically encoded and all of them have a good plan. Importantly though, if b_filter
is not specified then we don’t actually join to B
. The B_
CTE will have no reference to B
, it just has zero rows.
Now lets look at some things you don’t want to do.
Consider this form:
explain query plan
with
id, this) as (select * from A),
AA(id, that) as (select A.id, B.that from A left join B on B.id = A.id)
BB(select * from AA
left join BB on BB.id = AA.id
where AA.id = 5;
QUERY PLAN
--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?) \
Note that here we get 3 joins. Now a pretty cool thing happened here – even though the expression for BB
does not include a WHERE
clause SQLite has figured out the AA.id
being 5 forces A.id
to be 5 which in turn gives a constraint on BB
. Nice job SQLite. If it hadn’t been able to figure that out then the expansion of BB
would have resulted in a table scan.
Still, 3 joins is bad when we only need 2 joins to do the job. What happened? Well, when we did the original fragments with extensions and stuff we saw this same pattern in fragment code. Basically the fragment for BB
isn’t just doing the B
things it’s restarting from A
and doing its own join to get B
. This results in a wasted join. And it might result in a lot of work on the A
table as well if the filtering was more complex and couldn’t be perfectly inferred.
You might think, “oh, no problem, I can save this, I’ll just refer to AA
instead of A
in the second query.”
This does not help (but it’s going in the right direction):
explain query plan
with
id, this) as (select * from A),
AA(id, that) as (select AA.id, B.that from AA left join B on B.id = AA.id)
BB(select * from AA
left join BB on BB.id = AA.id
where AA.id = 5;
QUERY PLAN
--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?) \
In terms of fragments the anti-pattern is this.
[[shared_fragment]]create proc B_()
begin
select B.* from A left join B on B.id = A.id;
end;
The above starts the query for B
again from the root. You can save this, the trick is to not try to generate just the B
columns and then join them later. You can get a nice data flow going with chain of CTEs.
explain query plan
with
id, this) as (select * from A),
AA(id, this, that) as (select AA.*, B.that from AA left join B on B.id = AA.id)
AB(select * from AB
where AB.id = 5;
QUERY PLAN
--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?) \
And we’re right back to the perfect plan. The good form creates a CTE chain where we only need the result of the final CTE. A straight line of CTEs each depending on the previous one results in a excellent data flow.
In terms of fragments this is now:
[[shared_fragment]]create proc A_()
begin
select * from A;
end;
[[shared_fragment]]create proc AB_()
begin
with
call A_)
(select A_.*, B.that from A_ left join B on B.id = A_.id
end;
with (call AB_())
select * from AB_ where AB_.id = 5;
For brevity we haven’t included the possibility of using conditional fragments (i.e. IF
statements) the same good query plan could be generated in that way.
We can generalize AB_
so that it doesn’t know where the base data is coming from and can be used in more cases.
[[shared_fragment]]create proc A_()
begin
select * from A;
end;
[[shared_fragment]]create proc AB_()
begin
with
source(*) like A -- you must provide some source that is the same shape as A
select source.*, B.that from source left join B on B.id = source.id
end;
with
call A_())
(call AB_() using A_ as source)
(select * from AB_ where AB_.id = 5;
Again this results in a nice straight chain of CTEs and even though the where clause is last the A
table is constrained properly.
It’s important not to fork the chain in the query plan, if that happens then whatever came before the fork must be materialized for use in both branches. That can be quite bad because then the filtering might come after the materialization. This is an example that is quite bad.
explain query plan
with
id, this) as (select * from A),
AA(id, that) as (select AA.id, B.that from AA left join B on B.id = AA.id),
BB(id, other) as (select AA.id, C.other from AA left join C on C.id = AA.id)
CC(select * from AA
left join BB on BB.id = AA.id
left join CC on CC.id = AA.id
where AA.id = 5;
QUERY PLAN
--MATERIALIZE 2
|--SCAN TABLE A
| |--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?)
| \--MATERIALIZE 3
|--SCAN TABLE A
| |--SEARCH TABLE C USING INTEGER PRIMARY KEY (rowid=?)
| \--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SCAN SUBQUERY 2
|--SEARCH SUBQUERY 3 USING AUTOMATIC COVERING INDEX (id=?) \
Things have gone poorly here. As you can see A
is now scanned twice. and there are many more joins. We could make this a lot better by moving the A
condition all the way up into the first CTE. With fragments that would just mean creating something like:
[[shared_fragment]]create proc A_(id_)
begin
select * from A where A.id = id_;
end;
At least then if we have to materialize we’ll get only one row. This could be a good thing to do universally, but it’s especially important if you know that forking in the query shape is mandatory for some reason.
A better pattern might be this:
explain query plan
with
id, this) as (select * from A),
AA(id, this, that) as (select AA.*, B.that from AA left join B on B.id = AA.id),
AB(id, this, that, other) as (select AB.*, C.other from AB left join C on C.id = AB.id)
ABC(select * from ABC
where ABC.id = 5;
QUERY PLAN
--SEARCH TABLE A USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE B USING INTEGER PRIMARY KEY (rowid=?)
|--SEARCH TABLE C USING INTEGER PRIMARY KEY (rowid=?) \
Here we’ve just extended the chain. With shared fragments you could easily build an AB_
proc as before and then build an ABC_
proc either by calling AB_
directly or by having a table parameter that is LIKE AB_
.
Both cases will give you a great plan.
So the most important things are:
These few rules will go far in helping you to create shapes.
One last thing, without shared fragments, if you wanted to create a large join the normal way, maybe 10 tables or so, then you have to type that join into your file and it would be very much in your face. Shared fragments might hide that join from you in a nice easy-to-use fragment. Which you might then decide you want to use the fragment 3 times… And now with a tiny amount of code you have 30 joins.
The thing is shared fragments make it easy to generate a lot of SQL. It’s not bad that shared fragments make things easy, but with great power comes great responsibility, so give a care as to what it is you are assembling. Understanding your fragments, especially any big ones, will help you to create great code.This is a brief discussion of the CQL Amalgam and its normal usage patterns.
The amalgam is not much more than a concatenation of all of the .h
files and the .c
files into one big cql_amalgam.c
file. With that files you can trivially make the compiler with a single cc cql_amalgam.c
command.
Because of this simplicity, the amalgam gives you a convenient way to consume the compiler in a different/unique build environment and, if desired, strip it down to just the parts you want while controlling its inputs more precisely than you can with just a command line tool.
If you want to snapshot a particular CQL compiler the easiest way to do so is to use make an amalgam at that version and then check in the output cql_amalgam.c
. Just like with SQLite and its amalgam.
There are many other uses of the amalgam: * It’s possible to tweak the amalgam with a little pre-processing to make a windows binary, last time I attempted this it took about 10 minutes * trying to get the whole build to work on Windows is a lot harder * The amalgam is readily consumed by emscripten to create WASM * This along with Lua and SQLite in WASM resulted in a fully online playground * Meta builds a VSCode extension that hosts the actual compiler in WASM in VSCode for error checking * You can make any kind of tool of your own that wants to consume the AST or the output parts * You can invoke the CQL compiler without launching a new process from inside your environment
Generally the amalgam makes it easier for you to host the CQL compiler in some new environment.
The amalgam has to include the results of bison and flex, so a normal build must run first. The simplest way to build it starting from the sources
directory is:
make
./make_amalgam.sh
The result goes in out/cql_amalgam.c
. It can then be built using cc
with whatever flags you might desire. With a few -D
directives it can readily be compiled with Microsoft C and it also works with Emscripten (emcc
) basically unchanged. Clang and Gcc of course also work.
The standard test script test.sh
builds the amalgam and attempts to compile it as well, which ensures that the amalgam can at least compile at all times.
Of course you can do whatever tests you might like by simply compiling the amalgam as is and then using it to compile things. But importantly the test script test.sh
can test the amalgam build like so:
test.sh --use_amalgam
This runs all the normal tests using the binary built from the amalgam rather than the normal binary.
Normal CQL development practices result in this happening pretty often so the amalgam tends to stay in good shape. The code largely works in either form with very few affordances for the amalgam build needed. Most developers don’t even think about the amalgam build flavor; to a first approximation “it just works”.
To use the amalgam you’ll want to do something like this:
#define CQL_IS_NOT_MAIN 1
// Suppresses a bunch of warnings because the code
// is in an #include context
// PR's to remove these are welcome :D
#pragma clang diagnostic ignored "-Wnullability-completeness"
#include "cql_amalgam.c"
void go_for_it(const char *your_buffer) {
= yy_scan_string(your_buffer);
YY_BUFFER_STATE my_string_buffer
// Note: "--in" is irrelevant because the scanner is
// going to read from the buffer above.
//
// If you don't use yy_scan_string, you could use "--in"
// to get data from a file.
int argc = 4;
char *argv[] = { "cql", "--cg", "foo.h", "foo.c" };
(argc, argv);
cql_main(my_string_buffer);
yy_delete_buffer}
So the general pattern is:
Most amalgam functions are static
to avoid name conflicts. You will want to create your own public functions such as go_for_it
above that use the amalgam in all the ways you desire.
You’ll want to avoid calling any internal functions other than cql_main
because they are liable to change.
NOTE: The amalgam is C code not C++ code. Do not attempt to use it inside of an
extern "C"
block in a C++ file. It won’t build. If you want a C++ API, expose the C functions you need and write a wrapper class.
The amalgam includes the following useful #ifdef
options to allow you to customize it.
If this symbol is defined then cql_main
will not be redefined to be main
.
As the comments in the source say:
#ifndef CQL_IS_NOT_MAIN
// Normally CQL is the main entry point. If you are using CQL
// in an embedded fashion then you want to invoke its main at
// some other time. If you define CQL_IS_NOT_MAIN then cql_main
// is not renamed to main. You call cql_main when you want.
#define cql_main main
#endif
Set this symbol so that you own main and cql_main is called at your pleasure.
The amalgam includes the normal #include
directives needed to make it compile, things like stdio and such. In your situation these headers may not be appropriate. If CQL_NO_SYSTEM_HEADERS
is defined then the amalgam will not include anything; you can then add whatever headers you need before you include the amalgam.
The amalgam includes a set of recommended directives for warnings to suppress and include. If you want to make other choices for these you can suppress the defaults by defining CQL_NO_DIAGNOSTIC_BLOCK
; you can then add whatever diagnostic pragmas you want/need.
The amalgam uses cql_emit_error
to write its messages to stderr. The documentation is included in the code which is attached here. If you want the error messages to go somewhere else, define cql_emit_error
as the name of your error handling function. It should accept a const char *
and record that string however you deem appropriate.
#ifndef cql_emit_error
// CQL "stderr" outputs are emitted with this API.
//
// You can define it to be a method of your choice with
// "#define cql_emit_error your_method" and then your method
// will get the data instead. This will be whatever output the
// compiler would have emitted to stderr. This includes
// semantic errors or invalid argument combinations. Note that
// CQL never emits error fragments with this API; you always
// get all the text of one error. This is important if you
// are filtering or looking for particular errors in a test
// harness or some such.
//
// You must copy the memory if you intend to keep it. "data" will
// be freed.
//
// Note: you may use cql_cleanup_and_exit to force a failure from
// within this API but doing so might result in unexpected cleanup
// paths that have not been tested.
void cql_emit_error(const char *err) {
(stderr, "%s", err);
fprintfif (error_capture) {
(error_capture, "%s", err);
bprintf}
}
#endif
Typically you would #define cql_emit_error your_error_function
before you include the amalgam and then define your_error_function elsewhere in that file (before or after the amalgam is included are both fine).
The amalgam uses cql_emit_output
to write its messages to stdout. The documentation is included in the code which is attached here. If you want the standard output to go somewhere else, define cql_emit_output
as the name of your output handling function. It should accept a const char *
and record that string however you deem appropriate.
#ifndef cql_emit_output
// CQL "stdout" outputs are emitted (in arbitrarily small pieces)
// with this API.
//
// You can define it to be a method of your choice with
// "#define cql_emit_output your_method" and then your method will
// get the data instead. This will be whatever output the
// compiler would have emitted to stdout. This is usually
// reformated CQL or semantic trees and such -- not the normal
// compiler output.
//
// You must copy the memory if you intend to keep it. "data" will
// be freed.
//
// Note: you may use cql_cleanup_and_exit to force a failure from
// within this API but doing so might result in unexpected cleanup
// paths that have not been tested.
void cql_emit_output(const char *msg) {
("%s", msg);
printf}
#endif
Typically you would #define cql_emit_output your_output_function
before you include the amalgam and then define your_error_function elsewhere in that file (before or after the amalgam is included are both fine).
If you still want normal file i/o for your output but you simply want to control the placement of the output (such as forcing it to be on some virtual drive) you can replace this function by defining cql_open_file_for_write
.
If all you need to do is control the origin of the FILE *
that is written to, you can replace just this function.
#ifndef cql_open_file_for_write
// Not a normal integration point, the normal thing to do is
// replace cql_write_file but if all you need to do is adjust
// the path or something like that you could replace
// this method instead. This presumes that a FILE * is still ok
// for your scenario.
FILE *_Nonnull cql_open_file_for_write(
const char *_Nonnull file_name)
{
FILE *file;
if (!(file = fopen(file_name, "w"))) {
("unable to open %s for write\n", file_name);
cql_error(1);
cql_cleanup_and_exit}
return file;
}
#endif
Typically you would #define cql_open_file_for_write your_open_function
before you include the amalgam and then define your_open_function elsewhere in that file (before or after the amalgam is included are both fine).
The amalgam uses cql_write_file
to write its compilation outputs to the file system. The documentation is included in the code which is attached here. If you want the compilation output to go somewhere else, define cql_write_file
as the name of your output handling function. It should accept a const char *
for the file name and another for the data to be written. You can then store those compilation results however you deem appropriate.
#ifndef cql_write_file
// CQL code generation outputs are emitted in one "gulp" with this
// API. You can define it to be a method of your choice with
// "#define cql_write_file your_method" and then your method will
// get the filename and the data. This will be whatever output the
// compiler would have emitted to one of it's --cg arguments.
// You can then write it to a location of your choice.
// You must copy the memory if you intend to keep it. "data" will
// be freed.
// Note: you *may* use cql_cleanup_and_exit to force a failure
// from within this API. That's a normal failure mode that is
// well-tested.
void cql_write_file(
const char *_Nonnull file_name,
const char *_Nonnull data)
{
FILE *file = cql_open_file_for_write(file_name);
(file, "%s", data);
fprintf(file);
fclose}
#endif
Typically you would #define cql_write_file your_write_function
before you include the amalgam and then define your_write_function elsewhere in that file (before or after the amalgam is included are both fine).
When you include the amalgam, you get everything by default. You may, however, only want some limited subset of the compiler’s functions in your build.
To customize the amalgam, there are a set of configuration pre-processor options. To opt-in to configuration, first define CQL_AMALGAM_LEAN
. You then have to opt-in to the various pieces you might want. The system is useless without the parser, so you can’t remove that; but you can choose from the list below.
The options are:
CQL_AMALGAM_LEAN
: enable lean mode; this must be set or you get everythingCQL_AMALGAM_GEN_SQL
: the echoing features (required)CQL_AMALGAM_CG_COMMON
: common code generator pieces (required)CQL_AMALGAM_SEM
: semantic analysis (required)CQL_AMALGAM_CG_C
: C codegenCQL_AMALGAM_CG_LUA
: Lua codegenCQL_AMALGAM_JSON
: JSON schema outputCQL_AMALGAM_OBJC
: Objective-C code genCQL_AMALGAM_QUERY_PLAN
: the query plan creatorCQL_AMALGAM_SCHEMA
: the assorted schema output typesCQL_AMALGAM_TEST_HELPERS
: test helper outputCQL_AMALGAM_UNIT_TESTS
: some internal unit tests, which are pretty much needed by nobodyNote that -DCQL_AMALGAM_LEAN -DCQL_AMALGAM_GEN_SQL -DCQL_AMALGAM_SEM -DCQL_AMALGAM_CG_COMMON
is the minimal set of slices. See the /release/test_amalgam.sh
script to find the other valid options. But basically you can add anything to the minimum set. If you don’t add -DCQL_AMALGAM_LEAN
you get everything.
The amalgam will use malloc/calloc for its allocations and it is designed to release all memory it has allocated when cql_main returns control to you, even in the face of error.
Internal compilation errors result in an assert
failure leading to an abort. This is not supposed to ever happen but there can always be bugs. Normal errors just prevent later phases of the compiler from running so you might not see file output, but rather just error output. In all cases things should be cleaned up.
This is a working example that shows all of the basic DML statements and the call patterns to access them. The code also includes the various helpers you can use to convert C types to CQL types.
todo.sql
-- This is a simple schema for keeping track of tasks and whether or not they have been completed
-- this serves to both declare the table and create the schema
create proc todo_create_tables()
begin
create table if not exists tasks(
not null,
description text default false not null
done bool
);end;
-- adds a new not-done task
create proc todo_add(task TEXT NOT null)
begin
insert into tasks values(task, false);
end;
-- gets the tasks in inserted order
create proc todo_tasks()
begin
select rowid, description, done from tasks order by rowid;
end;
-- updates a given task by rowid
create proc todo_setdone_(rowid_ integer not null, done_ bool not null)
begin
update tasks set done = done_ where rowid == rowid_;
end;
-- deletes a given task by rowid
create proc todo_delete(rowid_ integer not null)
begin
delete from tasks where rowid == rowid_;
end;
main.c
#include <stdlib.h>
#include <sqlite3.h>
#include "todo.h"
int main(int argc, char **argv)
{
/* Note: not exactly world class error handling but that isn't the point */
// create a db
*db;
sqlite3 int rc = sqlite3_open(":memory:", &db);
if (rc != SQLITE_OK) {
(1);
exit}
// make schema if needed (in memory databases always begin empty)
= todo_create_tables(db);
rc if (rc != SQLITE_OK) {
(2);
exit}
// add some tasks
const char * const default_tasks[] = {
"Buy milk",
"Walk dog",
"Write code"
};
for (int i = 0; i < 3; i++) {
// note we make a string reference from a c string here
= cql_string_ref_new(default_tasks[i]);
cql_string_ref dtask = todo_add(db, dtask);
rc (dtask); // and then dispose of the reference
cql_string_releaseif (rc != SQLITE_OK) {
(3);
exit}
}
// mark a task as done
= todo_setdone_(db, 1, true);
rc if (rc != SQLITE_OK) {
(4);
exit}
// delete a row in the middle, rowid = 2
= todo_delete(db, 2);
rc if (rc != SQLITE_OK) {
(5);
exit}
// select out some results
;
todo_tasks_result_set_ref result_set= todo_tasks_fetch_results(db, &result_set);
rc if (rc != SQLITE_OK) {
("error: %d\n", rc);
printf(6);
exit}
// get result count
= todo_tasks_result_count(result_set);
cql_int32 result_count
// loop to print
for (cql_int32 row = 0; row < result_count; row++) {
// note "get" semantics mean that a ref count is not added
// if you want to keep the string you must "retain" it
= todo_tasks_get_description(result_set, row);
cql_string_ref text = todo_tasks_get_done(result_set, row);
cql_bool done = todo_tasks_get_rowid(result_set, row);
cql_int32 rowid
// convert to c string format
(ctext, text);
cql_alloc_cstr("%d: rowid:%d %s (%s)\n",
printf, rowid, ctext, done ? "done" : "not done");
row(ctext, text);
cql_free_cstr}
// done with results, free the lot
(result_set);
cql_result_set_release
// and close the database
(db);
sqlite3_close}
# ${cgsql} refers to the root of the CG/SQL repo
% cql --in todo.sql --cg todo.h todo.c
% cc -o todo -I${cqsql}/sources main.c todo.c ${cgsql}/sources/cqlrt.c -lsqlite3
Note that rowid 2 has been deleted, the leading number is the index in the result set. The rowid is of course the database rowid.
% ./todo
0: rowid:1 Buy milk (done)
1: rowid:3 Write code (not done)
This system as it appears in the sources here is designed to get some basic SQLite scenarios working but the runtime systems that are packaged here are basic, if only for clarity. There are some important things you should think about improving or customizing for your production environment. Here’s a brief list.
The reference counting solution in the stock CQLRT
implementation is single threaded. This might be ok, in many environments only one thread is doing all the data access. But if you plan to share objects between threads this is something you’ll want to address. CQLRT
is designed to be replacable. In fact there is another version included in the distribution cqlrt_cf
that is more friendly to iOS and CoreFoundation. This alternate version is an excellent demonstration of what is possible. There are more details in Internals Part 5: CQL Runtime
SQLite statement management includes the ability to reset and re-prepare statements. This is an important performance optimization but the stock CQLRT
does not take advantage of this. This is for two reasons: first, simplicity, and secondly (though more importantly), any kind of statement cache would require a caching policy and this simple CQLRT
cannot possibly know what might consitute a good policy for your application.
The following three macros can be defined in your cqlrt.h
and they can be directed at a version that keeps a cache of your choice.
#ifndef cql_sqlite3_exec
#define cql_sqlite3_exec(db, sql) sqlite3_exec((db), (sql), NULL, NULL, NULL)
#endif
#ifndef cql_sqlite3_prepare_v2
#define cql_sqlite3_prepare_v2(db, sql, len, stmt, tail) sqlite3_prepare_v2((db), (sql), (len), (stmt), (tail))
#endif
#ifndef cql_sqlite3_finalize
#define cql_sqlite3_finalize(stmt) sqlite3_finalize((stmt))
#endif
As you might expect, prepare
creates a statement or else returns one from the cache. When the finalize
API is called the indicated statement can be returned to the cache or discarded. The exec
API does both of these operations, but also, recall that exec
can get a semicolon separated list of statements. Your exec
implementation will have to use SQLite’s prepare functions to split the list and get prepared statements for part of the string. Alternately, you could choose not to cache in the exec
case.
As you can see in cqlrt_cf
, there is considerable ability to define what the basic data types mean. Importantly, the reference types text
, blob
, and object
can become something different (e.g., something already supported by your environment). For instance, on Windows you could use COM or .NET types for your objects. All object references are substantially opaque to CQLRT
; they have comparatively few APIs that are defined in the runtime: things like getting the text out of the string reference and so forth.
In addition to the basic types and operations you can also define a few helper functions that allow you to create some more complex object types. For instance, list, set, and dictionary creation and management functions can be readily created and then you can declare them using the DECLARE FUNCTION
language features. These objects will then be whatever list, set, or dictionary they need to be in order to interoperate with the rest of your environment. You can define all the data types you might need in your CQLRT
and you can employ whatever threading model and locking primitives you need for correctness.
The CQLRT
interface includes some helper macros for logging. These are defined as no-ops by default but, of course, they can be changed.
#define cql_contract assert
#define cql_invariant assert
#define cql_tripwire assert
#define cql_log_database_error(...)
#define cql_error_trace()
cql_contract
and cql_invariant
are for fatal errors. They both assert something that is expected to always be true (like assert
) with the only difference being that the former is conventionally used to validate preconditions of functions.
cql_tripwire
is a slightly softer form of assert that should crash in debug builds but only log an error in production builds. It is generally used to enforce a new condition that may not always hold with the goal of eventually transitioning over to cql_contract
or cql_invariant
once logging has demonstrated that the tripwire is never hit. When a fetch_results
method is called, a failure results in a call to cql_log_database_error
. Presently the log format is very simple. The invocation looks like this:
(info->db, "cql", "database error"); cql_log_database_error
The logging facility is expected to send the message to wherever is appropriate for your environment. Additionally it will typically get the failing result code and error message from SQLite, however these are likely to be stale. Failed queries usually still require cleanup and so the SQLite error codes be lost because (e.g.) a finalize
has happened, clearing the code. You can do better if, for instance, your runtime caches the results of recent failed prepare
calls. In any case, what you log and where you log it is entirely up to you.
The cql_error_trace
macro is described in Internals Part 3 It will typically invoke printf
or fprintf
or something like that to trace the origin of thrown exceptions and to get the error text from SQLite as soon as possible.
An example might be:
#define cql_error_trace() fprintf(stderr, "error %d in %s %s:%d\n", _rc_, _PROC_, __FILE__, __LINE_)
Typically the cost of all these diagnostics is too high to include in production code so this is turned on when debugging failures. But you can make that choice for yourself.
The file rt.c
defines the common result types, additional result types can be readily added to this file.
The C data type rtdata
includes many text fragments that directly control the code generation. If you want to make your generated code look more like say CoreFoundation you can define an rtdata
that will do the job. This will mean a lot of your generated code won’t require the #defines
for the CQL types, it can use your runtime directly. You can also enable things like Pascal casing for procedure names and a common prefix on procedure names if those are useful in your environment. However, the system is designed so that such changes aren’t necessary. The data types in cqlrt.h
are enough for any remapping, additional changes with rtdata
are merely cosmetic.
The CQLRT
macros are very powerful, they allow you to target almost any runtime with a C API. The cqlrt_cf
version is a good example of the sorts of changes you can make.
Concurrency and Statement Caching are not supported in the basic version for cqlrt.h
. If this is important to you you might want to customize for that.
Helper functions for additional data types can be added, and they can be unique to your runtime.
There are tracing macros to help with debugability. Providing some useful versions of those can be of great help in production environments.Each of the pipeline shortcuts below can be extracted from cql.y
to get an always current list (grep @op cql.y
). They are largely self explanatory as they always map some simple shortcut to a standard cql function with each @op
directive defining one such pipeline shortcut. The form of these is discussed more generally in Chapter 8. Below are the particulars of the builtin @op
directives.
Note:
@op
directives can be overridden and therefore customized to whatever is locally helpful. The function calls resulting from rewrites directed by@op
are still semantically checked so there are no additional type-safety issues for doing so.
There is a shortcut for the boxing operation for each primitive type
Examples: * let obj := expr:box;
Details:
@op bool : call box as cql_box_bool;
@op int : call box as cql_box_int;
@op long : call box as cql_box_long;
@op real : call box as cql_box_real;
@op text : call box as cql_box_text;
@op blob : call box as cql_box_blob;
@op object : call box as cql_box_object;
There is a shortcut to (try to) unbox for each primitive type.
Examples: * nullable_bool := my_box:to_bool;
* notnull_long := my_box:to_long:ifnull_throw;
Details:
@op object<cql_box> : call to_bool as cql_unbox_bool;
@op object<cql_box> : call to_int as cql_unbox_int;
@op object<cql_box> : call to_long as cql_unbox_long;
@op object<cql_box> : call to_real as cql_unbox_real;
@op object<cql_box> : call to_text as cql_unbox_text;
@op object<cql_box> : call to_blob as cql_unbox_blob;
@op object<cql_box> : call to_object as cql_unbox_object;
@op object<cql_box> : call type as cql_box_get_type;
These are for reading values out of a cursor, most these are really only useful in the context of a macro because that is the only way you could polymorphically pass a cursor around.
Examples: * let column_count := C:count;
* let column_type := C:type(n);
* let an_integer := C:get_int(n);
(could be null)
Details:
@op cursor : call count as cql_cursor_column_count;
@op cursor : call type as cql_cursor_column_type;
@op cursor : call get_bool as cql_cursor_get_bool;
@op cursor : call get_int as cql_cursor_get_int;
@op cursor : call get_long as cql_cursor_get_long;
@op cursor : call get_real as cql_cursor_get_real;
@op cursor : call get_text as cql_cursor_get_text;
@op cursor : call get_blob as cql_cursor_get_blob;
@op cursor : call get_object as cql_cursor_get_object;
These two shortcuts convert cursor contents to a blob or the reverse, extracting the blob into the cursor. Both have failure modes.
Examples: * let my_blob := C:to_blob
* C:to_blob(my_blob)
* C:from_blob(my_blob)
Details:
@op cursor : call to_blob as cql_cursor_to_blob;
@op cursor : call from_blob as cql_cursor_from_blob;
Creates a string representation of the cursor with field names and values, useful for debugging.
Examples: * printf("C is: %s\n", C:format);
Details:
@op cursor : call format as cql_cursor_format;
Reports the first difference between two cursors, or null if none. The first provides the column name and the second provides the differing column and values all in text form. Blob values are not emitted, only the length.
Examples: * printf("first difference: %s", C1:diff_val(C2));
Details:
@op cursor : call diff_col as cql_cursor_diff_col;
@op cursor : call diff_val as cql_cursor_diff_val;
@op cursor : call diff_index as cql_cursor_diff_index;
Hashes all fields of the cursor to give one overall hash.
Examples: * let hash := C:hash;
Details:
@op cursor : call hash as cql_cursor_hash;
Compares two cursors for equality.
Examples: * if not C:equals(D) throw;
Details:
@op cursor : call equals as cql_cursors_equal;
List operations for each primitive type, examples: * list[index] := value;
(will not grow the list) * value := list[index];
* list:add(value);
* n_items := list:count;
Details:
The details are totally symmetric for each list type offering add
, count
, and array get/set.
Long (can also hold int and bool)
@op cql_long_list : array set as cql_long_list_set_at;
@op cql_long_list : array get as cql_long_list_get_at;
@op cql_long_list : call add as cql_long_list_add;
@op cql_long_list : get count as cql_long_list_count;
Real
@op cql_real_list : array set as cql_real_list_set_at;
@op cql_real_list : array get as cql_real_list_get_at;
@op cql_real_list : call add as cql_real_list_add;
@op cql_real_list : get count as cql_real_list_count;
String
@op cql_string_list : array set as cql_string_list_set_at;
@op cql_string_list : array get as cql_string_list_get_at;
@op cql_string_list : call add as cql_string_list_add;
@op cql_string_list : get count as cql_string_list_count;
Blob
@op cql_blob_list : array set as cql_blob_list_set_at;
@op cql_blob_list : array get as cql_blob_list_get_at;
@op cql_blob_list : call add as cql_blob_list_add;
@op cql_blob_list : get count as cql_blob_list_count;
Object
@op cql_object_list : array set as cql_object_list_set_at;
@op cql_object_list : array get as cql_object_list_get_at;
@op cql_object_list : call add as cql_object_list_add;
@op cql_object_list : get count as cql_object_list_count;
There are dictionaries for each primitive type with simple dictionary shortcuts.
Examples: * dict:find(x)
* dict:add(x)
* dict[x] := val
* val := dict[x]
Details:
The details are totally symmetric for each dictionary type offering add
, find
and array get/set.
Long (can also hold int and bool)
@op object<cql_long_dictionary> : call add as cql_long_dictionary_add;
@op object<cql_long_dictionary> : call find as cql_long_dictionary_find;
@op object<cql_long_dictionary> : array set as cql_long_dictionary_add;
@op object<cql_long_dictionary> : array get as cql_long_dictionary_find;
Real
@op object<cql_real_dictionary> : call add as cql_real_dictionary_add;
@op object<cql_real_dictionary> : call find as cql_real_dictionary_find;
@op object<cql_real_dictionary> : array set as cql_real_dictionary_add;
@op object<cql_real_dictionary> : array get as cql_real_dictionary_find;
String
@op object<cql_string_dictionary> : call add as cql_string_dictionary_add;
@op object<cql_string_dictionary> : call find as cql_string_dictionary_find;
@op object<cql_string_dictionary> : array set as cql_string_dictionary_add;
@op object<cql_string_dictionary> : array get as cql_string_dictionary_find;
Blob
@op object<cql_blob_dictionary> : call add as cql_blob_dictionary_add;
@op object<cql_blob_dictionary> : call find as cql_blob_dictionary_find;
@op object<cql_blob_dictionary> : array set as cql_blob_dictionary_add;
@op object<cql_blob_dictionary> : array get as cql_blob_dictionary_find;
Object
@op object<cql_object_dictionary> : call add as cql_object_dictionary_add;
@op object<cql_object_dictionary> : call find as cql_object_dictionary_find;
@op object<cql_object_dictionary> : array set as cql_object_dictionary_add;
@op object<cql_object_dictionary> : array get as cql_object_dictionary_find;