Skip to content

IslandSQL/IslandSQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IslandSQL

Introduction

IslandSQL is an ANTLR 4 based parser for grammars used in SQL files. The parser requires a Java Virtual Machine supporting version 8 or newer and is available on Maven Central.

Scope

Database Management Systems

The following table shows the DBMS and their grammar versions in scope:

DBMS Grammar Version HTML Reference (live) PDF Reference (snapshot)
OracleDB SQL*Plus 23c User's Guide and Reference PDF
SQLcl 23.4 Users's Guide PDF
SQL 23c Language Reference PDF
PL/SQL 23c Language Reference PDF
PostgreSQL psql 16.2 psql — PostgreSQL interactive terminal PDF
SQL 16.2 The SQL Language PDF
PL/pgSQL 16.2 PL/pgSQL — SQL Procedural Language PDF

The HTML reference shows the latest version of the document. However, the latest snapshot version in PDF format represents the version that was used to define the grammar.

Statements

The current islands of interests are:

Statement Notes
call complete statement
delete complete statement
explain plan complete statement
insert complete statement
lock table complete statement
merge complete statement
select complete statement (embedded PL/SQL only as list of tokens)
update complete statement

Tokens that are not part of the statements listed above are preserved as hidden tokens. As a result, the token stream represents the complete input (file).

IslandSQL Grammar

The syntax diagrams of the IslandSQL grammar are produced by RR and can be found here.

SQL:2023 / ISO/IEC 9075:2023

The IslandSQL grammar is based on the grammars of the DBMSs in scope and not on the SQL:2023 standard. Nevertheless, it is interesting to consult the standard for constructs that are implemented differently in the DBMSs.

The table below shows links to the freely available BNF of the SQL:2023 standard. Furthermore, it contains links to some chosen root elements in the derived syntax diagrams of grammars produced by RR.

Please note that not all freely available parts of the standard contain a grammar definition in BNF.

BNF Derived Syntax Diagram
Part 2: Foundation (SQL/Foundation) preparable_statement, embedded_SQL_statement, direct_SQL_statement
Part 3: Call-Level Interface (SQL/CLI) CLI_routine
Part 4: Persistent stored modules (SQL/PSM) SQL_schema_definition_statement, SQL_schema_manipulation_statement, SQL_control_statement
Part 9: Management of External Data (SQL/MED) SQL_schema_definition_statement, SQL_schema_manipulation_statement
Part 10: Object language bindings (SQL/OLB) statement_or_declaration
Part 13: SQL Routines and types using the Java TM programming language (SQL/JRT) user-defined_type_body
Part 14: XML-Related Specifications (SQL/XML) table_primary, aggregate_function
Part 15: Multidimensional arrays (SQL/MDA) table_primary
Part 16: Property Graph Queries (SQL/PGQ) SQL_schema_definition_statement, SQL_schema_manipulation_statement, table_primary

Limitations

Common Principle

If a SQL script runs without errors, but IslandSQL reports parse errors, then we consider this to be a bug. Exceptions are documented here.

Wrapped PL/SQL Code

Code that has been wrapped with the wrap utility can be installed in the target database. However, wrapped code is ignored by the IslandSQL grammar.

Keywords as Identifiers

The grammar allows the use of keywords as identifiers. This makes the grammar robust and supports the fact OracleDB allows the use of keywords in various places.

However, there are cases where this leads to an unexpected parse tree, even if no keywords as identifiers are used. Here's an example:

select *
  from emp
  left join dept on emp.deptno = dept.deptno;

In this case left is treated as a table alias of emp, since join dept on emp.deptno = dept.deptno is a valid innerCrossJoinClause and the priority of the evaluation in ANTLR4 matches the order in the grammar.

Solving this issue is not simple, especially since OracleDB allows the use of left or right as valid table names and table aliases. Here's an another example:

with
   right as (
      select * from emp
   )
select *
  from right
 right join dept on right.deptno = dept.deptno;

In this example OracleDB selects 15 rows (an empty emp for deptno 40). The token right on the last line is therefore treated as part of the outerJoinClause by OracleDB and not as a table alias.

Prohibiting keywords as identifiers in certain places could lead to parse errors for working SQL. Therefore, the production of a false parse tree due to the support of keywords as identifiers is considered acceptable.

Dynamic Grammar of SQL*Plus and psql

The following commands affect the grammar and are not interpreted by IslandSQL. The IslandSQL grammar is built on the default settings. As a result other values lead to errors.

However, the following psql meta commands are supported as alternative to a semicolon:

Unterminated Statements

The grammar expects an SQL statement to end with a semicolon, unless it is the last statement in a file. Here is a valid example:

select 42 as result;
select 'fourty-two' as result

Removing the semicolon in the scripts will result in a parse error.

SQL*Plus Substitution Variables

Substitution Variables can contain arbitrary text. They are replaced before the execution of a script. The IslandSQL grammar provides limited support for substitution variables. They can be used in places where a sqlName is valid.

Here's an example of a supported usage:

lock table &table_name in exclusive mode wait &seconds;

And here's an example of an unsupported usage:

lock table dept in &lock_mode mode nowait;

The grammar expects certain keywords at the position of &lock_mode. Hence, this usage is not supported.

Variables in psql

Variables in psql can contain arbitrary text. They are replaced before the execution of a script. The IslandSQL grammar provides limited support for these variables. They can be used in places where a sqlName is valid.

Here's an example of a supported usage:

\set schema public
select e.*, :'schema' as schema from :schema.emp as e;

And here's an example of an unsupported usage:

\set schema public.
select e.*, :'schema' as schema from :schema emp as e;

The grammar expects an identifier to not contain a period. Hence, this usage is not supported.

External Table Access Parameters

The access_parameters clause used in inline_external_table or modified_external_table is driver specific. You can pass this access parameters as string or as a subquery returning a CLOB or embed the driver specific parameters directly. All variants are supported.

However, when you embed the drivers specific parameters directly, the parameters are parsed as a list of tokens. We do not plan to implement the driver specific grammars. See also:

PostgreSQL Bitwise XOR Operator #

In OracleDB an unquoted identifier can contain a # character. In PostgreSQL this is not allowed. The IslandSQL grammar supports identifiers containing #. As a result, in some cases, an expression containing a # is interpreted as an identifier instead of a bitwise exclusive OR expression in IslandSQL.

Here are some examples:

Query PostgreSQL IslandSQL Notes
select a#b from t; Bitwise XOR expression Identifier a#b No whitespace around operator
select a # b from t; Bitwise XOR expression Bitwise XOR expression Whitespace around operator
select a #b from t; Bitwise XOR expression Bitwise XOR expression Identifier cannot start with a #
select 1#2; Bitwise XOR expression Bitwise XOR expression Identifier cannot start with a digit

PostgreSQL Custom Operators

Integrating custom operators in a generic way will most likely lead to conflicts with existing expressions and conditions. Therefore, the IslandSQL grammar does not support custom operators.

However, the custom operators provided by the PostGIS extension are included in the grammar.

License

IslandSQL is licensed under the Apache License, Version 2.0. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0.