4th January 2017

Accessing External Data Sources

As of Rel version 3.003, it’s easy to read external data sources like CSV files, XLS/XLSX spreadsheets, tables in Microsoft Access databases, and tables in external databases accessible via JDBC. You can also write to CSV files and XLS/XLSX spreadsheet files.

Reading External Data Sources

Data sources are read by defining them as external relation-valued variables.

External relation-valued variables, or external relvars, are defined via a Rel extension to Tutorial D. The VAR variable declaration supports the following syntax:

VAR <varname> EXTERNAL <kind> <connection_string> [ DUP_REMOVE | DUP_COUNT | AUTOKEY ];

<kind> is the kind of external relvar. Built-in are CSV, XLS, JDBC and ACCDB. (Additional kinds, and additional JDBC drivers, can be put in the Extensions directory of a Rel database but are only recognised by the standalone Rel DBMS.)

<connection_string> is a CHAR specific to the kind of external relvar that describes the path or connection to the data source.

DUP_REMOVE silently removes duplicate tuples.

DUP_COUNT adds an INTEGER attribute named _DUP_COUNT that indicates a count of duplicate tuples.

AUTOKEY adds an INTEGER attribute named _AUTOKEY that is serially numbered. In the absence of DUP_REMOVE, DUP_COUNT, or AUTOKEY, the default is AUTOKEY.

Examples:

VAR myvar EXTERNAL CSV “/home/dave/test.csv”;

VAR myvar EXTERNAL CSV “/home/dave/test.csv,HOHEADING” DUP_REMOVE;

VAR myvar EXTERNAL XLS “/home/dave/test.xls” DUP_REMOVE;

VAR myvar EXTERNAL XLS “/home/dave/test.xlsx” DUP_COUNT;

VAR myvar EXTERNAL XLS “/home/dave/test.xlsx,1” DUP_COUNT;

VAR myvar EXTERNAL XLS “/home/dave/test.xls,2,NOHEADING” DUP_REMOVE;

VAR myvar EXTERNAL JDBC “jdbc:postgresql://localhost/database,sqluser,sqluserpw,MyTable” AUTOKEY;

VAR myvar EXTERNAL ACCDB “c:\\users\\me\\mydb.accdb,mytable”;

VAR myvar EXTERNAL ACCDB “c:\\users\\me\\mydb.accdb,mytable” DUP_REMOVE;

NOTE: In the XLS connection string, the number after the optional ‘,’ indicates the zero-based tab number.

NOTE: In the XLS & CSV connection string, NOHEADING indicates that the spreadsheet does not have a heading row.

The Rel command-line interface can help you specify the appropriate file name. Use the “Get file path” toolbar item to find and insert the file name. The button is shown, in a highlight box with a red circle, here:

Built-in JDBC support is provided for MySQL, MariaDB, PostgreSQL, Oracle Database and Microsoft SQL Server. E.g.:

VAR myvar EXTERNAL JDBC “jdbc:mysql://localhost/database,sqluser,sqluserpw,MyTable”;

VAR myvar EXTERNAL JDBC “jdbc:mariadb://localhost/database,sqluser,sqluserpw,MyTable”;

VAR myvar EXTERNAL JDBC “jdbc:postgresql://localhost/database,sqluser,sqluserpw,MyTable”;

VAR myvar EXTERNAL JDBC “jdbc:oracle:thin:@localhost:1521:database,sqluser,sqluserpw,MyTable”;

VAR myvar EXTERNAL JDBC “jdbc:sqlserver://localhost:1433;databaseName=database,sqluser,sqluserpw,MyTable”;

Writing to CSV and Spreadsheet Files

The Rel user interface supports exporting relations from variables (relvars), views, and queries to CSV (comma separated value) and XLS or XLSX spreadsheet files.

To export from variables and views, right-click on the variable or view name to bring up a menu, as shown here:

Select the Export menu item. The following dialog box will appear:

Once you’ve chosen the appropriate file format, press the Export button to choose a filename and export the file.

You can also export CSV and XLS/XLSX files from the Rev Visual Query Language. Select the “Rev” mode (note that the Rev extensions must be installed) and load or define a query like the following and click on the ‘?’ of any query node, as shown with a red circle here:

That will pop up a dialog box that shows you the Tutorial D code for the query, along with four option buttons, as shown here:

You can then click the “Export query results to file” button…

…to choose a file format and export the results of evaluating the query.