SQLet 1.0:
execute SQL on text files via the Linux command line (update coming soon!)

SQLet is a free, open-source script that allows you to directly execute SQL on multiple text files, right from the Linux command line.
In one single command, you can read in text files (with or without header line), and perform arbitrary select statements, including joins over several files.






Quick introduction and example:

Given two ';'-delimited text files, file1.txt and file2.txt, without headers, the following command on the Linux command line performs a join on their second columns, and outputs the result:

>sqlet.py -d';' -A file1.txt -B file2.txt 'select * from A,B where a2=b2;' | sqlite3

(Don't forget the semicolon in the select statement. You need Python and sqlite3 on your system.)

Download now -- contact info@sqlet.com for feedback and bug reports

 

Rationale:

We like text files. They are human-readable and portable databases. With a bit of command line jiu-jitsu they often yield fine IT systems - without additional database layers.

The usual command line tools (eg, awk) are indeed very powerful. Sometimes, though, SQL queries would come in handy. Light-weight SQL tools like SQLite can help, at the slight expense, however, of some additional commands, explicit column naming, and just a bit more typing.

SQLet tries to do away with this admittedly small, yet flow-interrupting overhead. It parses its arguments into SQL commands that can be piped directly to sqlite3. In effect, you can thus execute SQL queries on multiple text files right from the Linux command line.



Tutorial:

Consider a space-separated text file with a header line:

>cat file1_h.txt
NAME KEY VALUE
foo 11 3.14
bar 12 2.71
foobar 12 .007

** The following command performs a simple select statement on this file:

>sqlet.py -A file1_h.txt 'select * from A where a2="12";' | sqlite3
bar 12 2.71
foobar 12 .007

This reads in the text file into an SQL table called 'A', automatically naming the columns a1-a3, and executes the given SQL statement.

** The header line's column names can be used with the '-h1' flag:

>sqlet.py -h1 -A file1_h.txt 'select * from A where KEY="12";' | sqlite3
bar 12 2.71
foobar 12 .007

** By default, all columns are of type TEXT. The following command interprets column 2 as INTEGER column (note the '>' instead of '=' operator):

>sqlet.py -h1 -i2 -A file1_h.txt 'select * from A where KEY>11;' | sqlite3
bar 12 2.71
foobar 12 .007

The following command interprets columns 2 and 3 as REAL columns:

>sqlet.py -h1 -r2,3 -A file1_h.txt 'select * from A where KEY>11;' | sqlite3
bar 12 2.71
foobar 12 .007

** SQLet can operate on several files at once. Consider a second file without header, and a different delimiter:

>cat file2.txt
categoryX;10
categoryY;11
categoryZ;12

The following command loads file1 (with header, default delimiter=' '), loads file2 (no header, delimiter=';'), performs a join, and outputs the results using the new delimiter '|':

>sqlet.py -h1 -A file1_h.txt -h0 -d';' -B file2.txt -d'|' 'select * from A,B where KEY=b2;' | sqlite3
foo|11|3.14|categoryY|11
bar|12|2.71|categoryZ|12
foobar|12|.007|categoryZ|12

 

www.sqlet.com - info@sqlet.com - M. Auer 2013
SQLet is open source and distributed under the terms of the GNU General Public License