WiNCh = WsvNC is not CSV SourceForge.net Logo WsvNC - a JDBC driver for plain text data files

WsvNC Jdbc is a simple read-only jdbc driver (type 4) meant to get data from plain text data files. In this context each data file is considered as a table and this driver offers a SQL interface over this kind of tables.

WsvNC (wathever separated values) Jdbc has been developed starting from a previous CSV driver, the CsvJdbc, available as an open project on Source Forge. It has been realized to support another software used to import data from whatever source data file, located on a computer (local or remote host) that makes it available through whatever kind of protocol, over whatever kind of operating system that supports java (with Microsoft operating systems a JDBC-ODBC driver was already available but we felt the need to get data from a file located on a Linux PC). WNC Jdbc extends the original CsvJdbc including several functionalities added by other contributor developers (and modified by me) and, mostly, by implementing the possibility to count records, to query the types of columns (a type system has been realized) and to get data in a fixed length format.

WsvNC Jdbc has been developed as part of a bigger project that is not open, but the company that paid for this development (look at the credits) has recognized the benefits coming from getting the original open source project and supported its publishing.

 

Content

  1. WiNCh - Wsv (wathever separated values) is Not (only) Csv (comma separated values)
  2. WsvNC Driver properties
  3. How to use the driver
  4. An example
  5. Credits
  6. Release notes

^

 

1. WsvNC - Values Separated by Whatever

Winch is a JDBC driver that allows to have a SQL interface over plain text data files. At present it just allows to query a single table. It doesn't implement any command other than SELECT and the WHERE clause is not supported. But it is able to connect to a data source specified by an URL, so you can get data from a file exposed, for example, by the ftp protocol on a remote PC. It is also able to count the records of a table (file) and it recognizes the type of each column data (getObject is implemented) through a schema descriptor. WNC can get data encoded using several kinds of char sets.

Winch has been developed using Eclipse as IDE. The driver is supported with a test suite (WsvTestSuite), that can also be executed through Junit, with javadoc documentation, with some data file and their schema.ini file of properties.

^

 

2. WsvNC Driver properties

For a file of data to be accessed through Winch you have to specify a set of properties having to do with the connection to the file of data and another one for data characteristics. For this reason there are two sets of properties to be defined and used to create a connection to the source data file.

Properties for connection to the file of data

These are properties that specify the location and format of the data file.

Data properties

These properties have to be set to describe the data format.
NAME Java type TYPE
VARCHAR String.class 12
BOOLEAN Boolean.class 16
TINYINT Byte.class -6
CHAR Character.class 1
DOUBLE Double.class 8
FLOAT Float.class 6
INTEGER Integer.class 4
BIGINT Long.class -5
SMALLINT Short.class 5
TIMESTAMP Timestamp.class 93
DATE Date.class 91
TIME Time.class 92
VARBINARY byte[].class -3
NULL null 0
JAVA_OBJECT Object.class 2000
P_BOOLEAN boolean.class 100016
P_TINYINT byte.class 9994
P_CHAR char.class 100001
P_DOUBLE double.class 100008
P_FLOAT float.class 100006
P_INTEGER int.class 100004
P_BIGINT long.class 9995
P_SMALLINT short.class 100005


The types of columns can be specified either by their type integer identifiers or by their names. If column types are not specified for the columns, they are assumed to be of type VARCHAR.

^


3. How to use the driver

The WsvNC JDBC driver can be used just like any other JDBC driver. The wsvjdbc.jar file should be included in the application's classpath.

The driver class is axs.jdbc.driver.WsvDriver. The connection URL is jdbc:wsv:protocol://wsvdir, where csvdir is the directory in which the .wsv files are found (note that the extension of the data file can be whatever).
Using the driver means establishing a connection to the source data file and creating a statement that will be used to execute SQL queries. There are a couple of ways to get a connection to a plain text data file and to use it to query data by SQL commands. They are here described by examples.

1st mode: explicitly passing the properties
  1. Get a connection object passing the URL object pointing to the source (plain text) data file and the Property object holding the properties set according to the above described instructions

    Connection conn = DriverManager.getConnection(url, props);


  2. Create a Statement object to execute the query with

    Statement stmt = conn.createStatement();


  3. Execute a query

    ResultSet rs = stmt.executeQuery("SELECT COUNT(*) FROM " + tableName);


2nd mode: using the JdbcSourceConfiguration

This solution is useful, for example, if you have to get connection and data properties from a property file also used for other reasons and so holding properties having nothing to do with the JDBC driver. This approach is based on getting the necessary properties from a file (similar to the schema.ini created by the Microsoft ODBC interface) that hosts both, those relating to the driver connection and those describing the data. There you have to set the driver connection properties and those relating to data charactetistics, like this:

jdbc.sourceUrl = jdbc:wsv:file:///opt//axs-wsm//import
jdbc.sourceNameOfInputTable = anagrafica

Col1.name=SOCIETA
Col1.type=INTEGER
Col1.begin=1
Col1.end=3

Col2.name=CSAMPERS
Col2.type=VARCHAR
Col2.width=255
Col2.begin=4
Col2.end=6

Col3.name=COGNOME
Col3.type=VARCHAR
Col3.width= 255
Col3.begin=7
Col3.end=16

In the example reported you can see that an entire section is dedicated to each column and that for each column you can specify a NAME. Column names can also be specified by the first row of the data file (in this case they must be separated by the same separator used for the data). Column don't need to be explicitly specified by name, to be identified. In fact, if they have fixed length, they can also be identified by their BEGIN and END column boundaries. In this case data are not explicitly separated by a character separator and the delimiters define the beginning and ending position of each column's data item. The type can be specified using the integer identifier or the name of the meant type according to what described in the above paragraph relating to properties.
An object JdbcSourceConfiguration can be used to read the needed properties from a file and is responsible to format them in a way that follows the prescriptions described in the above paragraph relating to properties. Here is an example of how to use it:

  1. JdbcSourceConfiguration is useful to read the needed properties from a file and to have the correctly set and verified properties. Note that the name of property file can be whatever.

    JdbcSourceConfiguration cf = new JdbcSourceConfiguration("schema.ini");
    Properties props = cf.getConnectionProperties();


  2. Load the WNC driver. Note that you can also load it implicitly bu setting the system property jdbc.drivers like this:

    System.setProperty("jdbc.drivers", "axs.jdbc.driver.WsvDriver");
    Class.forName(props.getProperty("jdbc.sourceDriver"));


  3. (steps 3, 4 and 5) like 1, 2 and 3 shown above

^


4. An Example

Here is a complete example of how the driver can be used. Further examples are available in the Test Suite.

 
import java.io.File;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.SQLException;
import java.sql.Statement;
import java.util.Properties;

public class GeneralTest {
	public static void main(String[] args) {											
		try {
			// (1) setting properties				
			Properties props = new java.util.Properties();
			props.put("separator", "\t"); 			  					  // separator is a tab
			props.put("suppressHeaders", "false"); 						  // first line doesn't contain data
			props.put("fileExtension", ".wsv");   						  // file extension is .wsv
			props.put("charset", "ISO-8859-1");   					  	  // file encoding is "ISO-8859-2"				
			props.setProperty("jdbc.sourceUrl", "jdbc:wsv:file://.tests");// default URL points locally																								
			System.out.println("properties:\n" + props);
			// (2) loading the driver
			System.setProperty("jdbc.drivers", "axs.jdbc.driver.WsvDriver");
			// (3) getting connection to data source (passing URL and properties)				
			Connection conn = DriverManager.getConnection("jdbc:wsv:file://./tests", props);
			// (4) create a Statement object to execute the query with
			Statement stmt  = conn.createStatement();								         
			// (5) querying table/file 'anagraficaWithHeadersAndTabs'
			ResultSet rs    = stmt.executeQuery("SELECT * FROM anagraficaWithHeadersAndTabs"); // execute a query			
			int i = 0;													  // read the data and put it to the console
			Object recCol;
			// (6) gathering records
			while (rs.next()) {
				i++;
				System.out.println("record number" + i + " has " + rs.getMetaData().getColumnCount() + " columns");
				for (int j = 1; j <= rs.getMetaData().getColumnCount(); j++) {
					recCol = rs.getObject(j);
					System.out.print(rs.getMetaData().getColumnLabel(j) + " = " + recCol + "\n");
				}					
			}
			// (7) closing
			rs.close();
			stmt.close();
			conn.close();
		}
		catch (RuntimeException e) {
			System.out.println(e.getMessage() + "\n  --  " + e.getStackTrace());			
		} catch (SQLException e) {
			System.out.println(e.getMessage() + "\n  --  " + e.getStackTrace());			
		}								
	}
}

^

5. Credits

Daniele Pes has developed the WsvNC Driver as part of a bigger project of AXS Solutions. Several people have helped him, their names are reported inside the code as contributors.

^

 

6. Release Notes

WsvNC JDBC - a JDBC driver for plain text data files
Copyright (C) 2004 AXS TMC

This library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.

This library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.

You should have received a copy of the GNU Lesser General Public
License along with this library; if not, write to the Free Software
Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

^