Skip to content

ODBC status

Ivan Koptelov edited this page May 16, 2019 · 8 revisions

ODBC status

Planning

I decided to proceed with evolution way rather then revolution: cover current implementations with tests, change them until they will look good, push relevant parts into master.

The plan for tnt_stmt and odbc implementations is the following:

  • Squash and rebase the patchset on top of master.
  • odbc: setup testing via unixodbc and slice odbc_test.c to separate tests.
  • odbc: update/add RPM / Deb packages with odbcinst.ini and so on.
  • odbc: check working of the packages with, say, libreoffice base or other database observation tool.
  • odbc: add coverage reporting, elaborate API support level.
  • tnt_stmt: decide whether we want to expand tnt_object or use bindings and receive ok from Kostja on the API proposal.
  • tnt_stmt: remove tnt_open() and tnt_reopen().
  • tnt_stmt: rewrite the implementation to match proposed API, cover it with tests, write sphinx documentation (at this point we sure it both works and can be used for odbc implementation).
  • tnt_stmt: prepare the implementation to land into master (code style and so on) and push.
  • odbc: the same for odbc.
  • tnt_stmt: further work on appropriate Tarantool SQL types support (if it will not be done before).
  • odbc: the same for odbc.
  • odbc: verify AppVeyor pipeline build working packages (maybe setup some Windows testing).

tnt_stmt.[ch]

Here is proposal for MySQL-like tnt_stmt.[ch] API.

First, why it is needed at all:

  • Provide MySQL-like API, because developers aware of it.
  • Incapsulate msgpack encoding / decoding and provide an API working in terms of Tarantool SQL types.
  • Incapsulate verification of user-provided storage type against a SQL type in a result.
  • Support random access to result row cells.
  • Provide a result metadata (names and types).
  • Factors out all that logic from odbc specific errors / logging.

The following points were taken into consideration during the API desinging:

  • No unmotivated differences with MySQL C API.
  • Reflect only mysql_stmt_*() API, w/o string-oriented old API parts.
  • No tnt_col* functions within the module, only bindings.
  • Keep it small and simple.

Types

enum sql_types {
    SQL_TYPE_INTEGER, /* INTEGER, INT */
    SQL_TYPE_STRING,  /* TEXT, CHAR, VARCHAR */
};

struct tnt_stmt_bind {
    enum sql_types buffer_type;
    void *buffer;
    uint32_t buffer_length;
    uint32_t *length;
    bool *is_null;
    bool is_unsigned;
    bool *error;
};

struct tnt_stmt; /* Black box. */

See also the MYSQL_BIND description.

Functions

When a return value is just an error indicator (0 or -1) it is not shown in the list below.

// Prepare a SQL request
tnt_stmt_init(stream) -> stmt
tnt_stmt_prepare(stmt, q, len)

// Bind a storage for input parameters and results.
tnt_stmt_bind_param(stmt, bind, num)
tnt_stmt_bind_result(stmt, bind, num)

// Submit a SQL request
tnt_stmt_execute(stmt)

// Parse a SQL request metainfo
tnt_stmt_num_rows(stmt) -> uint32 or int64?
tnt_stmt_field_count(stmt) -> uint32 or int64?
tnt_stmt_affected_rows(stmt) -> uint32 or int64?
tnt_stmt_autoincrement_ids(stmt) -> ??
tnt_stmt_field_names(stmt, *len) -> char **
tnt_stmt_field_types(stmt, *len) -> enum sql_types*

// Parse a next row
tnt_stmt_fetch(stmt)

// Handle an error
tnt_stmt_errno(stmt) -> enum ???
tnt_stmt_error(stmt, *len) -> char *

// Free resources
tnt_stmt_free(stmt)

Details

Notable differences from MySQL API:

  • SQL types, of course.
  • tnt_stmt_autoincrement_ids() instead of mysql_insert_id() to reflect naming in tarantool binary protocol.
  • tnt_stmt_field_names() and tnt_stmt_field_types() instead of mysql_stmt_result_metadata(), because the latter returns MYSQL_RES, which is part of old text-based MySQL API.

Notable differences from tinyapi:

  • C type names reflects tarantool convention (say, no _t suffix).
  • Work with Tarantool SQL types instead of Msgpack ones.
  • struct tnt_stmt_bind field names is the same as in MySQL API, because I don't see a reason to change them.
  • No tnt_col* functions, because they do the same the result bindings do.
  • No tnt_query() shortcut, because it does not allow to use bindings, only tnt_col* functions (and so no input params binding at all).
  • tnt_stmt_errno() instead of tnt_stmt_code() to match MySQL API.
  • tnt_stmt_bind_param() and tnt_stmt_bind_result() instead of tnt_bind_query_param() and tnt_bind_result() to match MySQL API.
  • Hide tnt_fulfill() that looks as the internal function.

Further work

The API has some parts that are not fully defined now, because I think it is easier to elaborate some questions right during coding. Here is the list of such parts.

  • Support more Tarantool SQL types.
  • Handle -2^32..2^64-1 INTEGER in param / result bindings correctly.
  • Define return value type for tnt_stmt_autoincrement_ids(); think about multipart keys (but I don't sure they can be here). Maybe just int64_t *.
  • Define error codes and errors.
  • Define return value for functions like tnt_stmt_affected_rows(), which return amount of some values. A result should fit to uint32_t (because it is max size of msgpack array / map / string / bin), but we need to report an error, so maybe it should be int64_t.

Further ideas

I factored out some ideas from the main API proposal, because they change the API significantly comparing to MySQL C API and so I'm tentative about them. However that way the API would look more native for the connector and maybe would be more convenient for users.

tnt_stmt_bind_param() replacement

tnt_stmt_bind_param() performs quite similar actions as tnt_object_add_*() functions and convenient printf-like tnt_object_format(), tnt_object_vformat() functions. Creating bind structures for parameters involves more actions for a user, so maybe it is worth to integrate tnt_stmt with tnt_object instead of provide tnt_stmt_bind_param().

We also should consider that tnt_object works in terms of msgpack types, but tnt_stmt uses Tarantool SQL types. Latter ones are often unions of msgpack types: say, INTEGER is union of all signed and unsigned msgpack integral types. It does not matter much for parameters binding, because a user always knows an exact type, but looks as the abstraction leak. However it matters much for tnt_stmt_bind_result(), see the next subsection.

tnt_stmt_bind_result() replacement

The same thing about tnt_stmt_bind_result(): maybe it worth to implement msgpack decoding on the tnt_object level and integrate it with tnt_stmt.

I see the possible problem here with Tarantool SQL types, which are unions of msgpack types. We don't know whether a INTEGER value will be in the int64_t range or in the uint64_t range. The similar problem with the tarantool's 'number' type, which able to store both integral and floating point types (it is not a part of Tarantool SQL types, but likely will be in the future).

We also can provide convenient scanf-like functions. However there are unclear things here:

  • How to report a string (and maybe int) truncation? Just as a function error, not per-variable bool *error fields?
  • How to report that a value is null?
  • Which storage should the function expect for the INTEGER SQL type?

I see the two ways to handle the last problem:

  • Obligate a user to try with int64_t storage and do the second attempt in case of truncation error (but how a user will determine which value is truncated?).
  • Provide {bool, union {int64_t, uint64_t}} type and obligate a user to use it.

ODBC API support

Expected difficulties

I'm expecting the following difficulties with ODBC API.

The C Data Types part of ODBC standard explicitly states that SQLINTEGER is defined as long int and SQLUINTEGER is defined as unsigned long int. How to expose our INTEGER (-2^63..2^64-1) SQL type?

SQLBindParameter() and SQLBindCol() allows to bind values in an arbitrary order (as far as I see from the API), this can lead to holes in an array of struct tnt_stmt_bind, so maybe we should actually bind values with tnt_stmt_bind_*() functions right before performing a request (and check for holes here).

API

Here we'll track ODBC API support. Now it is just list of functions. When we'll have some coverage it will be updated with supporting information. Based on ODBC Function Summary.

Mind the designations:

  • If a function is supported ('+') then it is implemented and callable.
  • If a function is not supported ('-') then it is not implemented or is implemented with a stub which always returns error codes.
  • If a function is covered ('+') then it is covered with tests which ensures that function behavior satisfies API specification.
  • If a function is semi-covered ('+/-') then it has some tests, but is not fully covered.
  • If a function is not covered ('-'), then it does not have tests at all.
Connecting to a data source
Function Supported? Covered? Notes
SQLAllocHandle + +/-
SQLConnect + +/-
SQLDriverConnect + +/-
SQLBrowseConnect - -
Obtaining information about a driver and data source
Function Supported? Covered? Notes
SQLDataSources - -
SQLDrivers - -
SQLGetInfo + -
SQLGetFunctions - -
SQLGetTypeInfo + +/-
Setting and retrieving driver attributes
Function Supported? Covered? Notes
SQLSetConnectAttr + -
SQLGetConnectAttr + -
SQLSetEnvAttr - -
SQLGetEnvAttr - -
SQLSetStmtAttr - -
SQLGetStmtAttr + -
Setting and retrieving descriptor fields
Function Supported? Covered? Notes
SQLGetDescField - -
SQLGetDescRec - -
SQLSetDescField - -
SQLSetDescRec - -
SQLCopyDesc - -
Preparing SQL requests
Function Supported? Covered? Notes
SQLPrepare + +/-
SQLBindParameter + +/-
SQLGetCursorName - -
SQLSetCursorName - -
SQLSetScrollOptions - -
Submitting requests
Function Supported? Covered? Notes
SQLExecute + +/-
SQLExecDirect + +/-
SQLNativeSql + -
SQLDescribeParam - -
SQLNumParams + -
SQLParamData - -
SQLPutData - -
Retrieving results and information about results
Function Supported? Covered? Notes
SQLRowCount + +/-
SQLNumResultCols + +/-
SQLDescribeCol + +/-
SQLColAttribute + +/-
SQLBindCol + +/-
SQLFetch + +/-
SQLFetchScroll - -
SQLGetData + +/-
SQLSetPos - -
SQLBulkOperations - -
SQLMoreResults - -
SQLGetDiagField + +/-
SQLGetDiagRec + +/-
Obtaining information about the data source's system tables (catalog functions)
Function Supported? Covered? Notes
SQLColumnPrivileges - -
SQLColumns + +/-
SQLForeignKeys - -
SQLPrimaryKeys - -
SQLProcedureColumns - -
SQLProcedures - -
SQLSpecialColumns + +/-
SQLStatistics + +/-
SQLTablePrivileges - -
SQLTables + -
Terminating a statement
Function Supported? Covered? Notes
SQLFreeStmt + -
SQLCloseCursor + -
SQLCancel + -
SQLCancelHandle - -
SQLEndTran + -
Terminating a connection
Function Supported? Covered? Notes
SQLDisconnect + +/-
SQLFreeHandle + +/-