Apache.Arrow.Adbc.Drivers.Databricks
0.22.0
dotnet add package Apache.Arrow.Adbc.Drivers.Databricks --version 0.22.0
NuGet\Install-Package Apache.Arrow.Adbc.Drivers.Databricks -Version 0.22.0
<PackageReference Include="Apache.Arrow.Adbc.Drivers.Databricks" Version="0.22.0" />
<PackageVersion Include="Apache.Arrow.Adbc.Drivers.Databricks" Version="0.22.0" />
<PackageReference Include="Apache.Arrow.Adbc.Drivers.Databricks" />
paket add Apache.Arrow.Adbc.Drivers.Databricks --version 0.22.0
#r "nuget: Apache.Arrow.Adbc.Drivers.Databricks, 0.22.0"
#:package Apache.Arrow.Adbc.Drivers.Databricks@0.22.0
#addin nuget:?package=Apache.Arrow.Adbc.Drivers.Databricks&version=0.22.0
#tool nuget:?package=Apache.Arrow.Adbc.Drivers.Databricks&version=0.22.0
Databricks Driver
The Databricks ADBC driver is built on top of the Spark ADBC driver and inherits all of its properties, plus additional Databricks-specific functionality.
Database and Connection Properties
Note: The Databricks driver inherits all properties from the Spark driver. The properties below are Databricks-specific additions.
Configuration Methods
The Databricks driver supports multiple ways to configure properties:
1. Direct Property Configuration
Pass properties directly when creating the driver connection (traditional method).
2. Environment Variable Configuration
Configure properties using a JSON file loaded via environment variables:
- Create a JSON configuration file with standard ADBC parameters:
{
"adbc.databricks.driver_config_take_precedence": "true",
"adbc.databricks.enable_pk_fk": "false",
"adbc.connection.catalog": "my_catalog",
"adbc.connection.db_schema": "my_schema"
}
Note
All values in the JSON configuration file must be strings (including numbers, booleans, and file paths). For example, use "true" instead of true, and "4443" instead of 4443.
Example: Using mitmproxy to Inspect Thrift Traffic
To inspect Thrift traffic using mitmproxy, you can configure the Databricks driver to use a local proxy with TLS interception. Below is an example JSON configuration:
{
"adbc.databricks.driver_config_take_precedence": "true",
"adbc.proxy_options.use_proxy" : "true",
"adbc.proxy_options.proxy_host" : "localhost",
"adbc.proxy_options.proxy_port" : "4443",
"adbc.http_options.tls.enabled": "true",
"adbc.http_options.tls.allow_self_signed" : "true",
"adbc.http_options.tls.disable_server_certificate_validation" : "true",
"adbc.http_options.tls.allow_hostname_mismatch" : "true",
"adbc.http_options.tls.trusted_certificate_path" : "C:\\your-path-to\\mitmproxy-ca-cert.pem"
}
Set the system environment variable
DATABRICKS_CONFIG_FILEto point to your JSON file:- Open System Properties → Advanced → Environment Variables
- Add new system variable: Name=
DATABRICKS_CONFIG_FILE, Value=C:\path\to\your\config.json
Property Merging Behavior:
- By default: Constructor/code properties override environment config properties
- With
"adbc.databricks.driver_config_take_precedence": "true": Environment config properties override constructor/code properties
3. Hybrid Configuration
You can combine both methods - the driver will automatically merge environment config with constructor properties based on the precedence setting.
Use Cases:
- PowerBI Integration: Set system-wide defaults via environment config while allowing connection-specific overrides
Authentication Properties
| Property | Description | Default |
|---|---|---|
adbc.databricks.oauth.grant_type |
The OAuth grant type. Supported values: access_token (personal access token), client_credentials (OAuth client credentials flow) |
access_token |
adbc.databricks.oauth.client_id |
The OAuth client ID (when using client_credentials grant type) |
|
adbc.databricks.oauth.client_secret |
The OAuth client secret (when using client_credentials grant type) |
|
adbc.databricks.oauth.scope |
The OAuth scope (when using client_credentials grant type) |
sql |
adbc.databricks.token_renew_limit |
Minutes before token expiration to start renewing the token. Set to 0 to disable automatic renewal | 0 |
adbc.databricks.identity_federation_client_id |
The client ID of the service principal when using workload identity federation |
CloudFetch Properties
CloudFetch is Databricks' high-performance result retrieval system that downloads result data directly from cloud storage.
| Property | Description | Default |
|---|---|---|
adbc.databricks.cloudfetch.enabled |
Whether to use CloudFetch for retrieving results | true |
adbc.databricks.cloudfetch.lz4.enabled |
Whether the client can decompress LZ4 compressed results | true |
adbc.databricks.cloudfetch.max_bytes_per_file |
Maximum bytes per file for CloudFetch. Supports unit suffixes (B, KB, MB, GB). Examples: 20MB, 1024KB, 20971520 |
20MB |
adbc.databricks.cloudfetch.parallel_downloads |
Maximum number of parallel downloads | 3 |
adbc.databricks.cloudfetch.prefetch_count |
Number of files to prefetch | 2 |
adbc.databricks.cloudfetch.memory_buffer_size_mb |
Maximum memory buffer size in MB for prefetched files | 200 |
adbc.databricks.cloudfetch.prefetch_enabled |
Whether CloudFetch prefetch functionality is enabled | true |
adbc.databricks.cloudfetch.max_retries |
Maximum number of retry attempts for downloads | 3 |
adbc.databricks.cloudfetch.retry_delay_ms |
Delay in milliseconds between retry attempts | 500 |
adbc.databricks.cloudfetch.timeout_minutes |
Timeout in minutes for HTTP operations | 5 |
adbc.databricks.cloudfetch.url_expiration_buffer_seconds |
Buffer time in seconds before URL expiration to trigger refresh | 60 |
adbc.databricks.cloudfetch.max_url_refresh_attempts |
Maximum number of URL refresh attempts | 3 |
Databricks-Specific Properties
| Property | Description | Default |
|---|---|---|
adbc.connection.catalog |
Optional default catalog for the session | |
adbc.connection.db_schema |
Optional default schema for the session | |
adbc.databricks.enable_direct_results |
Whether to enable the use of direct results when executing queries | true |
adbc.databricks.apply_ssp_with_queries |
Whether to apply server-side properties (SSP) with queries. If false, SSP will be applied when opening the session | false |
adbc.databricks.ssp_* |
Server-side properties prefix. Properties with this prefix will be passed to the server by executing "set key=value" queries | |
adbc.databricks.enable_multiple_catalog_support |
Whether to use multiple catalogs | true |
adbc.databricks.enable_pk_fk |
Whether to enable primary key foreign key metadata calls | true |
adbc.databricks.use_desc_table_extended |
Whether to use DESC TABLE EXTENDED to get extended column metadata when supported by DBR | true |
adbc.databricks.enable_run_async_thrift |
Whether to enable RunAsync flag in Thrift operations | true |
adbc.databricks.driver_config_take_precedence |
Whether driver configuration overrides passed-in properties during configuration merging | false |
adbc.apache.statement.batch_size |
Sets the maximum number of rows to retrieve in a single batch request | 2000000 |
adbc.apache.connection.polltime_ms |
The time in milliseconds between each poll for query execution status. Databricks default is 100ms (Apache default: 500ms) | 100 |
Tracing Properties
| Property | Description | Default |
|---|---|---|
adbc.databricks.trace_propagation.enabled |
Whether to propagate trace parent headers in HTTP requests | true |
adbc.databricks.trace_propagation.header_name |
The name of the HTTP header to use for trace parent propagation | traceparent |
adbc.databricks.trace_propagation.state_enabled |
Whether to include trace state header in HTTP requests | false |
Authentication Methods
The Databricks ADBC driver supports the following authentication methods:
1. Token-based Authentication
Using a Databricks personal access token:
- Set
adbc.spark.auth_typetooauth - Set
adbc.databricks.oauth.grant_typetoaccess_token(this is the default if not specified) - Set
adbc.spark.oauth.access_tokento your Databricks personal access token
2. OAuth Client Credentials Flow
For machine-to-machine (m2m) authentication:
- Set
adbc.spark.auth_typetooauth - Set
adbc.databricks.oauth.grant_typetoclient_credentials - Set
adbc.databricks.oauth.client_idto your OAuth client ID - Set
adbc.databricks.oauth.client_secretto your OAuth client secret - Set
adbc.databricks.oauth.scopeto your auth scope (defaults tosql)
The driver will automatically handle token acquisition, renewal, and authentication with the Databricks service.
Note: Basic (username and password) authentication is not supported at this time.
Server-Side Properties
Server-side properties allow you to configure Databricks session settings. Any property with the adbc.databricks.ssp_ prefix will be passed to the server by executing SET key=value queries.
For example, setting adbc.databricks.ssp_use_cached_result to true will result in executing SET use_cached_result=true on the server when the session is opened.
The property name after the ssp_ prefix becomes the server-side setting name.
Data Types
The following table depicts how the Databricks ADBC driver converts a Databricks type to an Arrow type and a .NET type:
| Spark Type | Arrow Type | C# Type |
|---|---|---|
| ARRAY* | String | string |
| BIGINT | Int64 | long |
| BINARY | Binary | byte[] |
| BOOLEAN | Boolean | bool |
| CHAR | String | string |
| DATE | Date32 | DateTime |
| DECIMAL | Decimal128 | SqlDecimal |
| DOUBLE | Double | double |
| FLOAT | Float | float |
| INT | Int32 | int |
| INTERVAL_DAY_TIME+ | String | string |
| INTERVAL_YEAR_MONTH+ | String | string |
| MAP* | String | string |
| NULL | Null | null |
| SMALLINT | Int16 | short |
| STRING | String | string |
| STRUCT* | String | string |
| TIMESTAMP | Timestamp | DateTimeOffset |
| TINYINT | Int8 | sbyte |
| UNION | String | string |
| USER_DEFINED | String | string |
| VARCHAR | String | string |
Tracing
Tracing Exporters
To enable tracing messages to be observed, a tracing exporter needs to be activated.
Use either the environment variable OTEL_TRACES_EXPORTER or the parameter adbc.traces.exporter to select one of the
supported exporters. The parameter has precedence over the environment variable. The parameter must be set before
the connection is initialized.
The following exporters are supported:
| Exporter | Description |
|---|---|
adbcfile |
Exports traces to rotating files in a folder. |
File Exporter (adbcfile)
Rotating trace files are written to a folder. The file names are created with the following pattern:
apache.arrow.adbc.drivers.bigquery-<YYYY-MM-DD-HH-mm-ss-fff>-<process-id>.log.
The folder used depends on the platform.
| Platform | Folder |
|---|---|
| Windows | %LOCALAPPDATA%/Apache.Arrow.Adbc/Traces |
| macOS | $HOME/Library/Application Support/Apache.Arrow.Adbc/Traces |
| Linux | $HOME/.local/share/Apache.Arrow.Adbc/Traces |
By default, up to 999 files of maximum size 1024 KB are written to the trace folder.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
| .NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 is compatible. net48 was computed. net481 was computed. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen40 was computed. tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETFramework 4.7.2
- Apache.Arrow.Adbc.Drivers.Apache (>= 0.22.0)
- K4os.Compression.LZ4 (>= 1.3.8)
- K4os.Compression.LZ4.Streams (>= 1.3.8)
- Microsoft.IO.RecyclableMemoryStream (>= 3.0.1)
-
.NETStandard 2.0
- Apache.Arrow.Adbc.Drivers.Apache (>= 0.22.0)
- K4os.Compression.LZ4 (>= 1.3.8)
- K4os.Compression.LZ4.Streams (>= 1.3.8)
- Microsoft.IO.RecyclableMemoryStream (>= 3.0.1)
-
net8.0
- Apache.Arrow.Adbc.Drivers.Apache (>= 0.22.0)
- K4os.Compression.LZ4 (>= 1.3.8)
- K4os.Compression.LZ4.Streams (>= 1.3.8)
- Microsoft.IO.RecyclableMemoryStream (>= 3.0.1)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.