Skip to content

Cognite Databricks Integration Documentation

All customers: begin with the catalog-based quickstart — install, generate UDTFs from your CDF data model, store credentials in Databricks Secret Manager, and register Unity Catalog functions and views.

Prerequisites: Catalog-based prerequisites


Overview

cognite-databricks provides two approaches for registering and using User-Defined Table Functions (UDTFs):

  1. Catalog-based registrationdefault path for most customers: permanent UDTFs and Views in Unity Catalog, credentials via Secret Manager.
  2. Session-scoped registration — temporary registration in a single Spark session for development and testing.

Choosing the right approach

Use session-scoped when

  • Developing and testing UDTFs before committing to Unity Catalog
  • Prototyping without Unity Catalog persistence
  • Temporary analysis; learning UDTF patterns

Characteristics: functions live only for the session; no permanent catalog objects; often simpler credentials for ad-hoc use.

Use catalog-based when

  • Production deployments with governance
  • Data discovery and searchable Views in the Databricks UI
  • Fine-grained access control (GRANT/REVOKE)
  • Team collaboration on shared SQL assets

Characteristics: UDTFs and Views in Unity Catalog; secrets in Secret Manager; Views hide SECRET() details from analysts.


Documentation structure

Catalog-based (start here)

Session-scoped

Examples

  • Catalog-based quickstart: quickstart.ipynb
  • Session-scoped: examples/session_scoped/
  • Other catalog examples: examples/catalog_based/

Package architecture

cognite-databricks extends pygen-spark with Databricks-specific features:

  • Code generation: cognite-pygen-spark templates for Data Model and time series UDTFs
  • Databricks-specific: Unity Catalog SQL registration and Secret Manager integration

Import paths for generic components:

from cognite.pygen_spark import TypeConverter, CDFConnectionConfig, to_udtf_function_name
# Or: from cognite.databricks import ...
  • README: Package overview and installation
  • pygen-spark: Generic Spark UDTF code generation