Skip to main content

Use Copy Into

When enabled, generated notebooks use the COPY INTO command to read landing files instead of the default spark.read.format().load() approach. COPY INTO provides idempotent file ingestion — files that have already been loaded into the target table are automatically tracked and skipped on subsequent runs, making it safe to re-execute notebooks without producing duplicate records. Use this setting when you need exactly-once semantics for file processing or when your landing zone may occasionally reprocess the same set of files.

Notes:

  • This setting is part of the Databricks settings category.
  • The default value for this setting is N.