Importing/Exporting Data  «Prev  Next»

Lesson 4 The BULK INSERT statement
Objective Use the BULK INSERT statement.

Using BULK INSERT in SQL Server 2025

SQL Server includes BULK INSERT to load large volumes of data from a file into a table efficiently. In modern environments, this typically means importing CSV or delimited files that reside on a local disk, a network share, or cloud storage that SQL Server can access. The lesson objective is to understand the core BULK INSERT workflow, the options that control parsing and batching, and the operational/security constraints that determine whether the import succeeds.

In SQL Server 2025, BULK INSERT remains a primary T-SQL approach for file-to-table ingestion when you want: (1) a repeatable script, (2) predictable performance characteristics, and (3) explicit control of delimiters, row handling, and error tolerance. For more advanced pipelines, you may also use OPENROWSET(BULK...), SSIS, Fabric/Data Factory, or external ETL tools—but BULK INSERT is still the foundational option worth mastering.


Prerequisites and operational requirements

Before you run BULK INSERT, validate these practical requirements. Most import failures are not “syntax problems”; they are file-access, permissions, or format mismatches.

  1. A target table that matches the incoming data
    • Column order and datatypes must align with the file (or you must use a format file to map fields).
    • Plan for NULL handling, date/time formats, and decimal separators.
  2. SQL Server must be able to access the file
    • On-premises / VM: the file must be on a local path that the SQL Server service account can read, or on a network share (UNC path) with appropriate permissions.
    • Cloud storage: if your file is in object storage (for example, Azure Blob Storage), SQL Server must be configured to authenticate and reach that storage endpoint using a credential + external data source pattern (when applicable), and network/security controls must allow the connection.
  3. Security and least privilege
    • Grant only the minimal permissions required to read the file location.
    • Avoid embedding long-lived secrets in scripts. Prefer scoped credentials, managed identities, or short-lived tokens.
    • Use private networking controls where feasible (private endpoints, restricted firewall rules, and audited access).
  4. Decide how you will handle failures
    • Do you want the load to stop on the first error, or tolerate a limited number of bad rows?
    • Do you need a reject file or error file for remediation?
    • Will you load into a staging table first, then validate/transform into the final table?



Step-by-step: a practical BULK INSERT workflow

  1. Create a target table (or a staging table)

    For production workflows, a staging table is often preferable. It lets you load quickly, then validate and transform data before merging into the final schema.

    CREATE TABLE dbo.YourTargetTable
    (
        Column1 int            NOT NULL,
        Column2 nvarchar(100)  NOT NULL,
        Column3 datetime2(0)   NULL
    );
    
    • datetime2 is generally preferred over legacy datetime for precision and clarity.
  2. Confirm file accessibility from the SQL Server service context

    The SQL Server service account must be able to read the file. This is especially important with UNC paths and hardened environments.

    • Local path example: C:\imports\yourfile.csv
    • UNC path example: \\fileserver\imports\yourfile.csv
  3. Execute BULK INSERT with explicit CSV options

    This example loads a CSV file, skips a header row, and declares delimiters explicitly to avoid ambiguity.

    BULK INSERT dbo.YourTargetTable
    FROM 'C:\imports\yourfile.csv'
    WITH
    (
        FORMAT = 'CSV',
        FIRSTROW = 2,
        FIELDTERMINATOR = ',',
        ROWTERMINATOR = '0x0a',
        TABLOCK
    );
    
    • FIRSTROW = 2 skips a header row (common with CSV exports).
    • ROWTERMINATOR = '0x0a' (LF) is often safer than '\n' across tooling variations.
    • TABLOCK can improve performance for bulk loads (subject to your recovery model and concurrency needs).
  4. Validate results and apply constraints after the load

    A common pattern is: load → validate → transform → merge. If you load directly into the final table, you still need a validation step to catch data issues early.

    SELECT COUNT(*) AS rows_loaded
    FROM dbo.YourTargetTable;
    
    SELECT TOP (50) *
    FROM dbo.YourTargetTable
    ORDER BY Column1;
    

Additional options that matter in real imports

The most important BULK INSERT options are the ones that control parsing, batching, and error behavior. These options are where you tune for speed and operational resilience.

  • FIRSTROW: Skip header rows or metadata rows at the top of the file.
  • FIELDTERMINATOR: Define the delimiter between fields (comma, tab, pipe, etc.).
  • ROWTERMINATOR: Define line endings explicitly. In mixed environments, hex terminators can reduce surprises.
  • MAXERRORS: Set a tolerance threshold before canceling the load.
  • ERRORFILE: Capture rejected rows to a file for review and remediation.
  • KEEPNULLS: Preserve NULLs from the file instead of applying default constraints implicitly.
  • TABLOCK: Acquire a bulk update lock for better throughput (use with awareness of concurrency impacts).
  • BATCHSIZE: Commit data in batches (each batch in its own transaction), improving recoverability and reducing log pressure.
  • ROWS_PER_BATCH: Provide an estimate to help SQL Server optimize internal resources for the load.

In performance tuning terms, BATCHSIZE is typically the most operationally significant knob because it affects transaction log behavior, lock durations, and rollback scope when something fails mid-load.


Notes about batching and ordering

When using BULK INSERT, batching options influence how data is grouped and committed:

  1. ROWS_PER_BATCH: Provides SQL Server an estimate for how many rows are in each batch. This can help with memory grants and internal planning. Depending on file size and settings, SQL Server may still treat the load as a single operation, but this estimate can improve efficiency.
  2. BATCHSIZE: Commits the load in discrete transactions. If a load fails, only the current batch is rolled back, which is often preferable for large files and operational stability.

If you specify both ROWS_PER_BATCH and BATCHSIZE, the commit behavior is governed by BATCHSIZE. Use ROWS_PER_BATCH primarily as a planning hint; use BATCHSIZE when you need transaction control.

If you use an ORDER option (in workflows that support it), remember that sorting and comparisons are evaluated using the database or column collation. If the file is not actually sorted according to that collation, or if the destination table/index does not support the requested ordering, SQL Server may ignore the order hint. [1]


Specifying a format file

When you need more control than “the file columns match the table columns,” use a format file. A format file explicitly maps fields in the incoming file to columns in the destination table and can also describe field terminators, fixed-width fields, and datatype expectations.

A format file is useful when:

  1. The file contains a different number of fields than the destination table.
  2. The file’s field order differs from the table’s column order.
  3. Delimiters vary, or the file mixes fixed-width and delimited segments.

In practice, format files are commonly created and maintained using the bcp utility. Once you have a stable format file, it becomes a durable “contract” between producers of the file and your database load process.


Bulk insert statement in SQL Server - Format file
| # | Column Name | Data Type | Precision | Scale | Format / Default | Nullable | Description |
| - | ----------- | --------- | --------- | ----- | ---------------- | -------- | ----------- |
| 1 | au_id       | SQLCHAR   | 0         | 11    | ""               | 2        | au_id       |
| 2 | au_lname    | SQLCHAR   | 0         | 40    | "\t"             | 3        | au_lname    |
| 3 | au_fname    | SQLCHAR   | 0         | 20    | ""               | 4        | au_fname    |
| 4 | phone       | SQLCHAR   | 0         | 12    | ""               | 5        | phone       |
| 5 | address     | SQLCHAR   | 0         | 40    | ""               | 6        | address     |
| 6 | city        | SQLCHAR   | 0         | 20    | ""               | 7        | city        |
| 7 | state       | SQLCHAR   | 0         | 2     | ""               | 8        | state       |
| 8 | zip         | SQLCHAR   | 0         | 5     | ""               | 9        | zip         |
| 9 | contract    | SQLBIT    | 0         | 1     | ""               |          | contract    |
Example of a bulk-load format definition that maps file fields to SQL Server table columns.

  1. 7.0: The version of the bcp program used to generate the file (format versions vary by tool/version).
  2. 9: The number of fields described in the format definition.
  3. 1–9 (leftmost numeric column): The order of each field as it appears in the source file.
  4. SQLCHAR / SQLBIT: The host data type for each field (examples include SQLCHAR, SQLNCHAR, SQLINT, SQLBIT, SQLDATETIME, SQLDECIMAL, and others).
  5. 0: The length of the column prefix (commonly 0, 1, 2, or 4 depending on the format style).
  6. 11, 40, …: The maximum length of the field in the source file.
  7. "" or "\t": The field terminator (empty for fixed-length segments; \t for tab-delimited fields).
  8. 1–9 (server column order): The ordinal position of the destination column in the SQL Server table.
  9. au_id, au_lname, …: A descriptive label, typically the column name, used for readability and maintenance.

To skip a field from the data file, a common technique is to set the server column order to 0, use a prefix length of 0, and define terminators so the loader can still advance through the input correctly.

The next lesson will cover using the bcp utility to export and import data, and how format files fit into a repeatable bulk-load workflow.

Loading Data - Quiz

Click the Quiz link below to verify that you can identify the best loading approach for a given scenario and interpret the key BULK INSERT options.

Loading Data - Quiz


[1]Collation sequence: Determines the order in which Unicode character data is compared and sorted.

SEMrush Software 4 SEMrush Banner 4