06/07/2023 10:40 | Category: databases

Tags: redshiftetldba

redshift COPY commands can have identity columns

Recently I learned that COPY commandsd can have a flag called EXPLICIT_IDS when working with Redshift. This allows identity based insert operations to occur, which is helpful for ingest jobs.

Additionally, the flag can be passed for tables that do not have an identity column, which will still operate just fine. This is great for if we have lots of tables that we're doing COPY operations on, but don't want to specifically do some backend logic to handle which ones need identity based checking explicitly.

The "laziness" factor of this seems to be beneficial for bulk operations.

References

  • COPY documentation
  • EXPLICIT_IDS documentation
  • COPY command column mapping documentation - The important bits are below.
    • If an IDENTITY column is included in the column list, then EXPLICIT_IDS must also be specified; if
    an IDENTITY column is omitted, then EXPLICIT_IDS can't be specified. If no column list is specified, the command behaves as if a complete, in-order column list was specified, with IDENTITY columns omitted if EXPLICIT_IDS was also not specified.
    • If a column is defined with GENERATED BY DEFAULT AS IDENTITY, then it can be copied.
    Values are generated or updated with values that you supply. The EXPLICIT_IDS option isn't required. COPY doesn't update the identity high watermark. For more information, see GENERATED BY DEFAULT AS IDENTITY.