Database Tutorial

Course Tutorial Site

Site Admin

MERGE

without comments

Sometimes you need to insert new data and update existing data to relational tables during a bulk import of a physical file. You can accomplish this type of insert or update activity using a MERGE statement in Oracle.

Data import files often present interesting challenges because they frequently contain rows of data that belong in multiple tables (denormalized data sets). Import processes have to deal with this reality and discover the rules in order to break them up into normalized data sets. Only normalized data sets fit into the merge processes, because they work with single tables.

You need to understand why these files contain data from multiple tables. Although data modelers normalize information into single subject tables to avoid insertion, deletion, and update anomalies, analysts seldom use information in isolation because the data is useful to them only when it’s been assembled into information. Analysts typically apply business rules against data from a set of related tables. This means that they get these denormalized records sets from a query. As qualified earlier in this appendix, queries let you assemble data into meaningful and actionable information, which makes the information as a whole greater than the sum of its data parts.

As developers of database-centric applications, we seldom have control over the origin of these files. Although it’s not critical that we know where the files come from and why they’re important, it can be helpful. Typically called flat, loader, or batch import files, these files are import sources that feed corrections and additions into our data repositories.

Import sources come from many places, such as from business staff who sanitize data (analyzing data against business rules and fixing it) or from other business partners. Business partners can be organizations within the corporation or company, or other companies with whom your company does business. The business staff sanitizing or exchanging data within the company are intra-company import sources.

Intra-company imports can come from other IT organizations or from finance departments without professional IT staff. Those coming from other IT organizations are considered business- to-business (B2B) exchanges, and they package their outgoing files as export files. These export files support order management systems or financial systems.

Professionally packaged export files are typically formatted in an agreed upon XML (eXtensible Markup Language) or EDI (Electronic Data Interchange) formats. Those coming from your company’s internal finance, accounting, or operations departments are considered consumer-to-business (C2B) exchanges. That label works because the import files are typically comma-separated value (CSV) files from Microsoft Excel, which is what you would expect from
a consumer (internal information consumer). These XML files are typically managed by procedural programming interfaces.

This section demonstrates importing and merging data based on CSV and XML files on an ad hoc basis. This section shows you how to use the MERGE statement with Oracle external tables as source files for the import source. You can flip back to the “External Tables” section if you need to recall how to set up external tables.

The MERGE statement in Oracle works similarly to the INSERT ALL statement presented earlier in this appendix. It merges data from a query into a target table based on criteria evaluated in the ON subclause. The query in the subclause can be described as the source result set. The result set is like a virtual table, and virtual tables can be composed of data from multiple tables or a subset of data from one table. The results from a query can be thought of as the source table. The MERGE and INSERT ALL statements are different in two ways: the WHEN clause is limited to two logical conditions, MATCHED and NOT MATCHED, and the MERGE statement works with only one target table.

The MERGE statement has the following prototype:

MERGE INTO table_name
USING (select_statement) query_alias
ON ( condition_match [ {AND | OR } condition_match [...]] ) WHEN MATCHED THEN
update_statement
WHEN NOT MATCHED THEN
insert_statement;

Merging data from a source table to a target table requires that you know the columns that define the natural key of the target table, and in some cases the natural keys of all tables collapsed into the import source file. It also requires an outer join based on the natural key held between the source and target tables, to ensure that it returns the relative complement of the target table. The relative complement would be all the rows found in the source table that aren’t found in the target table. The set of rows in the relative complement exists only when you’re adding new rows from an import source file.

External data sets seldom have a copy of surrogate keys, because those columns aren’t useful to analysts working with the data. The absence of surrogate keys means that you need to use the natural keys to determine how to get parts of the import source into their respective normalized tables. An outer join based on the natural key in the source query always returns a null value as the surrogate key. This is helpful for two reasons: the lack of a surrogate key identifies new rows, and surrogate keys can be auto-generated by available sequences. You attempt a surrogate key match in the ON clause of a MERGE statement. New rows that fail to match are inserted, while those that match are updated. New rows are assigned new surrogate key values.

Although not all import source files contain new rows, most do. The following sections contain the steps necessary for importing or modifying data through bulk uploads. Bulk imports are frequently accomplished through the use of externally organized tables. As introduced earlier in this appendix, an externally organized table is a table that points to a flat file (often a CSV file) deployed on the operating system. External tables in Oracle require that a DBA set up virtual directories and grants.

Written by michaelmclaughlin

August 14th, 2018 at 12:53 am

Posted in