DW Handbook

This page shows the name of the construct, its description

Attribute

Value

Name

Union

Description

None

Function

None

Aim

Allow a data engineer (user) to combine two datasets vertically using column-wise merge by appending the contents of the column(s) in the second relation to the column(s) at the same index in the base (first) relation to perform the required analysis and data preparation on a single dataset instead of performing the same on multiple datasets

Context

This operation is used when a user requires combining multiple input dataset vertically as required for the analysis for which the combined datasets are to serve as input.

Rationale

combining datasets required for subsequent operations or end-goal analysis would reduce the repetition of duplicate operations and may increases the accuracy of analysis the dataset is being prepared for.

Mechanisim

merge datasets vertically to reduce duplicate processing of datasets. This can be done by exploring the facilities found in GUI-based tools and programming language functions.

Formalisim

A(R₁,R₂) = {(a₁,...,a_n) | (a₁,...,a_n) ∈ R₁ ^ (b₁,...,b_m) ∈ R₂}, where R₁ is a relation with n columns and R₂ is a relation with m columns and m <= n

Relational Algebra (RA)

Similar to RA operation Union (U)

Type

Atomic

Class

N-Ary

Transformation_category

M:1

Inputs

Inputs	Number of input datasets
Input dataset to merge	M

Outputs

Outputs	Number of output datasets
combined datasets	1

Used in stage(s)

Integration

Back