This page shows the name of the construct, its description
Attribute | Value | ||||
---|---|---|---|---|---|
Name | Merge Columns | ||||
Description | None | ||||
Function | None | ||||
Aim | Allow data engineer (user) to combine columns of string attributes to a new columns using a glue character | ||||
Context | This operation is used when two attributes (or columns) present in the original/unrefined dataset requires merging to a single attribute (ex: parts of a name) to aid the analysis for which the dataset is to serve as input. | ||||
Rationale | Providing a merged version of a column is used for fulfilment of analysis requirements or to ensures accurate results from the analysis the dataset is being prepared for. | ||||
Mechanisim | Create a new columns with values from two existing columns and a string as a glue between them using enactors E12 and E15. | ||||
Formalisim | µ((a1,...,an),i,j,glue) = α(R, x), φ(R, indexOf(x), ai⊕ glue ⊕ aj) = {(a1,...,an, ai⊕ glue ⊕ aj) | (a1,...,an) ∈ R}; where R is a relation, i and j are indices of the columns to be merged, glue is a character to be used to connect the values in the two columns, x is a column name to be created, x ⊕ y concatenates x and y. (Raman, V and Hellerstein, J 2001) | ||||
Relational Algebra (RA) | Similar to RA operation Attribute Extension (ε)/Generalized Projection | ||||
Type | Composite | ||||
Class | Unary | ||||
Transformation_category | 1:1 | ||||
Inputs |
|
||||
Outputs |
|
||||
Used in stage(s) | Structuring1 |
Back |