This page shows the name of the construct, its description
| Attribute | Value | ||||
|---|---|---|---|---|---|
| Name | Merge Columns | ||||
| Description | None | ||||
| Function | None | ||||
| Aim | Allow data engineer (user) to combine columns of string attributes to a new columns using a glue character | ||||
| Context | This operation is used when two attributes (or columns) present in the original/unrefined dataset requires merging to a single attribute (ex: parts of a name) to aid the analysis for which the dataset is to serve as input. | ||||
| Rationale | Providing a merged version of a column is used for fulfilment of analysis requirements or to ensures accurate results from the analysis the dataset is being prepared for. | ||||
| Mechanisim | Create a new columns with values from two existing columns and a string as a glue between them using enactors E12 and E15. | ||||
| Formalisim | µ((a1,...,an),i,j,glue) = α(R, x), φ(R, indexOf(x), ai⊕ glue ⊕ aj) = {(a1,...,an, ai⊕ glue ⊕ aj) | (a1,...,an) ∈ R}; where R is a relation, i and j are indices of the columns to be merged, glue is a character to be used to connect the values in the two columns, x is a column name to be created, x ⊕ y concatenates x and y. (Raman, V and Hellerstein, J 2001) | ||||
| Relational Algebra (RA) | Similar to RA operation Attribute Extension (ε)/Generalized Projection | ||||
| Type | Composite | ||||
| Class | Unary | ||||
| Transformation_category | 1:1 | ||||
| Inputs |
|
||||
| Outputs |
|
||||
| Used in stage(s) | Structuring1 |
| Back |