Description | Splits a string column into an array of substrings based on a delimiter. | Concatenates multiple arrays or strings into a single array or string. | Zips multiple arrays element-wise into a single array of structs. | Flattens an array into multiple rows, with one row per element in the array. |
Input Type | String | Arrays/Strings | Arrays | Array |
Output Type | Array of Strings | Array or String | Array of Structs | Multiple Rows (with original columns) |
Key Use Cases | Splitting strings based on delimiters (e.g., splitting comma-separated values). | Merging multiple arrays into one, or multiple strings into one. | Aligning data from multiple arrays element-wise, treating each set of elements as a row (struct). | Flattening arrays for row-by-row processing (e.g., after zipping or concatenating arrays). |
Example | split(col("string_col"), ",") → ["a", "b", "c"] | concat(col("array1"), col("array2")) → ["a", "b", "x", "y"] | array_zip(col("array1"), col("array2")) → [{'a', 1}, {'b', 2}] | explode(col("array_col")) → Converts an array into separate rows. |
Handling Different Lengths | Not applicable | If input arrays have different lengths, the shorter ones are concatenated as-is. | If input arrays have different lengths, the shorter ones are padded with null . | Not applicable. Converts each element into separate rows, regardless of length. |
Handling null values | Will split even if the string contains null values (but may produce empty strings). | If arrays contain null , concat() still works, returning the non-null elements. | Inserts null values into the struct where input arrays have null for a corresponding index. | Preserves null elements during the explosion but still creates separate rows. |