withColumnRenamed ()
withColumnRenamed(), Rename an existing column in a DataFrame
DataFrame.withColumnRenamed(existingName, newName)
existingName
: The current name of the column you want to rename.newName
: The new name for the column.
drop ()
In PySpark, you can drop columns from a DataFrame using the drop() method. Here’s a breakdown of the syntax, options, parameters, and examples for dropping columns in PySpark.
Syntax
DataFrame.drop(*cols)
*cols
: One or more column names (as strings) that you want to drop from the DataFrame. You can pass these as individual arguments or a list of column names.
# Drop single column 'age'
df_dropped = df.drop("age")
# Drop multiple columns 'id' and 'age'
df_dropped_multiple = df.drop("id", "age")
# Dropping Columns Using a List of Column Names
# Define columns to drop
columns_to_drop = ["id", "age"]
# Drop columns using list
df_dropped_list = df.drop(*columns_to_drop)
df_dropped_list.show()
Show()
By default, display 20 row, truncate 20 characters
df.show(n=10, truncate= True, vertical=True)