pyspark.sql.functions.make_valid_utf8#
- pyspark.sql.functions.make_valid_utf8(str)[source]#
Returns a new string in which all invalid UTF-8 byte sequences, if any, are replaced by the Unicode replacement character (U+FFFD).
New in version 4.0.0.
- Parameters
- str
Column
or column name A column of strings, each representing a UTF-8 byte sequence.
- str
- Returns
Column
the valid UTF-8 version of the given input string.
See also
Examples
>>> import pyspark.sql.functions as sf >>> spark.range(1).select(sf.make_valid_utf8(sf.lit("SparkSQL"))).show() +-------------------------+ |make_valid_utf8(SparkSQL)| +-------------------------+ | SparkSQL| +-------------------------+