pyspark.sql.tvf.TableValuedFunction.variant_explode_outer#

TableValuedFunction.variant_explode_outer(input)[source]#

Separates a variant object/array into multiple rows containing its fields/elements.

Its result schema is struct<pos int, key string, value variant>. pos is the position of the field/element in its parent object/array, and value is the field/element value. key is the field name when exploding a variant object, or is NULL when exploding a variant array. Unlike variant_explode, if the given variant is not a variant array/object, including SQL NULL, variant null, and any other variant values, then NULL is produced.

New in version 4.0.0.

Parameters
inputColumn

input column of values to explode.

Returns
DataFrame

Examples

>>> import pyspark.sql.functions as sf
>>> spark.tvf.variant_explode_outer(sf.parse_json(sf.lit('["hello", "world"]'))).show()
+---+----+-------+
|pos| key|  value|
+---+----+-------+
|  0|NULL|"hello"|
|  1|NULL|"world"|
+---+----+-------+
>>> spark.tvf.variant_explode_outer(sf.parse_json(sf.lit('[]'))).show()
+----+----+-----+
| pos| key|value|
+----+----+-----+
|NULL|NULL| NULL|
+----+----+-----+
>>> spark.tvf.variant_explode_outer(sf.parse_json(sf.lit('{"a": true, "b": 3.14}'))).show()
+---+---+-----+
|pos|key|value|
+---+---+-----+
|  0|  a| true|
|  1|  b| 3.14|
+---+---+-----+
>>> spark.tvf.variant_explode_outer(sf.parse_json(sf.lit('{}'))).show()
+----+----+-----+
| pos| key|value|
+----+----+-----+
|NULL|NULL| NULL|
+----+----+-----+