Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Is there a functional difference between the following two expressions? The result looks the same to me but curious if there's an unknown unknown. What does the $ symbol indicate/how is it read?

df1.orderBy($"reasonCode".asc).show(10, false)
    
df1.orderBy(asc("reasonCode")).show(10, false)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
3.9k views
Welcome To Ask or Share your Answers For Others

1 Answer

Those two statements are equivalent and will lead to the identical result.

The $ notation is special for Scala Spark and is referring to an implicit StringToColumn method which interprets the subsequent string "reasonCode" as a Column

implicit class StringToColumn(val sc: StringContext) {
  def $(args: Any*): ColumnName = {
    new ColumnName(sc.s(args: _*))
  }
}

In Scala Spark you have many ways to select a column. I have written down a full list of syntax varieties in another answer on select specific columns from spark dataframe.

Using different notations do not have any impact on the performance as they all get translated to the same set of RDD instructions through Spark's Catalyst optimizer.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...