-
Notifications
You must be signed in to change notification settings - Fork 28k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-43815][SQL] Wrap NPE with AnalysisException in CSV, XML, and JSON options #46626
base: master
Are you sure you want to change the base?
Conversation
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/csv/CSVOptions.scala
Outdated
Show resolved
Hide resolved
70235b9
to
1920fa8
Compare
@@ -107,7 +108,13 @@ class JSONOptions( | |||
val writeNullIfWithDefaultValue = SQLConf.get.jsonWriteNullIfWithDefaultValue | |||
|
|||
// A language tag in IETF BCP 47 format | |||
val locale: Locale = parameters.get(LOCALE).map(Locale.forLanguageTag).getOrElse(Locale.US) | |||
val locale: Locale = parameters.get(LOCALE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's reduce the duplicated code in CSV/JSON/XML options
val locale: Locale = parameters.get(LOCALE).map(Locale.forLanguageTag).getOrElse(Locale.US) | ||
val locale: Locale = parameters.get(LOCALE) | ||
.map { | ||
case null => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that, in PySpark we ignore when it's None
vs here we throw an exception. I really think we should fix the whole thing with defining the behaviour instead of fixing one case alone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Pyspark also seems to skip None
check with option()
or options()
in python.
Would it make sense to just add null checks in scala and option()
/options()
in pyspark?
However, I'm not sure if any options accept null
or None
as a legitimate value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that's the key. This is actually a bigger problem than this specific case. I think we should define the behaviour, document somewhere, and apply them for all...
What changes were proposed in this pull request?
When user sets
locale
to benull
, a NPE is raised. Instead, replace the NPE with a more understandable user facing error message.Why are the changes needed?
Improves usability of csv options.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Added tests pass.
Was this patch authored or co-authored using generative AI tooling?
No