Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-43815][SQL] Wrap NPE with AnalysisException in CSV, XML, and JSON options #46626

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

michaelzhan-db
Copy link
Contributor

What changes were proposed in this pull request?

When user sets locale to be null, a NPE is raised. Instead, replace the NPE with a more understandable user facing error message.

Why are the changes needed?

Improves usability of csv options.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added tests pass.

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label May 16, 2024
@HyukjinKwon HyukjinKwon changed the title [SPARK-43815] Wrap NPE with AnalysisException in csv options [SPARK-43815][SQL] Wrap NPE with AnalysisException in csv options May 17, 2024
@HyukjinKwon HyukjinKwon changed the title [SPARK-43815][SQL] Wrap NPE with AnalysisException in csv options [SPARK-43815][SQL] Wrap NPE with AnalysisException in CSV options May 17, 2024
@michaelzhan-db michaelzhan-db changed the title [SPARK-43815][SQL] Wrap NPE with AnalysisException in CSV options [SPARK-43815][SQL] Wrap NPE with AnalysisException in CSV, XML, and JSON options May 17, 2024
@@ -107,7 +108,13 @@ class JSONOptions(
val writeNullIfWithDefaultValue = SQLConf.get.jsonWriteNullIfWithDefaultValue

// A language tag in IETF BCP 47 format
val locale: Locale = parameters.get(LOCALE).map(Locale.forLanguageTag).getOrElse(Locale.US)
val locale: Locale = parameters.get(LOCALE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's reduce the duplicated code in CSV/JSON/XML options

val locale: Locale = parameters.get(LOCALE).map(Locale.forLanguageTag).getOrElse(Locale.US)
val locale: Locale = parameters.get(LOCALE)
.map {
case null =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that, in PySpark we ignore when it's None vs here we throw an exception. I really think we should fix the whole thing with defining the behaviour instead of fixing one case alone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Pyspark also seems to skip None check with option() or options() in python.

Would it make sense to just add null checks in scala and option()/options() in pyspark?
However, I'm not sure if any options accept null or None as a legitimate value.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's the key. This is actually a bigger problem than this specific case. I think we should define the behaviour, document somewhere, and apply them for all...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants