| Q | A
| ---------------- | -----
| Bug report? | no
| Feature request? | yes
| BC Break report? | no
| RFC? | yes/no
I've come across a problem regarding the handling of null/nil values. This is related to XML de-/serialization functionality.
An xml node is with the current implementation generally considered to be
null, when not present or when the
xsi:nil="true" attribute is present.
Let me first describe the situation I encountered, the problem(s) this raises and after that probable solutions.
Steps required to reproduce the problem
Take an xml document similar to the following one:
<?xml version="1.0" encoding="utf-8"?>
<!-- empty element, no xsi:nil="true" attribute -->
I'll omit the serializer configuration and php objects, as this is only an example. The important thing is that both, the
time properties of a calendar entry are to be deserialized as
DateTime. The date/time format is pretty obvious here (
H:i:s respectively). The
time is nullable, where the
date property is not.
The result and what it means
Before investigating, I would've expected this xml to produce a valid object structure where the time of the last calendar entry is simply not set. Instead, this or anything like it, will present you with an exception
Invalid datetime "", expected the format "H:i:s", [...]. and so on.
I've found out, that this was missing the
xsi:nil="true" attribute to be considered null.
Again, it's generally assumed, that a node is only null, when not present or said attribute is set. That's due to that in xml, there is pretty much no real
null, as we have it in php.
Well fine, some would say, just add that attribute in there. Would be a easy solution, indeed. Sadly, I've no direct control over the xml structure and serialization, e.g. I can't set the attribute. Ok, fine, I'm not picky, so a custom handler will do! And indeed it would (and did) do the job as needed eventually.
But this got me thinking: So why would the serializer try to look for the
xsi:nil attribute, when there is no
xsi prefix bound to any namespace. An xml is invalid, if
xsi:* is used without being registered. Verifiable via the w3schools xml validator using the following xml:
<?xml version="1.0" encoding="utf-8"?>
Namespace prefix xsi for nil on time is not defined
Again, this is hypothetical.
It's obviously required to add that namespace prefix and bind it somewhere in the document. But I would argue, it's not an uncommon use-case to have no namespaces defined at all. So the (kind of) requirement to add that namespace and use the attribute to be able to have an empty node be considered null is rather unintuitive. Or at least don't use the
xsi:nil check if there is no
xsi prefix anyway.
What's your opinion on this?
I've seen the way it is done with the serialization. There, the namespace is only added if any null-element has been visited.
I could think of something similar regarding the null handling to decide, depending on the presence of the namespace prefix in the document, if the
xsi:nil is checked or some other logic should be used. There are several ways to allow the developer to influence the behavior.
To be completely clear: I'm not proposing to changing the behavior and replacing it with something that would violate the spec. I'm simply saying there are use-cases where the current handling could cause issues (as it did for me). And in my opinion this is a functionality, that could benefit such use-cases.
This is especially the case for any of the datetime types, as seen in the example above. An exception will certainly get thrown on empty elements of that type, that's in the nature of the
Certainly, the xml-owner could fix their serializer or whatever they use to produce the documents, but in a real-world scenario, I find this unlikely, especially, when there are other implications such a change could have. For example on other interfacing software, that are already accustomed to the shortcomings of such system.
What would be affected
I've gone through these references prior to deciding to open a new issue here.
I'm willing to PR this, if the feature is considered beneficial. Although, some implementation details would've to be discussed prior to that.