Previously there was no guarantee that the tests that check for out of memory
handling behavior are actually correct - e.g. that they correctly simulate out
of memory conditions.
Now every simulated out of memory condition has to be "guarded" using
CHECK_ALLOC_FAIL. It makes sure that every piece of code that is supposed to
cause out-of-memory does so, and that no other code runs out of memory
unnoticed.
When parsing XPath variables, we need to perform a heap allocation; if it
fails, an xpath_exception instead of bad_alloc used to be thrown.
Now we throw the exception of a correct type so that xpath_exception means
'parsing error'.
This is mostly done using regex replaces of original Quickbook markup, plus a
bit of manual fixup for multiple references to the single point from different
lines that AsciiDoc does not seem to handle.
Previously we omitted extra whitespace for single PCDATA/CDATA children, but in
mixed content there was extra indentation before/after text nodes.
One of the problems with that is that the text that you saved is not exactly
the same as the parsing result using default flags (parse_trim_pcdata helps).
Another problem is that parse-format cycles do not have a fixed point for mixed
content - the result expands indefinitely. Some XML libraries, like Python
minidom, have the same issue, but this is definitely a problem.
Pretty-printing mixed content is hard. It seems that the only other sensible
choice is to switch mixed content nodes to raw formatting. In a way the code in
this change is a weaker version of that - it removes indentation around text
nodes but still keeps it around element siblings/children.
Thus we can switch to mixed-raw formatting at some point later, which will be
a superset of the current behavior.
To do this we have to either switch at the first text node (.NET XmlDocument
does that), or scan the children of each element for a possible text node and
switch before we output the first child.
The former behavior seems non-intuitive (and a bit broken); unfortunately, the
latter behavior can cost up to 20% of the output time for trees *without* mixed
content.
Fixes#13.
data/truncation.xml was corrupted at some point and was not actually valid.
Fix the file and make the test fail if we can't parse truncation.xml at all.