The early return in case of zero open internal entities and matching
end/nextPtr pointers cause the parser to miss XML_ERROR_NO_ELEMENTS
error.
The reason is that the internalEntityProcessor does not set the
m_reenter flag in such a case, which results in skipping the
prologProcessor or contentProcessor depending on wheter is_param is set
or not. However, this last skipped call to mentioned processors can
detect the non-existence of elements when some are expected.
callStoreEntityValue and storeAttributeValue call triggerReenter just
before continuing with their main loop. This call does not have any
use for the these functions as the continuity of their loop is already
achieved by the continue key word.
Only side effect these triggerReenter calls bring is that they cause a
return to the the callProcessor, only to reenter to the same point again,
wasting some time.
This commit removes these unnecessary calls.
At some points we check m_reenter flag for return. However this flag can
never be true at these points. Therefore body of the check is never
executed. This commit excludes the body from test coverage, removes the
nextPtr update (since we faced an error, no need to update it) and
lastly makes them return XML_ERROR_UNEXPECTED_STATE as a safety net.
After the commit "Fix entity debug order", the interaction with the open
entity lists has been changed.
Before the commit, during processing of an entity, if an inner entity was
found, it was pushed to the head of the list. This made the entity we were
processing the second in the list. So, in order the remove this entity,
we either remove the head or the second element, depending on if an inner
entity is found during processing.
After the commit, since we delay the removal of entities until their
inner entity references are resolved, the entity we want to remove will
always be on the head of the list. Thus the removal of the check.
Detection of recursive entity references are currently failing because
we process and close entities before their inner references are
processed. Since the detection works by checking wheter the referenced
entity is already open, this early close leads to wrong results. This
commit delays closing entities until their inner entities are processed
and closed. This is achieved by postponing the unsetting of the open
flag and using a new hasMore flag to check if the entity has more
elements to process.
The fix for entity processing changes the order of opening and closing
of entities. The reason is the new iterative approach that closes
and removes entities as soon as they are fully processed, unlike
the recursive approach that, as a side effect of the recursion, waits
the inner entities before closing and removal.
This commit delays the removal and the call to entityTrackingOnClose
until the current entities inner entities are processed, which in turn
allows us to have the correct debug order again.
This commit introduces a new nextPtr parameter to storeEntityValue.
After finishing its execution, storeEntityValue function sets this
parameter in way that it points to the next token to process.
This is useful when we want to leave and reenter storeEntityValue during
its execution since nextPtr will point where we left.
This commit is base to the following commit.
During processing attributes with entity references,
appendAttributeValue can reach high recursion depths that can lead to
a crash.
This commit switches the processing to an iterative approach similar to
the fix for internal entity processing. A new m_openAttributeEntities
list is introduced to keep track of entity references that need
processing. When a new entity reference is detected, instead of calling
appendAttributeValue recursively, the entity will be added to open
entities list and the execution will return to storeAttributeValue,
where newly added entity will be handled. After the entity processing
is done, appendAttributeValue will be called by using the next token.
This commits extends appendAttributeValue by introducing a new parameter
that will be set to the next token to process.
Having such a parameter allows us to reenter the function after an exit
and continue from the last token pointed by the pointer.
The new test asserts that internalEntityProcessor does not loop forever while
processing entities where external entity content references back to internal
entities from the parent document (see &e3; and &e4; below).
We ensure that progress is made after moving the parser from recursive
invocation to a state based processing within function callProcessor.
A version of this test case (originally external-to-Expat, "make run-xmltest")
failed earlier, so we wanted to have a variant of this test (that proved
itself relevant) included within the core test suite ("make check").
Co-authored-by: Jann Horn <jannh@google.com>
Add a new reenter flag. This flag acts like XML_SUSPENDED,
except that instead of returning out of the library, we
only return back to the main parse function, then re-enter
the processor function.
.. to give OSS-Fuzz a chance at a successful build while their
images are based on Ubuntu 20.04 with too-old Protobuf
PS: Display this commit with "-w" to see it best.