POSIX strndup() does not read memory beyond NUL byte of the source
string. Preserve this behavior in libexpat implementation to prevent
access violations and keep portability.
m_eventPtr is a key provider to these functions:
- XML_GetCurrentByteCount
- XML_GetCurrentByteIndex
- XML_GetCurrentColumnNumber
- XML_GetCurrentLineNumber
- XML_GetInputContext
The fix for recursive entity processing introduced a reenter flag that
returns the execution from the current function and switches to entity
processing.
The same fix also updates the m_eventPtr during this switch. However
this update changes the behaviour in certain cases as the older version
does not update the m_eventPtr while recursing into entity processing.
This commit removes the pointer update and restores the old behaviour.
The symptom was:
> [variadic:typing] lib/xmlparse.c:8242: Warning:
> Incorrect type for argument 7. The argument will be cast from unsigned int to int.
When compiling with Emscripten 3.1.6, the symptom was:
> [..]
> /usr/bin/emcc @CMakeFiles/expat.dir/includes_C.rsp -fno-strict-aliasing -fvisibility=hidden -std=c99 -MD -MT CMakeFiles/expat.dir/lib/xmlparse.c.o -MF CMakeFiles/expat.dir/lib/xmlparse.c.o.d -o CMakeFiles/expat.dir/lib/xmlparse.c.o -c /libexpat/expat/lib/xmlparse.c
> /libexpat/expat/lib/xmlparse.c:8132:11: warning: format specifies type 'int' but the argument has type 'ptrdiff_t' (aka 'long') [-Wformat]
> bytesMore, (account == XML_ACCOUNT_DIRECT) ? "DIR" : "EXP",
> ^~~~~~~~~
> 1 warning generated.
> [..]
> /usr/bin/emcc -DXML_TESTING @CMakeFiles/runtests.dir/includes_C.rsp -fno-strict-aliasing -fvisibility=hidden -std=c99 -MD -MT CMakeFiles/runtests.dir/tests/acc_tests.c.o -MF CMakeFiles/runtests.dir/tests/acc_tests.c.o.d -o CMakeFiles/runtests.dir/tests/acc_tests.c.o -c /libexpat/expat/tests/acc_tests.c
> /libexpat/expat/tests/acc_tests.c:279:11: warning: format specifies type 'unsigned int' but the argument has type 'unsigned long' [-Wformat]
> u + 1, countCases, expectedCountBytesDirect, actualCountBytesDirect);
> ^~~~~
> /libexpat/expat/tests/acc_tests.c:279:18: warning: format specifies type 'unsigned int' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat]
> u + 1, countCases, expectedCountBytesDirect, actualCountBytesDirect);
> ^~~~~~~~~~
> /libexpat/expat/tests/acc_tests.c:288:11: warning: format specifies type 'unsigned int' but the argument has type 'unsigned long' [-Wformat]
> u + 1, countCases, expectedCountBytesIndirect,
> ^~~~~
> /libexpat/expat/tests/acc_tests.c:288:18: warning: format specifies type 'unsigned int' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat]
> u + 1, countCases, expectedCountBytesIndirect,
> ^~~~~~~~~~
> 4 warnings generated.
> [..]
> /usr/bin/emcc -DXML_TESTING @CMakeFiles/runtests.dir/includes_C.rsp -fno-strict-aliasing -fvisibility=hidden -std=c99 -MD -MT CMakeFiles/runtests.dir/lib/xmlparse.c.o -MF CMakeFiles/runtests.dir/lib/xmlparse.c.o.d -o CMakeFiles/runtests.dir/lib/xmlparse.c.o -c /libexpat/expat/lib/xmlparse.c
> /libexpat/expat/lib/xmlparse.c:8132:11: warning: format specifies type 'int' but the argument has type 'ptrdiff_t' (aka 'long') [-Wformat]
> bytesMore, (account == XML_ACCOUNT_DIRECT) ? "DIR" : "EXP",
> ^~~~~~~~~
> 1 warning generated.
> [..]
> /usr/bin/em++ -DXML_TESTING @CMakeFiles/runtests_cxx.dir/includes_CXX.rsp -fno-strict-aliasing -fvisibility=hidden -std=c++11 -MD -MT CMakeFiles/runtests_cxx.dir/tests/acc_tests_cxx.cpp.o -MF CMakeFiles/runtests_cxx.dir/tests/acc_tests_cxx.cpp.o.d -o CMakeFiles/runtests_cxx.dir/tests/acc_tests_cxx.cpp.o -c /libexpat/expat/tests/acc_tests_cxx.cpp
> In file included from /libexpat/expat/tests/acc_tests_cxx.cpp:32:
> /libexpat/expat/tests/acc_tests.c:279:11: warning: format specifies type 'unsigned int' but the argument has type 'unsigned long' [-Wformat]
> u + 1, countCases, expectedCountBytesDirect, actualCountBytesDirect);
> ^~~~~
> /libexpat/expat/tests/acc_tests.c:279:18: warning: format specifies type 'unsigned int' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat]
> u + 1, countCases, expectedCountBytesDirect, actualCountBytesDirect);
> ^~~~~~~~~~
> /libexpat/expat/tests/acc_tests.c:288:11: warning: format specifies type 'unsigned int' but the argument has type 'unsigned long' [-Wformat]
> u + 1, countCases, expectedCountBytesIndirect,
> ^~~~~
> /libexpat/expat/tests/acc_tests.c:288:18: warning: format specifies type 'unsigned int' but the argument has type 'size_t' (aka 'unsigned long') [-Wformat]
> u + 1, countCases, expectedCountBytesIndirect,
> ^~~~~~~~~~
> 4 warnings generated.
> [..]
> /usr/bin/emcc -DXML_TESTING @CMakeFiles/runtests_cxx.dir/includes_C.rsp -fno-strict-aliasing -fvisibility=hidden -std=c99 -MD -MT CMakeFiles/runtests_cxx.dir/lib/xmlparse.c.o -MF CMakeFiles/runtests_cxx.dir/lib/xmlparse.c.o.d -o CMakeFiles/runtests_cxx.dir/lib/xmlparse.c.o -c /libexpat/expat/lib/xmlparse.c
> /libexpat/expat/lib/xmlparse.c:8132:11: warning: format specifies type 'int' but the argument has type 'ptrdiff_t' (aka 'long') [-Wformat]
> bytesMore, (account == XML_ACCOUNT_DIRECT) ? "DIR" : "EXP",
> ^~~~~~~~~
> 1 warning generated.
The early return in case of zero open internal entities and matching
end/nextPtr pointers cause the parser to miss XML_ERROR_NO_ELEMENTS
error.
The reason is that the internalEntityProcessor does not set the
m_reenter flag in such a case, which results in skipping the
prologProcessor or contentProcessor depending on wheter is_param is set
or not. However, this last skipped call to mentioned processors can
detect the non-existence of elements when some are expected.
callStoreEntityValue and storeAttributeValue call triggerReenter just
before continuing with their main loop. This call does not have any
use for the these functions as the continuity of their loop is already
achieved by the continue key word.
Only side effect these triggerReenter calls bring is that they cause a
return to the the callProcessor, only to reenter to the same point again,
wasting some time.
This commit removes these unnecessary calls.
At some points we check m_reenter flag for return. However this flag can
never be true at these points. Therefore body of the check is never
executed. This commit excludes the body from test coverage, removes the
nextPtr update (since we faced an error, no need to update it) and
lastly makes them return XML_ERROR_UNEXPECTED_STATE as a safety net.
After the commit "Fix entity debug order", the interaction with the open
entity lists has been changed.
Before the commit, during processing of an entity, if an inner entity was
found, it was pushed to the head of the list. This made the entity we were
processing the second in the list. So, in order the remove this entity,
we either remove the head or the second element, depending on if an inner
entity is found during processing.
After the commit, since we delay the removal of entities until their
inner entity references are resolved, the entity we want to remove will
always be on the head of the list. Thus the removal of the check.
Detection of recursive entity references are currently failing because
we process and close entities before their inner references are
processed. Since the detection works by checking wheter the referenced
entity is already open, this early close leads to wrong results. This
commit delays closing entities until their inner entities are processed
and closed. This is achieved by postponing the unsetting of the open
flag and using a new hasMore flag to check if the entity has more
elements to process.
The fix for entity processing changes the order of opening and closing
of entities. The reason is the new iterative approach that closes
and removes entities as soon as they are fully processed, unlike
the recursive approach that, as a side effect of the recursion, waits
the inner entities before closing and removal.
This commit delays the removal and the call to entityTrackingOnClose
until the current entities inner entities are processed, which in turn
allows us to have the correct debug order again.