This can happen when mtllib is absent, points to a non-existing file, or
material names are incorrect. In either case it helps the reader
differentiate between materials that are actually specified in the file,
even if their definition happens to match the default, from materials
that were never loaded in the first place.
Some .obj files have extra whitespace after usemtl and other statements;
this may interfere with parsing, for example by adding a space to the
material name which can result in inability to find it in .mtl file.
For consistency, we replace all uses of is_end_of_name with skip_name
that handles this by leaving trailing whitespace alone. It will be
skipped in the parsing flow when we skip the newline, which is similar
to the behavior of trailing whitespace after numeric data (which
parse_float/int leave alone).
When an array reached a size that would require >4GB allocation, the
argument to realloc would overflow during multiplication, resulting in a
very small allocation and a subsequent out of bounds access.
This change fixes that and also adjusts capacity calculation so that it
doesn't overflow 32-bit range until it's basically impossible not to.
This is not perfect - in particular, on 32-bit systems there's a risk of
size_t overflow that remains, however because we grow in 1.5x
increments, realistically an attempt to grow a 2GB allocation to the
next increment would fail before that. We can also technically overflow
capacity even after the adjustment, but that requires 3+B elements which
effectively means an .obj file on the scale of hundreds of gigabytes, at
which point maybe a streaming parser would be more practical.
In some cases we need to know the length of `indices`.
Currently need to calculate the sum of the `face_vertices`
elements to get this value.
It would be convenient if we could get the value directly.
On Windows, paths with mixed slashes weren't processed correctly as `\`
would be picked if it was anywhere in the path for the purpose of
determining base.
This also removes the #ifdef _WIN32 from the logic; this helps on
platforms like Emscripten where running the resulting binary that
wasn't compiled with WIN32 still needs backslash processing; paths with
backslashes on Linux should be exceedingly rare, plus we *already*
correct backslashes with forward slashes on Linux anyway...
Problem: I read a bare OBJ so no MTL is provided. I still want
to keep the materials setup across the scene for further edit.
Fix: Instead of using a fallback material (idx 0), create a material
for each not-defined material while keeping the name
so that the structure of the model is conserved.
- allows filepath with spaces for textures
- allows names with space for internal names in obj (group, material)
- It fixes incorrect matching (material name matching on first part of a name like "my material" merges all material with same start)
Problem: on a 22gb file, it fails on "out of bound write"
Fix: using size_t on 64bits for arithmetic ptr alloc/index system fixes it
Usage: user can define FAST_OBJ_UINT_TYPE as size_t on x64
When moving the unprocessed line to the beginning of the buffer, in rare
edge cases where the unprocessed chunk is larger than the processed
chunks (which means the lines are very long), the source & target range
will overlap. This is undefined as per C standard and triggers ubsan
errors.
Fix this by using memmove.
When string_equal's first argument is a prefix of the second argument
but the second argument is longer, the loop goes through all characters
of the first string, compares terminating NUL with a different character
in the right hand side string, discovers that it's different and leaves
the loop - with 'a' having already been incremented.
After this the condition proceeds to read from *a which causes a buffer
overrun.
Fix this by changing the function to something that's obviously correct,
even if somewhat less efficient.
Instead of each group storing a separate face/index array, we now store
one large face/index array and each group stores offsets inside it.
This makes it easier to parse .obj files when group information is
unimportant since one can just skip it - group information is often
inessential as it doesn't affect rendering behavior.
This makes parsing large files slightly faster (rungholt.obj parses in
~500ms instead of ~530ms after this change).
For use-cases that require parsing the .obj file and outputting another
file, resolving texture paths is inconvenient since the result depends
on the path that's passed to obj_fast_read. While this can be resolved
by recomputing the relative path in user code, it seems cleaner to keep
the map names as is when parsing .mtl.
Of course, if .obj file is required for rendering, the path
concatenation is still convenient. This change makes
fastObjTexture::name contain the original data, and fastObjTexture::path
contains the resolved path that can be used to actually load the texture
if necessary.
Negative indices refer to offsets of vertices (before multiplying by
stride), but array size of position/etc. is multiplied by stride.
Integer division isn't ideal for performance, however division by 3 is
lowered into integer multiplication on gcc/clang/msvc so this shouldn't
be a big concern.
In some .obj files, there's a stray 'g' followed by a newline at the
very end of the file. What happens right now is that *p++ skips past the
"terminating newline", and then proceeds to process out of bounds memory
which leads to a crash.
I'm not sure if 'g' can actually be empty per spec, so this change just
fixes the crash without resetting the group to "default" or anything
like that; 'v'/'f' shouldn't be empty but this would fix crashing when
parsing malicious/malformed .obj files as well.