Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging dev into main #74

Merged
merged 35 commits into from
Dec 26, 2024
Merged
Changes from 4 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
e91fbe3
Version bump before sync to main.
jbikker Dec 18, 2024
115f2a3
Update README.md
jbikker Dec 18, 2024
2aaff3e
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 18, 2024
e89bf4c
Minimalist wavefront code.
jbikker Dec 18, 2024
2a49133
Error in windows window.
jbikker Dec 18, 2024
dd8dc99
Disabling BuildNEON for now.
jbikker Dec 19, 2024
299bc27
Functional basic gpu wavefront path tracer.
jbikker Dec 19, 2024
f8098a2
Next Event Estimation for wavefront pt.
jbikker Dec 19, 2024
e554bbf
New screenshot.
jbikker Dec 19, 2024
e711d22
Update README.md
jbikker Dec 19, 2024
2affe64
Fix for second bounce.
jbikker Dec 20, 2024
c8c6093
NEE now accounts for BRDF.
jbikker Dec 20, 2024
f204b7f
Specular support in wavefront pt.
jbikker Dec 20, 2024
2372756
native_recip and fast_normalize
benanil Dec 20, 2024
9bca1e9
Merge pull request #71 from benanil/dev
jbikker Dec 20, 2024
71e50d4
Blue noise to play with.
jbikker Dec 20, 2024
c86ac2a
Blue noise sampling for NEE & bounce.
jbikker Dec 20, 2024
a848a16
Update README.md
jbikker Dec 21, 2024
392d450
Optimized cwbvh for ~10% faster traversal.
jbikker Dec 21, 2024
8feb5c2
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 21, 2024
d73d549
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 21, 2024
4e51514
Save/load for cwbvh, proper path state in wavefront.
jbikker Dec 22, 2024
8846bbe
Merged.
jbikker Dec 22, 2024
d04ae10
MIS for wavefront path tracer.
jbikker Dec 22, 2024
91d1a1a
Better documented code.
jbikker Dec 22, 2024
ed3451c
Minor comments.
jbikker Dec 22, 2024
cc624f9
Fixed MIS.
jbikker Dec 23, 2024
473fbc4
Update README.md
jbikker Dec 23, 2024
b1db88a
Update README.md
jbikker Dec 23, 2024
e22f885
Update README.md
jbikker Dec 23, 2024
36caa36
Update README.md
jbikker Dec 23, 2024
38e50d5
More MIS fixes.
jbikker Dec 23, 2024
a659545
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 23, 2024
87b9189
tinby_bvh_gpu now compiles on Linux; CodeMaid applied.
jbikker Dec 26, 2024
437ce4f
Update README.md
jbikker Dec 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 13 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
@@ -34,19 +34,19 @@ Several special-purpose builders are also available:

A constructed BVH can be used to quickly intersect a ray with the geometry, using ````BVH::Intersect```` or ````BVH::IsOccluded````, for shadow rays. The double-precision BVH is traversed using ````BVH::IntersectEx````.

The constructed BVH will have a layout suitable for construction ('````WALD_32BYTE````'). Several other layouts for the same data are available, which all serve one or more specific purposes. You can convert between layouts using ````BVH::Convert````. The available layouts are:
* ````BVH::WALD_32BYTE```` : A compact format that stores the AABB for a node, along with child pointers and leaf information in a cross-platform-friendly way. The 32-byte size allows for cache-line alignment.
* ````BVH::ALT_SOA```` : This format stores bounding box information in a SIMD-friendly format, making the BVH faster to traverse.
* ````BVH::WALD_DOUBLE```` : Double-precision version of ````BVH::WALD_32BYTE````.
* ````BVH::VERBOSE```` : A format designed for modifying BVHs, e.g. for post-build optimizations using ````BVH::Optimize()````.
* ````BVH::AILA_LAINE```` : This format uses 64 bytes per node and stores the AABBs of the two child nodes. This is the format presented in the [2009 Aila & Laine paper](https://research.nvidia.com/sites/default/files/pubs/2009-08_Understanding-the-Efficiency/aila2009hpg_paper.pdf) and recommended for basic GPU ray tracing.
* ````BVH::BASIC_BVH4```` : In this format, each node stores four child pointers, reducing the depth of the tree. This improves performance for divergent rays. Based on the [2008 paper](https://graphics.stanford.edu/~boulos/papers/multi_rt08.pdf) by Ingo Wald et al.
* ````BVH::BVH4_GPU```` : The ````BASIC_BVH4```` format can be converted to the more compact ````BVH4_GPU```` layout, which will be faster for GPU ray tracing.
* ````BVH::BVH4_AFRA```` : The ````BASIC_BVH4```` format can also be converted to a SIMD-friendly ````BVH4_AFRA```` layout, currently the fastest option for single-ray traversal on CPU.
* ````BVH::BASIC_BVH8```` : This format stores eight child pointers, further reducing the depth of the tree. The only purpose is the construction of ````BVH::CWBVH````.
* ````BVH::CWBVH```` : An advanced 80-byte representation of the 8-wide BVH, for state-of-the-art GPU rendering, based on the [2017 paper](https://research.nvidia.com/publication/2017-07_efficient-incoherent-ray-traversal-gpus-through-compressed-wide-bvhs) by Ylitie et al. and [code by AlanWBFT](https://github.com/AlanIWBFT/CWBVH).

A BVH in the ````BVH::WALD_32BYTE```` format may be _refitted_ in case the triangles moved using ````BVH::Refit````. Refitting is substantially faster than rebuilding and works well if the animation is subtle. Refitting does not work if polygon counts change.
Apart from the default BVH layout (simply named ````BVH````), several other layouts are available, which all serve one or more specific purposes. You can create a BVH in the desired layout by instantiating the appropriate class, or by converting from ````BVH```` using the ````::ConvertFrom```` methods. The available layouts are:
* ````BVH```` : A compact format that stores the AABB for a node, along with child pointers and leaf information in a cross-platform-friendly way. The 32-byte size allows for cache-line alignment.
* ````BVH_SoA```` : This format stores bounding box information in a SIMD-friendly format, making the BVH faster to traverse.
* ````BVH_Double```` : Double-precision version of ````BVH````.
* ````BVH_Verbose```` : A format designed for modifying BVHs, e.g. for post-build optimizations using ````BVH_Verbose::Optimize()````.
* ````BVH_GPU```` : This format uses 64 bytes per node and stores the AABBs of the two child nodes. This is the format presented in the [2009 Aila & Laine paper](https://research.nvidia.com/sites/default/files/pubs/2009-08_Understanding-the-Efficiency/aila2009hpg_paper.pdf). It can be traversed with a simple GPU kernel.
* ````BVH4```` : In this format, each node stores four child pointers, reducing the depth of the tree. This improves performance for divergent rays. Based on the [2008 paper](https://graphics.stanford.edu/~boulos/papers/multi_rt08.pdf) by Ingo Wald et al.
* ````BVH4_GPU```` : A more compact version of the ````BVH4```` format, which will be faster for GPU ray tracing.
* ````BVH4_CPU```` : A SIMD-friendly version of the ````BVH4```` format, currently the fastest option for single-ray traversal on CPU.
* ````BVH8```` : This format stores eight child pointers, further reducing the depth of the tree.
* ````BVH8_CWBVH```` : An advanced 80-byte representation of the 8-wide BVH, for state-of-the-art GPU rendering, based on the [2017 paper](https://research.nvidia.com/publication/2017-07_efficient-incoherent-ray-traversal-gpus-through-compressed-wide-bvhs) by Ylitie et al. and [code by AlanWBFT](https://github.com/AlanIWBFT/CWBVH).

A BVH in the ````BVH```` format may be _refitted_, in case the triangles moved, using ````BVH::Refit````. Refitting is substantially faster than rebuilding and works well if the animation is subtle. Refitting does not work if polygon counts change.

# How To Use
The library ````tiny_bvh.h```` is designed to be easy to use. Please have a look at tiny_bvh_minimal.cpp for an example. A Visual Studio 'solution' (.sln/.vcxproj) is included, as well as a CMake file. That being said: The examples consists of only a single source file, which can be compiled with clang or g++, e.g.:
27 changes: 27 additions & 0 deletions tiny_bvh.h
Original file line number Diff line number Diff line change
@@ -894,6 +894,8 @@ class BVH8_CWBVH : public BVHBase
BVH8_CWBVH( BVHContext ctx = {} ) { context = ctx; }
BVH8_CWBVH( BVH8& original ) { /* DEPRECATED */ ConvertFrom( bvh8 ); }
~BVH8_CWBVH();
void Save( const char* fileName );
bool Load( const char* fileName );
void Build( const bvhvec4* vertices, const uint32_t primCount );
void Build( const bvhvec4slice& vertices );
void ConvertFrom( BVH8& original ); // NOTE: Not const; this may change some nodes in the original.
@@ -937,6 +939,7 @@ class BLASInstance
#ifdef _MSC_VER
#include <intrin.h> // for __lzcnt
#endif
#include <fstream> // fstream

// We need quite a bit of type reinterpretation, so we'll
// turn off the gcc warning here until the end of the file.
@@ -2967,6 +2970,30 @@ BVH8_CWBVH::~BVH8_CWBVH()
AlignedFree( bvh8Tris );
}

void BVH8_CWBVH::Save( const char* fileName )
{
std::fstream s{ fileName, s.binary | s.out };
s.write( (char*)this, sizeof( BVH8_CWBVH ) );
s.write( (char*)bvh8Data, usedBlocks * 16 );
s.write( (char*)bvh8Tris, bvh8.idxCount * 4 * 16 );
}

bool BVH8_CWBVH::Load( const char* fileName )
{
std::fstream s{ fileName, s.binary | s.in };
if (!s) return false;
BVHContext tmp = context;
s.read( (char*)this, sizeof( BVH8_CWBVH ) );
context = tmp; // can't load context; function pointers will differ.
bvh8Data = (bvhvec4*)AlignedAlloc( usedBlocks * 16 );
bvh8Tris = (bvhvec4*)AlignedAlloc( bvh8.idxCount * 4 * 16 );
allocatedBlocks = usedBlocks;
s.read( (char*)bvh8Data, usedBlocks * 16 );
s.read( (char*)bvh8Tris, bvh8.idxCount * 4 * 16 );
bvh8 = BVH8();
return true;
}

void BVH8_CWBVH::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
16 changes: 15 additions & 1 deletion tiny_bvh_gpu.cpp
Original file line number Diff line number Diff line change
@@ -96,7 +96,21 @@ void Init()
AddMesh( "./testdata/lucy.bin", 1.1f, bvhvec3( -2, 4.1f, -3 ), 0x2ffff88 );
AddQuad( bvhvec3( 0, 30, -1 ), 9, 5, 0x1ffffff ); // hard-coded light source
// build bvh (here: 'compressed wide bvh', for efficient GPU rendering)
bvh.Build( tris, triCount );
if (!bvh.Load( "cwbvh.bin" ))
{
// optimizing a BVH: from BVH to BVH_Verbose, optimize, then back to BVH.
BVH bvh2;
bvh2.Build( tris, triCount );
BVH_Verbose verbose;
verbose.ConvertFrom( bvh2 );
verbose.Optimize( 1000000 ); // this will take a while: Next time, use cache.
bvh2.ConvertFrom( verbose );
// building a cwbvh without the convenient constructor: From BVH, via BVH8.
BVH8 bvh8;
bvh8.ConvertFrom( bvh2 );
bvh.ConvertFrom( bvh8 );
bvh.Save( "cwbvh.bin" );
}
// create OpenCL buffers for BVH data
cwbvhNodes = new Buffer( bvh.usedBlocks * sizeof( bvhvec4 ), bvh.bvh8Data );
cwbvhTris = new Buffer( bvh.idxCount * 3 * sizeof( bvhvec4 ), bvh.bvh8Tris );