Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging dev into main #74

Merged
merged 35 commits into from
Dec 26, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
e91fbe3
Version bump before sync to main.
jbikker Dec 18, 2024
115f2a3
Update README.md
jbikker Dec 18, 2024
2aaff3e
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 18, 2024
e89bf4c
Minimalist wavefront code.
jbikker Dec 18, 2024
2a49133
Error in windows window.
jbikker Dec 18, 2024
dd8dc99
Disabling BuildNEON for now.
jbikker Dec 19, 2024
299bc27
Functional basic gpu wavefront path tracer.
jbikker Dec 19, 2024
f8098a2
Next Event Estimation for wavefront pt.
jbikker Dec 19, 2024
e554bbf
New screenshot.
jbikker Dec 19, 2024
e711d22
Update README.md
jbikker Dec 19, 2024
2affe64
Fix for second bounce.
jbikker Dec 20, 2024
c8c6093
NEE now accounts for BRDF.
jbikker Dec 20, 2024
f204b7f
Specular support in wavefront pt.
jbikker Dec 20, 2024
2372756
native_recip and fast_normalize
benanil Dec 20, 2024
9bca1e9
Merge pull request #71 from benanil/dev
jbikker Dec 20, 2024
71e50d4
Blue noise to play with.
jbikker Dec 20, 2024
c86ac2a
Blue noise sampling for NEE & bounce.
jbikker Dec 20, 2024
a848a16
Update README.md
jbikker Dec 21, 2024
392d450
Optimized cwbvh for ~10% faster traversal.
jbikker Dec 21, 2024
8feb5c2
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 21, 2024
d73d549
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 21, 2024
4e51514
Save/load for cwbvh, proper path state in wavefront.
jbikker Dec 22, 2024
8846bbe
Merged.
jbikker Dec 22, 2024
d04ae10
MIS for wavefront path tracer.
jbikker Dec 22, 2024
91d1a1a
Better documented code.
jbikker Dec 22, 2024
ed3451c
Minor comments.
jbikker Dec 22, 2024
cc624f9
Fixed MIS.
jbikker Dec 23, 2024
473fbc4
Update README.md
jbikker Dec 23, 2024
b1db88a
Update README.md
jbikker Dec 23, 2024
e22f885
Update README.md
jbikker Dec 23, 2024
36caa36
Update README.md
jbikker Dec 23, 2024
38e50d5
More MIS fixes.
jbikker Dec 23, 2024
a659545
Merge branch 'dev' of https://github.com/jbikker/tinybvh into dev
jbikker Dec 23, 2024
87b9189
tinby_bvh_gpu now compiles on Linux; CodeMaid applied.
jbikker Dec 26, 2024
437ce4f
Update README.md
jbikker Dec 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,8 @@ Apart from the default BVH layout (simply named ````BVH````), several other layo

A BVH in the ````BVH```` format may be _refitted_, in case the triangles moved, using ````BVH::Refit````. Refitting is substantially faster than rebuilding and works well if the animation is subtle. Refitting does not work if polygon counts change.

New in version 1.1.3: 'Self-contained' formats may be serialized and de-serialized via ````::Save```` and ````::Load````. Currently this is supported for ````BVH8_CWBVH````, which stores vertex data in a custom format and thus does not rely on the input vertices for traversal.

# How To Use
The library ````tiny_bvh.h```` is designed to be easy to use. Please have a look at tiny_bvh_minimal.cpp for an example. A Visual Studio 'solution' (.sln/.vcxproj) is included, as well as a CMake file. That being said: The examples consists of only a single source file, which can be compiled with clang or g++, e.g.:

Expand All @@ -71,7 +73,7 @@ The **performance measurement tool** can be compiled with:

````g++ -std=c++20 -mavx -Ofast tiny_bvh_speedtest.cpp -o tiny_bvh_speedtest````

# Version 1.1.2
# Version 1.1.3

Version 1.1.0 introduced a <ins>change to the API</ins>. The single BVH class with multiple layouts has been replaced with a BVH class per layout. You can simply instantiate the desired layout; conversion (and data ownership) is then handled properly by the library. Examples:

Expand Down
100 changes: 50 additions & 50 deletions tiny_bvh.h
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ THE SOFTWARE.
// library version
#define TINY_BVH_VERSION_MAJOR 1
#define TINY_BVH_VERSION_MINOR 1
#define TINY_BVH_VERSION_SUB 2
#define TINY_BVH_VERSION_SUB 3

// ============================================================================
//
Expand Down Expand Up @@ -907,7 +907,7 @@ class BVH8_CWBVH : public BVHBase
uint32_t allocatedBlocks = 0; // node data is stored in blocks of 16 byte.
uint32_t usedBlocks = 0; // actually used blocks.
BVH8 bvh8; // BVH8_CWBVH is created from BVH8 and uses its data.
bool ownBVH8 = true; // False when ConvertFrom receives an external bvh8.
bool ownBVH8 = true; // false when ConvertFrom receives an external bvh8.
};

// BLASInstance: A TLAS is built over BLAS instances, where a single BLAS can be
Expand Down Expand Up @@ -2176,18 +2176,18 @@ void BVH_Verbose::MergeLeafs()
// BVH_GPU implementation
// ----------------------------------------------------------------------------

BVH_GPU::~BVH_GPU()
BVH_GPU::~BVH_GPU()
{
if (!ownBVH) bvh = BVH(); // clear out pointers we don't own.
AlignedFree( bvhNode );
AlignedFree( bvhNode );
}

void BVH_GPU::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
void BVH_GPU::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH_GPU::Build( const bvhvec4slice& vertices )
{
void BVH_GPU::Build( const bvhvec4slice& vertices )
{
bvh.BuildDefault( vertices );
ConvertFrom( bvh );
}
Expand All @@ -2196,7 +2196,7 @@ void BVH_GPU::ConvertFrom( const BVH& original )
{
// get a copy of the original bvh
if (&original != &bvh) ownBVH = false; // bvh isn't ours; don't delete in destructor.
bvh = original;
bvh = original;
// allocate space
const uint32_t spaceNeeded = original.usedNodes;
if (allocatedNodes < spaceNeeded)
Expand Down Expand Up @@ -2285,18 +2285,18 @@ int32_t BVH_GPU::Intersect( Ray& ray ) const
// BVH_SoA implementation
// ----------------------------------------------------------------------------

BVH_SoA::~BVH_SoA()
BVH_SoA::~BVH_SoA()
{
if (!ownBVH) bvh = BVH(); // clear out pointers we don't own.
AlignedFree( bvhNode );
AlignedFree( bvhNode );
}

void BVH_SoA::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
void BVH_SoA::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH_SoA::Build( const bvhvec4slice& vertices )
{
void BVH_SoA::Build( const bvhvec4slice& vertices )
{
bvh.context = context; // properly propagate context to fix issue #66.
bvh.BuildDefault( vertices );
ConvertFrom( bvh );
Expand All @@ -2306,7 +2306,7 @@ void BVH_SoA::ConvertFrom( const BVH& original )
{
// get a copy of the original bvh
if (&original != &bvh) ownBVH = false; // bvh isn't ours; don't delete in destructor.
bvh = original;
bvh = original;
// allocate space
const uint32_t spaceNeeded = bvh.usedNodes;
if (allocatedNodes < spaceNeeded)
Expand Down Expand Up @@ -2355,15 +2355,15 @@ void BVH_SoA::ConvertFrom( const BVH& original )
// BVH4 implementation
// ----------------------------------------------------------------------------

BVH4::~BVH4()
BVH4::~BVH4()
{
if (!ownBVH) bvh = BVH(); // clear out pointers we don't own.
AlignedFree( bvh4Node );
AlignedFree( bvh4Node );
}

void BVH4::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH4::Build( const bvhvec4slice& vertices )
{
Expand All @@ -2376,7 +2376,7 @@ void BVH4::ConvertFrom( const BVH& original )
{
// get a copy of the original bvh
if (&original != &bvh) ownBVH = false; // bvh isn't ours; don't delete in destructor.
bvh = original;
bvh = original;
// allocate space
const uint32_t spaceNeeded = original.usedNodes;
if (allocatedNodes < spaceNeeded)
Expand Down Expand Up @@ -2458,19 +2458,19 @@ int32_t BVH4::Intersect( Ray& ray ) const
// BVH4_CPU implementation
// ----------------------------------------------------------------------------

BVH4_CPU::~BVH4_CPU()
BVH4_CPU::~BVH4_CPU()
{
if (!ownBVH4) bvh4 = BVH4(); // clear out pointers we don't own.
AlignedFree( bvh4Node );
AlignedFree( bvh4Node );
AlignedFree( bvh4Tris );
}

void BVH4_CPU::Build( const bvhvec4* vertices, const uint32_t primCount )
void BVH4_CPU::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH4_CPU::Build( const bvhvec4slice& vertices )
{
void BVH4_CPU::Build( const bvhvec4slice& vertices )
{
bvh4.context = context; // properly propagate context to fix issue #66.
bvh4.Build( vertices );
ConvertFrom( bvh4 );
Expand All @@ -2480,7 +2480,7 @@ void BVH4_CPU::ConvertFrom( const BVH4& original )
{
// get a copy of the original bvh4
if (&original != &bvh4) ownBVH4 = false; // bvh isn't ours; don't delete in destructor.
bvh4 = original;
bvh4 = original;
// Convert a 4-wide BVH to a format suitable for CPU traversal.
// See Faster Incoherent Ray Traversal Using 8-Wide AVX InstructionsLayout,
// Atilla T. Áfra, 2013.
Expand Down Expand Up @@ -2564,18 +2564,18 @@ void BVH4_CPU::ConvertFrom( const BVH4& original )
// BVH4_GPU implementation
// ----------------------------------------------------------------------------

BVH4_GPU::~BVH4_GPU()
BVH4_GPU::~BVH4_GPU()
{
if (!ownBVH4) bvh4 = BVH4(); // clear out pointers we don't own.
AlignedFree( bvh4Data );
AlignedFree( bvh4Data );
}

void BVH4_GPU::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
void BVH4_GPU::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH4_GPU::Build( const bvhvec4slice& vertices )
{
void BVH4_GPU::Build( const bvhvec4slice& vertices )
{
bvh4.context = context; // properly propagate context to fix issue #66.
bvh4.Build( vertices );
ConvertFrom( bvh4 );
Expand Down Expand Up @@ -2816,18 +2816,18 @@ int32_t BVH4_GPU::Intersect( Ray& ray ) const
// BVH8 implementation
// ----------------------------------------------------------------------------

BVH8::~BVH8()
BVH8::~BVH8()
{
if (!ownBVH) bvh = BVH(); // clear out pointers we don't own.
AlignedFree( bvh8Node );
}

void BVH8::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
void BVH8::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH8::Build( const bvhvec4slice& vertices )
{
void BVH8::Build( const bvhvec4slice& vertices )
{
bvh.context = context; // properly propagate context to fix issue #66.
bvh.BuildDefault( vertices );
ConvertFrom( bvh );
Expand All @@ -2837,7 +2837,7 @@ void BVH8::ConvertFrom( const BVH& original )
{
// get a copy of the original
if (&original != &bvh) ownBVH = false; // bvh isn't ours; don't delete in destructor.
bvh = original;
bvh = original;
// allocate space
// Note: The safe upper bound here is usedNodes when converting an existing
// BVH2, but we need triCount * 2 to be safe in later conversions, e.g. to
Expand Down Expand Up @@ -2963,7 +2963,7 @@ int32_t BVH8::Intersect( Ray& ray ) const
// BVH8_CWBVH implementation
// ----------------------------------------------------------------------------

BVH8_CWBVH::~BVH8_CWBVH()
BVH8_CWBVH::~BVH8_CWBVH()
{
if (!ownBVH8) bvh8 = BVH8(); // clear out pointers we don't own.
AlignedFree( bvh8Data );
Expand Down Expand Up @@ -2994,12 +2994,12 @@ bool BVH8_CWBVH::Load( const char* fileName )
return true;
}

void BVH8_CWBVH::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
void BVH8_CWBVH::Build( const bvhvec4* vertices, const uint32_t primCount )
{
Build( bvhvec4slice( vertices, primCount * 3, sizeof( bvhvec4 ) ) );
}
void BVH8_CWBVH::Build( const bvhvec4slice& vertices )
{
void BVH8_CWBVH::Build( const bvhvec4slice& vertices )
{
bvh8.context = context; // properly propagate context to fix issue #66.
bvh8.Build( vertices );
ConvertFrom( bvh8 );
Expand All @@ -3009,7 +3009,7 @@ void BVH8_CWBVH::ConvertFrom( BVH8& original )
{
// get a copy of the original bvh8
if (&original != &bvh8) ownBVH8 = false; // bvh isn't ours; don't delete in destructor.
bvh8 = original;
bvh8 = original;
// Convert a BVH8 to the format specified in: "Efficient Incoherent Ray
// Traversal on GPUs Through Compressed Wide BVHs", Ylitie et al. 2017.
// Adapted from code by "AlanWBFT".
Expand Down
2 changes: 1 addition & 1 deletion tiny_bvh_gpu.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ static int triCount = 0, frameIdx = 0, spp = 0;
static Kernel* init, * clear, * generate, * extend, * shade;
static Kernel* updateCounters1, * updateCounters2, * traceShadows, * finalize;
static Buffer* pixels, * accumulator, * raysIn, * raysOut, * connections, * triData;
static Buffer* cwbvhNodes = 0, * cwbvhTris = 0, *noise = 0;
static Buffer* cwbvhNodes = 0, * cwbvhTris = 0, * noise = 0;
static size_t computeUnits;
static uint32_t* blueNoise = new uint32_t[128 * 128 * 8];

Expand Down
29 changes: 16 additions & 13 deletions wavefront.cl
Original file line number Diff line number Diff line change
Expand Up @@ -68,10 +68,12 @@ float3 CosWeightedDiffReflection( const float3 N, const float r0, const float r1
}

// PathState: path throughput, current extension ray, pixel index
#define PATH_LAST_SPECULAR 1
#define PATH_VIA_DIFFUSE 2
struct PathState
{
float4 T; // xyz = rgb, postponed pdf in w
float4 O; // pixel index and path depth in O.w
float4 T; // xyz = rgb, postponed MIS pdf in w
float4 O; // O.w: 24-bit pixel index, 4-bit path depth, 4-bit path flags
float4 D; // t in D.w
float4 hit;
};
Expand Down Expand Up @@ -125,8 +127,8 @@ void kernel Generate( global struct PathState* raysOut, uint frameSeed )
const float u = ((float)x + RandomFloat( &seed )) / (float)get_global_size( 0 );
const float v = ((float)y + RandomFloat( &seed )) / (float)get_global_size( 1 );
const float4 P = rd.p0 + u * (rd.p1 - rd.p0) + v * (rd.p2 - rd.p0);
raysOut[id].T = (float4)(1, 1, 1, -1 /* pdf, or -1 for specular vertex */);
raysOut[id].O = (float4)(rd.eye.xyz, as_float( id << 4 /* low bits: depth */ ));
raysOut[id].T = (float4)(1, 1, 1, 1 );
raysOut[id].O = (float4)(rd.eye.xyz, as_float( (id << 8) + PATH_LAST_SPECULAR ));
raysOut[id].D = (float4)(fast_normalize( P.xyz - rd.eye.xyz ), 1e30f);
raysOut[id].hit = (float4)(1e30f, 0, 0, as_float( 0 ));
}
Expand Down Expand Up @@ -169,13 +171,14 @@ void kernel Shade( global float4* accumulator,
const int pathId = atomic_dec( &shadeTasks ) - 1;
if (pathId < 0) break;
// fetch path data
float4 T4 = raysIn[pathId].T; // xyz = rgb, postponed pdf in w
float4 O4 = raysIn[pathId].O; // pixel index in O.w
float4 D4 = raysIn[pathId].D; // t in D.w
float4 T4 = raysIn[pathId].T; // xyz = rgb, postponed pdf in w
float4 O4 = raysIn[pathId].O; // pixel index in O.w
float4 D4 = raysIn[pathId].D; // t in D.w
float4 hit = raysIn[pathId].hit; // dist, u, v, prim
// prepare for shading
uint depth = as_uint( O4.w ) & 15;
uint pixelIdx = as_uint( O4.w ) >> 4;
uint pathState = as_uint( O4.w );
uint pixelIdx = pathState >> 8;
uint depth = (pathState >> 4) & 15;
uint seed = WangHash( as_uint( O4.w ) + rd.frameIdx * 17117 );
float3 T = T4.xyz;
float t = hit.x;
Expand All @@ -194,7 +197,7 @@ void kernel Shade( global float4* accumulator,
float3 lightColor = (float3)(20);
if (mat == 1 /* light source */)
{
if (T4.w == -1) accumulator[pixelIdx] += (float4)(T * lightColor, 1);
if (pathState & PATH_LAST_SPECULAR) accumulator[pixelIdx] += (float4)(T * lightColor, 1);
continue;
}
float3 vert0 = v0.xyz, vert1 = verts[vertIdx + 1].xyz, vert2 = verts[vertIdx + 2].xyz;
Expand All @@ -209,7 +212,7 @@ void kernel Shade( global float4* accumulator,
uint newRayIdx = atomic_inc( &extendTasks );
float3 R = Reflect( D, N );
raysOut[newRayIdx].T = (float4)(T * diff, -1 /* mark vertex as specular */);
raysOut[newRayIdx].O = (float4)(I + R * EPSILON, as_float( (pixelIdx << 4) + depth + 1 ));
raysOut[newRayIdx].O = (float4)(I + R * EPSILON, as_float( (pixelIdx << 8) + ((depth + 1) << 4) + PATH_LAST_SPECULAR ));
raysOut[newRayIdx].D = (float4)(R, 1e30f);
continue;
}
Expand Down Expand Up @@ -242,14 +245,14 @@ void kernel Shade( global float4* accumulator,
shadowOut[newShadowIdx].D = (float4)(L, dist - 2 * EPSILON);
}
// indirect illumination: diffuse bounce
if (depth < 3)
if (depth < 3 && (pathState & PATH_VIA_DIFFUSE) == 0 )
{
uint newRayIdx = atomic_inc( &extendTasks );
float3 R = CosWeightedDiffReflection( N, r2, r3 );
float PDF = dot( N, R ) * INVPI;
T *= dot( N, R ) * BRDF * native_recip( PDF );
raysOut[newRayIdx].T = (float4)(T, 1);
raysOut[newRayIdx].O = (float4)(I + R * EPSILON, as_float( (pixelIdx << 4) + depth + 1 ));
raysOut[newRayIdx].O = (float4)(I + R * EPSILON, as_float( (pixelIdx << 8) + ((depth + 1) << 4) + PATH_VIA_DIFFUSE ));
raysOut[newRayIdx].D = (float4)(R, 1e30f);
}
}
Expand Down
Loading