add _alloc_signal/_free_signal api #5264

nimlgen · 2024-07-02T19:54:20Z

No description provided.

github-actions · 2024-07-02T20:26:02Z

Changes

Name                             Lines    Diff    Tokens/Line    Diff
-----------------------------  -------  ------  -------------  ------
tinygrad/runtime/ops_nv.py         501      +2           16.0    -0.0
tinygrad/runtime/ops_amd.py        430      +2           15.0    -0.0
tinygrad/device.py                 274      +2           14.2    -0.0
tinygrad/runtime/graph/hcq.py      131      +0           21.0    -0.0


total lines changes: +6

* add _alloc_signal/_free_signal api * oops, revert this * linter

commit 51882c8efcd033fb13b3c50f1a80d4be4672a99a Author: p4sscode <p4ssenger.developer@gmail.com> Date: Tue Jul 9 14:53:41 2024 -0300 Squashed commit of the following: commit 1678199 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Tue Jul 9 20:44:44 2024 +0300 add update_copy to hcq spec (tinygrad#5348) * add update_copy to hcq spec * fix amd commit 9504db1 Author: chenyu <chenyu@fastmail.com> Date: Tue Jul 9 12:28:52 2024 -0400 remove the realize in _rebuild_tensor_v2 (tinygrad#5347) no longer needed commit 1f5de80 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Tue Jul 9 17:20:17 2024 +0300 multi reduce Tensor.var passing verify_lazyop (tinygrad#5346) * what about this * reset late gate commit 3d45219 Author: kormann <49917710+DKormann@users.noreply.github.com> Date: Tue Jul 9 15:38:39 2024 +0200 [bug fix] nested commutative pattern _match [run_process_replay] [no_assert] (tinygrad#5340) * deep pat test * lint * min diff * min lines * nothing * is res extra * cleanup2 * add res back * reduce lines * type anno --------- Co-authored-by: qazal <qazal.software@gmail.com> commit e815c57 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Tue Jul 9 15:56:06 2024 +0300 use hcq_profile in nv/amd program (tinygrad#5344) commit bee96a1 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Tue Jul 9 15:24:56 2024 +0300 fuzz uop schedules (tinygrad#5345) * basic blocks + cleanups * fixups * elif is better for future me * fuzz_schedule_max_paths * fix linter commit d5a68ae Author: Ian Paul <46036106+ianpaul10@users.noreply.github.com> Date: Tue Jul 9 10:48:42 2024 +0000 Simple abstractions3.py fix (tinygrad#5343) * abstractions3.py fix * Add abstractions3.py to CI tests commit a2a9bfd Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Tue Jul 9 10:39:39 2024 +0300 nv correct error messages with ptx (tinygrad#5341) * nv correct error messages with ptx * return compile error commit c13da83 Author: George Hotz <72895+geohot@users.noreply.github.com> Date: Mon Jul 8 21:23:19 2024 -0700 tests from lowerer branch (tinygrad#5339) * tests from lowerer branch * Update test_image_dtype.py * Update test_image_dtype.py * Update test_image_dtype.py commit 4ceab5d Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 8 22:25:03 2024 -0400 fix PTX match rule for gated LOAD (tinygrad#5338) * test padto sum with bool tensor and bool acc dtype make sure bool tensor acc with gate is handled correctly * broken in PTX * fix ptx commit a80f2df Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 8 21:33:05 2024 -0400 fix some PTX tests (tinygrad#5337) fix broken PTX tests in test_linearizer and test_uops. there are tests that were skipped and broken because it runs only with CUDA=1 and we run PTX with NV=1 now commit 9150a6b Author: wozeparrot <wozeparrot@gmail.com> Date: Tue Jul 9 00:45:40 2024 +0000 tensor metadata (tinygrad#5271) commit 7f642aa Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 8 19:19:20 2024 -0400 minor PTX matcher cleanup [run_process_replay] (tinygrad#5336) * minor PTX matcher cleanup [run_process_replay] uop.cast syntatic sugar and some newline/space cleanup * comment commit 0f09402 Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 8 18:15:04 2024 -0400 fix Tensor.all and Tensor.any for PTX (tinygrad#5335) supported boolean acc and boolean phi. and rewrite boolean max to uint8 max commit 053c706 Author: Roelof van Dijk <3604013+roelofvandijk@users.noreply.github.com> Date: Mon Jul 8 20:47:34 2024 +0200 refactor: expr_view on View (tinygrad#5315) commit 2349d83 Author: kormann <49917710+DKormann@users.noreply.github.com> Date: Mon Jul 8 20:46:15 2024 +0200 Fix scope order in graph toposort [run_process_replay] (tinygrad#5330) * fix * test * nothing commit 631bc97 Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 8 14:00:28 2024 -0400 raise line count limit to 8500 (tinygrad#5331) commit bb77469 Author: Timmy <96938750+0xtimmy@users.noreply.github.com> Date: Mon Jul 8 10:28:55 2024 -0700 multireduce scheduler tests (tinygrad#5141) Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com> commit bb2222e Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Mon Jul 8 19:01:27 2024 +0300 nv default for ampere & ada (tinygrad#5329) commit 51d6f37 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Mon Jul 8 18:25:05 2024 +0300 nv get classes based on device (tinygrad#5325) * nv get classes * support in mockgpu * choose sm based on gpu * fix * fix * fix arch commit 7d049fc Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 8 10:51:56 2024 -0400 move getting 0 and min value of a dtype to dtype.py (tinygrad#5328) cleanup getting base case for reduce ops [run_process_replay] commit b0c5c58 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Mon Jul 8 17:24:33 2024 +0300 nv rm_control to rmctrl type (tinygrad#5327) * nv rm_control to rmctrl type * fix commit 73bddc4 Author: Elias Wahl <82230675+Eliulm@users.noreply.github.com> Date: Mon Jul 8 15:07:44 2024 +0200 Fix fake dataloader (tinygrad#5326) commit 43ec8d7ed8dfd420af72875730232a5b2faf3d6b Author: p4sscode <p4ssenger.developer@gmail.com> Date: Tue Jul 9 14:44:59 2024 -0300 remove comments commit d1874a53b51859f5d74a7919773c99a535e89c44 Author: p4sscode <p4ssenger.developer@gmail.com> Date: Mon Jul 8 14:13:29 2024 -0300 remove unnecessary code commit ef5497f141c2f5b48d89707cc77b1b6b02587105 Author: p4sscode <p4ssenger.developer@gmail.com> Date: Mon Jul 8 13:27:50 2024 -0300 remove simple_relu and env commit 61be81b3fa1d57ea612a97c72e53430f369bf59f Author: p4sscode <p4ssenger.developer@gmail.com> Date: Mon Jul 8 13:26:37 2024 -0300 working with accs commit 8803069c0459a276d373d4c0c0feeb0b3b057761 Author: p4sscode <p4ssenger.developer@gmail.com> Date: Mon Jul 8 11:02:05 2024 -0300 make amx an OptOps commit 49119c2b62d89eaa0a2825714b44b549f7145edb Merge: 97faa1c3 6856f91 Author: p4sscode <p4ssenger.developer@gmail.com> Date: Sun Jul 7 15:55:03 2024 -0300 merge master commit 6856f91 Author: chenyu <chenyu@fastmail.com> Date: Sun Jul 7 14:36:00 2024 -0400 Tensor.any and Tensor.all (tinygrad#5320) does not work in ptx yet due to how boolean tensor is handled commit 2029cb7 Author: chenyu <chenyu@fastmail.com> Date: Sun Jul 7 13:04:22 2024 -0400 support passing None to Tensor.clip (tinygrad#5319) passing None for no upper bound or no lower bound commit 296a1a3 Author: chenyu <chenyu@fastmail.com> Date: Sun Jul 7 12:10:39 2024 -0400 update Tensor.round doc and example (tinygrad#5318) document rounding half to even and update examples to show commit c1e330f Author: chenyu <chenyu@fastmail.com> Date: Sun Jul 7 11:52:58 2024 -0400 Tensor.int and Tensor.bool (tinygrad#5317) commit 778d1cd Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Sun Jul 7 17:34:49 2024 +0300 nv allocate local memory dynamically (tinygrad#5277) * nv allocate local memory dynamically * fix * linter * linter 2 * linter * fixes commit ae10e93 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sun Jul 7 10:49:08 2024 +0300 UOps.VECTORIZE cleanups [run_process_replay] (tinygrad#5314) * still render_cast * one extra line ok * these are all just vectorize * save space * behavior change can go in a different diff commit 77b2ce9 Author: greg-niemeyer <152219575+greg-niemeyer@users.noreply.github.com> Date: Sat Jul 6 23:59:57 2024 -0700 Add UOps.VECTORIZE [run_process_replay] (tinygrad#5289) * Add UOps.VECTORIZE to core * Update vectorized cast tests * Addresses code review comments - Removes VECTORIZE from LLVMRenderer - Add line breaks to unduly long lines - Add noop CAST rule back - Update asserts and add render_vectorize in CSytleLanguage renderer * Add missing const folding rule for VECTORIZE Also adds corresponding test * Fixes test_const_vectorize_fold and add assert - Use sane types with VECTORIZE in test_const_vectorize_fold - Add assert that sanity checks the types for VECTORIZE * Rename test_cast_vectorized_fold Renames test_cast_vectorized_fold to test_noop_vectorize_fold because the test targets a very specific rule and there are other tests for VECTORIZE. * Revert unrelated changes --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com> Co-authored-by: qazal <qazal.software@gmail.com> commit 2a7282c Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sun Jul 7 09:12:49 2024 +0300 test: delete the extra cast in cstyle load [run_process_replay] [no_assert] (tinygrad#5310) * test: delete the extra cast in cstyle load [run_process_replay] [no_assert] * assert buf_uop * ImageDType * ptx is actually a 64bit address commit cededd8 Author: chenyu <chenyu@fastmail.com> Date: Sat Jul 6 21:55:59 2024 -0400 minor multi cleanup (tinygrad#5311) add type, move around and some newlines commit 8a99514 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sun Jul 7 00:06:30 2024 +0300 generalize the uops toposort spec to ptx (tinygrad#5309) * generalize spec to ptx * redundant assert * extra print commit ca0ef17 Author: chenyu <chenyu@fastmail.com> Date: Sat Jul 6 12:47:27 2024 -0400 use precise::sin in metal (tinygrad#5307) commit 5c2ca7b Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sat Jul 6 19:28:47 2024 +0300 remove UOps.SPECIAL rendering from llvm (tinygrad#5306) commit 356e5d2 Author: chenyu <chenyu@fastmail.com> Date: Sat Jul 6 11:54:12 2024 -0400 touchup multi dtype in elementwise (tinygrad#5305) only need to check real once, also added type annotation commit 7ddda9f Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sat Jul 6 14:13:58 2024 +0300 hotfix: cache seen graphs in fusion (tinygrad#5302) commit 11dfb19 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sat Jul 6 12:39:31 2024 +0300 track seen graphs in recursive group (tinygrad#5301) * track seen * maybe never add realized * ahh it needs to track sts * delete extra check * cache typings * minor cleanup commit d813617 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sat Jul 6 12:04:03 2024 +0300 prescheduling refactor (tinygrad#5300) * p1 * refactor tuple commit c1e166c Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Sat Jul 6 11:36:40 2024 +0300 fix dtype mismatch for bool ops in multi (tinygrad#5299) commit fc03fc0 Author: chenyu <chenyu@fastmail.com> Date: Fri Jul 5 14:52:09 2024 -0400 enable sin on METAL in test_dtype_alu (tinygrad#5298) commit b369e75 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Fri Jul 5 21:14:38 2024 +0300 refactor schedule creation (tinygrad#5297) commit 5292d37 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Fri Jul 5 19:43:50 2024 +0300 LoadOps.VIEW in the scheduler spec (tinygrad#5296) * refactor to allow_buffer_view * tests * fix multi commit 1ab7a4c Author: hikettei <88639579+hikettei@users.noreply.github.com> Date: Sat Jul 6 01:16:44 2024 +0900 Handling Multiple UnaryOps.BITCAST in Function for Proper Kernel Fusion [run_process_replay] (tinygrad#5172) * [Patch] added an option not to ignore view replacing when doing bitcast * added the testcase * [Add] reproduced bitcast cannot be fused into a single kernel in the unittest --------- Co-authored-by: qazal <77887910+Qazalin@users.noreply.github.com> commit 43c3f73 Author: chenyu <chenyu@fastmail.com> Date: Fri Jul 5 11:01:20 2024 -0400 handcode_bert_opt.py (tinygrad#5295) similar to handcode_resnet50_opt.py, one file to check bert kernels without dataset. commit d7835a7 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Fri Jul 5 16:53:40 2024 +0300 hotfix: fix metal with vars (tinygrad#5294) * hotfix: fix metal with vars * one more place commit 8a548b0 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Fri Jul 5 16:13:05 2024 +0300 metal support offset (tinygrad#5293) commit 1cefbb3 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Fri Jul 5 13:00:01 2024 +0300 uop graph tests + type_verify cleanup (tinygrad#5292) * test_cast_alu_fold * test_double_cast_fold + these should assert commit 341c4a2 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Fri Jul 5 11:29:35 2024 +0300 hotfix: use dtype.scalar() for rendering cast [run_process_replay] [no_assert] (tinygrad#5290) commit 87d27c4 Author: chenyu <chenyu@fastmail.com> Date: Thu Jul 4 14:25:24 2024 -0400 minor _broadcast cleanup (tinygrad#5286) `any(x==0 for x in y)` is `0 in y`. also `get_args(ConstType)` instead of hard coded `float, int, bool` commit 8c03816 Author: SnakeOnex <romakcz@gmail.com> Date: Thu Jul 4 17:15:07 2024 +0200 fix README example (tinygrad#5284) * fixed README example * README test * changed py -> python markdown code flags in REAME commit 2778b60 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Thu Jul 4 18:06:04 2024 +0300 new memory scheduler (tinygrad#5278) * new memory schedule algo * works * fix * fix * linter * tiny fixes * do not optimize copy buffers * mpre comments * tiny cleanups commit 84b3e3b Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Thu Jul 4 13:29:21 2024 +0300 hcq exec no embedded signal (tinygrad#5142) commit 0c3a35e Author: Tobias Fischer <tobiasfischer17@gmail.com> Date: Wed Jul 3 22:47:10 2024 -0400 Stable Diffusion v2 Inference (tinygrad#5283) * model implementation * clip fix, more qol options commit e5ba385 Author: chenyu <chenyu@fastmail.com> Date: Wed Jul 3 19:42:56 2024 -0400 remove first contiguous in multi from_sharded (tinygrad#5121) second contiguous guarantees lbs are contiguous going into MultiLazyBuffer, don't need the first contiguous commit f1ff65e Author: chenyu <chenyu@fastmail.com> Date: Wed Jul 3 17:52:50 2024 -0400 remove "no-nans-fp-math"="true" for LLVM (tinygrad#5282) fixed isnan for llvm (still have issue with < nan) commit 97faa1c307ee9322a88de83cf6c2d2ae8918db56 Author: p4sscode <p4ssenger.developer@gmail.com> Date: Wed Jul 3 16:01:58 2024 -0300 minor changes commit 3929a9d Author: chenyu <chenyu@fastmail.com> Date: Wed Jul 3 14:59:05 2024 -0400 fix UOp.cmp_tuple for ALU (tinygrad#5280) * fix UOp.cmp_tuple for ALU for ALU, use self.arg instead of self.op to compare * skip that? commit a9d6a6c Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Wed Jul 3 20:15:42 2024 +0300 verify_lazyop with multi reduce (tinygrad#5276) * outsource the assert to the implicit movement op check * tests commit 16e3b8b Author: George Hotz <72895+geohot@users.noreply.github.com> Date: Wed Jul 3 09:40:00 2024 -0700 uops work from lowerer [run_process_replay] (tinygrad#5279) commit 622b7bd Author: chenyu <chenyu@fastmail.com> Date: Wed Jul 3 12:28:53 2024 -0400 simpler TinyJit inside TinyJit detection (tinygrad#5219) * simpler TinyJit inside TinyJit detection suggested in tinygrad@73395b9#commitcomment-143660402 * cannot repro... * clear the way out * finally clear commit 04ef0fd Author: gip <gip@users.noreply.github.com> Date: Wed Jul 3 09:07:09 2024 -0700 fix: message when applegpu tools missiong (tinygrad#5236) commit d3e244d Author: reddyn12 <72528507+reddyn12@users.noreply.github.com> Date: Wed Jul 3 12:06:01 2024 -0400 prev speed improvements (tinygrad#5252) Co-authored-by: reddyn <nikidsniper@gmail.com> commit 8c90b4d4b0f4d0ba97d3e9ee12d3253523e69bca Author: p4sscode <p4ssenger.developer@gmail.com> Date: Wed Jul 3 11:25:30 2024 -0300 vec is done before store commit 21d41f0 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Wed Jul 3 11:34:10 2024 +0300 nv follows HCQCompatAllocRes protocol (tinygrad#5275) * nv follows HCQCompatAllocRes protocol * fix amd commit d3e4e21 Author: Vyacheslav Pachkov <slava.pach@gmail.com> Date: Wed Jul 3 10:25:44 2024 +0300 add return type for HCQCompatAllocator _alloc (tinygrad#5267) Co-authored-by: nimlgen <138685161+nimlgen@users.noreply.github.com> commit 191463a Author: chenyu <chenyu@fastmail.com> Date: Tue Jul 2 23:29:54 2024 -0400 add timing to SDXL (tinygrad#5273) commit b2c3a28 Author: chenyu <chenyu@fastmail.com> Date: Tue Jul 2 21:39:01 2024 -0400 nn.RMSNorm (tinygrad#5272) the norm itself has no significant value to add to Tensor method, but we would want Tensor.normalize commit 9a2a82a Author: chenyu <chenyu@fastmail.com> Date: Tue Jul 2 21:37:52 2024 -0400 test stable diffusion unet in ci (tinygrad#5268) unet is parameterized now so can test a smaller one is ci commit ce52b10 Author: chenyu <chenyu@fastmail.com> Date: Tue Jul 2 20:01:11 2024 -0400 add a flag DISABLE_LOOP_COLLAPSE (tinygrad#5270) workaround if user encountered UNMUL error commit e53b164 Author: George Hotz <72895+geohot@users.noreply.github.com> Date: Tue Jul 2 15:03:54 2024 -0700 small changes from lowerer (tinygrad#5266) commit 7be776f Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Tue Jul 2 23:35:39 2024 +0300 add _alloc_signal/_free_signal to hcq (tinygrad#5264) * add _alloc_signal/_free_signal api * oops, revert this * linter commit 9a25ee0 Author: Tobias Fischer <tobiasfischer17@gmail.com> Date: Tue Jul 2 12:40:27 2024 -0400 pixed unet call params (tinygrad#5262) commit b4cb32bac18b7ca2426e243c40fc90fe90b014ad Author: p4sscode <p4ssenger.developer@gmail.com> Date: Tue Jul 2 10:41:20 2024 -0300 squash merge amx_support for third revision commit 59bc837 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Tue Jul 2 15:13:10 2024 +0300 refactor gated load rendering [run_process_replay] (tinygrad#5259) * refactor gated load rendering [run_process_replay] * hotfix: extra line * remove llvm diff commit e050603 Author: nimlgen <138685161+nimlgen@users.noreply.github.com> Date: Tue Jul 2 13:57:46 2024 +0300 nv close fds after mapping (tinygrad#5246) commit d3cfb6c Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Tue Jul 2 13:48:47 2024 +0300 refactor UOps.LOAD barrier [run_process_replay] (tinygrad#5258) commit a1044e6 Author: qazal <77887910+Qazalin@users.noreply.github.com> Date: Tue Jul 2 09:21:09 2024 +0300 iterate over scoped uops once [run_process_replay] (tinygrad#5255) commit dfbee4f Author: wozeparrot <wozeparrot@gmail.com> Date: Tue Jul 2 02:33:58 2024 +0000 feat: add blobfile to testing (tinygrad#5254) commit 8c9c1cf Author: Tobias Fischer <tobiasfischer17@gmail.com> Date: Mon Jul 1 22:33:01 2024 -0400 Pulled CLIP and UNet into Seperate Files (tinygrad#5253) * pulled clip and unet into seperate files * reference cleanup, lru cache fix * better pool indexing commit 5808c37 Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 1 15:00:47 2024 -0400 hotfix disable flaky llama3 beam benchmark on green (tinygrad#5249) commit b9122ec Author: chenyu <chenyu@fastmail.com> Date: Mon Jul 1 14:43:47 2024 -0400 revert stable diffusion validation with threefry (tinygrad#5248) * Revert "use threefry in stable diffusion benchmark (tinygrad#4988)" This reverts commit 44dfa37. * sdxl and validation fix * relax threshold

nimlgen and others added 4 commits July 2, 2024 22:53

add _alloc_signal/_free_signal api

35853a0

oops, revert this

9785f5f

linter

c8d1eeb

Merge branch 'master' into hcq_signal_cleanup

d48080a

nimlgen merged commit 7be776f into tinygrad:master Jul 2, 2024
16 checks passed

vpachkov pushed a commit to vpachkov/tinygrad that referenced this pull request Jul 8, 2024

add _alloc_signal/_free_signal to hcq (tinygrad#5264)

16112b4

* add _alloc_signal/_free_signal api * oops, revert this * linter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add _alloc_signal/_free_signal api #5264

add _alloc_signal/_free_signal api #5264

nimlgen commented Jul 2, 2024

github-actions bot commented Jul 2, 2024

add _alloc_signal/_free_signal api #5264

add _alloc_signal/_free_signal api #5264

Conversation

nimlgen commented Jul 2, 2024

github-actions bot commented Jul 2, 2024

Changes