-
Notifications
You must be signed in to change notification settings - Fork 529
Insights: pytorch/FBGEMM
Overview
-
0 Active issues
-
- 0 Merged pull requests
- 19 Open pull requests
- 0 Closed issues
- 0 New issues
Could not load contribution data
Please try again later
1 Release published by 1 person
-
v1.1.0 FBGEMM_GPU v1.1.0 Release Notes
published
Jan 29, 2025
19 Pull requests opened by 8 people
-
Fix handling of dynamic FP8 grouped gemm on Nvidia
#3616 opened
Jan 26, 2025 -
Performance Optimization: Optimized TileShape Configuration for f8
#3617 opened
Jan 27, 2025 -
Re-land D67407935 (Optimized backward pass for ROCm devices, pt 2)
#3619 opened
Jan 27, 2025 -
Partial revert of D66986498
#3620 opened
Jan 27, 2025 -
finish #1808 cherry-pick, adjust interface
#3627 opened
Jan 28, 2025 -
Update bf16i4 gemm with new cutlass version
#3630 opened
Jan 29, 2025 -
avoid using warning tensor in cpu tbe op
#3631 opened
Jan 29, 2025 -
k_norm in rope for fp8 kv cache
#3633 opened
Jan 29, 2025 -
Partial revert of D66986498 (Optimized backward pass for ROCm devices, pt 1), 2nd attempt
#3637 opened
Jan 29, 2025 -
Adding Missing includes and explicitly declaring Tensor in aten namespace.
#3638 opened
Jan 30, 2025 -
Fix zero_start_index_M argument for triton rowwise quantize
#3639 opened
Jan 30, 2025 -
Re-organize SLL ops, pt 4
#3644 opened
Jan 30, 2025 -
Adding couple more APIs to KVTensorWrapper to bring partiy with torch::Tensor
#3645 opened
Jan 31, 2025 -
Re-organize SLL ops, pt 5
#3646 opened
Jan 31, 2025 -
Re-organize SLL ops, pt 6
#3647 opened
Jan 31, 2025 -
Add preprocess stage to quantize bench operators
#3648 opened
Jan 31, 2025 -
Re-organize SLL ops, pt 7
#3650 opened
Jan 31, 2025 -
Add tracing option to quantize bench
#3651 opened
Jan 31, 2025 -
Improve FP8 grouped GEMM perf via tileshape and cooperative
#3653 opened
Feb 2, 2025
26 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Use int64_t for buffer indices to avoid overflow
#1896 commented on
Jan 27, 2025 • 0 new comments -
Fixing type comparing errors from Lint
#1911 commented on
Jan 27, 2025 • 0 new comments -
Debug test_jagged_dense_bmm CI errors
#1928 commented on
Jan 27, 2025 • 0 new comments -
Print CMake commands
#1949 commented on
Jan 27, 2025 • 0 new comments -
Use Nova workflow to host all published wheel files at PyTorch site
#1958 commented on
Jan 27, 2025 • 0 new comments -
Revert diff that fails test
#1995 commented on
Jan 27, 2025 • 0 new comments -
Fix hypothesis version
#1996 commented on
Jan 27, 2025 • 0 new comments -
Revert skipping test_pack_segments
#2037 commented on
Jan 27, 2025 • 0 new comments -
Add device parameter to DenseTableBatchedEmbeddingBagsCodegen
#2054 commented on
Jan 27, 2025 • 0 new comments -
Unify TBE API for key FBGEMM operations (Frontend)
#2081 commented on
Jan 27, 2025 • 0 new comments -
Skip InputCombineTest for ROCm on OSS CI
#2229 commented on
Jan 27, 2025 • 0 new comments -
Add header to fix PRIu64 error
#2234 commented on
Jan 27, 2025 • 0 new comments -
ensemble_rowwise_adagrad
#2871 commented on
Jan 27, 2025 • 0 new comments -
Remove SSD from OSS
#3043 commented on
Jan 27, 2025 • 0 new comments -
build cuda-only
#3061 commented on
Jan 27, 2025 • 0 new comments -
Test build time for sm 8.0 only
#3068 commented on
Jan 27, 2025 • 0 new comments -
Fix FBGEMM CI strict aliasing and unused var errors
#3083 commented on
Jan 27, 2025 • 0 new comments -
Add NEON and SVE implementations for Float16 conversions
#3424 commented on
Jan 30, 2025 • 0 new comments -
Remove numpy requirement
#3470 commented on
Jan 27, 2025 • 0 new comments -
sparse_ema_options (fbgemm)
#3473 commented on
Jan 27, 2025 • 0 new comments -
Support FP8 grouped GEMM with rowwise scailing
#3560 commented on
Feb 2, 2025 • 0 new comments -
Refactor FP8 grouped GEMM with dynamic and static versions
#3561 commented on
Feb 2, 2025 • 0 new comments -
Unifying TBE API using List (Backend)
#3563 commented on
Jan 27, 2025 • 0 new comments -
AdagradW
#3605 commented on
Feb 2, 2025 • 0 new comments -
Port oss f16_fast_gemv into fbcode
#3610 commented on
Feb 1, 2025 • 0 new comments -
Updating split_table_batched_embeddings_ops_training.py
#3613 commented on
Jan 31, 2025 • 0 new comments