Skip to content
View vtu81's full-sized avatar
🥯
Everything
🥯
Everything

Highlights

  • Pro

Block or report vtu81

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vtu81/README.md

Pinned Loading

  1. SORRY-Bench/sorry-bench SORRY-Bench/sorry-bench Public

    SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal Behaviors

    Jupyter Notebook 33

  2. LLM-Tuning-Safety/LLMs-Finetuning-Safety LLM-Tuning-Safety/LLMs-Finetuning-Safety Public

    We jailbreak GPT-3.5 Turbo’s safety guardrails by fine-tuning it on only 10 adversarially designed examples, at a cost of less than $0.20 via OpenAI’s APIs.

    Python 239 28

  3. backdoor-toolbox backdoor-toolbox Public

    A compact toolbox for backdoor attacks and defenses.

    Python 144 18

  4. Unispac/Subnet-Replacement-Attack Unispac/Subnet-Replacement-Attack Public

    Official implementation of (CVPR 2022 Oral) Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks.

    Jupyter Notebook 26 7

  5. Unispac/Fight-Poison-With-Poison Unispac/Fight-Poison-With-Poison Public

    Code repository for the paper --- [USENIX Security 2023] Towards A Proactive ML Approach for Detecting Backdoor Poison Samples

    Python 22 2

  6. ain-soph/trojanzoo ain-soph/trojanzoo Public

    TrojanZoo provides a universal pytorch platform to conduct security researches (especially backdoor attacks/defenses) of image classification in deep learning.

    Python 274 63