-
@dll-wu @SJTU-LIT
- China
Highlights
- Pro
Stars
Implementation of the paper: "AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?"
ScreenAgent: A Computer Control Agent Driven by Visual Language Large Model (IJCAI-24)
Official repo for paper DigiRL: Training In-The-Wild Device-Control Agents with Autonomous Reinforcement Learning.
Code for the paper 🌳 Tree Search for Language Model Agents
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
A project structure aware autonomous software engineer aiming for autonomous program improvement. Resolved 30.67% tasks (pass@1) in SWE-bench lite and 38.40% tasks (pass@1) in SWE-bench verified wi…
VisualWebArena is a benchmark for multimodal agents.
[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).
An Analytical Evaluation Board of Multi-turn LLM Agents
This is the official implementation of the paper: "Contrastive Learning of Sentence Embeddings from Scratch"
Official github repo for C-Eval, a Chinese evaluation suite for foundation models [NeurIPS 2023]
ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
A Complete PyTorch 1.0 Implementation of Gated Graph Sequence Neural Networks (GGNN)
Object detection, 3D detection, and pose estimation using center point detection:
Implementation for our paper "Conditional Image-Text Embedding Networks"
Implementation of Grounding of Textual Phrases in Images by Reconstruction in Tensorflow
Implementation of Knowledge Aided Consistency for Weakly Supervised Phrase Grounding in Tensorflow