Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/master'
Browse files Browse the repository at this point in the history
  • Loading branch information
AntGro committed Jan 8, 2025
2 parents 9116742 + e01c657 commit 033a768
Showing 6 changed files with 96 additions and 4 deletions.
2 changes: 1 addition & 1 deletion HEBO/hebo/__init__.py
Original file line number Diff line number Diff line change
@@ -14,4 +14,4 @@
from . import optimizers
from . import sklearn_tuner

__version__ = "0.3.5"
__version__ = "0.3.6"
2 changes: 1 addition & 1 deletion HEBO/setup.py
Original file line number Diff line number Diff line change
@@ -18,7 +18,7 @@

setuptools.setup(
name = 'HEBO',
version = '0.3.5', # also needs to be changed in hebo/__init__.py
version = '0.3.6', # also needs to be changed in hebo/__init__.py
packages = setuptools.find_packages(),
description = 'Heteroscedastic evolutionary bayesian optimisation',
long_description = long_description,
8 changes: 8 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -20,6 +20,7 @@ Huawei, Noah's Ark Lab.
- [SparsePO: Controlling Preference Alignment of LLMs via Sparse Token Masks](./SparsePO)
- Generative Model Research
- [EM-LLM: Human-like Episodic Memory for Infinite Context LLMs](./EM-LLM)
- [Mixture of Attentions For Speculative Decoding](https://github.com/huawei-noah/HEBO/tree/mixture-of-attentions/)

Further instructions are provided in the README files associated to each project.

@@ -340,3 +341,10 @@ Code associated with our EM-LLM paper: [[arXiv]](https://arxiv.org/abs/2407.0945
#### Abstract

Large language models (LLMs) have shown remarkable capabilities, but still struggle with processing extensive contexts, limiting their ability to maintain coherence and accuracy over long sequences. In contrast, the human brain excels at organising and retrieving episodic experiences across vast temporal scales, spanning a lifetime. In this work, we introduce EM-LLM, a novel approach that integrates key aspects of human episodic memory and event cognition into LLMs with no fine-tuning, enabling them to handle practically infinite context lengths while maintaining computational efficiency. EM-LLM organises sequences of tokens into coherent episodic events using a combination of Bayesian surprise and graph-theoretic boundary refinement in an online fashion. When needed, these events are retrieved through a two-stage memory process, combining similarity-based and temporally contiguous retrieval for efficient and human-like access to relevant information. Experiments on the LongBench and $\infty$-Bench benchmarks demonstrate EM-LLM's superior performance, consistently outperforming the state-of-the-art retrieval model InfLLM across various baseline LLMs. In addition, EM-LLM outperforms its popular counterpart, RAG, in a wide range of tasks, while requiring similar resources. Notably, EM-LLM's performance even surpasses full-context models in most tasks, while successfully performing retrieval across 5 million tokens -- a scale computationally infeasible for such models. Finally, our analysis reveals strong correlations between EM-LLM's event segmentation and human-perceived events, suggesting a bridge between this artificial system and its biological counterpart, thereby offering a novel computational framework for exploring human memory mechanisms.

## [Mixture of Attentions for Speculative Decoding](https://github.com/huawei-noah/HEBO/tree/mixture-of-attentions/)

#### Abstract

The growth in the number of parameters of Large Language Models (LLMs) has led to a significant surge in computational requirements, making them challenging and costly to deploy. Speculative decoding (SD) leverages smaller models to efficiently propose future tokens, which are then verified by the LLM in parallel. Small models that utilise activations from the LLM currently achieve the fastest decoding speeds. However, we identify several limitations of SD models including the lack of on-policyness during training and partial observability. To address these shortcomings, we propose a more grounded architecture for small models by introducing a Mixture of Attentions for SD. Our novel architecture can be applied in two scenarios: a conventional single device deployment and a novel client-server deployment where the small model is hosted on a consumer device and the LLM on a server. In a single-device scenario, we demonstrate state-of-the-art speedups improving EAGLE-2 by 9.5% and its acceptance length by 25%. In a client-server setting, our experiments demonstrate: 1) state-of-the-art latencies with minimal calls to the server for different network conditions, and 2) in the event of a complete disconnection, our approach can maintain higher accuracy compared to other SD methods and demonstrates advantages over API calls to LLMs, which would otherwise be unable to continue the generation process.

22 changes: 22 additions & 0 deletions SIMMER/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
MIT License

Copyright (c) 2022. Huawei Technologies Co., Ltd.

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL ~~THE~~
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

4 changes: 2 additions & 2 deletions SIMMER/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Saut\'e and Simmer {RL}: Safe Reinforcement Learning Using Safety State Augmentation
# Sauté and Simmer RL: Safe Reinforcement Learning Using Safety State Augmentation

### Sauté RL: Almost Surely Safe RL Using State Augmentation

@@ -39,7 +39,7 @@ conda env create -f sauterl.yml
conda activate sauterl
```

Our implementation is based on the Open AI safety starter agents. To install the Open AI libraries run the following commands:
Our implementation is based on the Open AI safety starter agents (distributed under MIT license). To install the Open AI libraries run the following commands:

```console
mkdir safe-rl
62 changes: 62 additions & 0 deletions SIMMER/THIRD PARTY OPEN SOURCE SOFTWARE NOTICE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
THIRD PARTY OPEN SOURCE SOFTWARE NOTICE

Please note we provide an open source software notice for the third party open source software along with this software
and/or this software component contributed by Huawei (in the following just “this SOFTWARE”).
The open source software licenses are granted by the respective right holders.

Warranty Disclaimer
THE OPEN SOURCE SOFTWARE IN THIS SOFTWARE IS DISTRIBUTED IN THE HOPE THAT IT WILL BE USEFUL,
BUT WITHOUT ANY WARRANTY, WITHOUT EVEN THE IMPLIED WARRANTY OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
SEE THE APPLICABLE LICENSES FOR MORE DETAILS.

------------------------------------------------------------------------------------------------------------------------

Copyright Notice and License Texts

Software: Safety starter agents (https://github.com/openai/safety-starter-agents)

MIT License

Copyright (c) 2019 OpenAI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

Software: Safety gym (https://github.com/openai/safety-gym)

MIT License

Copyright (c) 2019 OpenAI

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

0 comments on commit 033a768

Please sign in to comment.