First draft at adding persistent memory via sqlite3 #124

EricFedrowisch · 2023-04-04T00:46:11Z

Implemented persistent memory with a sqlite3 database instead of list of strings.
Benefits include:
-session memories can now be persistent
-contents searchable between multiple sessions
-multiple agents could in theory interoperate/cooperate with one centralized memory repository since each would get a new session id

DamascusGit

LGTM

ryanpeach · 2023-04-04T04:34:18Z

Is SQLLite really the best database for long term text storage? I'd think elastisearch...

Also we are getting close to needing a docker compose file.

ryanpeach · 2023-04-04T04:35:13Z

Also see #122

waynehamadi · 2023-04-04T05:32:42Z

@ryanpeach elasticsearch is very heavy. they don't have a long term free version on the cloud like pinecone does, so putting the api key of the cloud offering wouldn't be very useful for most people.
Putting it as a docker file I am not sure...
But a big benefit of elasticsearch is that it does good keyword search and it allows people to not have to embed their texts.

But I would put opensearch though, Elasticsearch is under an ambiguous license, so people forking autogpt and using it for comercial use with the elasticsearch integration might run into issues.

EricFedrowisch · 2023-04-04T08:42:31Z

Sqlite3 is part of python standard library. That's why I chose it as a sensible default. It sidesteps both license and installation issues. If they have python 3, they have sqlite3.
https://docs.python.org/3/library/persistence.html
It was either that or pickle.

Eunyxjk · 2023-04-04T08:56:44Z

Thank you, big shot

Eunyxjk · 2023-04-04T08:58:01Z

!!

EricFedrowisch · 2023-04-04T12:16:41Z

To be clear, I'm pro-vector inputs. However, vectored inputs can be added in addition to this. This is more like a "persistent log".

ryanpeach · 2023-04-04T15:00:08Z

I also think that vector memory is one thing and long term text retrieval is another, like @EricFedrowisch says. However, I'd also be fine saving long term text to a text file and searching it with an nlp library or regex. I don't think SQL is a good choice for long term text storage because it doesn't add value in querying the text, and it doesn't add value in storing the text over raw text on disk storage.

I'd totally be in favor of opensearch in docker though.

EricFedrowisch · 2023-04-04T15:26:59Z

@ryanpeach Wow! Then you should totally write some code, Ryan. Like just do it. No one is stopping you.

Merging up to latest on Torantulino/Auto-GPT Master

ryanpeach · 2023-04-04T15:35:45Z

@EricFedrowisch Dude i have several PR's. I'm contributing to the discussion on this one. That kind of unprofessional comment is very toxic.

Choosing the appropriate, limited, set of databases for this repo will be important for its future. It should probably be discussed. I'd rather that discussion be had before writing code to implement.

EricFedrowisch · 2023-04-04T15:50:35Z

@ryanpeach Weird. I don't see any pull requests here by you. I see a lot of issues about code styling, which feels pointless when half the point of GPT-4 is to have the AI write the code. You can call me toxic if you like, but I don't see how you materially improved the conversation in any way. If you have better ideas, then write them and have people use them.
Several people are already using mine: #125 (comment)
And they seem to like it.

Joe0 · 2023-04-04T16:05:16Z

I hadn't seen this prior to implementing something similar. I basically built the ability to save "snapshots", which include the history and memory; which allows you to stop/start in the same state, and potentially course-correct manually. I had to do some refactoring to make it clean enough that we can swap out data stores (you should be able to change the snapshot file to point to a different data store).

Joe0#9

EricFedrowisch · 2023-04-04T16:13:22Z

@Joe0 Framing the data as snapshots seems very useful. The persistent memory imp I roughed out uses "sessions", with basically each run of the program creating a new session. There is a command to get_session(id), so you can retrieve whatever session you want in plain text. If you can decouple the snapshot from data stores, that would be huge.

Joe0 · 2023-04-04T16:25:32Z

@EricFedrowisch ya, I'm not sure what the best approach is. We can store the state transitions to reconstruct the internal state at some point in time (which is initially how I approached it), but it was getting annoying, so I switched to just taking snapshots of the state after each transition.

Maybe the right idea is to just build some tooling around re-creation snapshots based on the state transitions. Though, it could be useful for people collaborating to be able to resume from snapshots.

EricFedrowisch · 2023-04-04T16:36:26Z

@Joe0 I think you are onto something. Still reading and trying to understand your pr.

Out of curiousity, what was your motivation for focusing on state transitions?

Joe0 · 2023-04-04T17:17:02Z

@EricFedrowisch if you store the transitions themselves, you get some nice properties like full replays, and can potentially do some level of caching. Snapshots would also give you full replayability, but you can't really do caching with them, unless the full states are equal (which seems much less likely). The benefit of snapshots is that you don't need to replay or reconstruct the current state, and just have it.

I had a few motivations behind the implementation, one of which was the ability to save states and create test cases. Another was the ability to manually course-correct the system by exiting and modifying the snapshot manually. It also creates a nice debugging tool, where you can effectively see what happened, all I need to add is the outputs of the commands themselves.

The core of the PR is really just refactoring the way we manage message history and memory so that we can "hook" all the operations and perform side effects (saving). Then the rest is just saving/loading. I'm working on a cleaner PR that's branched off master of this repo.

EricFedrowisch · 2023-04-04T17:31:28Z

@ryanpeach Several people have confirmed that I was being toxic. Please accept my apology. Your desire to fully explore the questions this project raises are reasonable. Again, I'm sorry. I'll try and do better in the future.

dschonholtz · 2023-04-04T23:08:20Z

I like your class.
I just put together the code mentioned above on #122 for using pinecone.
It might make sense to put these both on the same interface and/or we could consider something like dumping text to sql and just putting the embeddings in pinecone.
Currently, I am sending up the raw text as meta data and it would be better to store that locally so we don't store more crap in pinecone than we need to and so we can query stuff locally.

It might be worth doing something like:

Store raw memory data locally in sql mapped to a vector id
Query for relevant memories in pinecone which returns vector id so we can query sql from it.
Make a start up option to push local sql to pinecone embeddings.

Just some thoughts. Also, wanted to make sure we could make the merge conflict situation less painful and make sure we work together on whatever solution we come up with if we do take both PRs

EricFedrowisch · 2023-04-05T03:14:37Z

@dschonholtz Glad to see #122 . I'd see this pr as analogous to your SimpleMemory class and/or a local cache. I'm not familiar with pinecone, currently using llama_index for my experiments. So, gonna leave decisions revolving around that to others with more experience.
This repo is a viral success, pull requests and issues are coming in way faster than one person can likely deal with. So many changes are coming in, so fast that it seems likely that merges are going to be painful for a while. @Torantulino needs some time to have a chance to communicate his vision.

jdonkers · 2023-04-05T17:42:44Z

I tested this code and believe there's a bug. The memory addresses are being input to the prompt, rather than the memory itself.

Example:
{'role': 'system', 'content': 'Permanent memory: <memory.MemoryDB object at 0x00000174FEDD3F40>'}

ryanpeach · 2023-04-05T17:49:39Z

If we do choose to use a SQL database, I'd at least petition that we use an adaptor. I have plans to use autogpt on k8s with multiple instances talking to one another in the future. I believe the database should be Horizontally Autoscalable. In the SQL world, the best database for that is Postgres. If we implemented the option to use sqlalchemy, or something similar, then the database could be customizable by the user.

I think another thing is making the memory modular. We will apparently obviously disagree on database implementations. Maybe the only solution to that is to create a database folder, a class interface, and then just a bunch of implementations, with a CLI flag or config file for the user choice.

A discussion should be had with this PR about such a design as well:
#122

EricFedrowisch · 2023-04-05T17:50:33Z

I'll check it out, @jdonkers and do another commit. My guess is I need to do a repr so that it can return a string for the current session text.

nponeccop

Rebase against the current master

…-GPT into pr/124 Moved code to new package to integrate later perhaps.

Resolved.

…g#98 added tailwind css class which fix the bug Significant-Gravitas#98

First draft at adding persistent memory via sqlite3

First draft at adding persistent memory via sqlite3

6adef8e

EricFedrowisch mentioned this pull request Apr 4, 2023

long-term memory? #125

Closed

This comment was marked as duplicate.

Sign in to view

DamascusGit previously approved these changes Apr 4, 2023

View reviewed changes

EricFedrowisch mentioned this pull request Apr 4, 2023

Logs, Watchdog, Monitored Continuous Mode [auto-PAUSE] #151

Closed

Merge pull request #1 from Torantulino/master

fa0ec78

Merging up to latest on Torantulino/Auto-GPT Master

Joe0 mentioned this pull request Apr 4, 2023

Add the ability to capture and load snapshots of memory and message history #192

Closed

EricFedrowisch mentioned this pull request Apr 5, 2023

I am a Cow - loop #216

Closed

nponeccop previously requested changes Apr 9, 2023

View reviewed changes

This was referenced Apr 10, 2023

PRs batch 2 #673

Closed

PR batch 3 #709

Closed

richbeales self-assigned this Apr 14, 2023

Merge branch 'master' of https://github.com/Significant-Gravitas/Auto…

f86ca43

…-GPT into pr/124 Moved code to new package to integrate later perhaps.

BillSchumacher dismissed DamascusGit’s stale review via f86ca43 April 15, 2023 21:39

Blacked.

4a19124

BillSchumacher approved these changes Apr 15, 2023

View reviewed changes

BillSchumacher merged commit 1586966 into Significant-Gravitas:master Apr 15, 2023

tgonzales pushed a commit to tgonzales/Auto-GPT that referenced this pull request Apr 19, 2023

Merge pull request Significant-Gravitas#124 from rahul-ghimire-au6/bu…

e57c4dd

…g#98 added tailwind css class which fix the bug Significant-Gravitas#98

sindlinger pushed a commit to Orgsindlinger/Auto-GPT-WebUI that referenced this pull request Sep 25, 2024

Merge pull request Significant-Gravitas#124 from EricFedrowisch/master

598dd94

First draft at adding persistent memory via sqlite3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

First draft at adding persistent memory via sqlite3 #124

First draft at adding persistent memory via sqlite3 #124

EricFedrowisch commented Apr 4, 2023

This comment was marked as duplicate.

DamascusGit left a comment

ryanpeach commented Apr 4, 2023

ryanpeach commented Apr 4, 2023

waynehamadi commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023 •

edited

Loading

Eunyxjk commented Apr 4, 2023

Eunyxjk commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

ryanpeach commented Apr 4, 2023 •

edited

Loading

EricFedrowisch commented Apr 4, 2023

ryanpeach commented Apr 4, 2023 •

edited

Loading

EricFedrowisch commented Apr 4, 2023

Joe0 commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

Joe0 commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

Joe0 commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

dschonholtz commented Apr 4, 2023

EricFedrowisch commented Apr 5, 2023

jdonkers commented Apr 5, 2023 •

edited

Loading

ryanpeach commented Apr 5, 2023 •

edited

Loading

EricFedrowisch commented Apr 5, 2023

nponeccop left a comment

First draft at adding persistent memory via sqlite3 #124

First draft at adding persistent memory via sqlite3 #124

Conversation

EricFedrowisch commented Apr 4, 2023

This comment was marked as duplicate.

DamascusGit left a comment

Choose a reason for hiding this comment

ryanpeach commented Apr 4, 2023

ryanpeach commented Apr 4, 2023

waynehamadi commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023 • edited Loading

Eunyxjk commented Apr 4, 2023

Eunyxjk commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

ryanpeach commented Apr 4, 2023 • edited Loading

EricFedrowisch commented Apr 4, 2023

ryanpeach commented Apr 4, 2023 • edited Loading

EricFedrowisch commented Apr 4, 2023

Joe0 commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

Joe0 commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

Joe0 commented Apr 4, 2023

EricFedrowisch commented Apr 4, 2023

dschonholtz commented Apr 4, 2023

EricFedrowisch commented Apr 5, 2023

jdonkers commented Apr 5, 2023 • edited Loading

ryanpeach commented Apr 5, 2023 • edited Loading

EricFedrowisch commented Apr 5, 2023

nponeccop left a comment

Choose a reason for hiding this comment

EricFedrowisch commented Apr 4, 2023 •

edited

Loading

ryanpeach commented Apr 4, 2023 •

edited

Loading

ryanpeach commented Apr 4, 2023 •

edited

Loading

jdonkers commented Apr 5, 2023 •

edited

Loading

ryanpeach commented Apr 5, 2023 •

edited

Loading