-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: centralized backup management #68
Comments
Really interested to see this come in as a feature request, this is something that I'm thinking about in the background (and something that backrest intends to be able to support architecturally). Can you elaborate a bit on your use case? Do you care primarily about being able to view backup status and results in a centralized place? Or do you want to be able to manage backup configurations across a fleet of machines / perform bulk operations? |
Hi @garethgeorge! I am also interested in this feature. I propose the same design as Bareos. We will have:
Our server send request to agent (authenticated by mTLS) with restic config , some pre/post-scripts. Then our client backup directrly to repository backend or we can use rest-server and backup to central server using Also may be we can use agentless setup = send restic command using ssh connection. But there are caveats aka long lived ssh may be interupted by SSHD configuration. |
Interesting, when I'd considered this feature in the past I'd imagined something along the lines of
I'll read through the Bareos docs, do you see there as being strong advantages one way or the other w.r.t. the central server being responsible for pushing commands to each of the workers? I have some concern that the central server becomes very highly privileged if it's SSHing in and running backup operations. From an implementation perspective though, it might be very simple to just open up to running restic commands over SSH (as you mention) and to support scheduling operations in parallel such that backups can be run over multiple at the same time (constraint would likely be 1 backup per repository at a time). |
That's I desire. |
for my example, i have For my usage scenario, I hope that the Backrest I run on the Raspberry Pi is the main control device, and the ones running on other devices/VPS are clients, and all control is done through Backrest on the pi. And in terms of UI, my backup mode is to back up once a day (retain up to 30 items), and back up once a week (never delete), so the structure I expect is like this:
for this kind many mechine management, i think it will need some custom folder tree that user can order there plan |
This would be a killer feature, at the moment I'm using Synology Active Backup for Business to backup and deduplicate across 12 Windows machines but there are a couple of issues:
Being able to use Backrest centrally manage the backup configuration (scheduling, directories) with a status page to know if something is failing/missed its schedule for x interval would be fantastic. Even better if the local web interface could be used to restore an older version of the file (either to a new location as current or over the original as #118 suggests making possible). Ideally if centrally managed the configuration couldn't be changed here - though this could be locked down by user account? |
Hey all, updating this thread as it's a pretty requested feature and it's also a capability I want for my own systems -- likely looking at prototyping this in the near term I'm investigating a few avenues for implementation
To provide some design details -- this will likely look something like:
I'm slightly leaning towards the daemon / monitor process model because it's more in line with the self-hosted ethos. There are also some interesting possibilities to examine in the future here e.g. centralizing some operations e.g. run backups on daemon processes (with read only credentials to repos) but run prune operations only on the trusted monitor process. I'm still thinking through what this might look like / how it'd be configured. Perhaps a concept of a meta-plan is needed to logically group plans across multiple nodes. |
Hello, @garethgeorge |
P2P is not necessary for this use-case. I suggest:
|
+1 |
I definitely like all the ideas presented so far, but I just thought I'd think aloud about some of my thoughts. For syncing the nodes config just using an API endpoint with a set interval is probably just fine, but I'd personally also like to be able to manually run actions on those agents. Now using this same model for that, but then for some sort of action queue would probably work fine. It got me thinking however wouldn't something like web sockets be more ideal for such a use case? Using web sockets would have the benefit of not having to constantly pull 1 or multiple endpoints every x seconds, especially if we'd need to have separate endpoint for the config, manual actions, etc.. It would also prevent having a delay between triggering an action and it actually being performed on the node (assuming it can perform that action at that time). |
Hey all, thanks for all the interest in this issue -- just updating to say that steady progress is being made toward supporting centralized backup management. Much of the refactoring (and migrations) in the 1.0.0 release are focused on readying the Backrest data model to support operations (possibly created by other installations) in repos and correct tracking of those operations. On the networking front: I'm still investigating here, Backrest uses gRPC under-the-hood which is natively http/2. Because connectivity / syncing operations will happen on the backend we're not restricted to web technologies e.g. Websockets. I'm agreed that polling is a not the way we want to go, TCP keep-alive is much cheaper than repeatedly re-establishing connections (especially if they are HTTPS -- and they should be!). I'm hoping to find a good OSS option that I can shim gRPC requests onto (such that they can be initiated by the hub and sent to the clients -- which really looks like some sort of inversion layer where clients will actually be establishing and "keeping alive" TCP channels to the backrest hub). I think https://libp2p.io/ may have some capabilities here (though I do not want to pull in any of the mesh networking / connectivity to the ipfs swarm from that project) but I'm wanting to find simpler alternatives -- which could ultimately look like building it myself! Another problem space I'm still giving thought to is the relationship between the hub and clients. In particular, the model I'm imagining will be common is many client devices backing up to a single repo. In this case, I feel that the hub should be able to centrally coordinate maintenance operations e.g. "forget" and "prune" execution. I'm considering here whether:
The latter approach has the disadvantage that the hub will need access to a repo config for each repo used by a client BUT it also has the significant advantage that clients may be read only (e.g. you can centralize trust in the hub with many low-trust clients, this protects against ransomware. In my case I'd likely run a low-cost IPv6 VPS dedicated to this purpose). Not yet sure what will be best here but I am leaning towards the latter option. |
Very happy you are going down this path! |
I don't want to have to depend on persistent TCP connections or websockets for this functionality. While it is less work than if you polled frequently, it limits the usefulness of backrest for bandwidth-limited clients, IOT, edge, etc. A lot of the things that would be useful to centrally manage with this project really only need to check their config before they run a backup task, once a day at most. For SIM connections that bill by connection time and bandwidth, the minimal benefit (instant config update) would not outweigh the high cost incurred. |
@brandonkal I disagree, If data usage is an issue, you're not going to be running backups over that connection anyway. |
Hi, just thought I'd check in. Any updates on progress? Very interested in this functionality. |
edited with update: starting work on the multi-host model now, I've decided to skip the oplog decoupling from bbolt for now (will work on that in a future revision). In progress PR is in #385 . No promises on ETA, but I expect to have some prototypes in the next few weeks. Likely a month or more out until it's tested / ready for release and documentation. |
+1 for this feature. It looks like you're already well into the development but the way I would do it is as has already been suggested. The backrest binary is installed on all the servers and accepts api calls from the central one and returns status updates to it. So all backups are executed locally on each server. Related to the design here I would like to be able to define a plan and then execute it on multiple servers |
I agree with @mattdale77, I'd also prefer this method. That said, the way it's currently being implemented will still be useful. Looking forward to trying it out! |
Posting a progress update now that multihost management is making good progress and largely passing tests in #562 . It's been a long road to get here and has taken some significant rethinking of the feature as well as significant refactoring to backrest's operation storage model (and pushing half a year of preparatory refactoring and stability improvements). What's done so far
Todo in the near term
Rollout Plan1.7.0 My expectation is to release the data model changes that support multihost management in 1.7.0 and, under the hood, the configuration and syncapi for multihost management will be available in an alpha state. There will not be any ui support in this revision, but it will be possible to see operations sync'd from other hosts in the repo view if a peer is added correctly. I'll post some instructions for doing this here in the release. Test coverage is good, but I'll primarily be using this stage of things prove out the migration logic and start getting users configs updated to include repo GUIDs. Within my setup I'll be building some experience running sync with a stable release version and putting out patches, improving error messages, and thinking through what status info needs to be available as I start UI work. 1.8.0 Aiming to include initial UI support for settings related to multihost management in this revision. This is likely to include
1.9.0 or beyond Will aim to provide support for some sort of easy peering flow to easily connect instances together. This is also a possible target for a stable revision of multihost management, but it is very possible that this milestone will slip to later versions. |
Great work! |
For me, thr big problem with Restic is to manage, I mean, for each server it's installed, will need a different management location.
It's difficult to manage a big set of servers, so I would like to know if is there a plan for Backrest to manage all the Restic backups from a single WebUI.
It could be like Cockpit do with Linux Server, but it will be better if there is a dashbord where we can follow the latest backups results from all servers.
The text was updated successfully, but these errors were encountered: