Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: SQLite+Compressed save format (Phase 2 of v2 save format) #5777

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

randombk
Copy link
Contributor

@randombk randombk commented Dec 1, 2024

Checklist

Required

Purpose of change

This PR is Phase 2 of #5771, and supersedes #5772.

This patch implements a new V2 save layout based on SQLite3 and gzip.

  • A save can either be V1 (current layout) or V2.
  • Existing V1 saves will continue to use V1.
  • V1 saves can be converted to V2 using a new option in the "World" tab of the main menu.
  • Newly created saves use V2.

Describe the solution

The new save format consolidates and compresses certain high-volume files into a set of sqlite3 database files with gzip compression. Specifically,

map.sqlite3 replaces:

  • maps/ directory (map tile info)
  • o.* files (overmap info)

<player_id>.sqlite3 replaces:

  • <player_id>.seen.* files (player overmap visibility data)
  • <player_id>.mm1/ directory (player map tile memory)

All other files will remain unchanged. This should cover all the worst offenders while leaving the more manual-edit-friendly files untouched.

Each database consists of a single table:

CREATE TABLE IF NOT EXISTS files (
    path           TEXT PRIMARY KEY NOT NULL,
    parent         TEXT NOT NULL,  -- Currently unused, but adding here for future compat considerations
    compression    TEXT DEFAULT NULL,  -- NULL for uncompressed. 'zlib' if `data` is compressed.
    data           BLOB NOT NULL
);

Benefits:

  • Greatly reduced size-on-disk.
  • Greatly reduced number of files in a save. This should benefit backups, cross-device copies, and other systems requiring file watchers.
  • Improved consistency. Crashes in the middle of a save no longer cause save corruption.

Costs:

  • Slightly degraded save/load performance.
  • For large worlds, replaced a large number of very small files with a small number of fairly large files.

There is an interesting trade-off between file size and save performance. Even in the old format, save/load was not a particularly big issue. Only changed chunks were modified, and autosave triggered often enough that most save events only covered a small number of chunks.

The new format does introduce an additional delay to the save time, though events remained sub-second. All of this delay can be attributed to the compression process - turning off compression resulted in no measurable changes in save performance. This is to be expected - we're introducing an extra CPU-intensive step into a single-threaded save process.

Benchmarks & Comparisons

The below are some rough figures I captured during my testing. All of these were done on a Ryzen 9 7900X system running Arch Linux with the save directory located on a btrfs partition stored on a NVME SSD.

Some figures using a small test save (after ~3hrs gameplay):

mode size # files Save Time
Baseline V1 (current) 70.5 MB 4,174 119ms
V2, gzip level 1 (this PR) 10.0 MB 15 170ms
V2, gzip level 6 7.2 MB 15 320ms

Using a medium-sized test save:
Figures were taken after saving at one end of a big town, taking a helicopter across the town to generate new tiles, then saving again.

mode size # files Save Time
Baseline V1 (current) 128 MB 11,714 105ms
V2, gzip level 1 (this PR) 21.8 MB 15 162ms
V2, gzip level 6 17.2 MB 15 302ms

(Not sure why the bigger map was a faster save; could be multiple things, but the point is that there's a small increase)

Gzip compression level 1 ("fast") seems to be optimal. I suspect a more modern compression scheme like zstd would be even better, but we'll need to look into the packaging story.

Describe alternatives you've considered

Testing

Additional context

@github-actions github-actions bot added src changes related to source code. tests changes related to tests labels Dec 1, 2024
Copy link
Contributor

autofix-ci bot commented Dec 1, 2024

Autofix has formatted code style violation in this PR.

I edit commits locally (e.g: git, github desktop) and want to keep autofix
  1. Run git pull. this will merge the automated commit into your local copy of the PR branch.
  2. Continue working.
I do not want the automated commit
  1. Format your code locally, then commit it.
  2. Run git push --force to force push your branch. This will overwrite the automated commit on remote with your local one.
  3. Continue working.

If you don't do this, your following commits will be based on the old commit, and cause MERGE CONFLICT.

@randombk randombk force-pushed the sqlite-saves-phase2 branch from d2b918c to f6aa348 Compare December 1, 2024 09:23
@scarf005
Copy link
Member

scarf005 commented Dec 1, 2024

turning off compression resulted in no measurable changes in save performance. This is to be expected - we're introducing an extra CPU-intensive step into a single-threaded save process.

while out of scope for this PR, I'm curious whether it's possible to

  • ...add in settings to enable/disable save compression? would there be issue in save/loading from compressed/uncompressed state? this way players could choose tradeoff between save speed and file compression.
  • ...parallelized without being too hard to implement?
  • ...delegate this task to another thread(s)?
sequenceDiagram
	Note left of BN: log "saving..."
	BN-)saveThread: send save request
    activate saveThread
    alt on failure
      saveThread--)BN: log "failed saving"  
    else compress save
      saveThread--)BN: log "save success"
    end
    deactivate saveThread
Loading

src/world.cpp Outdated Show resolved Hide resolved
@randombk
Copy link
Contributor Author

randombk commented Dec 3, 2024

I'm going to tweak this to make this opt-in during world creation (+ the convert-existing-world functionality).
I'll need to understand the perf implications some more, but I want to avoid adding in the multithreading complexity (particularly cross-platform) to save a couple hundred milliseconds.

Same goes for tweaking compression algo. The database has the metadata to change compression or toggle it on/off even at the map tile grain, but it's more the config UI and adding in so much complexity into something that most players shouldn't have to worry about.

@scarf005
Copy link
Member

scarf005 commented Dec 3, 2024

makes sense, to most players keeping their saves not being corrupt would be much more important than saving 100ms.

@scarf005 scarf005 changed the title impl: SQLite+Compressed save format (Phase 2 of v2 save format) feat: SQLite+Compressed save format (Phase 2 of v2 save format) Dec 3, 2024
@randombk randombk force-pushed the sqlite-saves-phase2 branch from f6aa348 to b67cb16 Compare December 10, 2024 07:15
@github-actions github-actions bot added JSON related to game datas in JSON format. and removed tests changes related to tests labels Dec 10, 2024
@randombk randombk force-pushed the sqlite-saves-phase2 branch from 95874ef to e657e38 Compare December 10, 2024 07:17
@randombk
Copy link
Contributor Author

The V2 format is now opt-in during world creation.
image

There is an option to convert V1 saves to V2 in the main menu.

image

I tested creating/loading both V1 and V2 worlds. I converted one of my V1 worlds to V2 and have been playing on it for the past ~week without issues.

I still need to figure out the dependency/build situation for MSVC and Android - this is the last remaining item before the PR is ready. For folks on Linux, you can try out these changes already.

@randombk randombk force-pushed the sqlite-saves-phase2 branch from e657e38 to 7342db9 Compare December 10, 2024 07:21
@randombk
Copy link
Contributor Author

Additional TODOs: add in the required libraries into the CI environment, and update the relevant docs & wikis to document the V2 system and the new build dependencies.

@randombk randombk force-pushed the sqlite-saves-phase2 branch 3 times, most recently from a8e32d0 to 72e7c4b Compare December 10, 2024 07:54
@github-actions github-actions bot added the docs PRs releated to docs page label Dec 10, 2024
@randombk randombk force-pushed the sqlite-saves-phase2 branch 2 times, most recently from 41e0e6e to 54e115b Compare December 14, 2024 22:15
@github-actions github-actions bot added the scripts related to game management scripts label Dec 14, 2024
@randombk randombk marked this pull request as ready for review December 14, 2024 22:47
@randombk
Copy link
Contributor Author

Any idea how to trigger the MSVC and Android checks? I'm not sure why the CI skipped those

Copy link
Member

@Coolthulhu Coolthulhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, except for the compiler warnings.

&compressedSize,
reinterpret_cast<const Bytef *>( input.data() ),
input.size(),
Z_BEST_SPEED
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed here, but it might be a good idea to extract it to an option later.

src/compress.cpp Outdated
void zlib_decompress( const void *compressed_data, int compressed_size, std::string &output )
{
// We need to guess at the decompressed size - we expect things to compress fairly well.
uLongf decompressedSize = compressed_size * 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An explicit cast here would be useful to silence the warning. Some compilers can be annoying about things like this.

src/world.cpp Outdated
sqlite3 *db = nullptr;
int ret;

if( SQLITE_OK != ( ret = sqlite3_initialize() ) ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid assignments in if, as per warning. They are harder to read, so it's best to only use them when they save a lot of space.

src/world.cpp Outdated
}

bool world::read_overmap_player_visibility( const point_abs_om &p, file_read_fn reader )
{
return read_from_player_file( overmap_player_filename( p ), reader, true );
if( info->world_save_format == save_format::V2_COMPRESSED_SQLITE3 ) {
auto playerdb = get_player_db();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full typename preferred where known and not too bulky (ie. not an iterator or similar).

src/world.cpp Outdated
}

void replaceBackslashes( std::string &input )
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning on missing declaration. Looks like you can make the function static.

  The `clang-tidy` CI step is breaking because it's trying to
  interpret all files in src/ and tests/ as CPP files. Thus when
  src/CMakeLists.txt is changed, it tries to parse that file
  as C++ and fails.

Signed-off-by: David Li <jiawei.davidli@gmail.com>
  Implement new SQLite-based save format. New worlds can be configured
  to use V2. Existing worlds can be converted from the main menu.

  Existing V1 worlds should load exactly the same as before until
  converted.

Signed-off-by: David Li <jiawei.davidli@gmail.com>

# Conflicts:
#	.github/workflows/manual-release.yml
#	.github/workflows/release.yml
@randombk randombk force-pushed the sqlite-saves-phase2 branch 2 times, most recently from 511387c to 15ef66f Compare December 31, 2024 04:38
Signed-off-by: David Li <jiawei.davidli@gmail.com>
@randombk randombk force-pushed the sqlite-saves-phase2 branch from 15ef66f to c500571 Compare December 31, 2024 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs PRs releated to docs page JSON related to game datas in JSON format. scripts related to game management scripts src changes related to source code.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants