Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(v2/regression): add v2 test cases #2247

Merged
merged 2 commits into from
Jan 7, 2025

Conversation

derekbit
Copy link
Member

@derekbit derekbit commented Jan 6, 2025

Which issue(s) this PR fixes:

Issue

longhorn/longhorn#10136
longhorn/longhorn#10139

What this PR does / why we need it:

Special notes for your reviewer:

Additional documentation or context

Summary by CodeRabbit

  • Tests
    • Added new test cases for backup functionality
    • Verified backup system behavior during replica unavailability
    • Tested backup block deletion process
    • Expanded test coverage for volume backup operations

@derekbit derekbit requested a review from chriscchien January 6, 2025 08:43
@derekbit derekbit self-assigned this Jan 6, 2025
@derekbit derekbit requested a review from a team as a code owner January 6, 2025 08:43
Copy link

coderabbitai bot commented Jan 6, 2025

Walkthrough

This pull request introduces two new test functions in the manager/integration/tests/test_basic.py file to enhance the testing of Longhorn's backup functionality. The first test, test_backup_status_for_unavailable_replicas, verifies the backup system's behavior when a replica responsible for a backup becomes unavailable. The second test, test_backup_block_deletion, checks the correct deletion of non-referenced backup blocks while maintaining backup process integrity.

Changes

File Change Summary
manager/integration/tests/test_basic.py Added two new test functions:
- test_backup_status_for_unavailable_replicas: Tests backup status when a replica becomes unavailable
- test_backup_block_deletion: Validates backup block deletion process

Assessment against linked issues

Objective Addressed Explanation
Replica address in backup status [#10136]

Possibly related PRs

Suggested reviewers

  • c3y1huang
  • ChanYiLin
  • yangchiu

Poem

🐰 In the realm of backups, where data takes flight,
Replicas dance, some vanish from sight
Our tests stand guard, with vigilance keen
Ensuring no block is lost, no status unseen
A rabbit's code, robust and bright! 🔍


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Longhorn 10136
Longhorn 10139

Signed-off-by: Derek Su <derek.su@suse.com>
@derekbit derekbit changed the title feat(v2/regression): add test_backup_status_for_unavailable_replicas feat(v2/regression): add v2 test cases Jan 7, 2025
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
manager/integration/tests/test_basic.py (4)

Line range hint 534-614: Add more assertions to verify backup error state

The test case has good coverage of the backup failure scenario, but could be enhanced with additional assertions to verify:

  1. The specific error message content in the backup status
  2. The backup state transitions (e.g., from InProgress to Error)
  3. The volume state during and after the replica failure
 def backup_status_for_unavailable_replicas_test(client, volume_name,  # NOQA
                                                 size, backing_image=""):  # NOQA
     volume = create_and_check_volume(client, volume_name,
                                      num_of_replicas=2,
                                      size=str(size),
                                      backing_image=backing_image)
+    # Verify initial volume state
+    assert volume.ready
+    assert volume.robustness == "healthy"
+
     lht_hostId = get_self_host_id()
     volume = volume.attach(hostId=lht_hostId)
     volume = common.wait_for_volume_healthy(client, volume_name)

     # write data to the volume
     volume_endpoint = get_volume_endpoint(volume)

     # Take snapshots without overlapping
     data_size = size/Mi
     write_volume_dev_random_mb_data(
         volume_endpoint, 0, data_size)

     # create a snapshot and backup
     snap = create_snapshot(client, volume_name)
     volume.snapshotBackup(name=snap.name)
     bv, b = find_backup(client, volume_name, snap.name)
     backup_id = b.id

     # find the replica for this backup
     replica_name = find_replica_for_backup(client, volume_name, backup_id)

     # disable scheduling on that node
     volume = client.by_id_volume(volume_name)
     for r in volume.replicas:
         if r.name == replica_name:
             node = client.by_id_node(r.hostId)
             node = set_node_scheduling(client, node, allowScheduling=False)
             common.wait_for_node_update(client, node.id,
                                         "allowScheduling", False)
     assert node

     # remove the replica with the backup
     volume.replicaRemove(name=replica_name)
     volume = common.wait_for_volume_degraded(client, volume_name)

+    # Verify backup transitions to Error state with expected message
+    def verify_backup_error(b):
+        return (b.id == backup_id and
+                b.state == "Error" and
+                "Failed to get backup status" in b.error)
+
     # now the backup status should in an Error state and with an error message
-    def backup_failure_predicate(b):
-        return b.id == backup_id and "Error" in b.state and b.error != ""
     volume = common.wait_for_backup_state(client, volume_name,
-                                          backup_failure_predicate)
+                                          verify_backup_error)

     # re enable scheduling on the previously disabled node
     node = client.by_id_node(node.id)
     node = set_node_scheduling(client, node, allowScheduling=True)
     common.wait_for_node_update(client, node.id,
                                 "allowScheduling", True)

Line range hint 615-717: Improve test documentation and error handling

The test case for backup block deletion has good coverage but could benefit from:

  1. More detailed docstring explaining the test scenarios
  2. Better error handling for backup operations
  3. Additional assertions for backup block counts
 def test_backup_block_deletion(set_random_backupstore, client, core_api, volume_name):  # NOQA
     """
     Test backup block deletion
 
     Context:
 
     We want to make sure that we only delete non referenced backup blocks,
     we also don't want to delete blocks while there other backups in progress.
     The reason for this is that we don't yet know which blocks are required by
     the in progress backup, so blocks deletion could lead to a faulty backup.
+
+    Test Scenarios:
+    1. Normal backup block deletion when no backups are in progress
+    2. Block deletion prevention when a backup is in progress
+    3. Error handling for failed backup operations
+    4. Verification of block counts before and after deletion
 
     Setup:
 
     1. Setup minio as S3 backupstore
 
     Steps:
     ...
     """
     backup_store_type = set_random_backupstore
     if backup_store_type not in ["nfs", "s3"]:
         pytest.skip("Skip test case because the backup store type is not supported") # NOQA

     backupstore_cleanup(client)

     volume = create_and_check_volume(client, volume_name)
     host_id = get_self_host_id()
     volume = volume.attach(hostId=host_id)
     volume = common.wait_for_volume_healthy(client, volume_name)

+    try:
         data0 = {'pos': 0,
                  'len': 2 * BACKUP_BLOCK_SIZE,
                  'content': common.generate_random_data(2 * BACKUP_BLOCK_SIZE)}
         _, backup0, _, data0 = create_backup(
             client, volume_name, data0)
+    except Exception as e:
+        pytest.fail(f"Failed to create initial backup: {str(e)}")

     data1 = {'pos': 0,
              'len': BACKUP_BLOCK_SIZE,
              'content': common.generate_random_data(BACKUP_BLOCK_SIZE)}
     _, backup1, _, data1 = create_backup(client, volume_name, data1)

     data2 = {'pos': 0,
              'len': BACKUP_BLOCK_SIZE,
              'content': common.generate_random_data(BACKUP_BLOCK_SIZE)}
     backup_volume, backup2, _, data2 = create_backup(client, volume_name, data2)

     backup_blocks_count = backupstore_count_backup_block_files(client,
                                                                core_api,
                                                                volume_name)
+    # Verify initial block count
     assert backup_blocks_count == 4

     bvs = client.list_backupVolume()
+    # Verify backup volume data stored size
+    backup_volume_found = False
     for bv in bvs:
         if bv['name'] == backup_volume.name:
+            backup_volume_found = True
             assert bv['dataStored'] == \
                 str(backup_blocks_count * BACKUP_BLOCK_SIZE)
+    assert backup_volume_found, "Backup volume not found"

Line range hint 1480-1533: Add cleanup and improve error handling in restore test

The restore test case could be enhanced with:

  1. Better cleanup in case of test failures
  2. More robust error handling for backup operations
  3. Additional assertions for data verification
 @pytest.mark.v2_volume_test  # NOQA
 @pytest.mark.coretest   # NOQA
-def test_restore_inc(set_random_backupstore, client, core_api, volume_name, pod):  # NOQA
+def test_restore_inc(set_random_backupstore, client, core_api, volume_name, pod, request):  # NOQA
     """
     Test restore from disaster recovery volume (incremental restore)
 
     Run test against all the backupstores
+
+    Test Scenarios:
+    1. Basic restore functionality
+    2. Incremental restore with multiple backups
+    3. Error handling during restore
+    4. Data verification after restore
 
     1. Create a volume and attach to the current node
     2. Generate `data0`, write to the volume, make a backup `backup0`
     ...
     """
+    def cleanup():
+        # Cleanup volumes and backups in case of test failure
+        volumes_to_delete = [
+            volume_name,
+            "sb-0-" + volume_name,
+            "sb-1-" + volume_name,
+            "sb-2-" + volume_name
+        ]
+        for vol in volumes_to_delete:
+            try:
+                v = client.by_id_volume(vol)
+                if v:
+                    client.delete(v)
+                    wait_for_volume_delete(client, vol)
+            except Exception:
+                pass
+
+    request.addfinalizer(cleanup)

     restore_inc_test(client, core_api, volume_name, pod)

Line range hint 1-100: Add missing docstrings and improve test setup documentation

The test file could benefit from:

  1. A module-level docstring explaining the test suite's purpose
  2. Better documentation of test prerequisites and assumptions
  3. More detailed explanation of test utilities and helper functions

Add a module-level docstring at the beginning of the file:

+"""
+Integration tests for Longhorn's basic functionality.
+
+This module contains core test cases that verify basic Longhorn operations:
+- Volume lifecycle (create, attach, detach, delete)
+- Backup and restore operations
+- Snapshot management
+- Error handling and recovery
+- Filesystem operations
+
+Prerequisites:
+- A running Kubernetes cluster
+- Longhorn installed and running
+- Storage classes configured
+- Backup stores (S3, NFS) configured if testing backup functionality
+
+Test utilities and common operations are imported from the common module.
+"""
+
 import pytest
 
 import os
 import subprocess
 ...
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e35dd45 and 6bf286f.

📒 Files selected for processing (1)
  • manager/integration/tests/test_basic.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (2)
  • GitHub Check: make check
  • GitHub Check: Summary
🔇 Additional comments (1)
manager/integration/tests/test_basic.py (1)

Line range hint 1-2247: LGTM! The test coverage looks comprehensive

The test file provides thorough coverage of Longhorn's core functionality. The new test functions for backup status and block deletion are well structured and cover important edge cases. The suggested improvements around error handling, documentation, and cleanup would be good to have but are not critical for the current functionality.

@@ -531,6 +531,7 @@ def snapshot_test(client, volume_name, backing_image): # NOQA
cleanup_volume(client, volume)


@pytest.mark.v2_volume_test # NOQA
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chriscchien Should we review longhorn/longhorn#9760 (comment) again to include all test cases that support v2?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The supported cases will be updated over time. I suggest we can update it later, but to prevent any confusion, the ground truth should be the label @pytest.mark.v2_volume_test in manager/integration/tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yangchiu yangchiu self-requested a review January 7, 2025 08:19
@yangchiu yangchiu merged commit 5745036 into longhorn:master Jan 7, 2025
4 of 6 checks passed
@yangchiu
Copy link
Member

yangchiu commented Jan 7, 2025

@mergify backport v1.8.x

Copy link

mergify bot commented Jan 7, 2025

backport v1.8.x

✅ Backports have been created

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants