Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix subnet protection for non-full network states (eg. limited, etc.) #419

Merged
merged 4 commits into from
Nov 15, 2024

Conversation

lmagyar
Copy link
Contributor

@lmagyar lmagyar commented Oct 22, 2024

Proposed Changes

  • Subnet protection was down when the connection wasn't FULL, ie. even a LIMITED (eg. WAN outage) was able to turn off the protection.
  • With this fix, protection is reapplied (unprotect then protect immediately) on interface down, up, dhcp4-change, chcp6-change events.
  • Better do it more often then less. These events happen quite rarely, so it shouldn't be a performance issue.
  • Additionally it waits for the supervisor to be up for 5 minutes (instead of 1,5 minutes). It's usefull when the supervisor is restarted.
  • Plus some minor refactoring, bugfixes.

Related Issues

Summary by CodeRabbit

  • New Features
    • Introduced new scripts for managing environment variables and network connectivity (with-contenv-merge, protect-subnets).
  • Improvements
    • Enhanced error handling and logging in subnet route management scripts.
    • Improved input validation in the subnet-routes script.
    • Increased robustness in handling local subnet routes with new constants and conditions.
  • Bug Fixes
    • Adjusted logic to prevent potential errors in route management and execution.
    • Enhanced conditions for removing local subnets from IP rules.

@lmagyar lmagyar force-pushed the pr-fix-subnet-protection branch from d5f95cc to 30478a8 Compare November 15, 2024 15:22
Copy link

coderabbitai bot commented Nov 15, 2024

Walkthrough

The changes introduced in this pull request include the addition of two new scripts: with-contenv-merge and protect-subnets. The with-contenv-merge script manages environment variables in containerized environments, while the protect-subnets script handles specific actions related to network connectivity changes. Additionally, modifications were made to existing scripts, enhancing their functionality and error handling, particularly in managing local subnet routes. The PROTECTION_RULE_PRIORITY constant was introduced to standardize priority values across scripts.

Changes

File Path Change Summary
tailscale/rootfs/command/with-contenv-merge New script added to manage environment variables using execlineb.
tailscale/rootfs/etc/NetworkManager/dispatcher.d/connectivity-change Deleted script that handled connectivity changes in NetworkManager.
tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets New script added to handle actions from Network Manager dispatcher and manage subnet routes.
tailscale/rootfs/usr/bin/protect-subnet-routes Modified to enhance local subnet route handling, introduced PROTECTION_RULE_PRIORITY, and improved logging.
tailscale/rootfs/usr/bin/subnet-routes Updated input validation and error handling logic for argument checks.
tailscale/rootfs/usr/bin/unprotect-subnet-routes Enhanced conditions for removing local subnets from IP rules, introduced PROTECTION_RULE_PRIORITY.

Suggested labels

bugfix

🐇 In the meadow, scripts do play,
New paths for routes, come what may.
With variables merged, and logging bright,
Connectivity changes, all done right!
Hops of joy for the code we weave,
In the world of scripts, we believe! 🌼


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 30478a8 and 6cd1442.

📒 Files selected for processing (1)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@lmagyar
Copy link
Contributor Author

lmagyar commented Nov 15, 2024

It's ready for review.
It is installed on ~100 real devices and didn't caused issues.

@lmagyar lmagyar marked this pull request as ready for review November 15, 2024 16:59
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Outside diff range and nitpick comments (10)
tailscale/rootfs/command/with-contenv-merge (2)

13-14: Add error handling and documentation for the default execution path.

The default execution path lacks documentation and error handling for failed commands.

Consider adding comments and error handling:

+# Default execution path when S6_KEEP_ENV is non-zero
+foreground {
 $@
+}
+importas -u ? ?
+if { s6-test ${?} -ne 0 }
+  exit ${?}

1-14: Document the script's role in subnet protection system.

This script is part of the subnet protection system and handles environment merging, but its relationship with network state management and the broader system architecture isn't documented.

Consider:

  1. Adding a header comment explaining the script's purpose and its role in subnet protection
  2. Documenting the security implications of environment variable handling
  3. Describing the interaction with network state changes and the supervisor
tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets (2)

5-14: Consider enhancing error handling before halt

The network event handling looks solid, but the halt mechanism could be enhanced.

Consider adding more detailed error logging before halting:

 if ! protect-subnet-routes; then
   # Better stop add-on than risking losing all network connections
+  bashio::log.error "Failed to protect subnet routes. Halting add-on to prevent network loss."
   echo -n 1 > /run/s6-linux-init-container-results/exitcode
   exec /run/s6/basedir/bin/halt
 fi

15-21: Standardize logging format across cases

The debug logging is appropriate, but there's an inconsistency in how interface information is logged between connectivity-change and other events.

Consider standardizing the log format:

 connectivity-change)
-    bashio::log.debug "Unhandled Network Manager action ${NM_DISPATCHER_ACTION} ${CONNECTIVITY_STATE-}"
+    bashio::log.debug "Unhandled Network Manager action ${DEVICE_IP_IFACE-} ${NM_DISPATCHER_ACTION} (state: ${CONNECTIVITY_STATE-})"
tailscale/rootfs/usr/bin/unprotect-subnet-routes (3)

7-7: LGTM! Consider adding a comment about the priority value.

Good practice moving the hardcoded value to a constant. Consider adding a comment explaining why 5000 was chosen as the priority level for these rules.

+# Priority 5000 is chosen to ensure our rules take precedence over default routing
+# while still allowing space for higher priority rules if needed
 readonly PROTECTION_RULE_PRIORITY=5000

12-14: Consider improving readability of the complex condition.

The logic is correct but could be more readable by breaking it down or adding comments.

+# Only remove routes if:
+# 1. We're not using userspace networking
+# 2. Either accept_routes is not set or is explicitly true
 if bashio::config.false "userspace_networking" && \
   (! bashio::config.has_value "accept_routes" || bashio::config.true "accept_routes")
 then

Line range hint 1-24: Consider adding logging for network state transitions.

Given this script's critical role in subnet protection during network state changes, consider adding more detailed logging about the current network state when the script runs. This would help diagnose issues where routes might not be properly removed during state transitions.

+# Log current network state
+bashio::log.debug "Current network state: $(bashio::network.state)"
+bashio::log.debug "Userspace networking: $(bashio::config.has_value 'userspace_networking')"
+bashio::log.debug "Accept routes: $(bashio::config.has_value 'accept_routes')"
 if bashio::config.false "userspace_networking" && \
   (! bashio::config.has_value "accept_routes" || bashio::config.true "accept_routes")
 then
tailscale/rootfs/usr/bin/subnet-routes (1)

18-21: Consider enhancing the error message.

While the error handling is correct, the error message could be more informative by explaining the purpose of each option.

 if ! [[ "${1-}" =~ ^(local|advertised)$ ]]; then
-  echo "Usage: subnet-routes local|advertised" 1>&2
+  echo "Usage: subnet-routes <mode>" 1>&2
+  echo "Modes:" 1>&2
+  echo "  local      - Print routes for local subnets" 1>&2
+  echo "  advertised - Print routes from advertise_routes configuration" 1>&2
   exit 1
 fi
tailscale/rootfs/usr/bin/protect-subnet-routes (2)

Line range hint 20-32: Good enhancement of supervisor readiness check

The increased wait time of 5 minutes (60 iterations * 5s) is a good improvement for handling supervisor restart scenarios.

Consider renaming wait_counter to something more descriptive like supervisor_wait_attempts for better code readability:

-declare wait_counter=0
+declare supervisor_wait_attempts=0

-while ! bashio::api.supervisor GET "/addons/self/options/config" false &> /dev/null; do
-  if (( wait_counter++ == 60 )); then
+while ! bashio::api.supervisor GET "/addons/self/options/config" false &> /dev/null; do
+  if (( supervisor_wait_attempts++ == 60 )); then

45-49: Enhance IPv6 address detection robustness

The current IPv6 detection using simple string matching could be improved to be more precise.

Consider using a more robust IPv6 detection:

-if [[ "${route}" =~ .*:.* ]]; then
+if [[ "${route}" =~ ^([0-9a-fA-F]{0,4}:){1,7}[0-9a-fA-F]{0,4}/[0-9]{1,3}$ ]]; then
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between de0aae9 and 30478a8.

📒 Files selected for processing (6)
  • tailscale/rootfs/command/with-contenv-merge (1 hunks)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/connectivity-change (0 hunks)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets (1 hunks)
  • tailscale/rootfs/usr/bin/protect-subnet-routes (2 hunks)
  • tailscale/rootfs/usr/bin/subnet-routes (1 hunks)
  • tailscale/rootfs/usr/bin/unprotect-subnet-routes (1 hunks)
💤 Files with no reviewable changes (1)
  • tailscale/rootfs/etc/NetworkManager/dispatcher.d/connectivity-change
🔇 Additional comments (11)
tailscale/rootfs/command/with-contenv-merge (2)

1-2: LGTM! Appropriate shebang configuration.

The use of execlineb with -S0 flag is correct for preserving argument handling in this context.


3-7: Consider adding input validation for S6_KEEP_ENV.

While the current implementation works for numeric values, it might behave unexpectedly with non-numeric values in S6_KEEP_ENV.

Let's verify the behavior with different input values:

tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets (2)

1-4: Verify the existence of the custom shebang script

The shebang and comments look good. The preservation of dispatcher variables is crucial for this script's functionality.

#!/bin/bash
# Verify the existence of the with-contenv-merge script
fd "with-contenv-merge" --type f --exec file {} \;

1-21: Verify script integration and permissions

The script's logic is sound, but let's verify its integration with the system.

✅ Verification successful

Script integration and permissions are properly configured

The verification shows that:

  • The NetworkManager dispatcher script has correct executable permissions (755)
  • Both dependent scripts (protect-subnet-routes and unprotect-subnet-routes) exist and contain proper implementation
  • The dispatcher script is correctly placed in the NetworkManager dispatcher directory
  • The integration between scripts is sound:
    • The dispatcher script calls both utility scripts
    • The utility scripts handle subnet route protection/unprotection with proper error handling
    • Both scripts use consistent configuration checks and priorities
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if the script is properly installed and executable
fd "protect-subnets" --type f --exec stat -c "%a %n" {} \;

# Verify the existence of dependent scripts
fd "protect-subnet-routes|unprotect-subnet-routes" --type f

# Check NetworkManager dispatcher configuration
fd "NetworkManager" --type d | grep -i "dispatcher"

Length of output: 377


Script:

#!/bin/bash
# Check if the script is executable using ls instead of stat
ls -l tailscale/rootfs/etc/NetworkManager/dispatcher.d/protect-subnets

# Check the content of dependent scripts
echo "=== protect-subnet-routes ==="
cat tailscale/rootfs/usr/bin/protect-subnet-routes
echo -e "\n=== unprotect-subnet-routes ==="
cat tailscale/rootfs/usr/bin/unprotect-subnet-routes

# Verify NetworkManager dispatcher directory structure
ls -la tailscale/rootfs/etc/NetworkManager/dispatcher.d/

Length of output: 4257

tailscale/rootfs/usr/bin/unprotect-subnet-routes (1)

Line range hint 15-24: Fix the sed pattern and verify route removal behavior.

The sed pattern uses \d+ which is not supported in basic regex. Consider using [0-9]+ instead.

Apply this fix:

-    | sed -nr 's/^\d+:\s+from all to ([^\s]+) lookup main$/\1/p')
+    | sed -nr 's/^[0-9]+:\s+from all to ([^\s]+) lookup main$/\1/p')
✅ Verification successful

The test output shows that the suggested pattern [0-9]+ works correctly as it successfully extracted "192.168.1.0/24" from the test input. While we couldn't verify the actual IP rules due to the ip command not being available in the sandbox environment, this is sufficient to verify the sed pattern functionality.

Use [0-9]+ instead of \d+ in the sed pattern

The current pattern using \d+ in the sed expression is non-standard and may fail on some systems. The suggested replacement using [0-9]+ is verified to work correctly for extracting subnet routes from IP rules.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Verify the sed pattern works as expected and check for any existing IP rules

# Test the sed pattern with sample input
echo "5000: from all to 192.168.1.0/24 lookup main" | \
  sed -nr 's/^[0-9]+:\s+from all to ([^\s]+) lookup main$/\1/p'

# Check current IP rules for similar patterns
ip -4 rule list
ip -6 rule list

Length of output: 309

tailscale/rootfs/usr/bin/subnet-routes (2)

18-18: LGTM! Good defensive programming practice.

The change from $1 to ${1-} improves robustness by properly handling unset parameters, making the script more resilient to incorrect usage.


Line range hint 23-24: Consider cache invalidation for network state changes.

Given that this PR aims to fix subnet protection during network state changes (interface down/up, DHCP changes), the current caching mechanism might prevent routes from being updated when needed. Consider:

  1. Adding cache invalidation when network state changes are detected
  2. Reducing cache duration or adding a force refresh parameter

This would ensure subnet protection is reapplied correctly after network events.

Let's verify if the cache is properly handled in network state change scenarios:

tailscale/rootfs/usr/bin/protect-subnet-routes (4)

14-16: Well-structured configuration validation

Good practice to check both userspace_networking and accept_routes configurations before proceeding. The condition is clear and maintainable.


36-43: Excellent error handling and documentation

The warning messages and comments about potential supervisor issues are well documented and helpful for troubleshooting. Good practice to explain the rationale behind keeping the warning message.


50-59: Well-implemented route addition with proper error handling

The implementation properly handles both successful additions and the case where routes already exist. Good practice to log the action and handle "File exists" separately from other errors.


7-7: Verify the IP rule priority value selection

The chosen priority value of 5000 needs verification to ensure it doesn't conflict with other system rules and properly overrides Tailscale's routing.

✅ Verification successful

Priority value 5000 is appropriate for the use case

The priority value of 5000 is correctly chosen for this context. The script adds IP rules with higher priority than Tailscale's routing to protect local subnet routes from being overridden when the same subnet is routed within the tailnet. The script:

  • Only runs when userspace networking is disabled and route acceptance is enabled
  • Adds rules for both IPv4 and IPv6 routes
  • Includes proper error handling for existing rules
  • Uses this priority to ensure local subnet routing takes precedence over Tailscale's routing

The implementation is sound and the priority value serves its intended purpose of protecting local subnet routes.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check existing IP rule priorities to ensure no conflicts
echo "Current IPv4 rules:"
ip -4 rule show

echo -e "\nCurrent IPv6 rules:"
ip -6 rule show

Length of output: 279


Script:

#!/bin/bash
# Let's check the script to understand the context of this priority value
cat tailscale/rootfs/usr/bin/protect-subnet-routes

Length of output: 2679

tailscale/rootfs/command/with-contenv-merge Show resolved Hide resolved
@frenck frenck added the bugfix Inconsistencies or issues which will cause a problem for users or implementors. label Nov 15, 2024
@frenck frenck merged commit 6810dd3 into hassio-addons:main Nov 15, 2024
11 of 12 checks passed
@lmagyar lmagyar deleted the pr-fix-subnet-protection branch November 15, 2024 19:13
@github-actions github-actions bot locked and limited conversation to collaborators Nov 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bugfix Inconsistencies or issues which will cause a problem for users or implementors.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants