DBX-Patch Architecture
Overview
DBX-Patch uses a class-based architecture with a unified interface for all patches. This document describes the design patterns, base classes, and implementation details.
Core Design Patterns
1. Singleton Pattern
All patch classes use the singleton pattern to ensure:
- Only one instance exists per patch type
- Consistent global state across the application
- Thread-safe initialization in Databricks notebook environments
Implementation:
from dbx_patch.base_patch import SingletonMeta
class SingletonMeta(ABCMeta):
"""Thread-safe singleton metaclass."""
_instances: dict[type, Any] = {}
_lock: threading.Lock = threading.Lock()
def __call__(cls, *args, **kwargs):
if cls not in cls._instances:
with cls._lock:
# Double-check locking pattern
if cls not in cls._instances:
instance = super().__call__(*args, **kwargs)
cls._instances[cls] = instance
return cls._instances[cls]
Usage:
# These always return the same instance
patch1 = SysPathInitPatch()
patch2 = SysPathInitPatch()
assert patch1 is patch2 # True
2. Abstract Base Classes (ABC)
All patches inherit from one of two abstract base classes:
BasePatch: For patches that modify runtime behaviorBaseVerification: For verification-only checks
This enforces a consistent interface across all patches.
Base Classes
BasePatch
Located in src/dbx_patch/base_patch.py, this is the foundation for all runtime patches.
Interface:
from abc import abstractmethod
from dbx_patch.base_patch import BasePatch
from dbx_patch.models import PatchResult
class BasePatch(metaclass=SingletonMeta):
"""Abstract base class for all Databricks runtime patches."""
@abstractmethod
def patch(self) -> PatchResult:
"""Apply the patch to the Databricks runtime."""
...
@abstractmethod
def remove(self) -> bool:
"""Remove the patch and restore original behavior."""
...
@abstractmethod
def is_applied(self) -> bool:
"""Check if the patch is currently applied."""
...
# Optional methods (can be overridden)
def refresh_paths(self) -> int:
"""Refresh cached editable install paths."""
...
def get_editable_paths(self) -> set[str]:
"""Get current cached editable paths."""
...
Shared State:
All BasePatch instances maintain:
_is_applied: bool- Tracks whether patch is active_original_target: Any- Reference to original function/method for restoration_cached_editable_paths: set[str]- Cached editable install paths_verbose: bool- Logging verbosity flag_logger: Any- Lazy-initialized logger instance
Helper Methods:
def _get_logger(self) -> Any:
"""Get lazily-initialized logger instance."""
...
def _detect_editable_paths(self) -> set[str]:
"""Detect editable install paths from pth_processor."""
...
BaseVerification
For patches that only verify compatibility without modifying behavior.
Interface:
from dbx_patch.base_patch import BaseVerification
class BaseVerification(metaclass=SingletonMeta):
"""Abstract base class for verification-only patches."""
@abstractmethod
def verify(self) -> PatchResult:
"""Verify compatibility with Databricks runtime."""
...
@abstractmethod
def is_verified(self) -> bool:
"""Check if verification has been performed."""
...
Patch Implementations
Active Patches (BasePatch)
1. SysPathInitPatch
Location: src/dbx_patch/patches/sys_path_init_patch.py
Purpose: Patches sys_path_init.patch_sys_path_with_developer_paths() to automatically process .pth files.
Key Methods:
from dbx_patch.patches.sys_path_init_patch import SysPathInitPatch
patch = SysPathInitPatch()
result = patch.patch() # Apply patch
is_active = patch.is_applied() # Check status
success = patch.remove() # Remove patch
2. WsfsImportHookPatch
Location: src/dbx_patch/patches/wsfs_import_hook_patch.py
Purpose: Patches workspace import machinery to allow imports from editable install paths. Supports both legacy (DBR < 18.0) and modern (DBR >= 18.0) runtimes.
Key Methods:
from dbx_patch.patches.wsfs_import_hook_patch import WsfsImportHookPatch
patch = WsfsImportHookPatch()
result = patch.patch() # Apply patch
paths = patch.get_editable_paths() # Get allowed paths
count = patch.refresh_paths() # Refresh after new installs
success = patch.remove() # Remove patch
3. PythonPathHookPatch
Location: src/dbx_patch/patches/python_path_hook_patch.py
Purpose: Patches PythonPathHook._handle_sys_path_maybe_updated() to preserve editable install paths during sys.path updates.
Key Methods:
from dbx_patch.patches.python_path_hook_patch import PythonPathHookPatch
patch = PythonPathHookPatch()
result = patch.patch() # Apply patch
paths = patch.get_editable_paths() # Get preserved paths
count = patch.refresh_paths() # Refresh cache
success = patch.remove() # Remove patch
4. AutoreloadHookPatch
Location: src/dbx_patch/patches/autoreload_hook_patch.py
Purpose: Registers editable path check in autoreload discoverability hook's allowlist.
Key Methods:
from dbx_patch.patches.autoreload_hook_patch import AutoreloadHookPatch
patch = AutoreloadHookPatch()
result = patch.patch() # Apply patch
is_active = patch.is_applied() # Check status
success = patch.remove() # Remove patch
Verification Patches (BaseVerification)
1. PostImportHookVerification
Location: src/dbx_patch/patches/post_import_hook_verify.py
Purpose: Verifies that PostImportHook.ImportHookFinder doesn't interfere with editable imports.
Key Methods:
from dbx_patch.patches.post_import_hook_verify import PostImportHookVerification
verify = PostImportHookVerification()
result = verify.verify() # Perform verification
is_done = verify.is_verified() # Check if verified
2. WsfsPathFinderVerification
Location: src/dbx_patch/patches/wsfs_path_finder_patch.py
Purpose: Verifies workspace path finder compatibility. Supports both legacy and modern runtimes.
Key Methods:
from dbx_patch.patches.wsfs_path_finder_patch import WsfsPathFinderVerification
verify = WsfsPathFinderVerification()
result = verify.verify() # Perform verification
is_done = verify.is_verified() # Check if verified
Return Types
PatchResult
All patch() and verify() methods return a PatchResult dataclass:
from dataclasses import dataclass
@dataclass
class PatchResult:
"""Result of a patch operation."""
success: bool
already_patched: bool = False
function_found: bool = True
hook_found: bool = True
editable_paths_count: int = 0
editable_paths: set[str] = field(default_factory=set)
error: str | None = None
Usage:
result = SysPathInitPatch().patch()
if result.success:
print(f"✅ Patch applied successfully")
print(f"Found {result.editable_paths_count} editable paths")
elif result.already_patched:
print("⚠️ Patch already applied")
else:
print(f"❌ Patch failed: {result.error}")
High-Level API
patch_dbx()
The main entry point that applies all patches in the correct order:
from dbx_patch import patch_dbx
result = patch_dbx(force_refresh=False, verbose=True)
Implementation:
def patch_dbx(force_refresh: bool = False, verbose: bool = True) -> PatchAllResult:
"""Apply all patches to enable editable installs.
Args:
force_refresh: Force refresh of editable paths cache
verbose: Enable verbose logging
Returns:
PatchAllResult with operation details
"""
from dbx_patch.patches.sys_path_init_patch import SysPathInitPatch
from dbx_patch.patches.wsfs_import_hook_patch import WsfsImportHookPatch
from dbx_patch.patches.python_path_hook_patch import PythonPathHookPatch
from dbx_patch.patches.autoreload_hook_patch import AutoreloadHookPatch
# Step 1: Process .pth files
process_all_pth_files(force=force_refresh, verbose=verbose)
# Step 2: Apply patches
sys_path_result = SysPathInitPatch().patch()
wsfs_result = WsfsImportHookPatch().patch()
path_hook_result = PythonPathHookPatch().patch()
autoreload_result = AutoreloadHookPatch().patch()
# Step 3: Verify compatibility
wsfs_finder_result = WsfsPathFinderVerification().verify()
post_import_result = PostImportHookVerification().verify()
return PatchAllResult(...)
remove_all_patches()
Removes all applied patches:
from dbx_patch import remove_all_patches
result = remove_all_patches()
Implementation:
def remove_all_patches() -> RemovePatchesResult:
"""Remove all applied patches and restore original behavior."""
from dbx_patch.patches.sys_path_init_patch import SysPathInitPatch
from dbx_patch.patches.wsfs_import_hook_patch import WsfsImportHookPatch
from dbx_patch.patches.python_path_hook_patch import PythonPathHookPatch
from dbx_patch.patches.autoreload_hook_patch import AutoreloadHookPatch
sys_path_result = SysPathInitPatch().remove()
wsfs_result = WsfsImportHookPatch().remove()
path_hook_result = PythonPathHookPatch().remove()
autoreload_result = AutoreloadHookPatch().remove()
return RemovePatchesResult(...)
Testing
Unit Tests
Each patch class can be tested independently:
def test_sys_path_init_patch():
from dbx_patch.patches.sys_path_init_patch import SysPathInitPatch
patch = SysPathInitPatch()
# Test patch application
result = patch.patch()
assert result.success or not result.function_found # OK if not in Databricks
# Test status check
is_applied = patch.is_applied()
assert isinstance(is_applied, bool)
# Test removal
if result.success:
assert patch.remove() == True
assert patch.is_applied() == False
Integration Tests
Test the full patch workflow:
def test_full_patch_workflow():
from dbx_patch import patch_dbx, remove_all_patches
# Apply all patches
result = patch_dbx()
# Verify results
assert result.sys_path_init_patched or not result.in_databricks
# Remove patches
remove_result = remove_all_patches()
assert isinstance(remove_result, RemovePatchesResult)
Migration from Old API
Old Function-Based API
# ❌ Old API (no longer supported)
from dbx_patch.patches.sys_path_init_patch import patch_sys_path_init, unpatch_sys_path_init, is_patched
result = patch_sys_path_init(verbose=True)
status = is_patched()
success = unpatch_sys_path_init()
New Class-Based API
# ✅ New API
from dbx_patch.patches.sys_path_init_patch import SysPathInitPatch
patch = SysPathInitPatch()
result = patch.patch()
status = patch.is_applied()
success = patch.remove()
Best Practices
1. Use High-Level API
Prefer patch_dbx() over individual patches:
# ✅ Recommended
from dbx_patch import patch_dbx
patch_dbx()
# ⚠️ Only use if you need specific patches
from dbx_patch.patches.wsfs_import_hook_patch import WsfsImportHookPatch
WsfsImportHookPatch().patch()
2. Check Status Before Patching
patch = SysPathInitPatch()
if not patch.is_applied():
result = patch.patch()
else:
print("Already patched")
3. Refresh After Installing New Packages
from dbx_patch.patches.wsfs_import_hook_patch import WsfsImportHookPatch
# Install new editable package
%pip install -e /path/to/package
# Refresh cached paths
patch = WsfsImportHookPatch()
patch.refresh_paths()
4. Error Handling
result = SysPathInitPatch().patch()
if not result.success:
if not result.function_found:
print("Not in Databricks environment")
else:
print(f"Error: {result.error}")
Thread Safety
All patch classes use thread-safe singleton initialization:
from concurrent.futures import ThreadPoolExecutor
def patch_in_thread():
return SysPathInitPatch()
# All threads get the same instance
with ThreadPoolExecutor(max_workers=10) as executor:
instances = list(executor.map(lambda x: patch_in_thread(), range(10)))
assert all(inst is instances[0] for inst in instances)
Debugging
Enable debug logging:
import os
os.environ['DBX_PATCH_DEBUG'] = '1'
from dbx_patch import patch_dbx
patch_dbx(verbose=True)
Inspect patch state:
from dbx_patch.patches.wsfs_import_hook_patch import WsfsImportHookPatch
patch = WsfsImportHookPatch()
print(f"Applied: {patch.is_applied()}")
print(f"Editable paths: {patch.get_editable_paths()}")