refactor(backend): phase 1 - unified error handling with thiserror

Introduce AppError enum to replace Result<T, String> pattern across the codebase, improving error context preservation and type safety. ## Changes ### Core Infrastructure - Add src/error.rs with AppError enum using thiserror - Add thiserror dependency to Cargo.toml - Implement helper functions: io(), json(), toml() for ergonomic error creation - Implement From<PoisonError> for automatic lock error conversion - Implement From<AppError> for String to maintain Tauri command compatibility ### Module Migrations (60% complete) - config.rs: Full migration to AppError - read_json_file, write_json_file, atomic_write - archive_file, copy_file, delete_file - claude_mcp.rs: Full migration to AppError - get_mcp_status, read_mcp_json, upsert_mcp_server - delete_mcp_server, validate_command_in_path - set_mcp_servers_map - codex_config.rs: Full migration to AppError - write_codex_live_atomic with rollback support - read_and_validate_codex_config_text - validate_config_toml - app_config.rs: Partial migration - MultiAppConfig::load, MultiAppConfig::save - store.rs: Partial migration - AppState::save now returns Result<(), AppError> - commands.rs: Minimal changes - Use .map_err(Into::into) for compatibility - mcp.rs: Minimal changes - sync_enabled_to_claude uses Into::into conversion ### Documentation - Add docs/BACKEND_REFACTOR_PLAN.md with detailed refactoring roadmap ## Benefits - Type-safe error handling with preserved error chains - Better error messages with file paths and context - Reduced boilerplate code (118 Result<T, String> instances to migrate) - Automatic error conversion for seamless integration ## Testing - All existing tests pass (4/4) - Compilation successful with no warnings - Build time: 0.61s (no performance regression) ## Remaining Work - claude_plugin.rs (7 functions) - migration.rs, import_export.rs - Add unit tests for error.rs - Complete commands.rs migration after dependent modules Co-authored-by: Claude <claude@anthropic.com>
2025-10-27 16:29:11 +08:00
parent bfab1d0ccb
commit c01e495eea
12 changed files with 352 additions and 114 deletions
--- a/docs/BACKEND_REFACTOR_PLAN.md
+++ b/docs/BACKEND_REFACTOR_PLAN.md
@@ -0,0 +1,139 @@
+# CC Switch Rust 后端重构方案
+
+## 目录
+- [背景与现状](#背景与现状)
+- [问题确认](#问题确认)
+- [方案评估](#方案评估)
+- [渐进式重构路线](#渐进式重构路线)
+- [测试策略](#测试策略)
+- [风险与对策](#风险与对策)
+- [总结](#总结)
+
+## 背景与现状
+- 前端已完成重构，后端 (Tauri + Rust) 仍维持历史结构。
+- 核心文件集中在 `src-tauri/src/commands.rs`、`lib.rs` 等超大文件中，业务逻辑与界面事件耦合严重。
+- 测试覆盖率低，只有零散单元测试，缺乏集成验证。
+
+## 问题确认
+
+| 提案问题 | 实际情况 | 严重程度 |
+| --- | --- | --- |
+| `commands.rs` 过长 | ✅ 1526 行，包含 32 个命令，职责混杂 | 🔴 高 |
+| `lib.rs` 缺少服务层 | ✅ 541 行，托盘/事件/业务逻辑耦合 | 🟡 中 |
+| `Result<T, String>` 泛滥 | ✅ 118 处，错误上下文丢失 | 🟡 中 |
+| 全局 `Mutex` 阻塞 | ✅ 31 处 `.lock()` 调用，读写不分离 | 🟡 中 |
+| 配置逻辑分散 | ✅ 分布在 5 个文件 (`config`/`app_config`/`app_store`/`settings`/`codex_config`) | 🟢 低 |
+
+代码规模分布（约 5.4k SLOC）：
+- `commands.rs`: 1526 行（28%）→ 第一优先级 🎯
+- `lib.rs`: 541 行（10%）→ 托盘逻辑与业务耦合
+- `mcp.rs`: 732 行（14%）→ 相对清晰
+- `migration.rs`: 431 行（8%）→ 一次性逻辑
+- 其他文件合计：2156 行（40%）
+
+## 方案评估
+
+### ✅ 优点
+1. **分层架构清晰**  
+   - `commands/`：Tauri 命令薄层  
+   - `services/`：业务流程，如供应商切换、MCP 同步  
+   - `infrastructure/`：配置读写、外设交互  
+   - `domain/`：数据模型 (`Provider`, `AppType` 等)  
+   → 提升可测试性、降低耦合度、方便团队协作。
+
+2. **统一错误处理**  
+   - 引入 `AppError`（`thiserror`），保留错误链和上下文。  
+   - Tauri 命令仍返回 `Result<T, String>`，通过 `From<AppError>` 自动转换。  
+   - 改善日志可读性，利于排查。
+
+3. **并发优化**  
+   - `AppState` 切换为 `RwLock<MultiAppConfig>`。  
+   - 读多写少的场景提升吞吐（如频繁查询供应商列表）。
+
+### ⚠️ 风险
+1. **过度设计**  
+   - 完整 DDD 四层在 5k 行项目中会增加 30-50% 维护成本。  
+   - Rust trait + repository 样板较多，收益不足。  
+   - 推荐“轻量分层”而非正统 DDD。
+
+2. **迁移成本高**  
+   - `commands.rs` 拆分、错误统一、锁改造触及多文件。  
+   - 测试缺失导致重构风险高，需先补测试。  
+   - 估算完整改造需 5-6 周；建议分阶段输出可落地价值。
+
+3. **技术选型需谨慎**  
+   - `parking_lot` 相比标准库 `RwLock` 提升有限，不必引入。  
+   - `spawn_blocking` 仅用于 >100ms 的阻塞任务，避免滥用。  
+   - 以现有依赖为主，控制复杂度。
+
+## 渐进式重构路线
+
+### 阶段 1：统一错误处理（高收益 / 低风险）
+- 新增 `src-tauri/src/error.rs`，定义 `AppError`。  
+- 底层文件 IO、配置解析等函数返回 `Result<T, AppError>`。  
+- 命令层通过 `?` 自动传播，最终 `.map_err(Into::into)`。
+- 预估 3-5 天，立即启动。
+
+### 阶段 2：拆分 `commands.rs`（高收益 / 中风险）
+- 按业务拆分为 `commands/provider.rs`、`commands/mcp.rs`、`commands/config.rs`、`commands/settings.rs`、`commands/misc.rs`。  
+- `commands/mod.rs` 统一导出和注册。  
+- 文件行数降低到 200-300 行/文件，职责单一。  
+- 预估 5-7 天，可并行进行部分重构。
+
+### 阶段 3：补充测试（中收益 / 中风险）
+- 引入 `tests/` 或 `src-tauri/tests/` 集成测试，覆盖供应商切换、MCP 同步、配置迁移。  
+- 使用 `tempfile`/`tempdir` 隔离文件系统，组合少量回归脚本。  
+- 预估 5-7 天，为后续重构提供安全网。
+
+### 阶段 4：提取轻量服务层（中收益 / 中风险）
+- 新增 `services/provider_service.rs`、`services/mcp_service.rs`。  
+- 不强制使用 trait；直接以自由函数/结构体实现业务流程。  
+   ```rust
+   pub struct ProviderService;
+   impl ProviderService {
+       pub fn switch(config: &mut MultiAppConfig, app: AppType, id: &str) -> Result<(), AppError> {
+           // 业务流程：验证、回填、落盘、更新 current、触发事件
+       }
+   }
+   ```
+- 命令层负责参数解析，服务层处理业务逻辑，托盘逻辑重用同一接口。  
+- 预估 7-10 天，可在测试补齐后执行。
+
+### 阶段 5：锁与阻塞优化（低收益 / 低风险）
+- `AppState` 从 `Mutex` 改为 `RwLock`。  
+- 读写操作分别使用 `read()`/`write()`，减少不必要的互斥。  
+- 长耗时任务（如归档、批量迁移）用 `spawn_blocking` 包裹，其余直接同步调用。  
+- 预估 3-5 天，可在主流程稳定后安排。
+
+## 测试策略
+- **优先覆盖场景**  
+  - 供应商切换：状态更新 + live 配置同步  
+  - MCP 同步：enabled 服务器快照与落盘  
+  - 配置迁移：归档、备份与版本升级
+- **推荐结构**
+  ```rust
+  #[cfg(test)]
+  mod integration {
+      use super::*;
+      #[test]
+      fn switch_provider_updates_live_config() { /* ... */ }
+      #[test]
+      fn sync_mcp_to_codex_updates_claude_config() { /* ... */ }
+      #[test]
+      fn migration_preserves_backup() { /* ... */ }
+  }
+  ```
+- 目标覆盖率：关键路径 >80%，文件 IO/迁移 >70%。
+
+## 风险与对策
+- **测试不足** → 阶段 3 强制补齐，建立基础集成测试。  
+- **重构跨度大** → 按阶段在独立分支推进（如 `refactor/backend-step1` 等）。  
+- **回滚困难** → 每阶段结束打 tag（如 `v3.6.0-backend-step1`），保留回滚点。  
+- **功能回归** → 重构后执行手动冒烟流程：供应商切换、托盘操作、MCP 同步、配置导入导出。
+
+## 总结
+- 当前规模下不建议整体引入完整 DDD/四层架构，避免过度设计。  
+- 建议遵循“错误统一 → 命令拆分 → 补测试 → 服务层抽象 → 锁优化”的渐进式策略。  
+- 完成阶段 1-3 后即可显著提升可维护性与可靠性；阶段 4-5 可根据资源灵活安排。  
+- 重构过程中同步维护文档与测试，确保团队成员对架构演进保持一致认知。
+