feat: finalize search UX and release hardening

This commit is contained in:
stoorps 2026-03-21 16:53:33 +00:00
parent c63b2917da
commit 34f9543a78
Signed by: stoorps
SSH key fingerprint: SHA256:AZlPfu9hTu042EGtZElmDQoy+KvMOeShLDan/fYLoNI
44 changed files with 4983 additions and 94 deletions

View file

@ -0,0 +1,498 @@
# Repository Audit: aim (AppImage Manager)
**Timestamp:** 2026-03-21T02:02:28Z (UTC)
**Commit:** f260790d915e350879838d027db8d602311d4437
**Branch:** main
**Repo Path:** /home/stoorps/repos/aim
**Audit Scope:** Full repository — product completeness, provider support, registry operations, docs-vs-code mismatches, missing tests, v1.0 readiness
---
## Executive Summary
The `aim` AppImage manager has a **solid architectural foundation** with clean domain separation, good test coverage for GitHub flows, and working install/update/remove operations. However, **critical gaps exist** that block production v1.0 readiness:
- **Download reliability is insufficient** — entire files loaded into memory, no resume/retry, no timeouts, risking OOM on 500MB+ AppImages
- **Registry safety is weak** — no atomic writes, no backups, no corruption recovery, vulnerable to concurrent access
- **Documentation is stale** — README omits SourceForge, zsync, custom-json providers already in code
- **Missing product features** — no search, no version pinning, no dry-run, no info command, no rollback
- **Checksum verification exists but is unused** — metadata contains SHA-512 checksums that are parsed but never validated during downloads
- **CustomJsonAdapter is a non-functional stub** — declared but all methods return errors
**v1.0 Readiness Verdict:** **Not ready.** Core reliability gaps (download safety, registry atomicity, checksum validation) must be closed before production use. Feature gaps (search, info, dry-run) are acceptable for v1.0 but limit user expectations.
---
## Surface Map
### Entry Points
- **CLI:** `aim <query>`, `aim`, `aim update`, `aim list`, `aim remove <query>`
- **Scope Overrides:** `--user`, `--system`
- **Environment Variables:** `AIM_REGISTRY_PATH`, `AIM_GITHUB_TOKEN`, `GITHUB_TOKEN`, `AIM_GITHUB_API_BASE`, `HOME`
### Public Interfaces
- **Supported Query Forms:**
- GitHub shorthand: `owner/repo`
- GitHub URLs: repository, release, asset
- GitLab URLs: repository, release-like, ambiguous candidates
- SourceForge URLs: project, file download paths
- Direct HTTPS/HTTP URLs
- Local file imports: `file://...`
### Implemented Providers
- **Repository-backed:** GitHub, GitLab, SourceForge
- **Exact-resolution:** DirectUrl, File
- **Metadata-only:** Zsync (update channels), Electron Builder (checksums)
- **Stub:** CustomJson (all operations return errors)
### State & Data Layers
- **Registry:** TOML file at `~/.local/share/aim/registry.toml`, stores installed apps with source, version, update strategy
- **Install Paths:**
- User scope: `~/.local/lib/aim/appimages`, `~/.local/share/applications`, `~/.local/share/icons`
- System scope: `/opt/aim/appimages`, `/usr/share/applications`, `/usr/share/icons`
- **Staging:** `.local/share/aim/staging` for download-before-commit
### Cross-Cutting Concerns
- **Platform Detection:** Distro family (Debian, RedHat, Arch, Immutable, Nix, Alpine, Other), desktop session presence
- **Error Handling:** Distinct unsupported-query vs no-installable-artifact vs transport-failure errors
- **Progress Reporting:** Live spinners, byte progress, staged events (resolveQuery, discoverRelease, downloadArtifact, etc.)
---
## Findings (Prioritised Backlog)
### GAP-001: Download Reliability — Memory Exhaustion Risk
**Type:** Reliability / Performance
**Severity:** **Critical**
**Impact:** **Production blocker.** Large AppImages (500MB+) are loaded entirely into memory before writing to disk. Risky on low-memory systems, can trigger OOM kills.
**Evidence:**
- [crates/aim-core/src/app/add.rs](crates/aim-core/src/app/add.rs#L506-L545): `download_artifact_bytes_with_reporter` allocates a `Vec<u8>`, reads 16KB chunks in a loop, then `extend_from_slice`. Full payload lives in heap before `stage_and_commit_payload` writes it.
- No streaming-to-disk, no memory pressure handling.
**Repro/Trace:**
1. Attempt `aim <large-appimage-url>` where payload is 1GB
2. Watch process RSS grow to 1GB+ before any file write
**Suggested Fix:**
- Stream directly to staged file path using `io::copy` or chunked writes with temp file + atomic rename
- Alternatively, use memory-mapped I/O for very large files
- Add download timeout via `reqwest::ClientBuilder::timeout`
**Acceptance Criteria:**
- [ ] Download writes directly to disk without intermediate full-buffer accumulation
- [ ] Memory usage stays constant regardless of AppImage size
- [ ] Configurable timeout (default 5 minutes)
---
### GAP-002: Download Reliability — No Resume, Retry, or Timeout
**Type:** Reliability
**Severity:** **High**
**Impact:** Network interruptions, slow connections, or server timeouts cause complete download restart. Poor UX for large files or unstable networks.
**Evidence:**
- [crates/aim-core/src/app/add.rs](crates/aim-core/src/app/add.rs#L506-L545): Single `reqwest::blocking::get(url)` call, no retry loop, no partial-download resume via Range headers
- No timeout configured on `reqwest::blocking::Client` in `ReqwestGitHubTransport::new()` — hangs indefinitely on stalled connections
**Suggested Fix:**
- Add retry logic (3 attempts with exponential backoff)
- Support HTTP Range requests for resume if server supports it (check `Accept-Ranges` header)
- Configure request timeout (30s connect, 5m total)
- Provide `--no-resume` flag to skip range logic if needed
**Acceptance Criteria:**
- [ ] Failed downloads retry up to 3 times
- [ ] Partial downloads resume from last byte if server supports Range
- [ ] Timeout after 5 minutes of inactivity or 30s connect timeout
---
### GAP-003: Checksum Verification Not Used
**Type:** Security / Reliability
**Severity:** **High**
**Impact:** **Silent corruption or supply-chain attacks.** Metadata includes SHA-512 checksums (electron-builder) but downloads are never verified. Corrupted or tampered payloads install successfully.
**Evidence:**
- [crates/aim-core/src/metadata/electron_builder.rs](crates/aim-core/src/metadata/electron_builder.rs#L8): Parses `sha512:` field from `latest-linux.yml`
- [crates/aim-core/src/domain/update.rs](crates/aim-core/src/domain/update.rs#L17): `MetadataHints` struct includes `checksum: Option<String>`
- No verification step in [crates/aim-core/src/app/add.rs](crates/aim-core/src/app/add.rs#L381) `download_artifact_bytes_with_reporter` or [crates/aim-core/src/integration/install.rs](crates/aim-core/src/integration/install.rs#L87-L115) `stage_and_commit_payload`
**Repro/Trace:**
1. Install app with known electron-builder metadata: `aim owner/repo`
2. Inject corrupt bytes into download stream (or MITM proxy)
3. Installation succeeds without checksum mismatch error
**Suggested Fix:**
- After download completes, compute SHA-512 of payload if `checksum` hint exists
- Fail installation with clear error if mismatch
- Add `--skip-checksum` flag for advanced users bypassing validation
**Acceptance Criteria:**
- [ ] SHA-512 checksums from metadata are verified post-download
- [ ] Mismatch triggers clear error with expected vs actual hash
- [ ] Installs without metadata checksums proceed with warning
---
### GAP-004: Registry Corruption — No Atomic Writes or Backups
**Type:** Reliability
**Severity:** **High**
**Impact:** Registry corruption if process killed mid-save. No recovery mechanism; complete data loss of installed apps list.
**Evidence:**
- [crates/aim-core/src/registry/store.rs](crates/aim-core/src/registry/store.rs#L26-L31): `RegistryStore::save` directly writes to `self.path` via `fs::write`. Not atomic (no temp file + rename).
- [crates/aim-core/src/registry/store.rs](crates/aim-core/src/registry/store.rs#L14-L21): `load` returns `toml::de::Error` on corrupt file, no fallback to backup
**Repro/Trace:**
1. Start `aim sharkdp/bat`
2. Kill process (`kill -9`) during registry save stage
3. Attempt `aim list` — registry file is truncated or corrupt, operation fails
**Suggested Fix:**
- Atomic save: write to `registry.toml.new`, then `fs::rename` to `registry.toml`
- Backup previous registry to `registry.toml.bak` before overwrite
- On load failure, attempt restore from `.bak`
- Add corruption detection (signature header) for faster failure
**Acceptance Criteria:**
- [ ] Registry saves are atomic (temp file + rename)
- [ ] Previous registry backed up before save
- [ ] Corrupt registry auto-restores from backup with warning
- [ ] Test: kill process mid-save, verify registry intact
---
### GAP-005: Concurrent Access — No Registry Locking
**Type:** Reliability
**Severity:** **Medium**
**Impact:** Running multiple `aim` commands simultaneously can corrupt registry (race on read-modify-write). Data loss or duplicate entries possible.
**Evidence:**
- [crates/aim-cli/src/lib.rs](crates/aim-cli/src/lib.rs#L34-L38): `dispatch_with_reporter` loads registry, mutates in-memory copy, saves back. No file lock.
- No `flock` or lock file mechanism in [crates/aim-core/src/registry/store.rs](crates/aim-core/src/registry/store.rs)
- Grep for `Mutex` found zero results in `crates/` — no concurrency control
**Repro/Trace:**
1. Terminal A: `aim sharkdp/bat` (slow network download)
2. Terminal B: `aim update` (starts while A is downloading)
3. Both save registry simultaneously, one clobbers the other
**Suggested Fix:**
- Advisory file lock on `registry.toml.lock` using `fs2` crate
- Acquire lock in `load()`, release in `save()`
- Fail fast with clear error if lock unavailable after 5s
**Acceptance Criteria:**
- [ ] Only one `aim` process can modify registry at a time
- [ ] Second process waits or fails cleanly with "registry locked" message
- [ ] Lock automatically released on crash (advisory lock semantics)
---
### GAP-006: Documentation — Missing Providers in README
**Type:** Documentation / DX
**Severity:** **Medium**
**Impact:** Users unaware of supported sources. SourceForge, zsync, custom-json implemented but undocumented.
**Evidence:**
- [README.md](README.md#L23-L30): Documents GitHub, GitLab, direct URLs, file imports
- [crates/aim-core/src/adapters/mod.rs](crates/aim-core/src/adapters/mod.rs#L12-L19): `all_adapter_kinds()` includes `sourceforge`, `zsync`, `custom-json`
- [crates/aim-core/src/source/input.rs](crates/aim-core/src/source/input.rs#L53-L56): `classify_sourceforge_http` actively classifies SourceForge URLs
- grep for "SourceForge" in README.md: **no matches**
**Suggested Fix:**
- Update README "Query Forms" section:
- Add SourceForge project URLs and file download URLs
- Note zsync as discovered metadata only (not install source)
- Document or remove custom-json (currently nonfunctional)
- Add examples: `aim https://sourceforge.net/projects/app/files/v1.0/App.AppImage/download`
**Acceptance Criteria:**
- [ ] README documents all functional source types
- [ ] SourceForge examples included
- [ ] Zsync role clarified (metadata vs source)
- [ ] Custom-json either documented as experimental or removed from public list
---
### GAP-007: CustomJsonAdapter Is Non-Functional Stub
**Type:** Missing Feature / Code Quality
**Severity:** **Medium**
**Impact:** Adapter listed in `all_adapter_kinds()` but completely unusable. Misleading.
**Evidence:**
- [crates/aim-core/src/adapters/custom_json.rs](crates/aim-core/src/adapters/custom_json.rs#L1-L24):
- `normalize()` returns `Err(AdapterError::UnsupportedQuery)`
- `resolve()` returns `Err(AdapterError::UnsupportedSource)`
- Zero tests for this adapter in `crates/aim-core/tests/`
**Suggested Fix:**
- **Option A (remove):** Delete adapter, remove from `all_adapter_kinds()`, document as future work
- **Option B (implement):** Define JSON schema for custom update metadata, implement resolution logic, add tests
- **Option C (mark experimental):** Add comment and feature flag, exclude from default builds
**Acceptance Criteria:**
- [ ] Either adapter works with documented schema, or removed from public API
- [ ] No adapter listed in `all_adapter_kinds()` that returns errors for all operations
---
### GAP-008: Missing CLI Commands — Search, Info, Show
**Type:** Missing Feature
**Severity:** **Medium**
**Impact:** Users must know exact owner/repo or URL beforehand. No discovery workflow.
**Evidence:**
- [crates/aim-cli/src/cli/args.rs](crates/aim-cli/src/cli/args.rs#L1-L29): Only defines `Remove`, `List`, `Update` subcommands
- No `aim search`, `aim info <query>`, or `aim show <stable-id>` commands
- GitHub releases API supports search, but no CLI exposure
**Suggested Fix:**
- `aim search <keyword>` — search GitHub for AppImages (requires GitHub API search endpoint or external index)
- `aim info <query>` or `aim show <stable-id>` — display app details pre-install (version, description, assets) or post-install (installed paths, source, update status)
**Acceptance Criteria:**
- [ ] `aim search` returns list of candidate apps with descriptions
- [ ] `aim info` shows app metadata without installing
- [ ] `aim show bat` displays installed app details (version, files, source)
---
### GAP-009: Missing CLI Features — Dry-Run, Force, Verbose
**Type:** Missing Feature
**Severity:** **Medium**
**Impact:** Limited user control over operations. No preview mode, no way to debug issues, no forced reinstall.
**Evidence:**
- [crates/aim-cli/src/cli/args.rs](crates/aim-cli/src/cli/args.rs): No `--dry-run`, `--force`, `--verbose`, `--no-desktop` flags
- All operations execute immediately without preview except update plan (`aim` bare shows pending updates but requires `aim update` to execute)
**Suggested Fix:**
- Add `--dry-run` / `-n`: Show what would be done without mutating state
- Add `--force` / `-f`: Reinstall even if already installed at same version
- Add `--verbose` / `-v`: Show detailed logs (API requests, file operations)
- Add `--no-desktop`: Skip desktop integration, install payload only
**Acceptance Criteria:**
- [ ] `aim --dry-run <query>` prints plan without installing
- [ ] `aim --force <query>` reinstalls existing app
- [ ] `aim --verbose list` shows debug output
- [ ] `aim --no-desktop <query>` installs without .desktop file
---
### GAP-010: No Version Pinning or Update Blocking
**Type:** Missing Feature
**Severity:** **Medium**
**Impact:** Users cannot pin to specific versions or exclude from updates. All apps track latest by default (unless installed from specific release URL).
**Evidence:**
- [crates/aim-core/src/domain/app.rs](crates/aim-core/src/domain/app.rs): `AppRecord` has `installed_version`, `update_strategy`, but no `pinned` or `update_policy` field
- `aim update` blindly updates all apps without user filtering
**Suggested Fix:**
- Add `aim pin <stable-id>` and `aim unpin <stable-id>` commands
- Store `pinned: bool` or `update_policy: Auto | Manual | Pinned` in `AppRecord`
- `aim update` skips pinned apps, logs "X apps pinned, skipping"
- Add `--ignore <stable-id>` flag to `aim update` for one-time exclusion
**Acceptance Criteria:**
- [ ] `aim pin bat` prevents `aim update` from updating bat
- [ ] `aim list` shows pinned status
- [ ] `aim unpin bat` re-enables updates
---
### GAP-011: No Rollback or Update Failure Recovery
**Type:** Reliability
**Severity:** **Medium**
**Impact:** Failed updates leave old binary deleted but registry updated, or vice versa. No way to revert to previous version.
**Evidence:**
- [crates/aim-core/src/app/update.rs](crates/aim-core/src/app/update.rs#L43-L75): `execute_updates_with_reporter` catches errors and records `UpdateExecutionStatus::Failed`, but old payload is already overwritten by [crates/aim-core/src/integration/install.rs](crates/aim-core/src/integration/install.rs#L102-L103) `fs::rename` steps
- No backup of old `.AppImage` file before update
- grep for `rollback` or `backup` found zero matches
**Repro/Trace:**
1. `aim sharkdp/bat` (install v1.0)
2. Tamper with network to force update download failure
3. `aim update` — old bat.AppImage replaced with corrupt download, registry shows v1.1 but binary is broken
**Suggested Fix:**
- Before update, rename `app.AppImage` to `app.AppImage.backup`
- On success, delete backup
- On failure, restore from backup and log rollback
- Add `aim rollback <stable-id>` command to restore previous version
**Acceptance Criteria:**
- [ ] Failed updates restore previous payload automatically
- [ ] Registry remains consistent with installed payload
- [ ] Manual rollback command available for user-initiated revert
---
### GAP-012: No Proxy Support or Network Configuration
**Type:** Missing Feature
**Severity:** **Low**
**Impact:** Corporate or privacy-conscious users behind proxies cannot use `aim`.
**Evidence:**
- [crates/aim-core/src/source/github.rs](crates/aim-core/src/source/github.rs#L158-L175): `ReqwestGitHubTransport::new()` uses default `reqwest::blocking::Client` builder, no proxy configuration
- grep for `proxy`, `http_proxy`, `https_proxy`: zero matches
**Suggested Fix:**
- Respect `HTTP_PROXY`, `HTTPS_PROXY`, `NO_PROXY` environment variables
- Use `reqwest::ClientBuilder::proxy()` to configure
- Add `--proxy <url>` CLI flag for override
**Acceptance Criteria:**
- [ ] `HTTP_PROXY=http://proxy:8080 aim <query>` uses proxy
- [ ] Corporate proxy users can install apps
---
### GAP-013: No Offline or Cache-First Mode
**Type:** Missing Feature
**Severity:** **Low**
**Impact:** Cannot use `aim` in air-gapped or intermittent-connectivity environments. Every operation requires network.
**Evidence:**
- All provider adapters make live HTTP requests, no local cache
**Suggested Fix:**
- Cache GitHub release metadata in `~/.cache/aim/metadata/` for 1 hour
- Add `--offline` flag to use cached data only
- Add `--refresh` to force cache invalidation
**Acceptance Criteria:**
- [ ] `aim --offline list` works without network if metadata cached
- [ ] Metadata cached for 1 hour, revalidated on next request
---
### GAP-014: No Help Text or Usage Examples
**Type:** Documentation / DX
**Severity:** **Low**
**Impact:** Users must read README or guess commands. Poor discoverability.
**Evidence:**
- [crates/aim-cli/src/cli/args.rs](crates/aim-cli/src/cli/args.rs#L1-L29): Clap definitions exist but no detailed `about` text or examples
- `aim --help` output is minimal (auto-generated clap summary)
**Suggested Fix:**
- Add `#[command(long_about = "...")]` with detailed usage examples
- Add `#[arg(help = "...")]` for each flag
- Include examples in help: `aim sharkdp/bat`, `aim list`, `aim remove bat`
**Acceptance Criteria:**
- [ ] `aim --help` shows examples and detailed descriptions
- [ ] Each subcommand has helpful usage text
---
### GAP-015: No CI-Friendly Output or Machine-Parseable Formats
**Type:** Missing Feature
**Severity:** **Low**
**Impact:** Cannot integrate `aim` into automation or scripts. Output is human-readable only.
**Evidence:**
- [crates/aim-cli/src/ui/render.rs](crates/aim-cli/src/ui/render.rs): Output is styled text tables, no JSON or CSV option
**Suggested Fix:**
- Add `--format json|yaml|csv` flag
- Add `--quiet` / `-q` for errors-only output
**Acceptance Criteria:**
- [ ] `aim list --format json` returns machine-parseable output
- [ ] `aim --quiet <query>` suppresses progress spinners
---
### GAP-016: Test Coverage — Missing Tests
**Type:** Testing / Quality
**Severity:** **Medium**
**Impact:** Gaps in reliability verification.
**Evidence:**
- **CustomJsonAdapter:** Zero tests in [crates/aim-core/tests/](crates/aim-core/tests/)
- **Concurrency:** No tests simulating concurrent registry access
- **Corruption recovery:** No tests for registry load failures + backup restore
- **Large files:** No tests for memory usage on 500MB+ downloads
- **Network failures:** Limited retry/timeout tests
- grep for `custom_json` in `tests/`: zero matches
**Suggested Fix:**
- Add `custom_json_adapter_contract_test` if adapter is retained
- Add `concurrent_registry_access_test` using threads
- Add `registry_corruption_recovery_test` with intentionally corrupt TOML
- Add `large_file_download_memory_test` mocking 1GB payload
- Add `network_failure_retry_test` with flaky mock server
**Acceptance Criteria:**
- [ ] All adapters in `all_adapter_kinds()` have contract tests
- [ ] Concurrency edge cases covered
- [ ] Corruption recovery path validated
---
## Quick Wins
Fixes that deliver high user value with low implementation cost:
1. **GAP-006 (Documentation)** — Update README with SourceForge, zsync usage (~30 minutes)
2. **GAP-014 (Help Text)** — Add detailed clap help strings (~1 hour)
3. **GAP-009 (Dry-Run Flag)** — Add `--dry-run` flag that skips final save (~2 hours)
4. **GAP-007 (CustomJson Cleanup)** — Remove non-functional adapter or add experimental marker (~30 minutes)
5. **GAP-004 (Registry Backups)** — Copy registry to `.bak` before save (~1 hour)
---
## Open Questions
1. **Checksums:** Should aim fail-closed (reject unverified downloads) or warn-only? Some sources lack checksums.
2. **Update Policy:** Default to auto-update all apps, or require explicit `aim update <stable-id>`?
3. **Version Pinning:** Should pinned apps show in `aim list` with special marker, or hidden?
4. **Search Scope:** Limit search to apps with AppImage releases, or show all repos?
5. **CustomJson:** Is this intended for user-defined update feeds? Should it support local JSON files or HTTP endpoints?
6. **Offline Mode:** Cache release metadata only, or cache payloads too?
7. **Platform Support:** Current code is Linux-only (by design). Document macOS/Windows as non-goals?
8. **Concurrency:** Should `aim` support lock wait timeout (fail fast) or infinite retry?
---
## Appendix
### Notable Search Patterns
- `TODO|FIXME|HACK`: **0 matches** in Rust source (clean codebase)
- `unimplemented!`: 0 matches
- `panic!`: Minimal use, only in test fixtures
- `.unwrap()`: Mostly in tests, production code uses `Result`
### Files Reviewed
- All `.rs` files in `crates/aim-core/src/` (app, adapters, domain, integration, platform, registry, source, update, metadata)
- All `.rs` files in `crates/aim-cli/src/` (cli, ui, lib, main)
- All test files in `crates/aim-core/tests/` and `crates/aim-cli/tests/`
- [README.md](README.md), [Cargo.toml](Cargo.toml)
- Planning docs in [.plans/007-source-provider-expansion/](.plans/007-source-provider-expansion/)
### Test Coverage Hotspots
- **Strong:** GitHub flows (discovery, parsing, install, update), GitLab basic flows, SourceForge URL classification
- **Moderate:** DirectUrl, file imports, metadata parsing (zsync, electron-builder), platform detection, remove flows
- **Weak:** CustomJson (zero), concurrency, corruption recovery, network failures, large file handling
---
## v1.0 Readiness Verdict
**Status:** **Pre-Alpha → Alpha** (not v1.0-ready)
**Blockers for v1.0:**
1. **GAP-001 (Memory exhaustion)** — Critical reliability issue
2. **GAP-003 (Checksum verification)** — Critical security issue
3. **GAP-004 (Registry atomicity)** — Critical data safety issue
4. **GAP-005 (Concurrent access)** — High-severity correctness issue
**Nice-to-Have for v1.0:**
- Search, info commands
- Dry-run mode
- Proxy support
- Better help text
**Recommendation:**
Close **GAP-001, GAP-003, GAP-004, GAP-005** before any production announcement. After those fixes, aim reaches **beta** status (core safety established). User-facing features (search, pin, rollback) can ship in v1.1+.
**Effort Estimate (Blockers Only):** ~3-5 engineering days for someone familiar with Rust + reqwest + file I/O patterns.
**Current Strengths:**
- Clean architecture (adapters, domain, app separation)
- Provider expansion path is well-designed
- Good test coverage for happy paths
- Terminal UX is polished (progress bars, styled output)
- Platform detection handles immutable distros, Nix, etc.
**Overall Quality:** Well-built for internal/hobby use. Needs production hardening for public v1.0.