Description
Currently, mostlyai-engine does not leverage GPU acceleration on macOS. While CUDA-based acceleration is available for Linux, macOS users are limited to CPU-bound operations. To improve performance for macOS users, we should add support for PyTorch’s Metal Performance Shaders (MPS) backend, which enables GPU acceleration on Apple Silicon (eg. M1/M2/M3 or future Macs).
Proposed Solution
- Enable PyTorch MPS backend detection
-
Check if PyTorch is installed with MPS support (torch.backends.mps.is_available()).
-
If MPS is available, ensure models and tensors are correctly moved to the MPS device
- Update Installation & Dependencies
-
Ensure torch>=1.13 is installed, as MPS support is available from this version onwards.
-
Add documentation for macOS users on installing PyTorch with MPS support.
- Modify Training & Inference Pipelines
-
Adapt existing PyTorch calls to dynamically select the best available backend (mps, cuda, or cpu).
-
Ensure compatibility with QLoRA and bitsandbytes (fallback to CPU if MPS does not support certain operations).
- Performance Benchmarking & Validation
Questions
- Should we introduce an extra[mps] option for macOS users to explicitly enable MPS-related dependencies?
Answer: We would want to keep a simple set of extras (eg. [gpu] for both Linux + CUDA and Darwin + MPS)
- How well does bitsandbytes integrate with Darwin?
Answer: We'll ensure the required version has the necessary wheels.
Acceptance Criteria
- Mac users can utilize MPS acceleration via PyTorch without modifying code manually.
- Performance improvements over CPU-only execution are verified.
- No breaking changes for Linux users.
Description
Currently, mostlyai-engine does not leverage GPU acceleration on macOS. While CUDA-based acceleration is available for Linux, macOS users are limited to CPU-bound operations. To improve performance for macOS users, we should add support for PyTorch’s Metal Performance Shaders (MPS) backend, which enables GPU acceleration on Apple Silicon (eg. M1/M2/M3 or future Macs).
Proposed Solution
Check if PyTorch is installed with MPS support (torch.backends.mps.is_available()).
If MPS is available, ensure models and tensors are correctly moved to the MPS device
Ensure torch>=1.13 is installed, as MPS support is available from this version onwards.
Add documentation for macOS users on installing PyTorch with MPS support.
Adapt existing PyTorch calls to dynamically select the best available backend (mps, cuda, or cpu).
Ensure compatibility with QLoRA and bitsandbytes (fallback to CPU if MPS does not support certain operations).
Compare training/inference speeds using MPS vs. CPU.
Identify any limitations or unsupported operations within MPS that may require fallbacks.
Questions
Answer: We would want to keep a simple set of extras (eg. [gpu] for both Linux + CUDA and Darwin + MPS)
Answer: We'll ensure the required version has the necessary wheels.
Acceptance Criteria