Commit Graph

147 Commits

Author SHA1 Message Date
Philipp Emanuel Weidmann f3b9826ca4 Add CI workflow 2025-11-19 09:45:54 +05:30
Richard Young, PhD 13bb7b24d6 Fix KeyError when HuggingFace user profile fields are missing (#20)
Handle optional fullname and email fields in user profile gracefully
using .get() method with fallback values to prevent KeyError when
uploading models to HuggingFace.

This fixes an issue where users without a public email or fullname
set in their HuggingFace profile would encounter an error during
the upload process.

Co-authored-by: ricyoung <riyoung@gmail.com>
2025-11-19 05:32:50 +05:30
Nikolai Kolodziej c8b6663b93 Fix multi-GPU support and memory management (#17)
* Ensure projector is on the same device as the matrix for multi-GPU support

* Optimize memory management for loaded model weights

* Refactor memory management by removing unnecessary gc.collect() calls

* Optimize memory usage (#1)

* Improve memory management by explicitly deleting model layers and optimizing projector usage

* Optimize memory management by explicitly deleting the model and forcing garbage collection

* Add back deleted `empty_cache` call

* Fix broken file

* Remove unnecessary deletions

* Remove unnecessary empty_cache() calls

* Remove unused import of gc

* Duplicate `gc.collect` call in `empty_cache()`

* Move additional `gc.collect` call in front of `torch.x.empty_cache`
2025-11-19 05:09:12 +05:30
Ooze 61fdf72b42 Add support for Granite MoE Hybrid in model.py by including down projections for shared MLP and MoE experts (#14) 2025-11-18 08:32:58 +05:30
red40maxxer 7bad84b4f1 perf: clear residuals after computing direction (#15)
Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com>
2025-11-17 22:18:22 +05:30
Matt Barnson 09730bad70 MPS support (#5)
* MPS support

* oops, added issue tracker.

* Delete .beads/issues.jsonl
2025-11-17 18:42:01 +05:30
Philipp Emanuel Weidmann b3545e4b1e Fix retrieving package version v1.0.1 2025-11-16 17:35:13 +05:30
Philipp Emanuel Weidmann 3f346b6150 Change package name 2025-11-16 17:01:50 +05:30
Philipp Emanuel Weidmann 1a59d226c1 Fix spacing after images in README 2025-11-16 16:06:08 +05:30
Philipp Emanuel Weidmann 12ecf50033 Add README 2025-11-16 15:19:27 +05:30
Philipp Emanuel Weidmann ea699dce46 Improve appearance of selection menus 2025-11-16 11:32:58 +05:30
Philipp Emanuel Weidmann 8a1aceff11 Switch to multi-objective optimization 2025-11-14 18:04:23 +05:30
Philipp Emanuel Weidmann 0bae27f359 Fix some of the problems with Falcon-E-3B 2025-11-13 11:39:08 +05:30
Philipp Emanuel Weidmann e24080db64 Add metadata to pyproject.toml 2025-11-02 10:06:15 +05:30
Philipp Emanuel Weidmann fae39ffb89 Move default configuration to Python 2025-11-02 09:29:55 +05:30
Philipp Emanuel Weidmann 850c21b534 Make multivariate TPE work properly 2025-11-01 16:57:12 +05:30
Philipp Emanuel Weidmann a24e6eba96 Improve optimization 2025-10-31 16:04:28 +05:30
Philipp Emanuel Weidmann a9655c8d31 Perform calculations involving residual vectors in float32
Credit to Jim Lai for pointing out potential numerical problems in https://huggingface.co/blog/grimjim/projected-abliteration
2025-10-31 13:47:24 +05:30
Philipp Emanuel Weidmann 1496e0a04c Dynamically choose between global and per-layer refusal directions 2025-10-31 13:04:45 +05:30
Philipp Emanuel Weidmann c638d3d012 Adjust score parameters 2025-10-25 13:15:31 +05:30
Philipp Emanuel Weidmann 47e855d5d8 Guard against missing model card data 2025-10-25 13:12:18 +05:30
Philipp Emanuel Weidmann e2419de016 Add "abliterated" to model tags 2025-10-25 09:59:44 +05:30
Philipp Emanuel Weidmann ad8b04d371 Bump version to 1.0.0 2025-10-25 09:52:43 +05:30
Philipp Emanuel Weidmann 37c5ea06d1 Print elapsed and remaining time 2025-10-25 09:47:54 +05:30
Philipp Emanuel Weidmann cf57a0cfbe Add functionality to evaluate any model relative to the main model 2025-10-24 13:38:03 +05:30
Philipp Emanuel Weidmann e6aba71186 Improve refusal detection 2025-10-24 11:27:28 +05:30
Philipp Emanuel Weidmann f8f3f9a012 Fix chat responses being cut off 2025-10-22 12:30:28 +05:30
Philipp Emanuel Weidmann 6359aa44bb Separate abliteration parameters for different layer components 2025-10-22 12:05:28 +05:30
Philipp Emanuel Weidmann ed65d6902b Support gpt-oss MoE 2025-10-15 17:51:39 +05:30
Philipp Emanuel Weidmann 7ed0cb1ffb Support Phi-3.5-MoE 2025-10-14 11:23:53 +05:30
Philipp Emanuel Weidmann 8b827ee386 Support multimodal models 2025-10-14 10:32:34 +05:30
Philipp Emanuel Weidmann dd7abd3296 Add hf_transfer to dependencies
Required for repositories that don't use Xet
2025-10-14 07:56:43 +05:30
Philipp Emanuel Weidmann 3d5e645c13 Handle Ctrl+C gracefully 2025-10-12 12:53:40 +05:30
Philipp Emanuel Weidmann 74b55977f0 Pretty-print configuration errors 2025-10-12 10:39:59 +05:30
Philipp Emanuel Weidmann b4a0c0d3f2 Add program version to generated README intro 2025-10-11 17:31:11 +05:30
Philipp Emanuel Weidmann 7caf9fcdc5 Separate training and evaluation prompts 2025-10-09 12:51:31 +05:30
Philipp Emanuel Weidmann 2ff8dcba6b Add model card when uploading to Hugging Face 2025-09-30 08:43:21 +05:30
Philipp Emanuel Weidmann 5b01ad4344 Add save and upload functionality 2025-09-27 11:15:41 +05:30
Philipp Emanuel Weidmann 7573a2eebd Support passing model name without "--model" argument prefix 2025-09-25 15:02:22 +05:30
Philipp Emanuel Weidmann fd0fa52552 Add chat functionality 2025-09-24 18:09:23 +05:30
Philipp Emanuel Weidmann f00d35dc46 Improve early abort score calculation 2025-09-23 19:02:00 +05:30
Philipp Emanuel Weidmann 3f242369e0 Add educated guesses for parameter values to get the optimizer started 2025-09-23 16:00:20 +05:30
Philipp Emanuel Weidmann c447805fc2 Improve default dtype configuration 2025-09-23 13:31:41 +05:30
Philipp Emanuel Weidmann b6c715ab6f Abort trial early if KL divergence is too high 2025-09-23 13:20:31 +05:30
Philipp Emanuel Weidmann 9485edc221 Support Qwen3 MoE 2025-09-22 15:22:48 +05:30
Philipp Emanuel Weidmann 1b37160490 Fix model loading issues 2025-09-21 16:04:41 +05:30
Philipp Emanuel Weidmann af19fbd254 Initial commit 2025-09-21 11:10:30 +05:30