Nikolai Kolodziej
c8b6663b93
Fix multi-GPU support and memory management ( #17 )
...
* Ensure projector is on the same device as the matrix for multi-GPU support
* Optimize memory management for loaded model weights
* Refactor memory management by removing unnecessary gc.collect() calls
* Optimize memory usage (#1 )
* Improve memory management by explicitly deleting model layers and optimizing projector usage
* Optimize memory management by explicitly deleting the model and forcing garbage collection
* Add back deleted `empty_cache` call
* Fix broken file
* Remove unnecessary deletions
* Remove unnecessary empty_cache() calls
* Remove unused import of gc
* Duplicate `gc.collect` call in `empty_cache()`
* Move additional `gc.collect` call in front of `torch.x.empty_cache`
2025-11-19 05:09:12 +05:30
Ooze
61fdf72b42
Add support for Granite MoE Hybrid in model.py by including down projections for shared MLP and MoE experts ( #14 )
2025-11-18 08:32:58 +05:30
red40maxxer
7bad84b4f1
perf: clear residuals after computing direction ( #15 )
...
Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com >
2025-11-17 22:18:22 +05:30
Matt Barnson
09730bad70
MPS support ( #5 )
...
* MPS support
* oops, added issue tracker.
* Delete .beads/issues.jsonl
2025-11-17 18:42:01 +05:30
Philipp Emanuel Weidmann
b3545e4b1e
Fix retrieving package version
v1.0.1
2025-11-16 17:35:13 +05:30
Philipp Emanuel Weidmann
3f346b6150
Change package name
2025-11-16 17:01:50 +05:30
Philipp Emanuel Weidmann
1a59d226c1
Fix spacing after images in README
2025-11-16 16:06:08 +05:30
Philipp Emanuel Weidmann
12ecf50033
Add README
2025-11-16 15:19:27 +05:30
Philipp Emanuel Weidmann
ea699dce46
Improve appearance of selection menus
2025-11-16 11:32:58 +05:30
Philipp Emanuel Weidmann
8a1aceff11
Switch to multi-objective optimization
2025-11-14 18:04:23 +05:30
Philipp Emanuel Weidmann
0bae27f359
Fix some of the problems with Falcon-E-3B
2025-11-13 11:39:08 +05:30
Philipp Emanuel Weidmann
e24080db64
Add metadata to pyproject.toml
2025-11-02 10:06:15 +05:30
Philipp Emanuel Weidmann
fae39ffb89
Move default configuration to Python
2025-11-02 09:29:55 +05:30
Philipp Emanuel Weidmann
850c21b534
Make multivariate TPE work properly
2025-11-01 16:57:12 +05:30
Philipp Emanuel Weidmann
a24e6eba96
Improve optimization
2025-10-31 16:04:28 +05:30
Philipp Emanuel Weidmann
a9655c8d31
Perform calculations involving residual vectors in float32
...
Credit to Jim Lai for pointing out potential numerical problems in https://huggingface.co/blog/grimjim/projected-abliteration
2025-10-31 13:47:24 +05:30
Philipp Emanuel Weidmann
1496e0a04c
Dynamically choose between global and per-layer refusal directions
2025-10-31 13:04:45 +05:30
Philipp Emanuel Weidmann
c638d3d012
Adjust score parameters
2025-10-25 13:15:31 +05:30
Philipp Emanuel Weidmann
47e855d5d8
Guard against missing model card data
2025-10-25 13:12:18 +05:30
Philipp Emanuel Weidmann
e2419de016
Add "abliterated" to model tags
2025-10-25 09:59:44 +05:30
Philipp Emanuel Weidmann
ad8b04d371
Bump version to 1.0.0
2025-10-25 09:52:43 +05:30
Philipp Emanuel Weidmann
37c5ea06d1
Print elapsed and remaining time
2025-10-25 09:47:54 +05:30
Philipp Emanuel Weidmann
cf57a0cfbe
Add functionality to evaluate any model relative to the main model
2025-10-24 13:38:03 +05:30
Philipp Emanuel Weidmann
e6aba71186
Improve refusal detection
2025-10-24 11:27:28 +05:30
Philipp Emanuel Weidmann
f8f3f9a012
Fix chat responses being cut off
2025-10-22 12:30:28 +05:30
Philipp Emanuel Weidmann
6359aa44bb
Separate abliteration parameters for different layer components
2025-10-22 12:05:28 +05:30
Philipp Emanuel Weidmann
ed65d6902b
Support gpt-oss MoE
2025-10-15 17:51:39 +05:30
Philipp Emanuel Weidmann
7ed0cb1ffb
Support Phi-3.5-MoE
2025-10-14 11:23:53 +05:30
Philipp Emanuel Weidmann
8b827ee386
Support multimodal models
2025-10-14 10:32:34 +05:30
Philipp Emanuel Weidmann
dd7abd3296
Add hf_transfer to dependencies
...
Required for repositories that don't use Xet
2025-10-14 07:56:43 +05:30
Philipp Emanuel Weidmann
3d5e645c13
Handle Ctrl+C gracefully
2025-10-12 12:53:40 +05:30
Philipp Emanuel Weidmann
74b55977f0
Pretty-print configuration errors
2025-10-12 10:39:59 +05:30
Philipp Emanuel Weidmann
b4a0c0d3f2
Add program version to generated README intro
2025-10-11 17:31:11 +05:30
Philipp Emanuel Weidmann
7caf9fcdc5
Separate training and evaluation prompts
2025-10-09 12:51:31 +05:30
Philipp Emanuel Weidmann
2ff8dcba6b
Add model card when uploading to Hugging Face
2025-09-30 08:43:21 +05:30
Philipp Emanuel Weidmann
5b01ad4344
Add save and upload functionality
2025-09-27 11:15:41 +05:30
Philipp Emanuel Weidmann
7573a2eebd
Support passing model name without "--model" argument prefix
2025-09-25 15:02:22 +05:30
Philipp Emanuel Weidmann
fd0fa52552
Add chat functionality
2025-09-24 18:09:23 +05:30
Philipp Emanuel Weidmann
f00d35dc46
Improve early abort score calculation
2025-09-23 19:02:00 +05:30
Philipp Emanuel Weidmann
3f242369e0
Add educated guesses for parameter values to get the optimizer started
2025-09-23 16:00:20 +05:30
Philipp Emanuel Weidmann
c447805fc2
Improve default dtype configuration
2025-09-23 13:31:41 +05:30
Philipp Emanuel Weidmann
b6c715ab6f
Abort trial early if KL divergence is too high
2025-09-23 13:20:31 +05:30
Philipp Emanuel Weidmann
9485edc221
Support Qwen3 MoE
2025-09-22 15:22:48 +05:30
Philipp Emanuel Weidmann
1b37160490
Fix model loading issues
2025-09-21 16:04:41 +05:30
Philipp Emanuel Weidmann
af19fbd254
Initial commit
2025-09-21 11:10:30 +05:30