Richard Young, PhD
13bb7b24d6
Fix KeyError when HuggingFace user profile fields are missing ( #20 )
...
Handle optional fullname and email fields in user profile gracefully
using .get() method with fallback values to prevent KeyError when
uploading models to HuggingFace.
This fixes an issue where users without a public email or fullname
set in their HuggingFace profile would encounter an error during
the upload process.
Co-authored-by: ricyoung <riyoung@gmail.com >
2025-11-19 05:32:50 +05:30
Nikolai Kolodziej
c8b6663b93
Fix multi-GPU support and memory management ( #17 )
...
* Ensure projector is on the same device as the matrix for multi-GPU support
* Optimize memory management for loaded model weights
* Refactor memory management by removing unnecessary gc.collect() calls
* Optimize memory usage (#1 )
* Improve memory management by explicitly deleting model layers and optimizing projector usage
* Optimize memory management by explicitly deleting the model and forcing garbage collection
* Add back deleted `empty_cache` call
* Fix broken file
* Remove unnecessary deletions
* Remove unnecessary empty_cache() calls
* Remove unused import of gc
* Duplicate `gc.collect` call in `empty_cache()`
* Move additional `gc.collect` call in front of `torch.x.empty_cache`
2025-11-19 05:09:12 +05:30
Ooze
61fdf72b42
Add support for Granite MoE Hybrid in model.py by including down projections for shared MLP and MoE experts ( #14 )
2025-11-18 08:32:58 +05:30
red40maxxer
7bad84b4f1
perf: clear residuals after computing direction ( #15 )
...
Co-authored-by: mad-cat-lon <113548315+mad-cat-lon@users.noreply.github.com >
2025-11-17 22:18:22 +05:30
Matt Barnson
09730bad70
MPS support ( #5 )
...
* MPS support
* oops, added issue tracker.
* Delete .beads/issues.jsonl
2025-11-17 18:42:01 +05:30
Philipp Emanuel Weidmann
b3545e4b1e
Fix retrieving package version
v1.0.1
2025-11-16 17:35:13 +05:30
Philipp Emanuel Weidmann
3f346b6150
Change package name
2025-11-16 17:01:50 +05:30
Philipp Emanuel Weidmann
1a59d226c1
Fix spacing after images in README
2025-11-16 16:06:08 +05:30
Philipp Emanuel Weidmann
12ecf50033
Add README
2025-11-16 15:19:27 +05:30
Philipp Emanuel Weidmann
ea699dce46
Improve appearance of selection menus
2025-11-16 11:32:58 +05:30
Philipp Emanuel Weidmann
8a1aceff11
Switch to multi-objective optimization
2025-11-14 18:04:23 +05:30
Philipp Emanuel Weidmann
0bae27f359
Fix some of the problems with Falcon-E-3B
2025-11-13 11:39:08 +05:30
Philipp Emanuel Weidmann
e24080db64
Add metadata to pyproject.toml
2025-11-02 10:06:15 +05:30
Philipp Emanuel Weidmann
fae39ffb89
Move default configuration to Python
2025-11-02 09:29:55 +05:30
Philipp Emanuel Weidmann
850c21b534
Make multivariate TPE work properly
2025-11-01 16:57:12 +05:30
Philipp Emanuel Weidmann
a24e6eba96
Improve optimization
2025-10-31 16:04:28 +05:30
Philipp Emanuel Weidmann
a9655c8d31
Perform calculations involving residual vectors in float32
...
Credit to Jim Lai for pointing out potential numerical problems in https://huggingface.co/blog/grimjim/projected-abliteration
2025-10-31 13:47:24 +05:30
Philipp Emanuel Weidmann
1496e0a04c
Dynamically choose between global and per-layer refusal directions
2025-10-31 13:04:45 +05:30
Philipp Emanuel Weidmann
c638d3d012
Adjust score parameters
2025-10-25 13:15:31 +05:30
Philipp Emanuel Weidmann
47e855d5d8
Guard against missing model card data
2025-10-25 13:12:18 +05:30
Philipp Emanuel Weidmann
e2419de016
Add "abliterated" to model tags
2025-10-25 09:59:44 +05:30
Philipp Emanuel Weidmann
ad8b04d371
Bump version to 1.0.0
2025-10-25 09:52:43 +05:30
Philipp Emanuel Weidmann
37c5ea06d1
Print elapsed and remaining time
2025-10-25 09:47:54 +05:30
Philipp Emanuel Weidmann
cf57a0cfbe
Add functionality to evaluate any model relative to the main model
2025-10-24 13:38:03 +05:30
Philipp Emanuel Weidmann
e6aba71186
Improve refusal detection
2025-10-24 11:27:28 +05:30
Philipp Emanuel Weidmann
f8f3f9a012
Fix chat responses being cut off
2025-10-22 12:30:28 +05:30
Philipp Emanuel Weidmann
6359aa44bb
Separate abliteration parameters for different layer components
2025-10-22 12:05:28 +05:30
Philipp Emanuel Weidmann
ed65d6902b
Support gpt-oss MoE
2025-10-15 17:51:39 +05:30
Philipp Emanuel Weidmann
7ed0cb1ffb
Support Phi-3.5-MoE
2025-10-14 11:23:53 +05:30
Philipp Emanuel Weidmann
8b827ee386
Support multimodal models
2025-10-14 10:32:34 +05:30
Philipp Emanuel Weidmann
dd7abd3296
Add hf_transfer to dependencies
...
Required for repositories that don't use Xet
2025-10-14 07:56:43 +05:30
Philipp Emanuel Weidmann
3d5e645c13
Handle Ctrl+C gracefully
2025-10-12 12:53:40 +05:30
Philipp Emanuel Weidmann
74b55977f0
Pretty-print configuration errors
2025-10-12 10:39:59 +05:30
Philipp Emanuel Weidmann
b4a0c0d3f2
Add program version to generated README intro
2025-10-11 17:31:11 +05:30
Philipp Emanuel Weidmann
7caf9fcdc5
Separate training and evaluation prompts
2025-10-09 12:51:31 +05:30
Philipp Emanuel Weidmann
2ff8dcba6b
Add model card when uploading to Hugging Face
2025-09-30 08:43:21 +05:30
Philipp Emanuel Weidmann
5b01ad4344
Add save and upload functionality
2025-09-27 11:15:41 +05:30
Philipp Emanuel Weidmann
7573a2eebd
Support passing model name without "--model" argument prefix
2025-09-25 15:02:22 +05:30
Philipp Emanuel Weidmann
fd0fa52552
Add chat functionality
2025-09-24 18:09:23 +05:30
Philipp Emanuel Weidmann
f00d35dc46
Improve early abort score calculation
2025-09-23 19:02:00 +05:30
Philipp Emanuel Weidmann
3f242369e0
Add educated guesses for parameter values to get the optimizer started
2025-09-23 16:00:20 +05:30
Philipp Emanuel Weidmann
c447805fc2
Improve default dtype configuration
2025-09-23 13:31:41 +05:30
Philipp Emanuel Weidmann
b6c715ab6f
Abort trial early if KL divergence is too high
2025-09-23 13:20:31 +05:30
Philipp Emanuel Weidmann
9485edc221
Support Qwen3 MoE
2025-09-22 15:22:48 +05:30
Philipp Emanuel Weidmann
1b37160490
Fix model loading issues
2025-09-21 16:04:41 +05:30
Philipp Emanuel Weidmann
af19fbd254
Initial commit
2025-09-21 11:10:30 +05:30