PEFT documentation
PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers
PEANuT: Parameter-Efficient Adaptation with Weight-aware Neural Tweakers
PEANuT is a parameter-efficient fine-tuning technique that introduces weight-aware neural tweakers to generate adapter updates from the frozen pretrained weights themselves. Instead of learning a purely linear low-rank update as in LoRA, PEANuT conditions the adapter transformation on the base weight, which makes the update rule more expressive while keeping the number of trainable parameters small.
PEANuT uses an input projection A, an output projection B, and optional intermediate residual encoder/decoder
pairs with non-linear activations. This makes it possible to model more complex update patterns than weight-agnostic
linear adapters while still remaining within the PEFT setting.
PEANuT currently has the following tradeoffs:
Pros:
- Higher theoretical expressiveness than linear low-rank updates.
- Better performance than LoRA on a range of tasks under similar budgets.
- Works well in very low-parameter regimes, for example around
0.2Mtrainable parameters.
Cons:
- Higher memory usage than LoRA, because
ΔWis explicitly constructed before being applied. - Slower training and inference than LoRA, and deeper intermediate layers increase the overhead further.
- The non-linearity can require more careful hyperparameter tuning, especially learning rate and related optimization settings.
If these tradeoffs do not fit your use case, consider other PEFT methods such as LoRA.
The abstract from the paper is:
Fine-tuning large pre-trained foundation models often yields excellent downstream performance but is prohibitively expensive when updating all parameters. Parameter-efficient fine-tuning (PEFT) methods such as LoRA alleviate this by introducing lightweight update modules, yet they commonly rely on weight-agnostic linear approximations, limiting their expressiveness. In this work, we propose PEANuT, a novel PEFT framework that introduces weight-aware neural tweakers, compact neural modules that generate task-adaptive updates conditioned on frozen pre-trained weights. PEANuT provides a flexible yet efficient way to capture complex update patterns without full model tuning. We theoretically show that PEANuT achieves equivalent or greater expressivity than existing linear PEFT methods with comparable or fewer parameters. Extensive experiments across four benchmarks with over twenty datasets demonstrate that PEANuT consistently outperforms strong baselines in both NLP and vision tasks, while maintaining low computational overhead.
PeanutConfig
class peft.PeanutConfig
< source >( task_type: Optional[Union[str, TaskType]] = None peft_type: Optional[Union[str, PeftType]] = None auto_mapping: Optional[dict] = None peft_version: Optional[str] = None base_model_name_or_path: Optional[str] = None revision: Optional[str] = None inference_mode: bool = False r: int = 32 depth: int = 0 act_fn: str = 'relu' scaling: float = 1.0 target_modules: Optional[Union[list[str], str]] = None exclude_modules: Optional[Union[list[str], str]] = None modules_to_save: Optional[list[str]] = None layers_to_transform: Optional[Union[list[int], int]] = None layers_pattern: Optional[Union[list[str], str]] = None init_weights: bool = True )
Parameters
- r (
int) — PEANuT rank. This is the hidden dimension used by the adapters. Similar to LoRA rank, largerrincreases adapter capacity and trainable parameters. - depth (
int) — Number of hidden adapter layers per encoder/decoder side in PEANuT. The input projectionAand output projectionBare always present in addition to these hidden layers. Therefore,depthmust be a non-negative integer.depth=0:A,B.depth=1:A, one encoder, one decoder,B.depth=2:A, two encoders, two decoders,B.depth=3:A, three encoders, three decoders,B, etc.
- act_fn (
str) — Non-linear activation applied in the PEANuT network. This corresponds tonon_linearin the vanilla PyTorch implementation. Default is"relu". Any activation key available intransformers.activations.ACT2FNis supported and may perform better on different tasks. - scaling (
float) — A scalar multiplier applied to the PEANuT output before adding it to the frozen base layer output. The final adapter contribution isscaling * (x @ delta_w). - target_modules (
Union[List[str], str], optional) — The names of the modules to apply PEANuT to. Can be a list of module name strings (e.g.['q_proj', 'v_proj']) or a regex pattern. - modules_to_save (
List[str], optional) — List of modules apart from PEANuT layers to be set as trainable and saved in the final checkpoint. - exclude_modules (
Union[List[str], str], optional) — The names of the modules to not apply the adapter. When passing a string, a regex match will be performed. When passing a list of strings, either an exact match will be performed or it is checked if the name of the module ends with any of the passed strings. - layers_to_transform (
Union[list[int], int], optional) — The layer indexes to transform. If this argument is specified, PEFT will transform only the layer indexes that are specified in this list. If a single integer is passed, PEFT will transform only the layer at this index. - layers_pattern (
Optional[Union[List[str], str]], optional) — The layer pattern name, used only iflayers_to_transformis not None and if the layer pattern is not in the common layers pattern. - init_weights (
bool) — Whether to initialize PEANuT adapter weights using the default initialization scheme:- If
True: all weights exceptBare initialized with Kaiming uniform, andBis initialized to zero. - If
False: all weights (includingB) are initialized with Kaiming uniform.
Initializing
Bto zero makes the adapter start as an exact no-op. - If
This is the configuration class to store the configuration of a PeanutModel.
Notes:
PEANuT uses a weight-aware pathway, where the delta weight is conditioned on the base weight. The A adapter
is applied over the base weight’s output dimension, so A has shape (out_dim -> r) rather than the usual
(in_dim -> r) used by LoRA-like methods.
PeanutModel
class peft.PeanutModel
< source >( model peft_config: Union[PeftConfig, dict[str, PeftConfig]] adapter_name: str low_cpu_mem_usage: bool = False state_dict: Optional[dict[str, torch.Tensor]] = None ) → torch.nn.Module
Parameters
- model (
torch.nn.Module) — The model to be adapted. - config (PeanutConfig) — The configuration of the PEANuT model.
- adapter_name (
str) — The name of the adapter, defaults to"default".
Returns
torch.nn.Module
The PEANuT PEFT model.
Creates a PEANuT model from a pretrained transformers model.
The method is described in detail in https://arxiv.org/abs/2410.01870.
Attributes:
- model (PreTrainedModel) — The model to be adapted.
- peft_config (PeanutConfig): The configuration of the PEANuT model.