Files
DS-LLM-TEMPLATE-FINETUNING/unsloth_compiled_cache/__pycache__/UnslothIterativeSFTTrainer.cpython-311.pyc
T

254 lines
44 KiB
Plaintext
Raw Normal View History

2025-08-13 23:50:20 +00:00
§
4$h£ãó dZddlmZddlZddlmZddlmZddlmZm Z m
Z
m Z m Z m
Z
mZmZddlmZmZmZmZmZmZmZmZmZmZmZmZmZm
Z
mZmZmZm Z m!Z!m"Z"m#Z#m$Z$m Z m%Z%m&Z&m'Z'm(Z(m)Z)mZm*Z*m
Z
mZm Z m#Z#m'Z'm)Z)mZddl)Z)ddlTddl+m,Z,m-Z-dd l.m/Z/ddlZddl0Z1dd
l2m3Z3ddlmZdd l4mZmZ5d d
d d
d
dœZ6ej7d d e6¬¦«d¦«Z8e,Gdde¦«¦«Z9 Gdde#¦«Z:Gdde:¦«Z;dS)z8
2025.8.4
2025.8.5
4.55.1
0.21.0
__UNSLOTH_VERSIONING__
é)ÚTensorN)Ú
functional)ÚAnyÚListÚOptionalÚTupleÚUnionÚDictÚSetÚCallable)%ÚAutoModelForCausalLMÚ
AutoTokenizerÚBaseImageProcessorr Ú DataCollatorÚDataCollatorForLanguageModelingÚDataCollatorForSeq2SeqÚ
DataLoaderÚDatasetÚEvalLoopOutputÚFeatureExtractionMixinÚIterativeSFTConfigÚIterativeSFTTrainerrÚ
PPODecoratorsÚPathÚ PeftModelÚPreTrainedModelÚPreTrainedTokenizerBaseÚProcessorMixinÚTrainerÚTrainingArgumentsr Úgenerate_model_cardÚget_comet_experiment_urlÚis_peft_availableÚis_wandb_availableÚosÚtorchÚwarningsrrrrr#r%r&)Ú*)Ú dataclassÚfield)ÚVersion)Ú nullcontext)rrTF)Úepilogue_fusionÚ max_autotuneÚ
shape_paddingz
trace.enabledztriton.cudagraphs)ÚdynamicÚ fullgraphÚoptionscó’tj| d|jd¦«dd¬¦«}tj| d¦«dd¬¦«}g}t ||¦«D]\}}| tj¦«}tj|d| d¦«¬¦«  d¦«}tj
|d¬¦«}||z
} |  | ¦«Œ’ tj |¦«}| |jd|jdf¦«}|S)Néÿÿÿÿér)ÚchunksÚdim)r7Úindex)r7é)
r&ÚchunkÚreshapeÚshapeÚzipÚtoÚfloat32ÚgatherÚ unsqueezeÚsqueezeÚ logsumexpÚappendÚconcat)
Úlogitsr8Úchunked_logitsÚ
chunked_indexÚall_per_token_logpsÚ chunk_logitsÚ chunk_indexÚselected_logitsÚlogsumexp_valuesÚper_token_logpss
úf/workspace/Fine-tuning/DS-LLM-TEMPLATE-FINETUNING/unsloth_compiled_cache/UnslothIterativeSFTTrainer.pyÚchunked_selective_log_softmaxrP"s5õ”[ §¢°°F´LÀÔ4DÑ!EÔ!EÐPQÐYZÐ[€NÝ”[ §¢¨rÑ!2Ô!2¸QÀaÐH€MØÐå%(¨¸Ñ%GÔ%Gð #—¥u¤}Ñ Ýœ, |¸2À{×G\ÒG\Ð]_ÑG`ÔG`Ða×iÐjlÑmˆÝ œ?¨<¸Ø)Ð,<Ñ<ˆØ×" Ýœ,Ð':ÑØ-×5°v´|ÀA´ÈÌ ÐUVÌÐ6XÑØ ÐócóšeZdZUdZedddi¬¦«Zeeed<edddi¬¦«Z ee
ed < d+ˆfd*„ Z ˆxZ S),ÚUnslothIterativeSFTConfigaÅ
Configuration class for the [`IterativeSFTTrainer`].
This class includes only the parameters that are specific to Iterative SFT training. For a full list of training
arguments, please refer to the [`~transformers.TrainingArguments`] documentation. Note that default values in this
class may differ from those in [`~transformers.TrainingArguments`].
Using [`~transformers.HfArgumentParser`] we can turn this class into
[argparse](https://docs.python.org/3/library/argparse#module-argparse) arguments that can be specified on the
command line.
Parameters:
> Parameters that control the model
model_init_kwargs (`dict[str, Any]` or `None`, *optional*, defaults to `None`):
Keyword arguments for [`~transformers.AutoModelForCausalLM.from_pretrained`], used when the `model`
argument of the [`IterativeSFTTrainer`] is provided as a string.
> Parameters that control the data preprocessing
max_length (`int` or `None`, *optional*, defaults to `None`):
Maximum length of the tokenized sequence. Sequences longer than `max_length` are truncated.
truncation_mode (`str`, *optional*, defaults to `"keep_end"`):
The truncation mode to use, either `"keep_end"` or `"keep_start"`.
optimize_device_cache (`bool`, *optional*, defaults to `False`):
Whether to optimize accelerator cache for slightly more memory-efficient training.
helpzvLLM SamplingParams)ÚdefaultÚmetadataÚvllm_sampling_paramsr4z8Chunk size to reduce memory usage. -1 is most efficient.Úunsloth_num_chunksFÚnor5éréúç-Cëâ6
?ç{®Gáz„?çÍÌÌÌÌÌì?ç+‡ÙÎ÷ï?ç:Œ0âŽyE>çð?çlinearçš™™™™™¹?ÚpassiveÚwarningTÚstepsr9éôéO
ÚO1ÚautoÚçÚ
adamw_8bitÚlengthÚ
every_saveÚlastéÚkeep_endc‡ óö|dkrtd|d¦«|dkrtd|d¦«||#dkr
|$dkrd}d }#t¦«jdŽid
|d |d |d
|d|d|d|d|d| “d|
d| d| d|
d|d|d|d|d|d|d|d|d|d |d!|d"|d#|d$|d%|d&|d'|d(|d)| “d*|!“d+|"“d,|#“d-|$“d.|%“d/|&“d0|'“d1|(“d2|)“d3|*“d4|+“d5|,“d6|-“d7|.“d8|/“d9|0“d:|1“d;|2“d<|3“d=|4“d>|5“d?|6“d@|7“dA|8“dB|9“dC|:“dD|;“dE|<“dF|=“dG|>“dH|?“dI|@“dJ|A“dK|B“dL|C“dM|D“dN|E“dO|F“dP|G“dQ|H“dR|I“dS|J“dT|K“dU|L“dV|M“dW|N“dX|O“dY|P“dZ|Q“d[|R“d\|S“d]|T“d^|U“d_|V“d`|W“da|X“db|Y“dc|Z“dd|[“de|\“df|]“dg|^“dh|_“di|`“dj|a“dk|b“dl|c“dm|d“dn|e“do|f“dp|g“dq|h“dr|i“ds|j“dt|k“du|l“dv|m“dw|n“dx|o“dy|p“dz|q“d{|r“d||s“d}|t“d~|u“d|v“d€|w“d|x“d|y“dƒ|z“d„|{“d…||“d†|}“d‡|~“dˆ|d‰|€“dŠ|d‹|‚“dŒ|ƒ“d|„“|‡¤Ž|…|_|†|_dS)NgH¯¼šò×z>z Unsloth: Your learning rate of `zi` is too small and less than 1e-7! Consider increasing it, otherwise gradient updates will be close to 0!r9za` is way too larger > 1! Consider decreasing it to 1e-1, otherwise gradient updates will explode!rgrhÚunsloth_training_checkpointsrYÚ
output_dirÚoverwrite_output_dirÚdo_trainÚdo_evalÚ
do_predictÚ
eval_strategyÚprediction_loss_onlyÚper_device_train_batch_sizeÚper_device_eval_batch_sizeÚper_gpu_train_batch_sizeÚper_gpu_eval_batch_sizeÚgradient_accumulation_stepsÚeval_accumulation_stepsÚ
eval_delayÚtorch_empty_cache_stepsÚ
learning_rateÚ weight_decayÚ
adam_beta1Ú
adam_beta2Ú adam_epsilonÚ
max_grad_normÚnum_train_epochsÚ max_stepsÚlr_scheduler_typeÚ warmup_ratioÚ warmup_stepsÚ log_levelÚlog_level_replicaÚlog_on_each_nodeÚ logging_dirÚlogging_strategyÚlogging_first_stepÚ
logging_stepsÚlogging_nan_inf_filterÚ
save_strategyÚ
save_stepsÚsave_total_limitÚsave_safetensorsÚsave_on_each_nodeÚsave_only_modelÚ'restore_callback_states_from_checkpointÚno_cudaÚuse_cpuÚuse_mps_deviceÚseedÚ data_seedÚ
jit_mode_evalÚuse_ipexÚbf16Úfp16Úfp16_opt_levelÚhalf_precision_backendÚbf16_full_evalÚfp16_full_evalÚtf32Ú
local_rankÚ ddp_backendÚ
tpu_num_coresÚtpu_metrics_debugÚdebugÚdataloader_drop_lastÚ
eval_stepsÚdataloader_num_workersÚdataloader_prefetch_factorÚ
past_indexÚrun_nameÚ disable_tqdmÚremove_unused_columnsÚ label_namesÚload_best_model_at_endÚmetric_for_best_modelÚgreater_is_betterÚignore_data_skipÚfsdpÚfsdp_min_num_paramsÚ fsdp_configÚ"fsdp_transformer_layer_cls_to_wrapÚaccelerator_configÚ deepspeedÚlabel_smoothing_factorÚoptimÚ
optim_argsÚ adafactorÚgroup_by_lengthÚlength_column_nameÚ report_toÚddp_find_unused_parametersÚddp_bucket_cap_mbÚddp_broadcast_buffersÚdataloader_pin_memoryÚdataloader_persistent_workersÚskip_memory_metricsÚuse_legacy_prediction_loopÚ push_to_hubÚresume_from_checkpointÚ hub_model_idÚ hub_strategyÚ hub_tokenÚhub_private_repoÚhub_always_pushÚ hub_revisionÚgradient_checkpointingÚgradient_checkpointing_kwargsÚinclude_inputs_for_metricsÚeval_do_concat_batchesÚ fp16_backendÚpush_to_hub_model_idÚpush_to_hub_organizationÚpush_to_hub_tokenÚ
mp_parametersÚauto_find_batch_sizeÚfull_determinismÚ torchdynamoÚ ray_scopeÚ ddp_timeoutÚ
torch_compileÚtorch_compile_backendÚtorch_compile_modeÚinclude_tokens_per_secondÚinclude_num_input_tokens_seenÚneftune_noise_alphaÚoptim_target_modulesÚbatch_eval_metricsÚ
eval_on_startÚuse_liger_kernelÚliger_kernel_configÚeval_use_gather_objectÚaverage_tokens_across_devicesÚmodel_init_kwargsÚ
max_lengthÚtruncation_modeÚoptimize_device_cache©)ÚFloatingPointErrorÚ
OverflowErrorÚsuperÚ__init__rWrX)‰Úselfrvrwrxryrzr{r|r}r~rr€rrr„r…r†r‡r‰rrrrrrr“r”r•r–r—r™rrr r­r¿rWrXÚkwargsÚ __class__s‰ €rOz"UnslothIterativeSFTConfig.__init__Zs9 ø€ðT ˜ Ð Õ'9ð;VÐ]jð;Vð;Vð;Vñ(Wô(Wð"WØ ˜ Ð ¥Mð3FÐUbð3Fð3Fð3Fñ%Gô%GðGØ Ð  -°7Ò":Ð":¸zÈSÒ?PÐ?PØ7ˆ ˆŒÔðD DðD DðD DØ#˜ðD Dà#7Ð#7ðD Dð D Dðgð D Dð
$˜ð D Dð *˜
D Dð$8Ð#7ðD Dð+FÐ*EðD Dð*DÐ)CðD Dð(@Ð'?ðD Dð'>Ð&=ðD Dð+FÐ*EðD Dð'>Ð&=ðD Dð$˜ðD Dð'>Ð&=ðD Dð *˜Mð!D Dð"(˜<ð#D Dð$$˜ð%D Dð&$˜ð'D Dð((˜<ð)D Dð**˜Mð+D Dð,/ð-D Dð."˜ ð/D Dð0!2Ð 1ð1D Dð2(˜<ð3D Dð4(˜<ð5D Dð6"˜ ð7D Dð8!2Ð 1ð9D Dð:/ð;D Dð<&˜+ð=D Dð>/ð?D Dð@"4Ð!3ðAD DðB*˜MðCD DðD&<Ð%;ðED DðF*˜MðGD DðH$˜ðID DðJ/ðKD DðL/ðMD DðN!2Ð 1ðOD DðP.˜oðQD DðR7^Ð6]ðSD DðTgðUD DðVgðWD DðX,˜^ðYD DðZ4ð[D Dð\"˜ ð]D Dð^*˜Mð_D Dð` xðaD Dðb4ðcD Dðd4ðeD Dðf,˜^ðgD Dðh&<Ð%;ðiD Dðj,˜^ðkD Dðl,˜^ðmD Dðn4ðoD Dðp$˜ðqD Dðr&˜+ðsD Dðt*˜MðuD Dðv!2Ð 1ðwD DðxEðyD Dðz$8Ð#7ð{D Dð|$˜ð}D Dð~&<Ð%;ðD Dð@*DÐ)CðAD DðB$˜ðCD DðD xðED DðF(˜<ðGD DðH%:Ð$9ðID DðJ&˜+ðKD DðL&<Ð%;ðMD DðN%:Ð$9ðOD DðP!2Ð 1ðQD DðR/ðSD DðT4ðUD DðV#6Ð"5ðWD DðX&˜+ðYD DðZ2TÐ1Sð[D Dð\"4Ð!3ð]D Dð^"˜ ð_D Dð`&<Ð%;ðaD DðbEðcD Dðd$˜ðeD Dðf"˜ ðgD Dðh.˜oðiD Dðj"4Ð!3ðkD Dðl"˜ ðmD Dðn*DÐ)CðoD Dðp!2Ð 1ðqD Dðr%:Ð$9ðsD Dðt%:Ð$9ðuD Dðv-JÐ,IðwD Dðx#6Ð"5ðyD Dðz*DÐ)Cð{D Dð|&˜+ð}D Dð~&<Ð%;ðD Dð@(˜<ðAD DðB(˜<ðCD DðD"˜ ðED DðF/ðGD DðH.˜oðID DðJ(˜<ðKD DðL&<Ð%;ðMD DðN-JÐ,IðOD DðP*DÐ)CðQD DðR&<Ð%;ðSD DðT(˜<ðUD DðV$8Ð#7ðWD DðX(@Ð'?ðYD DðZ!2Ð 1ð[D Dð\*˜Mð]D Dð^$8Ð#7ð_D Dð`/ðaD Dðb&˜+ðcD Dðd"˜ ðeD Dðf&˜+ðgD Dðh*˜MðiD Dðj%:Ð$9ðkD Dðl"4Ð!3ðmD Dðn)BÐ(AðoD Dðp-JÐ,IðqD Dðr#6Ð"5ðsD Dðt$8Ð#7ðuD Dðv"4Ð!3ðwD Dðx*˜MðyD Dðz/ð{D Dð|#6Ð"5ð}D Dð~&<Ð%;ðD Dð@-JÐ,IðAD DðB!2Ð 1ðCD DðD$˜ðED DðF.˜oðGD DðH%:Ð$9¸FðID DðD DðD DðJ%9ˆÔ!Ø"4ˆÔÐÐrQ)†NNFFFrYFr5r5NNrZrZrr[r\r]r^r_r`rarbr4rcrdrrerfTNrgFr9FrgrhNTFFFFFFririFFFFrjrkFFNr4NNFrlFNrNr4NNTNFNNFrlrNNNNrmrnNFFroNNNNTFTFFNNrpNNFNFNFTrkNNNrlTFNrqrrFNNFFNNFFFNFTNNrsFNr4)
Ú__name__Ú
__module__Ú __qualname__Ú__doc__r*rWrrÚ__annotations__rXÚintrþÚ
__classcell__©rs@rOrSrS3s\ø€ððð:+0¨%ØØÐ+ñ+ô+И( 3œ-ððñð*/¨ØØÐ*ñ*ô*И #œððñð ØØØØØ$Ø&'Ø%&Ø#'Ø"&Ø&'Ø"#ØØ"%ØØØØØØØØØØØØØØØ!&ØØØØØØ27ØØØØØØØØØØØ!'ØØØØØØØØØ!"Ø%)ØØØØ $ØØ!&Ø $Ø Ø ØØØØ-1ØØ!$ØØØØØØ%)Ø Ø $Ø $Ø(-Ø"Ø%*ØØ!%ØØØØØØ!&Ø(,Ø%*Ø!%ØØ#Ø#'Ø ØØ ØØØØØ $Ø!Ø$)Ø(-ØØ Ø"Ø!&Ø(,Ø ØØ$Ø %ØðOVVVVVVVVVV5rQrScó”eZdZdZddgZ d deeefdeee e
fdee d eee e
ee ffd
eeeeeefd eejjejjjfd eeejejgejfd
eeege
ffˆfd
Zdede defdZdejdejdejfdZedeej deej deej deedeef
d¦«Z!e"j#¦« d!deeej deeej deeej deeedeeef
d¦«Z$dZ%ˆfdZ& d"deedeedeeeedffdZ'ˆxZ(S)#Ú_UnslothIterativeSFTTrainerrlÚtrlz
iterative-sftN©NNÚmodelÚargsÚ
data_collatorÚ eval_datasetÚprocessing_classÚ
optimizersÚpreprocess_logits_for_metricsÚcompute_metricsc
óêt|t¦«r|n |jj} |€.|  d¦«d}
t |
d¦«}nit|t ¦«rTt|t
¦«s?| ¦«} |j| d<|   d¦«t di| ¤Ž}|tj | ¦«}|j )t|t¦«stjd¦«t|t¦«r| ||¦«}t!¦«rt|t"¦«rd|_nd|_||_t)|jd d¦«|_|€;|jrt-|d
d ¬ ¦«|_n#t1|jd¬
¦«|_n||_|j|_|j|_|j|_t9¦« |||j|||||¬¦«t=|jd¦«r|j  |j!¦«| "|j#j$¦«|j% &|j|j'|j(¦«\|_|_'|_(|jdkrdnd|j_)t=|d¦«stUd¦«|jtV_dS)/r4z
-IterativeSFTr×zŠYou passed model_init_kwargs to the `IterativeSFTConfig`, but your model is already instantiated. The `model_init_kwargs` will be ignored.TFÚis_encoder_decoderéœÿÿÿé)Úlabel_pad_token_idÚpad_to_multiple_of)Úmlm)rrrrrrrrÚadd_model_tagsrsÚleftÚrightÚ acceleratorzXYour `Trainer` does not have an `accelerator` object. Consider upgrading `transformers`.rú),Ú
isinstanceÚstrÚconfigÚ
_name_or_pathÚsplitrr Úto_dictr×ÚpoprÚfrom_pretrainedrör'ÚwarnÚ_create_model_from_pathr#rÚ
is_peft_modelrÚgetattrrrrrÚhasattrrrÚ
_tag_namesÚcreate_optimizer_and_schedulerrr!ÚprepareÚ optimizerÚ lr_schedulerÚtruncation_sideÚAttributeErrorr)
rÿrrrrrrrrÚmodel_idÚ
model_nameÚ dict_argsrs
€rOz$_UnslothIterativeSFTTrainer.__init__xsø€õ"' u­R55¸¼ Ô8RˆØ ˆ<Øš¨Ñ,¨RÔ0ˆÐ&BÐ&BÐ&BÑCˆDˆ
˜Õ
ÀDÕJ\Ñ9]Ô9]🠚 œˆIØ%)¤^ˆI MŠMÐ 2¨ Ð2ˆ Ð <¸XÑ ð Ô -µjÀÍÑ6LÔ6LÐ ŒMð
ô
ð
õ e ׸Ñ=ˆEõ Ñ Ô ð '¥:¨eµYÑ#?Ô#?ð 'Ø!%ˆ Ð à!&ˆ à 0ˆÔÝ")¨%¬,Ð8LÈeÑ"TÔ"TˆÔà Ð ØÔ
gÝ%;Ø$¸ÐRSð&ñ&ô&Ô&EÀTÔEZÐ`eÐ%fÑ%fÔ%fÔ"à!.ˆ àœ/ˆŒØÔØ%)Ô%?ˆÔ
Œ×ÒØØØÔ!Ø*Gð ñ
ô
ð
õ 4”:Ð  ŒJ× % d¤oÑ  ×+¨D¬IÔ,?Ñ9=Ô8H×8PÒ8PØ ŒJ˜œ¨Ô(9ñ9
ô9
Ñ5ˆŒ
D”N DÔ$5ð;?Ô:NÐR\Ò:\Ð:\°°ÐbiˆÔÔt˜]Ñ Ý Øôð
ð/3Ô.H
Ô+rQÚ
model_pathÚreturncó8|jpi}tj|fi|¤ŽS)z0Creates a model from a path or model identifier.)r
r))rÿr9rs rOr+z3_UnslothIterativeSFTTrainer._create_model_from_pathÖs*à Ô8°bÐÝ3°JÐTÐBSÐTrQÚ input_idsÚattention_maskÚlabelscó´| d|D¦«}jr dt|||¦«D¦«¦« jj¦«}| dd¦«d|d|djjk<nJ‰ dt||¦«D¦«¦« jj¦«}j o‰j
dkr!ˆfd|  ¦«D¦«}nC‰j
d kr!ˆfd
|  ¦«D¦«}ntd j
¦«|S) Ncó6g|]}tj|¦«ŒS)r&Ú ones_like)Ú.0Úidss rOú
<listcomp>zD_UnslothIterativeSFTTrainer.prepare_model_inputs.<locals>.<listcomp>Ýs"ÐH°seœo¨cÑHrQcó"g|] \}}}|||dœŒ
S)©r<r=r>)rBrCÚattÚlabs rOrDzD_UnslothIterativeSFTTrainer.prepare_model_inputs.<locals>.<listcomp>ás8ðððá˜S #ð#&¸ÈÐððrQÚdecoder_input_idsrr>cóg|]
\}}||dœŒ S))r<r=)rBrCrGs rOrDzD_UnslothIterativeSFTTrainer.prepare_model_inputs.<locals>.<listcomp>ís$Ðj¹x¸sÀC˜s°cÐjrQÚ
keep_startcó6i|]\}}||djŒS©©rBÚvrÿs €rOú
<dictcomp>zD_UnslothIterativeSFTTrainer.prepare_model_inputs.<locals>.<dictcomp>ós,ø€ÐU¹$¸!¸Q˜a Ð#4 T¤_Ð#4Ô!5ÐUrQrscó8i|]\}}||j dŒSrMrNrOs €rOrRzD_UnslothIterativeSFTTrainer.prepare_model_inputs.<locals>.<dictcomp>õs/ø€ÐV¹4¸1¸a˜a  D¤OÐ#3Ð#5Ð#5Ô!6ÐVrQzUnknown truncation mode: )
rrr=r>rÚdevicer(rÚ pad_token_idr÷ÚitemsÚ
ValueError)rÿr<r=r>Ú
input_datas` rOÚprepare_model_inputsz0_UnslothIterativeSFTTrainer.prepare_model_inputsÛsø€Ø Ð H¸iÐHˆ Ô ×ðå),¨Y¸ÈÑ)OÔ)Oðñôñô÷
ŠbÔ
ð
NŠNÐÑ 5à_cˆJ  ¨HÔ!5¸Ô9NÔ9[Ò![Ñ ×jÍ3ÈyÐZhÑKiÔKiÐôçŠbÔ
ð
Œ?Ð Ô# UÀ*×BRÒBRÑBTÔBTÐU
ØÔÒVÀ:×CSÒCSÑCUÔCUÐV
å Ð!S¸TÔ=QÐ!SÐ!SÑÐrQÚtextsÚ texts_labelsc
óV|€6|€štddg||g¦«D]ƒ\}}t|t¦«s!t|dt |¦«¦«t|dt
j¦«s(td|dt |d¦«¦«Œ„n„tgd¢|||g¦«D]ƒ\}}t|t¦«s!t|dt |¦«¦«t|dt
j¦«s(td|dt |d¦«¦«Œ„nêt|t¦«std t |¦«¦«t|dt¦«s%td
t |d¦«¦«|tt|t¦«std t |¦«¦«t|dt¦«s%td t |d¦«¦«|||||fS)

Check if the input data is valid for training.
Args:
input_ids (list[`torch.LongTensor`]):
List of tensors containing the input_ids
attention_mask (list[`torch.LongTensor`]):
List of tensors containing the attention_mask
labels (list[`torch.FloatTensor`]):
List of tensors containing the labels
texts (list[`str`]):
List of string containing the text input.
texts_labels (list[`str`]):
List of string containing the text labels.
Returns:
`tuple`: The input data.
Nr<r>z! must be a list of tensors - got rz Elements in z must be tensors - got rFz''text' must be a list of strings - got z)Elements in 'text' must be strings - got z.'text_labels' must be a list of strings - got z0Elements in 'text_labels' must be strings - got )r=r"ÚlistrWÚtyper&rr#)r<r=r>rZr[ÚnameÚ tensor_lists rOÚ_step_safety_checkerz0_UnslothIterativeSFTTrainer._step_safety_checkerûsð4 ‰=ØÐ%Ý),¨k¸8Ð-DÀyÐRXÐFYÑ)ZÔ)ZðmðmÑ%D˜% kµ4ÑhÝ(¨DÐ)fÐ)fÕSWÐXcÑSdÔSdÐ)fÐ)fÑ% k°!¤nµe´mÝ(Ð)k¸Ð)kÐ)kÕUYÐZeÐfgÔZhÑUiÔUiÐ)kÐ)kÑmñmõ *-Ø=À È>Ð[aÐ?bñ*ô*ðmðmÑ%D˜& kµ4ÑhÝ(¨DÐ)fÐ)fÕSWÐXcÑSdÔSdÐ)fÐ)fÑ% k°!¤nµe´mÝ(Ð)k¸Ð)kÐ)kÕUYÐZeÐfgÔZhÑUiÔUiÐ)kÐ)kÑmð mõ˜e¥TÑ
ZÝ Ð!XÍ4ÐPUÉ;Ì;Ð!XÐ!Xјe Aœh­Ñ
_Ý Ð!]ÍTÐRWÐXYÔRZÉ^Ì^Ð!]Ð!]ÑÐ! ÑlÝ$Ð%jÕVZÐ[gÑVhÔVhÐ%jÐ%jÑ! ,¨q¤/µ3ÑqÝ$Ð%oÕX\Ð]iÐjkÔ]lÑXmÔXmÐ%oÐ%oј.¨&°%¸ÐErQcó†j ¦«jjdkrGt jd¦« jj¦«_ jj_
||td¦«||tj
dt¦«||jrtd¦«|
|ddnd}|
|ddnd}|
|ddnd}|
|ddnd}|
|ddnd} |||||¦«\}}}}}|/‰ |jddd¬ ¦«}|d
|d }}|%‰ |jddd¬ ¦«d
}||} |||¦«}t)| ¦«¦«}i}| |¦«ˆfd } t/j|¦«}
|
 d
¦«t5|
jjd| ¬¦«} t9| ¦«D]—\} Šj j¦«5ˆfd|D¦«} j|¦«}
jj dkr|
 !¦«}
|
 "¦«}j #|
¦«jj$rH‰jj%<‰j &j '¦«jj%¦«j( )¦«j( *¦«j+j+ )¦«jxjdz
c_xj |z
c_  ,¦«ddd¦«n #1swxYwYŒ™dS)
Run an optimisation step given a list of input_ids, attention_mask, and labels or a list of text and
text_labels.
Args:
input_ids (list[`torch.LongTensor`]):
List of tensors containing the input_ids (if not provided, text will be used)
attention_mask (list[`torch.LongTensor`], , *optional*):
List of tensors containing the attention_mask
labels (list[`torch.FloatTensor`], *optional*):
List of tensors containing the labels (if set to None, will default to input_ids)
texts (list[`str`], *optional*):
List of strings containing the text input (if not provided, input_ids will directly be used)
texts_labels (list[`str`], *optional*):
List of strings containing the text labels (if set to None, will default to text)
Returns:
`dict[str, Any]`: A summary of the training statistics
rrmNz@Step should include `input_ids` or `texts` as keyword arguments.ztBoth `input_ids` and `texts` argument are provided. `input_ids` will be ignored. Please provide only one of the two.z€No 'labels' or 'text_labels' are provided. When using an encoder-decoder architecture, 'labels' or 'text_labels' must be passed.TÚpt)Ú
truncationÚpaddingÚreturn_tensorsr<r=cóÄt¦«}|dD]FŠdvr@tjˆfd|D¦«¦« jj¦«|<ŒG|S)NrrFcó g|]
}|Œ S)rBÚkeys €rOrDzF_UnslothIterativeSFTTrainer.step.<locals>.collator.<locals>.<listcomp>†sø€Ð3IÐ3IÐ3I¸q°A°c´FÐ3IÐ3IÐ3IrQ)Údictr&Ústackr>rrT)ÚdataÚ return_dictrjrÿs @€rOÚcollatorz2_UnslothIterativeSFTTrainer.step.<locals>.collatorsqøø€Ý™&œ&ˆ˜A”wð
að
aØÐCÝ',¤{Ð3IÐ3IÐ3IÐ3IÀDÐ3IÑ3IÔ3IÑ'JÔ'J×'MÒ'MÈdÌjÔN_Ñ'`Ô'`K Ñ$øØÐ rQr&)Ú
batch_sizeÚshuffleÚ
collate_fncó"i|] }||Œ S)rBrPÚbatchs €rOrRz4_UnslothIterativeSFTTrainer.step.<locals>.<dictcomp>•sø€Ð  5¨¤8ÐHrQr9)-rÚtrainÚstateÚ global_stepr&Útensorr>rrTÚtr_lossÚ_globalstep_last_loggedrWr'r*Ú UserWarningrrarrYr]ÚkeysÚupdaterÚ from_dictÚ
set_formatrr}Ú enumerater!Ú
accumulateÚ compute_lossÚn_gpuÚmeanÚdetachÚbackwardÚsync_gradientsrŠÚclip_grad_norm_Ú
parametersr2ÚstepÚ zero_gradr3Ú_maybe_log_save_evaluate)rÿr<r=r>rZr[Ú model_inputsÚmodel_inputs_namesÚ
batch_dictroÚ
batch_dataÚstep_dataloaderÚlossÚ tr_loss_steprts` @rOz _UnslothIterativeSFTTrainer.step1sqøø€ð8
Œ
×ÒÑÔÐà Œ:Ô !   œ<¨Ñ,×´ Ô0@ÑAˆDŒLØ+/¬:Ô+AˆDÔ Ð   ÝÐ
Ð
" uÐ'8Ý ŒMðñ
ô
ð
ð ˆ>˜2°tÔ7NÐðSñôð
ð
%.Ð$9I˜a˜a˜a”LL¸tˆ Ø.<Ð.H˜¨¨¨Ô*ÈdˆØ0˜˜˜°dˆØ-aaa”°4ˆØ*6Ð*B| A A A”Ȉ àAE×AZÒAZØ ~ v¨u°lñB
ôB
Ñ>ˆ > 6¨5°,ð Ð Ø× $¤/¸dÈDÐaeðôˆ)5°[Ô(AÀ<ÐP`ÔCa~ˆ Ð × $¤/¸dÈDÐaeðôàôˆFð ˆ>؈Fà׸NÈFÑSˆ å! ,×"3Ò"3Ñ"5Ô"5Ñàˆ
Ø×Ò˜ ð ð ð ð õÔ& 2ˆ
Ø×Ò˜ Ø”yÔØð 
ñ
ô
ˆõ"  0‰HˆAˆuØÔ!×,¨T¬ZÑ

HÐ5GÐH Ø×¬°\ÑBà”9”? Ÿ9š9™;œ;#Ÿ{š{™}œ} àÔ ×)¨$ÑÔ°t´yÔ7NÐ7ZØÔœ
ל Ôôðð
××ÔÔ