Files
DS-LLM-TEMPLATE-FINETUNING/unsloth_compiled_cache/__pycache__/UnslothPRMTrainer.cpython-311.pyc
T

250 lines
38 KiB
Plaintext
Raw Normal View History

2025-08-13 23:50:20 +00:00
§
5$hÞ›ãódZddlmZddlZddlmZddlmZddlmZm Z m
Z
m Z m Z m
Z
mZmZddlmZmZmZmZmZmZmZm
Z
mZmZmZmZmZmZmZmZmZm Z m Z m!Z!m"Z"m#Z#m$Z$m%Z%m&Z&m'Z'm(Z(mZm)Z)m*Z*m+Z+mZm,Z,m
Z
mZmZmZm'Z'm)Z)mZddl)Z)ddlTddl-m.Z.m/Z/dd l0m1Z1ddlZddl2Z3dd
l4m5Z5ddlmZdd l6m7Z7m8Z9d d
d d
d
dœZ:ej;d d e:¬¦«d¦«Z<e.Gdde¦«¦«Z= Gdde¦«Z>Gdde>¦«Z?dS)z8
2025.8.4
2025.8.5
4.55.1
0.21.0
__UNSLOTH_VERSIONING__
é)ÚTensorN)Ú
functional)ÚAnyÚListÚOptionalÚTupleÚUnionÚDictÚSetÚCallable)(ÚBaseImageProcessorr Ú DataCollatorÚ"DataCollatorForTokenClassificationÚDatasetÚEvalPredictionÚFeatureExtractionMixinrÚ PRMConfigÚ
PRMTrainerÚ PartialStateÚPathÚ PeftModelÚPreTrainedModelÚPreTrainedTokenizerBaseÚProcessorMixinÚTrainerÚTrainerCallbackr ÚchainÚcompute_accuracyÚdisable_dropout_in_modelÚfeaturesÚgenerate_model_cardÚinspectÚis_peft_availableÚis_wandb_availableÚnnÚosÚprepare_model_for_kbit_trainingÚtextwrapÚtorchÚwarningsrrrrr#r&r))Ú*)Ú dataclassÚfield)ÚVersion)Ú nullcontext)ÚDataCollatorForSeq2SeqÚDataCollatorForLanguageModelingTF)Úepilogue_fusionÚ max_autotuneÚ
shape_paddingz
trace.enabledztriton.cudagraphs)ÚdynamicÚ fullgraphÚoptionscó’tj| d|jd¦«dd¬¦«}tj| d¦«dd¬¦«}g}t ||¦«D]\}}| tj¦«}tj|d| d¦«¬¦«  d¦«}tj
|d¬¦«}||z
} |  | ¦«Œ’ tj |¦«}| |jd|jdf¦«}|S)Néÿÿÿÿér)ÚchunksÚdim)r<Úindex)r<é)
r)ÚchunkÚreshapeÚshapeÚzipÚtoÚfloat32ÚgatherÚ unsqueezeÚsqueezeÚ logsumexpÚappendÚconcat)
Úlogitsr=Úchunked_logitsÚ
chunked_indexÚall_per_token_logpsÚ chunk_logitsÚ chunk_indexÚselected_logitsÚlogsumexp_valuesÚper_token_logpss
ú]/workspace/Fine-tuning/DS-LLM-TEMPLATE-FINETUNING/unsloth_compiled_cache/UnslothPRMTrainer.pyÚchunked_selective_log_softmaxrU"s5õ”[ §¢°°F´LÀÔ4DÑ!EÔ!EÐPQÐYZÐ[€NÝ”[ §¢¨rÑ!2Ô!2¸QÀaÐH€MØÐå%(¨¸Ñ%GÔ%Gð #—¥u¤}Ñ Ýœ, |¸2À{×G\ÒG\Ð]_ÑG`ÔG`Ða×iÐjlÑmˆÝ œ?¨<¸Ø)Ð,<Ñ<ˆØ×" Ýœ,Ð':ÑØ-×5°v´|ÀA´ÈÌ ÐUVÌÐ6XÑØ Ðócó eZdZUdZedddi¬¦«Zeeed<edddi¬¦«Z ee
ed < d,ˆfd+„ Z ˆxZ S)-ÚUnslothPRMConfiga:
Configuration class for the [`PRMTrainer`].
This class includes only the parameters that are specific to PRM training. For a full list of training arguments,
please refer to the [`~transformers.TrainingArguments`] documentation. Note that default values in this class may
differ from those in [`~transformers.TrainingArguments`].
Using [`~transformers.HfArgumentParser`] we can turn this class into
[argparse](https://docs.python.org/3/library/argparse#module-argparse) arguments that can be specified on the
command line.
Parameters:
max_length (`int` or `None`, *optional*, defaults to `1024`):
Maximum length of the sequences (prompt + completion) used for truncation.
max_prompt_length (`int` or `None`, *optional*, defaults to `512`):
Maximum length of the prompt used for truncation.
max_completion_length (`int` or `None`, *optional*, defaults to `None`):
Maximum length of the completion used for truncation. The completion is the concatenation of the steps.
disable_dropout (`bool`, *optional*, defaults to `True`):
Whether to disable dropout in the model.
step_separator (`str`, *optional*, defaults to `"
"`):
Separator used to separate each step of the reasoning process.
train_on_last_step_only (`bool`, *optional*, defaults to `False`):
Whether to train only on the last step.
dataset_num_proc (`int`, *optional*, defaults to `None`):
Number of processes to use for processing the dataset.
helpzvLLM SamplingParams)ÚdefaultÚmetadataÚvllm_sampling_paramsr9z8Chunk size to reduce memory usage. -1 is most efficient.Úunsloth_num_chunksFÚnor:éréúç-Cëâ6
?ç{®Gáz„?çÍÌÌÌÌÌì?ç+‡ÙÎ÷ï?ç:Œ0âŽyE>çð?çlinearçš™™™™™¹?ÚpassiveÚwarningTÚstepsr>éôéO
ÚO1ÚautoÚçÚ
adamw_8bitÚlengthÚ
every_saveÚlastééé óN|dkrtd|d¦«|dkrtd|d¦«||#dkr
|$dkrd}d }#|‡€!d
d lm}t |‹¦«d zd ¦«}‡t ¦«jd”id
|d|d|d|d|d|d|d|d| “d|
d| d| d|
d|d|d|d|d|d|d |d!|d"|d#|d$|d%|d&|d'|d(|d)|d*|d+|d,| “d-|!“d.|"“d/|#“d0|$“d1|%“d2|&“d3|'“d4|(“d5|)“d6|*“d7|+“d8|,“d9|-“d:|.“d;|/“d<|0“d=|1“d>|2“d?|3“d@|4“dA|5“dB|6“dC|7“dD|8“dE|9“dF|:“dG|;“dH|<“dI|=“dJ|>“dK|?“dL|@“dM|A“dN|B“dO|C“dP|D“dQ|E“dR|F“dS|G“dT|H“dU|I“dV|J“dW|K“dX|L“dY|M“dZ|N“d[|O“d\|P“d]|Q“d^|R“d_|S“d`|T“da|U“db|V“dc|W“dd|X“de|Y“df|Z“dg|[“dh|\“di|]“dj|^“dk|_“dl|`“dm|a“dn|b“do|c“dp|d“dq|e“dr|f“ds|g“dt|h“du|i“dv|j“dw|k“dx|l“dy|m“dz|n“d{|o“d||p“d}|q“d~|r“d|s“d€|t“d|u“d|v“dƒ|w“d„|x“d…|y“d†|z“d‡|{“dˆ||“d‰|}“dŠ|~“d|dŒ|€“d|dŽ|‚“d|ƒ“d|„“d‘|…“d’|†“d“|‡“|ФŽ|ˆ|_|‰|_dS)•NgH¯¼šò×z>z Unsloth: Your learning rate of `zi` is too small and less than 1e-7! Consider increasing it, otherwise gradient updates will be close to 0!r>za` is way too larger > 1! Consider decreasing it to 1e-1, otherwise gradient updates will explode!rlrmÚunsloth_training_checkpointsr^r)Ú cpu_countr_Ú
output_dirÚoverwrite_output_dirÚdo_trainÚdo_evalÚ
do_predictÚ
eval_strategyÚprediction_loss_onlyÚper_device_train_batch_sizeÚper_device_eval_batch_sizeÚper_gpu_train_batch_sizeÚper_gpu_eval_batch_sizeÚgradient_accumulation_stepsÚeval_accumulation_stepsÚ
eval_delayÚtorch_empty_cache_stepsÚ
learning_rateÚ weight_decayÚ
adam_beta1Ú
adam_beta2Ú adam_epsilonÚ
max_grad_normÚnum_train_epochsÚ max_stepsÚlr_scheduler_typeÚ warmup_ratioÚ warmup_stepsÚ log_levelÚlog_level_replicaÚlog_on_each_nodeÚ logging_dirÚlogging_strategyÚlogging_first_stepÚ
logging_stepsÚlogging_nan_inf_filterÚ
save_strategyÚ
save_stepsÚsave_total_limitÚsave_safetensorsÚsave_on_each_nodeÚsave_only_modelÚ'restore_callback_states_from_checkpointÚno_cudaÚuse_cpuÚuse_mps_deviceÚseedÚ data_seedÚ
jit_mode_evalÚuse_ipexÚbf16Úfp16Úfp16_opt_levelÚhalf_precision_backendÚbf16_full_evalÚfp16_full_evalÚtf32Ú
local_rankÚ ddp_backendÚ
tpu_num_coresÚtpu_metrics_debugÚdebugÚdataloader_drop_lastÚ
eval_stepsÚdataloader_num_workersÚdataloader_prefetch_factorÚ
past_indexÚrun_nameÚ disable_tqdmÚremove_unused_columnsÚ label_namesÚload_best_model_at_endÚmetric_for_best_modelÚgreater_is_betterÚignore_data_skipÚfsdpÚfsdp_min_num_paramsÚ fsdp_configÚ"fsdp_transformer_layer_cls_to_wrapÚaccelerator_configÚ deepspeedÚlabel_smoothing_factorÚoptimÚ
optim_argsÚ adafactorÚgroup_by_lengthÚlength_column_nameÚ report_toÚddp_find_unused_parametersÚddp_bucket_cap_mbÚddp_broadcast_buffersÚdataloader_pin_memoryÚdataloader_persistent_workersÚskip_memory_metricsÚuse_legacy_prediction_loopÚ push_to_hubÚresume_from_checkpointÚ hub_model_idÚ hub_strategyÚ hub_tokenÚhub_private_repoÚhub_always_pushÚ hub_revisionÚgradient_checkpointingÚgradient_checkpointing_kwargsÚinclude_inputs_for_metricsÚeval_do_concat_batchesÚ fp16_backendÚpush_to_hub_model_idÚpush_to_hub_organizationÚpush_to_hub_tokenÚ
mp_parametersÚauto_find_batch_sizeÚfull_determinismÚ torchdynamoÚ ray_scopeÚ ddp_timeoutÚ
torch_compileÚtorch_compile_backendÚtorch_compile_modeÚinclude_tokens_per_secondÚinclude_num_input_tokens_seenÚneftune_noise_alphaÚoptim_target_modulesÚbatch_eval_metricsÚ
eval_on_startÚuse_liger_kernelÚliger_kernel_configÚeval_use_gather_objectÚaverage_tokens_across_devicesÚ
max_lengthÚmax_prompt_lengthÚmax_completion_lengthÚdisable_dropoutÚstep_separatorÚtrain_on_last_step_onlyÚdataset_num_proc©) ÚFloatingPointErrorÚ
OverflowErrorÚmultiprocessingr|ÚminÚsuperÚ__init__r\r])Úselfr}r~rr€rrr„r…r†r‡r‰rrrrrrr“r”r•r–r—r™rrr r­r¿rÿrrrrr\r]Úkwargsr|Ú __class__s €rTr
zUnslothPRMConfig.__init__Zs ø€ð\ ˜ Ð Õ'9ð;VÐ]jð;Vð;Vð;Vñ(Wô(Wð"WØ ˜ Ð ¥Mð3FÐUbð3Fð3Fð3Fñ%Gô%GðGØ Ð  -°7Ò":Ð":¸zÈSÒ?PÐ?PØ7ˆ ˆ Ð " 9 9¡;¤;¨q¡=°!Ñ àŒÔðGGG#˜ðG :à#7Ð#7ðG Ggð G
$˜ð G *˜
G$8Ð#7ðG+FÐ*EðG*DÐ)CðG(@Ð'?ðG'>Ð&=ðG+FÐ*EðG'>Ð&=ðG$˜ðG'>Ð&=ðG*˜Mð!G :ð"(˜<ð#G :ð$$˜ð%G :ð&$˜ð'G :ð((˜<ð)G :ð**˜Mð+G :ð,/ð-G :ð."˜ ð/G :ð0!2Ð 1ð1G :ð2(˜<ð3G :ð4(˜<ð5G :ð6"˜ ð7G :ð8!2Ð 1ð9G :ð:/ð;G :ð<&˜+ð=G :ð>/ð?G :ð@"4Ð!3ðAG :ðB*˜MðCG :ðD&<Ð%;ðEG :ðF*˜MðGG :ðH$˜ðIG :ðJ/ðKG :ðL/ðMG :ðN!2Ð 1ðOG :ðP.˜oðQG :ðR7^Ð6]ðSG :ðTgðUG :ðVgðWG :ðX,˜^ðYG :ðZ4ð[G :ð\"˜ ð]G :ð^*˜Mð_G :ð` xðaG :ðb4ðcG :ðd4ðeG :ðf,˜^ðgG :ðh&<Ð%;ðiG :ðj,˜^ðkG :ðl,˜^ðmG :ðn4ðoG :ðp$˜ðqG :ðr&˜+ðsG :ðt*˜MðuG :ðv!2Ð 1ðwG :ðxEðyG :ðz$8Ð#7ð{G :ð|$˜ð}G :ð~&<Ð%;ðG :ð@*DÐ)CðAG :ðB$˜ðCG :ðD xðEG :ðF(˜<ðGG :ðH%:Ð$9ðIG :ðJ&˜+ðKG :ðL&<Ð%;ðMG :ðN%:Ð$9ðOG :ðP!2Ð 1ðQG :ðR/ðSG :ðT4ðUG :ðV#6Ð"5ðWG :ðX&˜+ðYG :ðZ2TÐ1Sð[G :ð\"4Ð!3ð]G :ð^"˜ ð_G :ð`&<Ð%;ðaG :ðbEðcG :ðd$˜ðeG :ðf"˜ ðgG :ðh.˜oðiG :ðj"4Ð!3ðkG :ðl"˜ ðmG :ðn*DÐ)CðoG :ðp!2Ð 1ðqG :ðr%:Ð$9ðsG :ðt%:Ð$9ðuG :ðv-JÐ,IðwG :ðx#6Ð"5ðyG :ðz*DÐ)Cð{G :ð|&˜+ð}G :ð~&<Ð%;ðG :ð@(˜<ðAG :ðB(˜<ðCG :ðD"˜ ðEG :ðF/ðGG :ðH.˜oðIG :ðJ(˜<ðKG :ðL&<Ð%;ðMG :ðN-JÐ,IðOG :ðP*DÐ)CðQG :ðR&<Ð%;ðSG :ðT(˜<ðUG :ðV$8Ð#7ðWG :ðX(@Ð'?ðYG :ðZ!2Ð 1ð[G :ð\*˜Mð]G :ð^$8Ð#7ð_G :ð`/ðaG :ðb&˜+ðcG :ðd"˜ ðeG :ðf&˜+ðgG :ðh*˜MðiG :ðj%:Ð$9ðkG :ðl"4Ð!3ðmG :ðn)BÐ(AðoG :ðp-JÐ,IðqG :ðr#6Ð"5ðsG :ðt$8Ð#7ðuG :ðv"4Ð!3ðwG :ðx*˜MðyG :ðz/ð{G :ð|#6Ð"5ð}G :ð~&<Ð%;ðG :ð@-JÐ,IðAG :ðB$˜ðCG :ðD!2Ð 1ðEG :ðF%:Ð$9ðGG :ðH.˜oðIG :ðJ,˜^ðKG :ðL'>Ð&=ðMG :ðN/°&ðOGGG :ðP%9ˆÔ!Ø"4ˆÔÐÐrV)‰NNFFFr^Fr:r:NNr_r_rr`rarbrcrdrerfrgr9rhrirrjrkTNrlFr>FrlrmNTFFFFFFrnrnFFFFrorpFFNr9NNFrqFNrNr9NNTNFNNFrqrNNNNrrrsNFFrtNNNNTFTFFNNruNNFNFNFTrpNNNrqTFNrvrwFNNFFNNFFFNFTrxryNTrqFNNr9)
Ú__name__Ú
__module__Ú __qualname__Ú__doc__r-r\rrÚ__annotations__r]Úintr
Ú
__classcell__©r
s@rTrXrX3sgø€ððð:+0¨%ØØÐ+ñ+ô+И( 3œ-ððñð*/¨ØØÐ*ñ*ô*И #œððñð ØØØØØ$Ø&'Ø%&Ø#'Ø"&Ø&'Ø"#ØØ"%ØØØØØØØØØØØØØØØ!&ØØØØØØ27ØØØØØØØØØØØ!'ØØØØØØØØØ!"Ø%)ØØØØ $ØØ!&Ø $Ø Ø ØØØØ-1ØØ!$ØØØØØØ%)Ø Ø $Ø $Ø(-Ø"Ø%*ØØ!%ØØØØØØ!&Ø(,Ø%*Ø!%ØØ#Ø#'Ø ØØ ØØØØØ $Ø!Ø$)Ø(-ØØ Ø"Ø!&Ø(,ØØØ $Øðà"'ØØðW``````````5rVrXcó\eZdZdZddgZ ddeeeej fdee
dee d ee d
eee e
ee ffd eeeeeefd eegefd
eeege
fdeeedeejjejjjfdeeejejgejfdee
fˆfd
Zed¦«Z ˆfdZ! ddeedeedeeeedffdZ"ˆxZ#S)Ú_UnslothPRMTrainerrqÚtrlÚprmN©NNÚmodelÚargsÚ
data_collatorÚ
train_datasetÚ eval_datasetÚprocessing_classÚ
model_initÚcompute_metricsÚ callbacksÚ
optimizersÚpreprocess_logits_for_metricsÚ peft_configc
ót¦«s| td¦«t¦«r¯| ­t|t¦«s˜t |dd¦«st |dd¦«rtdt t
jt¦«j ¦«v}
d|j
i}|
s|j tj
d¦«n|
r|j
|j |d<t|fi|¤Ž}|}|jrt|¦«|t }|€'|td¦«t#||j¬ ¦«}d
|jvrzt)¦« ¦«5||j|j|j|j|jd œ}i|¥d di¥}| |j||j|jd
t;jt;jt;j d¦«¦«t;jt;j d¦«¦«dœ¦«¬¦«}i|¥d di¥}|‡| |j||j|jdt;jt;jt;j d¦«¦«t;jt;j d¦«¦«dœ¦«¬¦«}ddd¦«n #1swxYwYtC¦« "||||||||| |
| ¬¦ « tG|j$d¦«r!|j$ %|j&¦«dSdS)NzvPEFT is not installed and you passed a `peft_config` in the trainer's kwargs, please install it to use the PEFT modelsÚis_loaded_in_8bitFÚ is_quantizedrãÚuse_gradient_checkpointingzÂYou passed `gradient_checkpointing_kwargs` in the trainer's kwargs, but your peft version does not support it. please update to the latest version of peft to use `gradient_checkpointing_kwargs`.z^A processing_class must be specified when using the default DataCollatorForTokenClassification)Ú input_ids)Ú tokenizerrrÿrÚis_evalzTokenizing train datasetÚint64)Úlabelsr+)Ú fn_kwargsÚnum_procÚremove_columnsÚdescr TzTokenizing eval dataset) rrrrrr r!r"r#r$r%Úadd_model_tags)'r#Ú
ValueErrorÚ
isinstancerÚgetattrÚlistr"Ú signaturer'Ú
parametersrâr*ÚwarnrrrrÚ column_namesrÚmain_process_firstrrÿrÚmapÚ tokenize_rowrr ÚFeaturesÚSequenceÚValuer r
Úhasattrrr4Ú
_tag_names)r rrrrrr r!r"r#r$r%r&Ú_supports_gc_kwargsÚprepare_model_kwargsr0Útrain_fn_kwargsÚeval_fn_kwargsr
s €rTr
z_UnslothPRMTrainer.__init__ø€õ(  {Ð'>ÝðIñôð
õÑ
Ô
ð  [Ð%<ݘe¥YÑ
ݘ5Ð"5°uÑ[ÅÈÐP^Ð`eÑAfÔAfð[Ø*IÍTÝÔ)Õ*IÑNôNð+Ð-IÈ$ÔJeÐ+fÐs°4Ô3UÐ3aÝ œ
ðrñôðððs°Ô1SÐ1_ØPTÔPrÐ,Ð-LÑ;¸ZÐEYÐZð Ô ð $  Ð .ˆ Ð ØÐ Øôðõ?Ð?OÐ\`Ô\kÐlˆ ˜mÔ ×&
ð&
à!1Ø&*Ô&9Ø"&¤/Ø)-Ô)?Ø-1Ô-GØ/3Ô/Kð
ð ð#B YÐ"A° ¸5Ð"AÐ"AØ -× 1Ò 1ØÔ2Ø#0Ô#9Ø.å&.Ô&7½¼ÀwÑ8OÔ8OÑ&PÔ&PÝ)1Ô):½8¼>È'Ñ;RÔ;RÑ)SÔ)Sððñôð
!2ñ !ô !
ð"@ IÐ!?¨y¸$Ð!?Ð!?ØÐ+Ø#/×#3Ò#3ØÔ)Ø"0Ø!%Ô!6Ø'3Ô'<Ø6Ý!)Ô!2å*2Ô*;½H¼NÈ7Ñ<SÔ<SÑ*TÔ*TÝ-5Ô->½x¼~ÈgÑ?VÔ?VÑ-WÔ-Wððñ"ô"ð
$4ñ $ô $Lð5&
ð&
ð&
ñ&
ô&
ð&
ð&
ð&
ð&
ð&
ð&
øøøð&
ð&
ð&
ð&
õP Œ×ÒØØØØ!Ø*Gð ñ
ô
ð
õ 4”:Ð  ŒJ× % d¤oÑ  7sÅEJÊJ#Ê&J#có¬
|dd¬¦«d}ˆfd|dD¦«} |r<|s:dgt|d¦«d z
zt|dd
¦«gz}
nd |dD¦«}
 |d¬¦«Š
ˆ
fd | D¦«} d
t| |
¦«D¦«}
t t | ަ«} t t |
ަ«}
j jg|z}| || d}|| d|} |
d|}
|| z} dgt|¦«z|
z}
|| d|} |
d|}
| |
dœS)a/
Tokenize a row of the dataset.
Args:
features (`dict[str, str]`):
Row of the dataset, should contain the keys `"prompt"`, `"completions"`, and `"labels"`.
tokenizer (`PreTrainedTokenizerBase`):
Tokenizer used to process the data.
step_separator (`str`):
Separator between steps in the completion.
max_length (`int` or `None`):
Maximum length of the sequences (prompt + completion). If `None`, the sequences are not truncated.
max_prompt_length (`int` or `None`):
Maximum length of the prompt. If `None`, the prompt is not truncated.
max_completion_length (`int` or `None`):
Maximum length of the completion sequences. If `None`, the completion sequences are not truncated.
train_on_last_step_only (`bool`):
Whether to train only on the last step. If `True`, the labels are `-100` for all tokens except the last
token of the completion.
is_eval (`bool`):
Whether the function is used to tokenize samples from a training or an evaluation dataset. Used only if
`train_on_last_step_only` is set to `True`.
Returns:
`dict[str, list[int]]`:
Tokenized sequences with the keys `"input_ids"`, and `"labels".
Example:
```python
>>> from transformers import AutoTokenizer
>>> tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-0.5B")
>>> features = {
... "prompt": "Which number is larger, 9.8 or 9.11?",
... "completions": ["11 is greater than 8.", "Hence, 9.11 > 9.8."],
... "labels": [True, False],
... }
>>> PRMTrainer.tokenize_row(
... features, tokenizer, "\n", max_completion_length=None, train_on_last_step_only=False, is_eval=False
... )
{'input_ids': [23085, 1372, 374, 8131, 11, 220, 24, 13, 23, 476, 220, 24, 13, 16, 16, 30, 16, 16, 374, 7046, 1091, 220, 23, 13, 198, 39, 763, 11, 220, 24, 13, 16, 16, 861, 220, 24, 13, 23, 13, 198],
'labels': [-100, -100, -100, -100, -100, -100, -100, -100, 1, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, -100, 0]}
```
ÚpromptF©Úadd_special_tokensr+có6g|]}|d¬¦«dŒS)FrKr+r)Ú.0Ú
completionr,s €rTú
<listcomp>z3_UnslothPRMTrainer.tokenize_row.<locals>.<listcomp>1s:ø€ð
ð
ð
ØMWˆIˆIj°UÐ ;¸ 
ð
ð
rVÚ completionséœÿÿÿr/r>r9có,g|]}t|¦«ŒSr)r)rNÚlabels rTrPz3_UnslothPRMTrainer.tokenize_row.<locals>.<listcomp>7sÐA U•c˜%j”jÐArVcóg|]}|zŒSrr)rNrOÚ
separator_idss €rTrPz3_UnslothPRMTrainer.tokenize_row.<locals>.<listcomp>;sø€ÐX¸*˜:¨
ÑXrVcóHg|]\}}dgt|¦«dz
z|gzŒ S)rRr>)Úlen)rNrOrTs rTrPz3_UnslothPRMTrainer.tokenize_row.<locals>.<listcomp>>s6ÐqÑ?P¸zÈ54&C 
™OœO¨aÑ0°E°7ÑqrVN)r+r/)rXrÚencoderBr8rÚ bos_token_id)r r,rrÿrr-Ú
prompt_idsÚcompletions_idsr/Úcompletion_idsr+rVs ` @rTr?z_UnslothPRMTrainer.tokenize_rowøøø€ðpY˜x¨Ô1ÀeÐLÈ[Ô
ð
ð
ð
ð
Ø[cÐdqÔ[rð
ñ
ô
ˆð  B¨7ð BØVs 8¨HÔ#5Ñ6¸Ñ;½sÀ8ÈHÔCUÐVXÔCYÑ?ZÔ?ZÐ>[Ñ[ˆFˆA¨h°xÔ.@ÐAˆ"×ÈEÐ
ØÐXˆðrÐqÕTWÐXgÐioÑTpÔTpÐqˆõe _ÐÝ•e˜Và Ô 1°JÑ>ˆ Ð #Ð%6Ð$6Ð$7Ð$7Ô8ˆ Ð +Ð,BÐ-BÐ,BÔCˆÐ3ˆFà Ñ/ˆ Ø#˜j™/œ/Ñ)¨FÑ2ˆà Ð ! + : +Ô.ˆ˜K˜Z˜(ˆ&°&Ð9rVcó|jjt|jj¦«j}n%|jj d¦«d}| |¬¦«t¦« ||¦«dS)/r9)Ú
model_name) rrr}ÚnameÚsplitÚcreate_model_cardr Ú_save_checkpoint)r rÚtrialr`r
s €rTrdz#_UnslothPRMTrainer._save_checkpointXsyø€Ø Œ9Ô ˜dœiÔ8ˆJˆJàœÔ5°cÑ:¸2Ô>ˆJØ ×Ò¨*ÐÑ
Œ× Ò  ¨Ñ.rVr`Ú dataset_nameÚtagsc ó | ¦«sdSt|jjd¦«r@tj |jjj¦«s|jjj}nd}|t¦«}n(t|t¦«r|h}nt|¦«}t|jjd¦«r|  d¦«|  |j
¦«tjd¦«}t!|||j||t%¦«rt&jt&jjndd|d¬¦ « }| tj |jjd ¦«¦«dS)

Creates a draft of a model card using the information available to the `Trainer`.
Args:
model_name (`str` or `None`, *optional*, defaults to `None`):