Skip to content

Commit

Permalink
Merge branch 'main' into chat_template
Browse files Browse the repository at this point in the history
Signed-off-by: minmingzhu <[email protected]>
  • Loading branch information
minmingzhu authored Apr 10, 2024
2 parents 3cb18dd + 9182907 commit 3e6ccac
Show file tree
Hide file tree
Showing 5 changed files with 9 additions and 2 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/night_build_memo.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b-chat, huggyllama/llama-7b
finetune: gpt2, bigscience/bloom-560m, facebook/opt-125m, mosaicml/mpt-7b, huggyllama/llama-7b
1 change: 1 addition & 0 deletions docs/finetune_parameters.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@ The following are the parameters supported in the finetuning workflow.
|Configuration Name| Default|Meaning|
|-|-|-|
|base_model| EleutherAI/gpt-j-6b|Path to pretrained model or model identifier from huggingface.co/models|
|tokenizer_name|None|Path to pretrained tokenizer from huggingface.co/models. If not provided, the tokenizer will be loaded from the `base_model`.|
|gpt_base_model|True|This parameter is for [Transformers#22482](https://github.com/huggingface/transformers/issues/22482). It needs to be set to True when the pretrained model is realted to gpt, otherwise it is False.|
|output_dir|/tmp/llm-ray/output|The output directory to store the finetuned model|
|checkpoint_dir|/tmp/llm-ray/checkpoint|The directory to store checkpoint|
Expand Down
6 changes: 5 additions & 1 deletion llm_on_ray/finetune/finetune.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,6 +155,10 @@ def train_func(config: Dict[str, Any]):

gradient_accumulation_steps = config["Training"].get("gradient_accumulation_steps", 1)
base_model = config["General"]["base_model"]
if config["General"].get("tokenizer_name") is not None:
tokenizer_name = config["General"].get("tokenizer_name")
else:
tokenizer_name = base_model
dataset_file = config["Dataset"]["train_file"]

seed = config["Training"].get("seed")
Expand All @@ -171,7 +175,7 @@ def train_func(config: Dict[str, Any]):

tokenizer = common.tokenizer.Tokenizer.registory.get("HuggingFaceTokenizer")()(
config={
"name": base_model,
"name": tokenizer_name,
"config": config["General"]["config"],
}
)
Expand Down
1 change: 1 addition & 0 deletions llm_on_ray/finetune/finetune_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ class DeltatunerConfig(BaseModel):

class General(BaseModel):
base_model: str
tokenizer_name: Optional[str] = None
gpt_base_model: bool
output_dir: str
checkpoint_dir: Optional[str]
Expand Down
1 change: 1 addition & 0 deletions llm_on_ray/finetune/models/mpt-7b.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
General:
base_model: mosaicml/mpt-7b
tokenizer_name: EleutherAI/gpt-neox-20b
gpt_base_model: false
output_dir: /tmp/llm-ray/output
checkpoint_dir: /tmp/llm-ray/checkpoint
Expand Down

0 comments on commit 3e6ccac

Please sign in to comment.