Handle char level tokenization #11

brianhie · 2024-02-18T16:24:19Z

Add char level tokenizer, modify generation code to handle batched inference with char tok

…ference with char tok

Zymrael · 2024-02-18T17:36:08Z

src/generation.py

@@ -76,6 +86,7 @@ def generate(
            mem_after_tok = torch.cuda.memory_allocated(device=x.device) / 1e9
            print_rank_0(f"Memory after tokenization: {mem_after_tok} GB")
            print_rank_0("Starting generation...")
+            torch.cuda.memory._record_memory_history(enabled=True)


This shouldn't be on by default

Zymrael · 2024-02-18T17:36:40Z

src/generation.py

-                generation[:, : i + 1],
-                skip_special_tokens=skip_special_tokens,
-            )
+            if isinstance(self.tokenizer, CharLevelTokenizer):


Maybe could slightly reformat this to have different args depending on the Tokenizer class

Reformatted

Zymrael · 2024-02-18T17:37:17Z

Also run black and isort!

brianhie · 2024-02-18T21:39:38Z

Also run black and isort!

Done 😄

brianhie added 2 commits February 18, 2024 16:20

add char level tokenizer, modify generation code to handle batched in…

4a1d35f

…ference with char tok

remove some comments

f6a49aa

Zymrael reviewed Feb 18, 2024

View reviewed changes

brianhie added 2 commits February 18, 2024 21:19

respond to review, delete record_memory, cleanup args

171fb71

black and isort

047503a

Zymrael approved these changes Feb 18, 2024

View reviewed changes

Zymrael merged commit 37caf4d into togethercomputer:main Feb 18, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle char level tokenization #11

Handle char level tokenization #11

brianhie commented Feb 18, 2024

Zymrael Feb 18, 2024

brianhie Feb 18, 2024

Zymrael Feb 18, 2024

brianhie Feb 18, 2024

Zymrael commented Feb 18, 2024

brianhie commented Feb 18, 2024

Handle char level tokenization #11

Handle char level tokenization #11

Conversation

brianhie commented Feb 18, 2024

Zymrael Feb 18, 2024

Choose a reason for hiding this comment

brianhie Feb 18, 2024

Choose a reason for hiding this comment

Zymrael Feb 18, 2024

Choose a reason for hiding this comment

brianhie Feb 18, 2024

Choose a reason for hiding this comment

Zymrael commented Feb 18, 2024

brianhie commented Feb 18, 2024