Skip to content

My personal notes about interesting findings, tricks, solutions. Mostly for research engineering problems

Notifications You must be signed in to change notification settings

huangxt39/notes

Repository files navigation

My personal notes

transformerlens_bwd_hook.py: experiments about the question: when changing the activation of a module with forward hooks (say A -> A'), the backward hook will capture the gradient pre- or post- the forward modification (grad for A or A')? The answer is the grad captured by backward hook is the grad for last/newest activation.

gpt2_tokenizer_nonASCII_char.py: deals with non ASCII characters when using gpt2 tokenizer.convert_ids_to_tokens(). In some cases we need to keep the tokenized structure (so we cannot use tokenizer.decode()) to associate each token with a value. But these characters will become unrecognizable if one just convert ids into tokens. This file shows a workaround on this.

cosine_similarity: It turns out that cosine_similarity() of pytorch is not very efficient when computing pair-wise similarity. Simply writing it in another way can make it much much faster! In my own proj where I found this trick, it became 10 times faster.

About

My personal notes about interesting findings, tricks, solutions. Mostly for research engineering problems

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages