diff --git a/RELEASE.md b/RELEASE.md index 3f33ebfb..0c449b4c 100644 --- a/RELEASE.md +++ b/RELEASE.md @@ -86,5 +86,5 @@ Simple checklist on how to make releases for `safetensors`. If you want to make modifications to the CI/CD of the release GH actions, you need to : - **Comment the part that uploads the artifacts** to `crates.io`, `PyPi` or `npm`. -- Change the trigger mecanism so it can trigger every time you push to your branch. +- Change the trigger mechanism so it can trigger every time you push to your branch. - Keep pushing your changes until the artifacts are properly created. diff --git a/docs/source/speed.mdx b/docs/source/speed.mdx index 14a3af68..c748a891 100644 --- a/docs/source/speed.mdx +++ b/docs/source/speed.mdx @@ -84,9 +84,9 @@ Loaded pytorch 0:00:00.353889 on GPU, safetensors is faster than pytorch by: 2.1 X ``` -The speedup works because this library is able to skip unecessary CPU allocations. It is unfortunately not replicable in pure pytorch as far as we know. The library works by memory mapping the file, creating the tensor empty with pytorch and calling `cudaMemcpy` directly to move the tensor directly on the GPU. +The speedup works because this library is able to skip unnecessary CPU allocations. It is unfortunately not replicable in pure pytorch as far as we know. The library works by memory mapping the file, creating the tensor empty with pytorch and calling `cudaMemcpy` directly to move the tensor directly on the GPU. The currently shown speedup was gotten on: * OS: Ubuntu 18.04.6 LTS. * GPU: Tesla T4 * Driver Version: 460.32.03 -* CUDA Version: 11.2 \ No newline at end of file +* CUDA Version: 11.2 diff --git a/docs/source/torch_shared_tensors.mdx b/docs/source/torch_shared_tensors.mdx index a4901c04..2b5208fa 100644 --- a/docs/source/torch_shared_tensors.mdx +++ b/docs/source/torch_shared_tensors.mdx @@ -56,7 +56,7 @@ Multiple reasons for that: So if someone saves shared tensors in torch, there is no way to load them in a similar fashion so we could not keep the same `Dict[str, Tensor]` API. -- *It makes lazy loading very quircky.* +- *It makes lazy loading very quickly.* Lazy loading is the ability to load only some tensors, or part of tensors for a given file. This is trivial to do without sharing tensors but with tensor sharing @@ -80,11 +80,11 @@ Multiple reasons for that: a = torch.zeros((100, 100)) b = a[:1, :] torch.save({"b": b}, "model.bin") - # File is 41k instead of the epected 400 bytes + # File is 41k instead of the expected 400 bytes # In practice it could happen that you save several 10GB instead of 1GB. ``` -Now with all those reasons being mentionned, nothing is set in stone in there. +Now with all those reasons being mentioned, nothing is set in stone in there. Shared tensors do not cause unsafety, or denial of service potential, so this decision could be revisited if current workarounds are not satisfactory.