Skip to content

Commit

Permalink
adds file hashing interface
Browse files Browse the repository at this point in the history
  • Loading branch information
shikokuchuo committed Jan 24, 2024
1 parent 5cc6c63 commit 9d08d8b
Show file tree
Hide file tree
Showing 10 changed files with 186 additions and 84 deletions.
1 change: 1 addition & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
# Generated by roxygen2: do not edit by hand

export(sha3)
export(sha3sum)
useDynLib(secretbase, .registration = TRUE)
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
# secretbase 0.1.0.9000 (development)

* Adds file hashing interface.

# secretbase 0.1.0

* Initial CRAN release.
Expand Down
46 changes: 29 additions & 17 deletions R/base.R
Original file line number Diff line number Diff line change
Expand Up @@ -42,31 +42,27 @@

#' Cryptographic Hashing Using the SHA-3 Algorithm
#'
#' Returns a SHA-3 hash of the supplied R object.
#' Returns a SHA-3 hash of the supplied R object or file.
#'
#' @param x an object.
#' @param x an object. Character strings and raw vectors (with no attributes)
#' are hashed 'as is'. All other objects are hashed in-place using R
#' serialization but without allocation of the serialized object
#' (memory-efficient). For portability, serialization v3 XDR is always used
#' with headers skipped (as these contain R version and encoding
#' information).
#' @param bits [default 256L] output size of the returned hash. If one of 224,
#' 256, 384 or 512, uses the relevant SHA-3 cryptographic hash function. For
#' other values, uses the SHAKE256 extendable-output function (XOF). The
#' supplied value must be between 8 and 2^24, and is coerced to integer.
#' all other values, uses the SHAKE256 extendable-output function (XOF).
#' Must be between 8 and 2^24 and coercible to integer.
#' @param convert [default TRUE] if TRUE, the hash is converted to its hex
#' representation as a character string, if FALSE, output directly as a raw
#' vector, or if NA, a vector of (32-bit) integer values.
#'
#' @return A character string, raw or integer vector depending on 'convert'.
#'
#' @details A character string or raw vector (with no attributes) is hashed 'as
#' is'.
#'
#' All other objects are hashed in-place, in a 'streaming' fashion, by R
#' serialization but without allocation of the serialized object. To ensure
#' portability, R serialization version 3, big-endian representation is
#' always used, skipping the headers (as these contain the R version number
#' and native encoding information).
#'
#' The result of hashing is always a byte sequence, which is converted to a
#' character string hex representation if 'convert' is TRUE, or returned as
#' a raw vector if 'convert' is FALSE.
#' @details The result of hashing is always a byte sequence, which is converted
#' to a character string hex representation if 'convert' is TRUE, or
#' returned as a raw vector if 'convert' is FALSE.
#'
#' To hash to integer values, set convert to NA. For a single integer value
#' set 'bits' to 32. These values may be supplied as random seeds for R's
Expand All @@ -93,4 +89,20 @@
#'
#' @export
#'
sha3 <- function(x, bits = 256L, convert = TRUE) .Call(secretbase_sha3, x, bits, convert)
sha3 <- function(x, bits = 256L, convert = TRUE)
.Call(secretbase_sha3, x, bits, convert)

#' @param file character file name / path. The file is read in a streaming
#' fashion and does not need to fit in memory.
#'
#' @examples
#' # SHA3-256 hash a file:
#' file <- tempfile(); cat("secret base", file = file)
#' sha3sum(file)
#' unlink(file)
#'
#' @rdname sha3
#' @export
#'
sha3sum <- function(file, bits = 256L, convert = TRUE)
.Call(secretbase_sha3_file, file, bits, convert)
23 changes: 18 additions & 5 deletions README.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ install.packages("secretbase", repos = "https://shikokuchuo.r-universe.dev")

### Quick Start

`secretbase` offers one main function: `sha3()`
`secretbase` offers the functions: `sha3()` for objects and `sha3sum()` for files.

To use:

Expand All @@ -62,25 +62,38 @@ sha3("secret base", convert = FALSE)
sha3("秘密の基地の中", bits = 224)
sha3("", bits = 512)
```

Hash arbitrary R objects:

- done in-place, in a 'streaming' fashion, by R serialization but without allocation of the serialized object
- ensures portability by always using R serialization version 3, big endian representation, skipping the headers (which contain R version and native encoding information)
- done in-place using R serialization but without allocation of the serialized object (memory-efficient)
- ensures portability by always using serialization v3 XDR, skipping the headers (which contain R version and encoding information)

```{r streaming}
sha3(data.frame(a = 1, b = 2), bits = 160)
sha3(NULL)
```

To hash to integer:
Hash files:

- files are read in a streaming fashion and do not need to fit in memory.

```{r files}
file <- tempfile(); cat("secret base", file = file)
sha3sum(file)
```
```{r unlink, echo=FALSE}
unlink(file)
```

Hash to integer:

- specify 'convert' as `NA`
- specify 'bits' as `32` for a single integer value

```{r readinteger}
```{r integer}
sha3("秘密の基地の中", convert = NA)
sha3("秘密の基地の中", bits = 32, convert = NA)
Expand Down
25 changes: 18 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,8 @@ install.packages("secretbase", repos = "https://shikokuchuo.r-universe.dev")

### Quick Start

`secretbase` offers one main function: `sha3()`
`secretbase` offers the functions: `sha3()` for objects and `sha3sum()`
for files.

To use:

Expand Down Expand Up @@ -68,11 +69,10 @@ sha3("", bits = 512)

Hash arbitrary R objects:

- done in-place, in a ‘streaming’ fashion, by R serialization but
without allocation of the serialized object
- ensures portability by always using R serialization version 3, big
endian representation, skipping the headers (which contain R version
and native encoding information)
- done in-place using R serialization but without allocation of the
serialized object (memory-efficient)
- ensures portability by always using serialization v3 XDR, skipping the
headers (which contain R version and encoding information)

``` r
sha3(data.frame(a = 1, b = 2), bits = 160)
Expand All @@ -82,7 +82,18 @@ sha3(NULL)
#> [1] "b3e37e4c5def1bfb2841b79ef8503b83d1fed46836b5b913d7c16de92966dcee"
```

To hash to integer:
Hash files:

- files are read in a streaming fashion and do not need to fit in
memory.

``` r
file <- tempfile(); cat("secret base", file = file)
sha3sum(file)
#> [1] "a721d57570e7ce366adee2fccbe9770723c6e3622549c31c7cab9dbb4a795520"
```

Hash to integer:

- specify ‘convert’ as `NA`
- specify ‘bits’ as `32` for a single integer value
Expand Down
39 changes: 23 additions & 16 deletions man/sha3.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions src/init.c
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@

static const R_CallMethodDef callMethods[] = {
{"secretbase_sha3", (DL_FUNC) &secretbase_sha3, 3},
{"secretbase_sha3_file", (DL_FUNC) &secretbase_sha3_file, 3},
{NULL, NULL, 0}
};

Expand Down
Loading

0 comments on commit 9d08d8b

Please sign in to comment.