Skip to content

Commit

Permalink
Merge pull request #27 from MartinuzziFrancesco/fm/return
Browse files Browse the repository at this point in the history
Change cells' return to `out, state`
  • Loading branch information
MartinuzziFrancesco authored Dec 15, 2024
2 parents 1fa7bdf + f9596d9 commit 8282e46
Show file tree
Hide file tree
Showing 15 changed files with 378 additions and 68 deletions.
4 changes: 2 additions & 2 deletions Project.toml
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
name = "RecurrentLayers"
uuid = "78449bcf-6750-4b78-9e82-63d4a1ccdf8c"
authors = ["Francesco Martinuzzi"]
version = "0.1.5"
version = "0.2.0"

[deps]
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"

[compat]
Flux = "0.14, 0.15"
Flux = "0.16"
julia = "1.10"

[extras]
Expand Down
2 changes: 1 addition & 1 deletion docs/pages.jl
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ pages=[
"Home" => "index.md",
"API Documentation" => [
"Cells" => "api/cells.md",
"Cell Wrappers" => "api/wrappers.md",
"Layers" => "api/layers.md",
],
"Roadmap" => "roadmap.md"
]
File renamed without changes.
1 change: 1 addition & 0 deletions src/RecurrentLayers.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ module RecurrentLayers
using Flux
import Flux: _size_check, _match_eltype, chunk, create_bias, zeros_like
import Flux: glorot_uniform
#TODO add interlinks to initialstates in docstrings https://juliadocs.org/DocumenterInterLinks.jl/stable/
import Flux: initialstates, scan

abstract type AbstractRecurrentCell end
Expand Down
60 changes: 54 additions & 6 deletions src/fastrnn_cell.jl
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,19 @@ h_t &= \alpha \tilde{h}_t + \beta h_{t-1}
# Forward
fastrnncell(inp, [state])
fastrnncell(inp, state)
fastrnncell(inp)
## Arguments
- `inp`: The input to the fastrnncell. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the FastRNN. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- A tuple `(output, state)`, where both elements are given by the updated state `new_state`,
a tensor of size `hidden_size` or `hidden_size x batch_size`.
"""
function FastRNNCell((input_size, hidden_size)::Pair, activation=tanh_fast;
init_kernel = glorot_uniform,
Expand Down Expand Up @@ -65,7 +77,7 @@ function (fastrnn::FastRNNCell)(inp::AbstractVecOrMat, state)
candidate_state = fastrnn.activation.(Wi * inp .+ Wh * state .+ b)
new_state = alpha .* candidate_state .+ beta .* state

return new_state
return new_state, new_state
end

Base.show(io::IO, fastrnn::FastRNNCell) =
Expand Down Expand Up @@ -102,7 +114,18 @@ h_t &= \alpha \tilde{h}_t + \beta h_{t-1}
# Forward
fastrnn(inp, [state])
fastrnn(inp, state)
fastrnn(inp)
## Arguments
- `inp`: The input to the fastrnn. It should be a vector of size `input_size x len`
or a matrix of size `input_size x len x batch_size`.
- `state`: The hidden state of the FastRNN. If given, it is a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- New hidden states `new_states` as an array of size `hidden_size x len x batch_size`.
"""
function FastRNN((input_size, hidden_size)::Pair, activation = tanh_fast;
kwargs...)
Expand Down Expand Up @@ -155,7 +178,20 @@ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1}
# Forward
fastgrnncell(inp, [state])
fastgrnncell(inp, state)
fastgrnncell(inp)
## Arguments
- `inp`: The input to the fastgrnncell. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the FastGRNN. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- A tuple `(output, state)`, where both elements are given by the updated state `new_state`,
a tensor of size `hidden_size` or `hidden_size x batch_size`.
"""
function FastGRNNCell((input_size, hidden_size)::Pair, activation=tanh_fast;
init_kernel = glorot_uniform,
Expand Down Expand Up @@ -187,7 +223,7 @@ function (fastgrnn::FastGRNNCell)(inp::AbstractVecOrMat, state)
candidate_state = tanh_fast.(partial_gate .+ bh)
new_state = (zeta .* (ones(size(gate)) .- gate) .+ nu) .* candidate_state .+ gate .* state

return new_state
return new_state, new_state
end

Base.show(io::IO, fastgrnn::FastGRNNCell) =
Expand Down Expand Up @@ -225,7 +261,19 @@ h_t &= \big((\zeta (1 - z_t) + \nu) \odot \tilde{h}_t\big) + z_t \odot h_{t-1}
# Forward
fastgrnn(inp, [state])
fastgrnn(inp, state)
fastgrnn(inp)
## Arguments
- `inp`: The input to the fastgrnn. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the FastGRNN. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- New hidden states `new_states` as an array of size `hidden_size x len x batch_size`.
"""
function FastGRNN((input_size, hidden_size)::Pair, activation = tanh_fast;
kwargs...)
Expand Down
31 changes: 28 additions & 3 deletions src/indrnn_cell.jl
Original file line number Diff line number Diff line change
Expand Up @@ -33,8 +33,19 @@ See [`IndRNN`](@ref) for a layer that processes entire sequences.
# Forward
rnncell(inp, [state])
indrnncell(inp, state)
indrnncell(inp)
## Arguments
- `inp`: The input to the indrnncell. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the IndRNNCell. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- A tuple `(output, state)`, where both elements are given by the updated state `new_state`,
a tensor of size `hidden_size` or `hidden_size x batch_size`.
"""
function IndRNNCell((input_size, hidden_size)::Pair, σ=relu;
init_kernel = glorot_uniform,
Expand All @@ -50,7 +61,7 @@ function (indrnn::IndRNNCell)(inp::AbstractVecOrMat, state::AbstractVecOrMat)
_size_check(indrnn, inp, 1 => size(indrnn.Wi, 2))
σ = NNlib.fast_act(indrnn.σ, inp)
state = σ.(indrnn.Wi*inp .+ indrnn.Wh .* state .+ indrnn.b)
return state
return state, state
end

function Base.show(io::IO, indrnn::IndRNNCell)
Expand Down Expand Up @@ -84,6 +95,20 @@ See [`IndRNNCell`](@ref) for a layer that processes a single sequence.
```math
\mathbf{h}_{t} = \sigma(\mathbf{W} \mathbf{x}_t + \mathbf{u} \odot \mathbf{h}_{t-1} + \mathbf{b})
```
# Forward
indrnn(inp, state)
indrnn(inp)
## Arguments
- `inp`: The input to the indrnn. It should be a vector of size `input_size x len`
or a matrix of size `input_size x len x batch_size`.
- `state`: The hidden state of the IndRNN. If given, it is a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- New hidden states `new_states` as an array of size `hidden_size x len x batch_size`.
"""
function IndRNN((input_size, hidden_size)::Pair, σ = tanh; kwargs...)
cell = IndRNNCell(input_size, hidden_size, σ; kwargs...)
Expand Down
31 changes: 29 additions & 2 deletions src/lightru_cell.jl
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,19 @@ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t.
# Forward
rnncell(inp, [state])
lightrucell(inp, state)
lightrucell(inp)
## Arguments
- `inp`: The input to the lightrucell. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the LightRUCell. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- A tuple `(output, state)`, where both elements are given by the updated state `new_state`,
a tensor of size `hidden_size` or `hidden_size x batch_size`.
"""
function LightRUCell((input_size, hidden_size)::Pair;
init_kernel = glorot_uniform,
Expand All @@ -58,7 +70,7 @@ function (lightru::LightRUCell)(inp::AbstractVecOrMat, state)
candidate_state = @. tanh_fast(gxs[1])
forget_gate = sigmoid_fast(gxs[2] .+ Wh * state .+ b)
new_state = @. (1 - forget_gate) * state + forget_gate * candidate_state
return new_state
return new_state, new_state
end

Base.show(io::IO, lightru::LightRUCell) =
Expand Down Expand Up @@ -93,6 +105,21 @@ f_t &= \delta(W_f x_t + U_f h_{t-1} + b_f), \\
h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t.
\end{aligned}
```
# Forward
lightru(inp, state)
lightru(inp)
## Arguments
- `inp`: The input to the lightru. It should be a vector of size `input_size x len`
or a matrix of size `input_size x len x batch_size`.
- `state`: The hidden state of the LightRU. If given, it is a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- New hidden states `new_states` as an array of size `hidden_size x len x batch_size`.
"""
function LightRU((input_size, hidden_size)::Pair; kwargs...)
cell = LightRUCell(input_size => hidden_size; kwargs...)
Expand Down
31 changes: 29 additions & 2 deletions src/ligru_cell.jl
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,19 @@ h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t
# Forward
rnncell(inp, [state])
ligrucell(inp, state)
ligrucell(inp)
## Arguments
- `inp`: The input to the ligrucell. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the LiGRUCell. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- A tuple `(output, state)`, where both elements are given by the updated state `new_state`,
a tensor of size `hidden_size` or `hidden_size x batch_size`.
"""
function LiGRUCell((input_size, hidden_size)::Pair;
init_kernel = glorot_uniform,
Expand All @@ -60,7 +72,7 @@ function (ligru::LiGRUCell)(inp::AbstractVecOrMat, state)
forget_gate = @. sigmoid_fast(gxs[1] + ghs[1])
candidate_hidden = @. tanh_fast(gxs[2] + ghs[2])
new_state = forget_gate .* state .+ (1 .- forget_gate) .* candidate_hidden
return new_state
return new_state, new_state
end


Expand Down Expand Up @@ -93,6 +105,21 @@ z_t &= \sigma(W_z x_t + U_z h_{t-1}), \\
h_t &= z_t \odot h_{t-1} + (1 - z_t) \odot \tilde{h}_t
\end{aligned}
```
# Forward
ligru(inp, state)
ligru(inp)
## Arguments
- `inp`: The input to the ligru. It should be a vector of size `input_size x len`
or a matrix of size `input_size x len x batch_size`.
- `state`: The hidden state of the LiGRU. If given, it is a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- New hidden states `new_states` as an array of size `hidden_size x len x batch_size`.
"""
function LiGRU((input_size, hidden_size)::Pair; kwargs...)
cell = LiGRUCell(input_size => hidden_size; kwargs...)
Expand Down
31 changes: 29 additions & 2 deletions src/mgu_cell.jl
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,19 @@ h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t
# Forward
rnncell(inp, [state])
mgucell(inp, state)
mgucell(inp)
## Arguments
- `inp`: The input to the mgucell. It should be a vector of size `input_size`
or a matrix of size `input_size x batch_size`.
- `state`: The hidden state of the MGUCell. It should be a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- A tuple `(output, state)`, where both elements are given by the updated state `new_state`,
a tensor of size `hidden_size` or `hidden_size x batch_size`.
"""
function MGUCell((input_size, hidden_size)::Pair;
init_kernel = glorot_uniform,
Expand All @@ -58,7 +70,7 @@ function (mgu::MGUCell)(inp::AbstractVecOrMat, state)
forget_gate = sigmoid_fast.(gxs[1] .+ ghs[1]*state)
candidate_state = tanh_fast.(gxs[2] .+ ghs[2]*(forget_gate.*state))
new_state = forget_gate .* state .+ (1 .- forget_gate) .* candidate_state
return new_state
return new_state, new_state
end

Base.show(io::IO, mgu::MGUCell) =
Expand Down Expand Up @@ -92,6 +104,21 @@ f_t &= \sigma(W_f x_t + U_f h_{t-1} + b_f), \\
h_t &= (1 - f_t) \odot h_{t-1} + f_t \odot \tilde{h}_t
\end{aligned}
```
# Forward
mgu(inp, state)
mgu(inp)
## Arguments
- `inp`: The input to the mgu. It should be a vector of size `input_size x len`
or a matrix of size `input_size x len x batch_size`.
- `state`: The hidden state of the MGU. If given, it is a vector of size
`hidden_size` or a matrix of size `hidden_size x batch_size`.
If not provided, it is assumed to be a vector of zeros.
## Returns
- New hidden states `new_states` as an array of size `hidden_size x len x batch_size`.
"""
function MGU((input_size, hidden_size)::Pair; kwargs...)
cell = MGUCell(input_size => hidden_size; kwargs...)
Expand Down
Loading

2 comments on commit 8282e46

@MartinuzziFrancesco
Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/121439

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.2.0 -m "<description of version>" 8282e464032bc2cdb5c979d10a2eb4993720c4ff
git push origin v0.2.0

Please sign in to comment.