Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport net.box conn leak and sync test #521

Merged
merged 5 commits into from
Mar 11, 2025

Conversation

Serpentian
Copy link
Contributor

No description provided.

This commit introduces test, which checks, that vshard properly works
with synchronous user spaces. It also checks, that vshard properly finds
masters, if raft failover is used.

With elections and auto master enabled vshard works as expected,
no additional actions are required.

However, in order to use synchro replication in vshard without
election enabled user is supposed to manually call box.ctl.promote
on the expected master:

```lua
box.cfg{read_only = false}
box.ctl.promote()
```

Note, that `on_master_enable` cannot be used for `box.ctl.promote`,
when the node, which was a master, fails. It was decided to go with
`box.info.ro` flag as the marker of the storage state switch:
replica -> master and vice versa. All `on_master_enable/disable`
callbacks are executed only after `box.info.ro` is changed,
so it's impossible to call `box.ctl.promote` in them, since
instance doesn't go into rw state after box.cfg reconfiguration
due to `synchro` reason.

Closes tarantool#413

NO_DOC=<on_master_enable/disable are not documented>
This commit adds simple load generator for vshard. Currently
it generates only different echo requests. In the future support
for adding/removing customers and accounts may be introduced, but
it's not required for now.

NO_DOC=example
NO_TEST=example
There was a typo in `netbox_is_conn_dead` function, it marked the
connection as dead, when the fiber was in `suspended` status.
The connection is down, when its fiber is only in `dead` status,
in other cases `reconnect_after` works.

Closes tarantool#518

NO_DOC=bugfix
The net.box connection could not be garbage collected until
vconnect.future async call is garbage collected, since the
connection is referenced as a part of its async request.
The leak is reproducable only, when name_as_key
identification is used, since in other cases vconnect is not set.

Let's remove the future object from the net.box connection on
disconnect, so that it can be garbage collected after detach.

Part of tarantool#517

NO_DOC=bugfix
When replica is removed on reconfig, then if it has vconnect
in progress, it cannot be garbage collected due to vconnect.future
async call, which references connection.

Let's detach connections of the removed replicas on reconfig, on
detach the future object is dropped.

Closes tarantool#517

NO_DOC=bugfix
@Serpentian Serpentian requested a review from Gerold103 March 11, 2025 00:28
@sergepetrenko sergepetrenko requested review from sergepetrenko and removed request for Gerold103 March 11, 2025 13:34
@sergepetrenko sergepetrenko merged commit 8f89f97 into tarantool:master Mar 11, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants