-
Notifications
You must be signed in to change notification settings - Fork 202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Give scheduler a default volume, making it resilient to restarts by #1423
Conversation
default Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
Will merge after tests pass |
Please don't merge.
|
3868f5f
to
871f6ab
Compare
@yaron2 should be good now. |
Signed-off-by: joshvanl <me@joshvanl.dev>
Signed-off-by: joshvanl <me@joshvanl.dev>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM - Resolves the issue with the scheduler volumes in standalone mode for me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly enough the changes have resolved podman-starts but using docker(/-desktop) it's still complaining about operations on the data dir.
time="2024-07-17T16:57:49.536458383Z" level=info msg="Starting Dapr Scheduler Service -- version 1.14.0-rc.2 -- commit 3d30a6f3ae2191125ed9d3dcf0761077772e5de2" instance=6a5fdf505b55 scope=dapr.scheduler type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.536503675Z" level=info msg="Log level set to: info" instance=6a5fdf505b55 scope=dapr.scheduler type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.536643341Z" level=warning msg="mTLS is disabled. Skipping certificate request and tls validation" instance=6a5fdf505b55 scope=dapr.runtime.security type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.536865591Z" level=info msg="Healthz server is listening on [::]:8080" instance=6a5fdf505b55 scope=dapr.scheduler type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.536898091Z" level=warning msg="etcd client http ports not set. This is not recommended for production." instance=6a5fdf505b55 scope=dapr.scheduler.server type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.536926883Z" level=info msg="Dapr Scheduler is starting..." instance=6a5fdf505b55 scope=dapr.scheduler.server type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.5369168Z" level=info msg="metrics server started on :9090/" instance=6a5fdf505b55 scope=dapr.scheduler type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.536979341Z" level=info msg="Dapr Scheduler listening on: :50006" instance=6a5fdf505b55 scope=dapr.scheduler.server type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.537011216Z" level=info msg="Starting etcd" instance=6a5fdf505b55 scope=dapr.scheduler.server type=log ver=1.14.0-rc.2
{"level":"warn","ts":"2024-07-17T16:57:49.537048Z","caller":"embed/config.go:679","msg":"Running http and grpc server on single port. This is not recommended for production."}
{"level":"info","ts":"2024-07-17T16:57:49.537073Z","caller":"embed/etcd.go:127","msg":"configuring peer listeners","listen-peer-urls":["http://localhost:2380"]}
{"level":"info","ts":"2024-07-17T16:57:49.53722Z","caller":"embed/etcd.go:135","msg":"configuring client listeners","listen-client-urls":["http://localhost:2379"]}
time="2024-07-17T16:57:49.537322091Z" level=info msg="Running gRPC server on port 50006" instance=6a5fdf505b55 scope=dapr.scheduler.server type=log ver=1.14.0-rc.2
{"level":"info","ts":"2024-07-17T16:57:49.537343Z","caller":"embed/etcd.go:308","msg":"starting an etcd server","etcd-version":"3.5.14","git-sha":"Not provided (use ./build instead of go build)","go-version":"go1.22.4","go-os":"linux","go-arch":"arm64","max-cpu-set":10,"max-cpu-available":10,"member-initialized":false,"name":"dapr-scheduler-server-0","data-dir":"./data-default-dapr-scheduler-server-0","wal-dir":"","wal-dir-dedicated":"","member-dir":"data-default-dapr-scheduler-server-0/member","force-new-cluster":false,"heartbeat-interval":"100ms","election-timeout":"1s","initial-election-tick-advance":true,"snapshot-count":100000,"max-wals":5,"max-snapshots":5,"snapshot-catchup-entries":5000,"initial-advertise-peer-urls":["http://localhost:2380"],"listen-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"],"listen-client-urls":["http://localhost:2379"],"listen-metrics-urls":[],"cors":["*"],"host-whitelist":["*"],"initial-cluster":"dapr-scheduler-server-0=http://localhost:2380","initial-cluster-state":"new","initial-cluster-token":"etcd-cluster","quota-backend-bytes":2147483648,"max-request-bytes":1572864,"max-concurrent-streams":4294967295,"pre-vote":true,"initial-corrupt-check":false,"corrupt-check-time-interval":"0s","compact-check-time-enabled":false,"compact-check-time-interval":"1m0s","auto-compaction-mode":"periodic","auto-compaction-retention":"24h0m0s","auto-compaction-interval":"24h0m0s","discovery-url":"","discovery-proxy":"","downgrade-check-interval":"5s"}
{"level":"warn","ts":"2024-07-17T16:57:49.537376Z","caller":"fileutil/fileutil.go:53","msg":"check file permission","error":"directory \"./data-default-dapr-scheduler-server-0\" exist, but the permission is \"drwxr-xr-x\". The recommended permission is \"-rwx------\" to prevent possible unprivileged access to the data"}
{"level":"info","ts":"2024-07-17T16:57:49.537402Z","caller":"embed/etcd.go:375","msg":"closing etcd server","name":"dapr-scheduler-server-0","data-dir":"./data-default-dapr-scheduler-server-0","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"]}
{"level":"info","ts":"2024-07-17T16:57:49.537423Z","caller":"embed/etcd.go:377","msg":"closed etcd server","name":"dapr-scheduler-server-0","data-dir":"./data-default-dapr-scheduler-server-0","advertise-peer-urls":["http://localhost:2380"],"advertise-client-urls":["http://localhost:2379"]}
time="2024-07-17T16:57:49.537524966Z" level=info msg="Scheduler GRPC server stopped" instance=6a5fdf505b55 scope=dapr.scheduler.server type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.537547383Z" level=info msg="Healthz server is shutting down" instance=6a5fdf505b55 scope=dapr.scheduler type=log ver=1.14.0-rc.2
time="2024-07-17T16:57:49.537604383Z" level=fatal msg="error running scheduler: cannot access data directory: open /data-default-dapr-scheduler-server-0/.touch: permission denied" instance=6a5fdf505b55 scope=dapr.scheduler type=log ver=1.14.0-rc.2
Actions run:
https://github.com/dapr/rust-sdk/actions/runs/9977498812/job/27572563670#step:16:22
Did you delete the previous volume? |
Can merge once the E2E KinD test passes. |
Signed-off-by: joshvanl <me@joshvanl.dev>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files
☔ View full report in Codecov by Sentry. |
I ensured an uninstall of dapr including The issue is also appearing on actions runs. The only solution has been the first commit from this PR - #1422 where a default folder was created with standard permissions (0755) followed by a change to 0777. |
…#4256) * Updates API ref for jobs. Adds doc on Kubernetes Scheduler persistent volume Signed-off-by: joshvanl <me@joshvanl.dev> * Apply suggestions from code review Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com> Signed-off-by: Josh van Leeuwen <me@joshvanl.dev> * Adds scheduler persistent volume docs for selfhosted Signed-off-by: joshvanl <me@joshvanl.dev> * Updates scheduler volume docs based on dapr/cli#1423 Signed-off-by: joshvanl <me@joshvanl.dev> * Apply suggestions from code review Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com> Signed-off-by: Josh van Leeuwen <me@joshvanl.dev> * Adds reference for file system location of scheduler local volume Signed-off-by: joshvanl <me@joshvanl.dev> * Update daprdocs/content/en/reference/api/jobs_api.md Co-authored-by: Cassie Coyle <cassie.i.coyle@gmail.com> Signed-off-by: Josh van Leeuwen <me@joshvanl.dev> * Update daprdocs/content/en/operations/hosting/self-hosted/self-hosted-persisting-scheduler.md Signed-off-by: Mark Fussell <markfussell@gmail.com> * Apply suggestions from code review Co-authored-by: Mark Fussell <markfussell@gmail.com> Signed-off-by: Josh van Leeuwen <me@joshvanl.dev> * Adds default volume name for scheduler dapr init Signed-off-by: joshvanl <me@joshvanl.dev> * Adds directions for getting scheduler volume from Docker Desktop Signed-off-by: joshvanl <me@joshvanl.dev> * Update daprdocs/content/en/operations/hosting/self-hosted/self-hosted-persisting-scheduler.md Signed-off-by: Mark Fussell <markfussell@gmail.com> --------- Signed-off-by: joshvanl <me@joshvanl.dev> Signed-off-by: Josh van Leeuwen <me@joshvanl.dev> Signed-off-by: Mark Fussell <markfussell@gmail.com> Co-authored-by: Hannah Hunter <94493363+hhunter-ms@users.noreply.github.com> Co-authored-by: Cassie Coyle <cassie.i.coyle@gmail.com> Co-authored-by: Mark Fussell <markfussell@gmail.com>
default
Description
Please explain the changes you've made
Issue reference
We strive to have all PR being opened based on an issue, where the problem or feature have been discussed prior to implementation.
Please reference the issue this PR will close: #[issue number]
Checklist
Please make sure you've completed the relevant tasks for this PR, out of the following list: