-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Features][CustomSched][BPFLib] #29
base: master
Are you sure you want to change the base?
Conversation
abrehman94
commented
Oct 17, 2024
e215d44
to
939b86b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not going to look at correctness, etc. on this, will trust you. Just going for code quality, issues like that.
src/Ilúvatar/Makefile
Outdated
@@ -55,4 +58,4 @@ format: | |||
format-check: | |||
@cargo fmt --all -- --check | |||
clippy: | |||
@RUSTFLAGS=$(RUST_FLAGS) $(RUST_C) clippy --workspace --examples --benches --no-deps -- -Dclippy::suspicious -Dclippy::correctness -Dclippy::perf -Aclippy::single_match -Aclippy::new_without_default -Aclippy::too_many_arguments -Aclippy::type-complexity -Dclippy::from_over_into -Aclippy::redundant-field-names -Dwarnings | |||
@RUSTFLAGS=$(RUST_FLAGS) $(RUST_C) clippy --fix --allow-no-vcs --workspace --examples --benches --no-deps -- -Dclippy::suspicious -Dclippy::correctness -Dclippy::perf -Aclippy::single_match -Aclippy::new_without_default -Aclippy::too_many_arguments -Aclippy::type-complexity -Dclippy::from_over_into -Aclippy::redundant-field-names -Dwarnings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command is used to check if there are clippy violations, especially in CI, not to correct them. Adding --fix
will have the issues "fixed" in the CI build, but those changes aren't actually committed to this branch. The issues would persist and silently creep in code quality problems. Undo this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@@ -37,6 +37,9 @@ tiny: | |||
debug: | |||
@echo "Building debug" | |||
@RUSTFLAGS=$(RUST_FLAGS) $(RUST_C) build $(DEBUG_FLAGS) $(CARGO_ARGS) | |||
fix: | |||
@echo "Fixing lint errors" | |||
@RUSTFLAGS=$(RUST_FLAGS) $(RUST_C) fix $(DEBUG_FLAGS) $(CARGO_ARGS) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Combine this with the clippy fix command you changed below? Would be nice to have in the makefile to simplify fixing them locally
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -5,5 +5,9 @@ passthrough = ["ARCH=amd64", "GO_VERSION=1.22.0", "CNI_VERSION=v1.1.1", "GOPATH= | |||
default-target = "x86_64-unknown-linux-gnu" # use this target if none is explicitly provided | |||
pre-build = [ # additional commands to run prior to building the package | |||
"dpkg --add-architecture $CROSS_DEB_ARCH", | |||
"apt-get update && apt-get --assume-yes install protobuf-compiler iproute2 wget curl runc bridge-utils iptables net-tools sysstat" | |||
"apt-get update && apt-get --assume-yes install protobuf-compiler iproute2 wget curl runc bridge-utils iptables net-tools sysstat libelf-dev lsb-release wget software-properties-common gnupg", | |||
"wget https://apt.llvm.org/llvm.sh && chmod +x llvm.sh && ./llvm.sh all && ln -s /usr/bin/ld.lld-18 /usr/bin/ld.lld && ln -s /usr/bin/clang-18 /usr/bin/clang && ln -s /usr/bin/clang++-18 /usr/bin/clang++ && rm llvm.sh" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add these to the setup docs as new dependencies? I assume they're required for the bpf stuff
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are required at the build time. From what I understand, cross builds in a container environment so these dependencies need to be installed there. It does not need to be on the host. Setup docs are for the host right?
let rb_map = &mut skel.maps.queued_stats; | ||
let mut builder = libbpf_rs::RingBufferBuilder::new(); | ||
builder.add(rb_map, callback_stats).unwrap(); | ||
let queued_stats = builder.build().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want upwrap
and panic!
because they can cause hard-to-debug issues and the error is only sent to stderr and easily lost
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fs_policy_tsksz is launched as a separate process and the stderr is being logged to a separate log file. In case of a panic log file can be interpreted offline.
} | ||
} | ||
for (k, v) in &self.characteristics { | ||
println!("{}: {:?}", k, v); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Idk how this will run, but can it use the same logging framework we have? For consistency
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's running as a separate process. I am logging it's stdout and stderr to separate files. We can use same tracing framework but it would just change the logging format in those files.
Not sure if you meant logging to same file or just logging using this framework for consistent format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there am existing crate to make cgroup interaction easier on us and avoid duplication? I trust it's doing what you want, don't know what's out there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Closest I found was this https://crates.io/crates/cgroups-fs. It does not support cgroup v2.
procs: vec![], | ||
cpustats: [ | ||
("nr_periods".to_string(), 0), | ||
("nr_throttled".to_string(), 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These strings should be constants
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
let cgroup_id = "89e979a2b0e9fd3".to_string(); | ||
let res = read_cgroup(cgroup_id); | ||
println!("{:?}", res); | ||
assert!(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Get these tests working, and remove print statements in them please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
.arg("-c") | ||
.arg(cmd) | ||
.output() | ||
.expect("failed to execute process"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are wrappers for executing commands, iluvatar_library::utils::execute_cmd
and others
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was experimental code. I have removed it.
@@ -111,6 +121,8 @@ impl ContainerPool { | |||
Some(c) => { | |||
debug!(tid=%tid, container_id=%c.container_id(), name=%self.pool_name, pool_type=?PoolType::Idle, "Removing random container from pool"); | |||
self.add_container(c.clone(), &self.running_pool, tid, PoolType::Running); | |||
// pool is the best place to maintain a tid to cgroup id mapping |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the best place? Presumably the cgroup on a container doesn't change and could just be stored in the container struct. This would just make maintaining the pool and cgroups more convoluted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was no mapping for tid to container. It creates this map in the container pool.
I don't think during a transaction tid a cgroup-id can change. Correct me if I am wrong.
Cgroup-id is part of the container.
@@ -333,6 +333,8 @@ impl CpuQueueingInvoker { | |||
EventualItem::Now(n) => n?, | |||
}; | |||
self.running.fetch_add(1, std::sync::atomic::Ordering::Relaxed); | |||
// since the acquire has been called - tid-cgroup id should be there | |||
self.cmap.start_invoke(®.fqdn, tid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put these in invoke_on_container_2
, one combined place that already has the cmap in it, much less code threading you have to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is a very nice suggestion! moved it to invoke_on_container_2.
pub tx_pids: IpcSender<PidsPacket>, | ||
} | ||
|
||
fn launch_scheduler(worker_config: &WorkerConfig) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this should be in a separate file. It is much too specific to be in the startup code file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved to a separate file.
939b86b
to
03aa34f
Compare
How much did you change with that push? You re-wrote the history so I can't really tell. And please don't do that again... Is there more that needs done to this? |
I was trying to keep the history clean. Specific changes made by the force push can be viewed using compare button. Last commit does not fix everything. I have removed unnecessary code and fixed warnings only. Right now I am on vacation, I will get to it in a week. |
a60e641
to
2cdae55
Compare
Change BPFlib to a feature. |
2cdae55
to
de5cc34
Compare
* Custom Scheduling * Capturing cgroup stats on per invoke level * BPFLibrary to create a pinned BPFMap of function characteristics * used to share data with scx scheduler
de5cc34
to
e1170df
Compare
@aFuerst Please have a look. I think I have addressed all the review comments. Sorry about force pushing the branch. It's just too much code, if I would have done separate commits, it would pollute the git history. |