Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load generation App Updates #1509

Merged
merged 13 commits into from
Dec 20, 2023
Merged

Load generation App Updates #1509

merged 13 commits into from
Dec 20, 2023

Conversation

PaurushGarg
Copy link
Member

@PaurushGarg PaurushGarg commented Nov 28, 2023

Description:

Improves OTLP Metric and Traces Load Generation

  1. Remove configurable observation interval, flush interval, and datapoints from metrics- as they are unused and not-required in the context of load-generation required for the performance.
  2. Corrects the command line parameters of metric generation - as per the way it is used from load-gen.
  3. updates the test-framework to use the latest load-gen app.
  4. correctly sets up metric and trace provider to produce the right load (metric/traces).

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@PaurushGarg PaurushGarg requested a review from a team as a code owner November 28, 2023 07:53
@Danielzolty
Copy link
Contributor

Here are performance results for 100, 1,000, and 5,000 TPS for metric tests:

100:
{"testcase":"otlp_metric_mock","instanceType":"m5.2xlarge","receivers":["otlp"],"processors":["batch"],"exporters":["logging","awsemf"],"dataType":"otlp","dataMode":"metric","dataRate":100,"avgCpu":0.350000239262089,"avgMem":109.9917312,"maxCpu":0.5000363635444206,"maxMem":111.120384,"commitId":"dummy_commit","collectionPeriod":10,"testingAmi":"soaking_linux"}

1,000:
{"testcase":"otlp_metric_mock","instanceType":"m5.2xlarge","receivers":["otlp"],"processors":["batch"],"exporters":["logging","awsemf"],"dataType":"otlp","dataMode":"metric","dataRate":1000,"avgCpu":2.7616599662192955,"avgMem":141.03784106666666,"maxCpu":8.99966334149369,"maxMem":154.710016,"commitId":"dummy_commit","collectionPeriod":10,"testingAmi":"soaking_linux"}

5,000:
{"testcase":"otlp_metric_mock","instanceType":"m5.2xlarge","receivers":["otlp"],"processors":["batch"],"exporters":["logging","awsemf"],"dataType":"otlp","dataMode":"metric","dataRate":5000,"avgCpu":14.974639151005709,"avgMem":224.80576511999996,"maxCpu":21.39946276862715,"maxMem":254.705664,"commitId":"dummy_commit","collectionPeriod":10,"testingAmi":"soaking_linux"}

Copy link
Contributor

@vasireddy99 vasireddy99 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did 1st pass on this PR

@@ -49,7 +41,7 @@ The following is a list of additional optional command line arguments applicable

### StatsD Metrics Load Test Sample Command,
```
./gradlew :load-generator:run --args="metric --metricCount=100 --datapointCount=10 --observationInterval=1000 -u=localhost:8125 -d=statsd"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

observationInterval=1000 should we have observation interval configurable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as sample app, providing configurable observationInterval is good, but in load generator context and how it is used through test-framework; I think it would be unused functionality and could create problem as well.

log.error("Metric provider was not found!");

private void createCounters(Meter meter) {
if (meter != null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this could be replaced by an early return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

}
);
}
} else {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

log.info("Generating metrics at a rate of(ms) : {}" , this.param.getObservationInterval());
scheduler.scheduleAtFixedRate(emitter, 0, this.param.getObservationInterval(), TimeUnit.MILLISECONDS);
log.info("Generating metrics at a rate of(ms) : {}" , FLUSH_INTERVAL);
scheduler.scheduleAtFixedRate(emitter, 0, FLUSH_INTERVAL, TimeUnit.MILLISECONDS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this simplifies things a bit. Should we add an warning/raise an exception in case the emitter function is not able to complete in less than 1s? Otherwise the ingestion rate will not be correct right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure how to implement this. scheduleAtFixedRate only throws these three exceptions:

RejectedExecutionException – if the task cannot be scheduled for execution
NullPointerException – if command or unit is null
IllegalArgumentException – if period less than or equal to zero

@@ -24,7 +24,7 @@ public abstract class TraceEmitter implements Emitter {

@Override
public void start(Runnable emitter) {
scheduler.scheduleAtFixedRate(emitter, TimeUnit.SECONDS.toMillis(5000),
scheduler.scheduleAtFixedRate(emitter, 0,
TimeUnit.SECONDS.toNanos(1) / this.param.getRate(), TimeUnit.NANOSECONDS);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you need to fix this as well no? this needs to be scheduled every second and emitter should emitt the rate number of spans?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raphael, in my opinion, this is already fixed. works little differently than metrics. nextdatapoint is called no of rate times, which produces that many traces in a second, which gets batched and exported at 1s frequency.

@bryan-aguilar
Copy link
Contributor

Improves OTLP Metric Load Generation

How? By doing what? What changes are being made in this PR?

@@ -21,6 +21,9 @@
public interface Emitter {

int NUM_THREADS = 5;

final long FLUSH_INTERVAL = 1000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer configurable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, for load gen its not required to be configured, as we have TPS (100, 1000, 5000) batched at 1s. Adding flush_interval just adds unused complexity.
#1509 (comment)

scrape_interval: 15s
scrape_interval: 1s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't changing the scrape interval affect performance? Why is this change necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it should affect performance by factor of 15. This is to match the metric load as mentioned in the performance report.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted. Will pull into separate PR as discussed.

@PaurushGarg PaurushGarg changed the title Load generation update metrics Load generation App Updates Dec 15, 2023
@PaurushGarg
Copy link
Member Author

Improves OTLP Metric Load Generation

How? By doing what? What changes are being made in this PR?

Updated PR body.

@bryan-aguilar bryan-aguilar merged commit a371a93 into terraform Dec 20, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants