Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Benchmarking #320

Open
wants to merge 38 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
7c7aaaf
chore: Added flake8 settings
SRFU-NN Oct 8, 2024
c95f923
chore: Allow to set seed from ModelSystem initialisor
SRFU-NN Oct 8, 2024
6681ab0
chore: NoiseModel now uses the fully fledged RNG getter
SRFU-NN Oct 8, 2024
ae9997c
doc: Started on benchmarking notebook
SRFU-NN Oct 8, 2024
1c5bb7c
feat: Moved benchmarking function to separate file
SRFU-NN Oct 9, 2024
97a059f
Doc: Benchmarking
SRFU-NN Oct 10, 2024
9d17d4a
feat: Seed for LHS
SRFU-NN Oct 10, 2024
364e011
doc: Seeding LHS in benchmarking
SRFU-NN Oct 10, 2024
1ce5fd8
chore: Change noise type to property function
SRFU-NN Oct 22, 2024
c69c831
chore: Changed named lambda functions to def functions
SRFU-NN Oct 22, 2024
a2d8390
fix: Changed noise model to noise type being a property
SRFU-NN Oct 22, 2024
292be7f
feat: ModelSystem now has a copy method
SRFU-NN Oct 22, 2024
d4e5aa7
doc: Finding limits in benchmarking
SRFU-NN Oct 22, 2024
4281050
feat: Better naming for benchmarking notebook
SRFU-NN Nov 21, 2024
1ca20d1
Merge remote-tracking branch 'origin/develop' into benchmarking
SRFU-NN Feb 6, 2025
4da5eb3
doc: Notes on benchmarking
SRFU-NN Feb 7, 2025
9da8dfa
feat: Started benchamrking script
SRFU-NN Feb 10, 2025
0a876e1
feat: Golden ratio sampling suggestor
SRFU-NN Feb 11, 2025
f5729ef
Merge branch 'golden_ratio_sampling' into benchmarking
SRFU-NN Feb 11, 2025
353f0f7
feat: Benchamrking on a single instance
SRFU-NN Feb 12, 2025
ede0784
feat: Benchmarking with validation
SRFU-NN Feb 12, 2025
a563c2b
chore: Restored defaults
SRFU-NN Feb 12, 2025
c7cfc6d
doc: Reference on Generalized Golden Ratio Sequence
SRFU-NN Feb 12, 2025
c4d7581
feat: ConstantSuggestor
SRFU-NN Feb 12, 2025
a3a0b9a
fix: Integer.sample([1]) gave a point above the high limit
SRFU-NN Feb 12, 2025
d55bf73
chore: Robusting hart3 against non-float inputs
SRFU-NN Feb 12, 2025
e996e4c
feat: Benchmarking framework done
SRFU-NN Feb 12, 2025
0cd2a9d
doc: More benchmanrking options
SRFU-NN Feb 13, 2025
752a0cf
chore: Removed temporary script
SRFU-NN Feb 13, 2025
5719cac
Merge remote-tracking branch 'origin/develop' into benchmarking
SRFU-NN Feb 13, 2025
44dd6bb
refactor: Noise distributions is now a dict to allow for easy addition
SRFU-NN Feb 13, 2025
cca038f
chore: Benchmark uses new way to set noise distribution
SRFU-NN Feb 13, 2025
24bbba5
chore: Benchmark uses new way to set noise distribution
SRFU-NN Feb 13, 2025
748ee63
doc: Updated XPyriMentor notebook
SRFU-NN Feb 13, 2025
3c8306d
doc: Saving Suggestor definition in benchamrikng notebook
SRFU-NN Feb 13, 2025
48225e7
Merge branch 'benchmarking' of https://github.com/novonordisk-researc…
SRFU-NN Feb 13, 2025
1535804
doc: Writing why NoiseModel.noise_types is a dict with lambda functio…
SRFU-NN Feb 13, 2025
c1ad136
chore: NoiseModel.normal convenience function
SRFU-NN Feb 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .flake8
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[flake8]
exclude = .git,__pycache__,.env,venv,env,ENV,env.bak,venv.bak,build,dist
max-line-length = 88
155 changes: 155 additions & 0 deletions ProcessOptimizer/model_systems/benchmark.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
from __future__ import annotations
import functools
from dataclasses import dataclass, field
from typing import Iterable

import numpy as np

from . import get_model_system, ModelSystem
from ProcessOptimizer import Optimizer
from ProcessOptimizer.utils import expected_minimum
from XpyriMentor import XpyriMentor


@dataclass
class BenchmarkInstance:
model_system_name: str
xpyrimentor_definition: dict
experimental_budget: int
expected_random_runtime: float
seed: int
noise_level: float
validate: bool = False
# Results:
number_of_evaluations: int | None = None
success: bool | None = None
# Internal variables:
success_level: float = field(init=False, repr=False)
model: ModelSystem = field(init=False, repr=False)
xpyrimentor: XpyriMentor = field(init=False, repr=False)

def __init__(
self,
model_system_name: str,
xpyrimentor_definition: dict,
experimental_budget: int,
expected_random_runtime: float,
seed: int,
noise_level: float = 1.0,
**kwargs
):
"""
Initialize the benchmark instance.

Needs the following parameters:
* `model_system_name` [str]:
Name of the model system to use.
* `xpyrimentor_definition` [dict]:
Definition of the XpyriMentor object to use.
* `experimental_budget` [int]:
Maximum number of evaluations to run before stopping.
* `expected_random_runtime` [float]:
How "hard" the system is to optimize. This is the expected number of random
parameter sets you need to evaluate to find a good one.
* `seed` [int]:
Random seed to use.
"""
self.__dict__.update({
"model_system_name": model_system_name,
"xpyrimentor_definition":xpyrimentor_definition,
"experimental_budget":experimental_budget,
"expected_random_runtime":expected_random_runtime,
"seed":seed,
"noise_level":noise_level,
})
self.__dict__.update(kwargs)
self.success_level = find_limits(
model_system_name, expected_random_runtime, self.noise_level
)
self.model = get_model_system(model_system_name, seed=seed)
self.xpyrimentor = XpyriMentor(self.model.space, self.xpyrimentor_definition, seed=seed)

@property
def model_system(self) -> ModelSystem:
model_system = get_model_system(self.model_system_name, seed=self.seed)
model_system.noise_size = model_system.noise_size*self.noise_level
return model_system

def run(self) -> BenchmarkInstance:
"""
Run the benchmark instance, save the number of evaluations and whether the
success level was reached, and return the instance.
"""
success = False
while len(self.xpyrimentor.Xi) < self.experimental_budget:
x = self.xpyrimentor.ask()
y = self.model.get_score(x)
self.xpyrimentor.tell(x, [y])
# We could restrict testing to only if the point is considered good, but it
# doesn't seem to matter much for the runtime.
minimum_location, minimum_value = self.find_estimated_optimum()
if minimum_value<self.success_level and self.validate:
result = self.model.get_score(minimum_location)
self.xpyrimentor.tell(minimum_location, result)
minimum_location, minimum_value = self.find_estimated_optimum()
if minimum_value<self.success_level:
# Insert validation here
true_quality = find_pesimistic_value(self.model, minimum_location)
if true_quality<self.success_level:
success = True
break
self.number_of_evaluations = len(self.xpyrimentor.Xi)
self.success = success
return self

def find_estimated_optimum(self) -> float:
"""
Find the parameter set that is estimated to be the optimum, and the value that is
2 standard deviations above the true value at that point.
"""
# This is a bit of a hack, but it works for now. The optimizer is the second
# suggestor in the suggestor list of the sequential strategizer. We need the
# optimizer since we need to know the expected minimum and the model uncertainty
# there.
optimizer: Optimizer = self.xpyrimentor.suggestor.suggestors[-1][1].optimizer
optimizer.Xi = self.xpyrimentor.Xi
optimizer.yi = self.xpyrimentor.yi
optimizer.update_next()
result = optimizer.get_result()
result_location, [result_value, result_std] = expected_minimum(result, return_std=True)
return (result_location, result_value + 2*result_std)


@functools.cache
def find_limits(
model_system_name: str,
expected_random_runtime: float,
noise_level: float,
):
seed = 42
random_scaling = 100
model_system = get_model_system(model_system_name, seed=seed)
model_system.noise_size = model_system.noise_size*noise_level
sampler = XpyriMentor(
space=model_system.space, suggestor={"suggestor_name": "GoldenRatio"}, seed=seed
)
estimated_points = [
(point, find_pesimistic_value(model_system, point))
for point in sampler.ask(expected_random_runtime*random_scaling)
]
# Sort the points by score, and find the point that corresponds to the expected
# random runtime
estimated_points.sort(key=lambda x: x[1])
limit_point = estimated_points[int(random_scaling)]
return limit_point[1]

def find_pesimistic_value(model_system: ModelSystem, x: Iterable):
"""
Find the value that is 2 standard deviations above the true value at `x`.
"""
model_system = model_system.copy() # Copy to avoid changing the original
# Set the noise model so that we always return two standard deviations above the true
# value.
model_system.noise_model.noise_types["constant"] = lambda: 2
model_system.noise_model.noise_type = "constant"
return model_system.get_score(x)
7 changes: 6 additions & 1 deletion ProcessOptimizer/model_systems/hart3.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,10 @@ def hart3_score(x):
The score of the system at x.
"""
# Define the constants that are canonically used with this function.
if isinstance(x, np.ndarray):
x=x.astype(dtype=float) # Ensure that x is an array of floats to support the math.
else:
x = np.asarray(x, dtype=float)
alpha = np.asarray([1.0, 1.2, 3.0, 3.2])
P = 10**-4 * np.asarray(
[[3689, 1170, 2673], [4699, 4387, 7470], [1091, 8732, 5547], [381, 5743, 8828]]
Expand All @@ -40,13 +44,14 @@ def hart3_score(x):
return -np.sum(alpha * np.exp(-np.sum(A * (np.array(x) - P) ** 2, axis=1)))


def create_hart3(noise=bool)-> ModelSystem:
def create_hart3(noise=bool, **kwargs) -> ModelSystem:
hart3 = ModelSystem(
hart3_score,
[(0.0, 1.0), (0.0, 1.0), (0.0, 1.0)],
noise_model="constant",
true_max=0.0,
true_min=-3.863,
**kwargs,
)
if noise:
return hart3
Expand Down
7 changes: 6 additions & 1 deletion ProcessOptimizer/model_systems/hart6.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ def hart6_score(x):
The score of the system at x.
"""
# Define the constants that are canonically used with this function.
if isinstance(x, np.ndarray):
x=x.astype(dtype=float) # Ensure that x is an array of floats to support the math.
else:
x = np.asarray(x, dtype=float)
alpha = np.asarray([1.0, 1.2, 3.0, 3.2])
P = 10**-4 * np.asarray(
[
Expand All @@ -51,12 +55,13 @@ def hart6_score(x):
return -np.sum(alpha * np.exp(-np.sum(A * (np.array(x) - P) ** 2, axis=1)))


def create_hart6(noise: bool = True) -> ModelSystem:
def create_hart6(noise: bool = True, **kwargs) -> ModelSystem:
noise_model = "constant" if noise else None
return ModelSystem(
hart6_score,
[(0.0, 1.0) for _ in range(6)],
noise_model=noise_model,
true_max=0.0,
true_min=-3.3224,
**kwargs,
)
85 changes: 65 additions & 20 deletions ProcessOptimizer/model_systems/model_system.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from typing import Callable, List, Union
from typing import Callable, List, Optional, Union

import numpy as np
from ..space import Space, space_factory
Expand All @@ -11,42 +11,72 @@ class ModelSystem:
Model System for testing ProcessOptimizer. Instances of this class are used
in benchmarks and the example notebooks.

Parameters
Attributes
----------
* `score` [Callable]:
Function for calculating the noiseless score of the system at a given
point in the parameter space.

* `space` [List or Space]:
* `space` [Space]:
A list of dimension defintions or the parameter space as a Space object.

* `true_min` [float]:
The true minimum value of the score function within the parameter space.

* `noise_model` [str, dict, or NoiseModel]:
* `true_max` [float]:
The true maximum value of the score function within the parameter space.

* `noise_model` [NoiseModel]:
Noise model to apply to the score.
If str, it should be the name of the noise model type. In this case,
further arguments can be given (e.g. `noise_size`).
If dict, one key should be `model_type`.
If NoiseModel, this NoiseModel will be used.

Possible model type strings are:
"constant": The noise level is constant.
"proportional": Tne noise level is proportional to the score.
"zero": No noise is applied.
"""

def __init__(
self,
score: Callable[..., float],
space: Union[Space, List],
noise_model: Union[str, dict, NoiseModel, None],
true_min=None,
true_max=None,
true_min: Optional[float] = None,
true_max: Optional[float] = None,
seed: Union[int, np.random.RandomState, np.random.Generator, None] = 42,
):
"""
Initialize the model system.

Parameters
----------
* `score` [Callable]:
Function for calculating the noiseless score of the system at a given
point in the parameter space.

* `space` [List or Space]:
A list of dimension defintions or the parameter space as a Space object.

* `noise_model` [str, dict, or NoiseModel]:
Noise model to apply to the score.
If str, it should be the name of the noise model type. In this case,
further arguments can be given (e.g. `noise_size`).
If dict, one key should be `model_type`.
If NoiseModel, this NoiseModel will be used.

Possible model type strings are:
"constant": The noise level is constant.
"proportional": Tne noise level is proportional to the score.
"zero": No noise is applied.

* `true_min` [float]:
The true minimum value of the score function within the parameter space. If
not given, it will be estimated by evaluating the score function at a set of
points in the parameter space.

* `true_max` [float]:
The true maximum value of the score function within the parameter space. If
not given, it will be estimated by evaluating the score function at a set of
points in the parameter space.

* `seed` [int, RandomState, Generator, or None]:
Seed for the random number generator. If None, the ModelSystem will give
random results, otherwise the results will be deterministic. Default behavior
is deterministic.
"""
self.score = score
self.space = space_factory(space)
self.noise_model = parse_noise_model(noise_model)
self.noise_model = parse_noise_model(noise_model, seed=seed)
if true_min is None:
ndims = self.space.n_dims
points = self.space.lhs(
Expand Down Expand Up @@ -131,6 +161,21 @@ def set_noise_model(self, noise_model: Union[str, dict, NoiseModel, None]):
"""
self.noise_model = parse_noise_model(noise_model)

def copy(self):
"""
Returns a copy of the model system.

The rng of the copy will a spawn of the original, that is, different in a
deterministic way.
"""
return self.__class__(
score=self.score,
space=self.space,
noise_model=self.noise_model.copy(),
true_min=self.true_min,
true_max=self.true_max,
)

@property
def noise_size(self):
return self.noise_model.noise_size
Expand Down
Loading