Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache block fix #5

Open
wants to merge 74 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
7822d91
Parallel Qiskit Aer (GPU + MPI) by using cache blocking transpiler (#…
doichanj Feb 4, 2021
b475f19
Adding gates to the MPS simulator (#1088)
yaelbh Feb 4, 2021
f869a36
adding qutip copyright to mc controller (#1124)
DanPuzzuoli Feb 10, 2021
d9d4593
Fix numpy ABI incompatibility when building with numpy 1.20 (#1125)
vvilpas Feb 10, 2021
cf8edab
Add new save expectation value instructions (#1101)
chriseclectic Feb 10, 2021
600806c
Add ``SaveStatevector`` and ``SaveDensityMatrix`` instructions (#1116)
chriseclectic Feb 11, 2021
d58737a
Add `SaveProbabilities` and `SaveProbabilitiesDict` instructions (#1117)
chriseclectic Feb 11, 2021
09383a8
Fix cache blocking diagonal matrix
doichanj Feb 16, 2021
c2908f9
fix blocking diagonal matrix
doichanj Feb 17, 2021
998d532
Pass CMAKE_GENERATOR_PLATFORM thorugh scikit_build in win32 builds (#…
vvilpas Feb 17, 2021
424ae67
correct block bits after cache block transpiler
doichanj Feb 18, 2021
63e33d3
Add tests for SaveExpectationValueVariance (#1140)
chriseclectic Feb 18, 2021
43a150c
Add `SaveAmplitudes` and `SaveAmplitudesSquared` instructions. (#1129)
chriseclectic Feb 18, 2021
b274872
No numpy install before CMake runs (#1142)
vvilpas Feb 18, 2021
2da63a2
Migrate windows CI to all be in github actions (#1137)
mtreinish Feb 18, 2021
940a8a6
Add _directive attr to SaveData and Snapshot instruction (#1139)
chriseclectic Feb 18, 2021
a5e9f5c
fix save_density_matrix
doichanj Feb 19, 2021
47dba5f
bit scaling of matrix state is not multiplied to num_qubits_ and chun…
doichanj Feb 19, 2021
a0da338
fix MPI compilation
doichanj Feb 19, 2021
185b28e
change MPI tests to multi chunk tests
doichanj Feb 19, 2021
6e34272
Merge remote-tracking branch 'upstream/master' into cache-block-fix
doichanj Feb 19, 2021
12e20d3
modify contribution document
doichanj Feb 19, 2021
70dc825
Implemented save_amplitudes
doichanj Feb 22, 2021
1415180
added MPI support for save_amplitudes
doichanj Feb 22, 2021
32be341
Start updating tests to use configurable simulator backend (#1150)
chriseclectic Feb 22, 2021
2e777bd
explicitly avoid omp use for experiments in serial execution (#1147)
hhorii Feb 22, 2021
0ece80f
Add SaveUnitary and SaveStabilizer instructions (#1136)
chriseclectic Feb 22, 2021
a2f29ee
Fix noise sampling for conditional gates (#1154)
chriseclectic Feb 24, 2021
5923392
Extended stabilizer simulator expval command (#1121)
gadial Feb 24, 2021
5b5658c
Pass some json utils function args by const reference (#1151)
vvilpas Feb 24, 2021
1a6d5df
change default max_memory_mb from half to full system memory (#1152)
hhorii Feb 25, 2021
9c30458
smalle fix for CI test
doichanj Feb 25, 2021
09fd95d
debug for statevector chunk state
doichanj Feb 25, 2021
6dd4708
Fix bug in StatevectorChunk::State::vec2density
doichanj Feb 25, 2021
9749e0f
delete debug message
doichanj Feb 25, 2021
666e99f
Merge remote-tracking branch 'upstream/master' into cache-block-fix
doichanj Feb 25, 2021
96e7cf0
merged with upstream/master
doichanj Feb 25, 2021
756af11
Fix multi-chunk diagonal matrix (#1155)
doichanj Feb 25, 2021
bf92091
resolve conflict
doichanj Feb 26, 2021
857f093
fix merge failure
doichanj Feb 26, 2021
4a41d44
fix again
doichanj Feb 26, 2021
1063501
Add arm64 release wheel jobs (#1162)
mtreinish Feb 26, 2021
892a6fd
Add default save instruction labels (#1161)
chriseclectic Feb 26, 2021
cc7b1af
Add pending deprecation warnings to snapshots (#1158)
chriseclectic Mar 2, 2021
dffb1d7
Fix out of bounds array access. (#1167)
vvilpas Mar 2, 2021
71c940b
Fix memory_leak due to shared_ptr circular references. (#1168)
vvilpas Mar 2, 2021
493b558
prepare for merge upstream
doichanj Mar 3, 2021
57591ee
merge upstream
doichanj Mar 3, 2021
56b44f4
remove debug message
doichanj Mar 3, 2021
bdb7718
remove comparing weak_ptr with nullptr
doichanj Mar 3, 2021
7d1f5b8
remove reset to weak_ptr
doichanj Mar 3, 2021
7267429
Fixed bug in sample_measure_using_probabilities (#1132)
merav-aharoni Mar 3, 2021
43d8307
Merge branch 'master' into cache-block-fix
vvilpas Mar 3, 2021
f8f3bb7
reflect review comments
doichanj Mar 4, 2021
12d18bc
Merge remote-tracking branch 'refs/remotes/origin/cache-block-fix' in…
doichanj Mar 4, 2021
ae36d65
fix expval_pauli for density matrix
doichanj Mar 4, 2021
90831eb
Fix density matrix expval_pauli (#1171)
chriseclectic Mar 4, 2021
5c2f507
Disable all warnings emitted from thrust headers (#1169)
vvilpas Mar 4, 2021
ba56322
merge upstream/master
doichanj Mar 5, 2021
11c5b8a
Remove previously deprecated methods (#1160)
chriseclectic Mar 5, 2021
5a72c02
Fix grammar, capitalization, text inconsistencies (#900)
RafeyIqbalRahman Mar 5, 2021
feb9fb2
Update README.md to mention Linux-only GPU support (#1095)
amirebrahimi Mar 5, 2021
3283d37
Fix density matrix chunk expval_pauli
doichanj Mar 8, 2021
07cc9b8
Fix statevector chunk expval_pauli
doichanj Mar 8, 2021
19e846c
Merge branch 'master' into cache-block-fix
doichanj Mar 8, 2021
7fe4f9e
Fix expval tests (#1173)
chriseclectic Mar 8, 2021
d69f7e9
Fix extended stabilizer method basis gates (#1175)
chriseclectic Mar 9, 2021
3d2575a
Update CODEOWNERS (#1174)
chriseclectic Mar 9, 2021
a41d5c7
Merge branch 'master' into cache-block-fix
chriseclectic Mar 9, 2021
46b24a6
Merge remote-tracking branch 'upstream/master' into cache-block-fix
doichanj Mar 10, 2021
af41169
Merge branch 'cache-block-fix' of github.com:doichanj/qiskit-aer into…
doichanj Mar 10, 2021
acd216d
Fixes of multi-chunk State implementation (#1149)
doichanj Mar 10, 2021
1994b35
Add Fusion variations (#1110)
hhorii Mar 10, 2021
58d44d1
Merge remote-tracking branch 'upstream/master' into cache-block-fix
doichanj Mar 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
reflect review comments
doichanj committed Mar 4, 2021

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
commit f8f3bb70f39ca7e97453fb8ba06b63204dbee956
30 changes: 30 additions & 0 deletions src/controllers/controller.hpp
Original file line number Diff line number Diff line change
@@ -51,6 +51,7 @@
#include "noise/noise_model.hpp"
#include "transpile/basic_opts.hpp"
#include "transpile/truncate_qubits.hpp"
#include "transpile/cacheblocking.hpp"

namespace AER {
namespace Base {
@@ -221,6 +222,14 @@ class Controller {

void save_exception_to_results(Result &result,const std::exception &e);


//setting cache blocking transpiler
Transpile::CacheBlocking transpile_cache_blocking(const Circuit& circ,
const Noise::NoiseModel& noise,
const json_t& config,
const size_t complex_size,bool is_matrix) const;


// Get system memory size
size_t get_system_memory_mb();
size_t get_gpu_memory_mb();
@@ -679,6 +688,27 @@ void Controller::save_exception_to_results(Result &result,const std::exception &
}
}

Transpile::CacheBlocking Controller::transpile_cache_blocking(const Circuit& circ,
const Noise::NoiseModel& noise,
const json_t& config,
const size_t complex_size,bool is_matrix) const
{
Transpile::CacheBlocking cache_block_pass;

cache_block_pass.set_config(config);
if(!cache_block_pass.enabled()){
//if blocking is not set by config, automatically set if required
if(multiple_chunk_required(circ,noise)){
int nplace = num_process_per_experiment_;
if(num_gpus_ > 0)
nplace *= num_gpus_;
cache_block_pass.set_blocking(circ.num_qubits, get_min_memory_mb() << 20, nplace, complex_size,is_matrix);
}
}

return cache_block_pass;
}

//-------------------------------------------------------------------------
// Qobj execution
//-------------------------------------------------------------------------
51 changes: 8 additions & 43 deletions src/controllers/qasm_controller.hpp
Original file line number Diff line number Diff line change
@@ -215,11 +215,6 @@ class QasmController : public Base::Controller {
const Operations::OpSet &opset,
const json_t& config) const;


Transpile::CacheBlocking transpile_cache_blocking(const Circuit& circ,
const Noise::NoiseModel& noise,
const json_t& config) const;

//----------------------------------------------------------------
// Run circuit helpers
//----------------------------------------------------------------
@@ -930,42 +925,6 @@ Transpile::Fusion QasmController::transpile_fusion(Method method,
return fusion_pass;
}

Transpile::CacheBlocking QasmController::transpile_cache_blocking(const Circuit& circ,
const Noise::NoiseModel& noise,
const json_t& config) const
{
Transpile::CacheBlocking cache_block_pass;

cache_block_pass.set_config(config);
if(!cache_block_pass.enabled()){
//if blocking is not set by config, automatically set if required
if(Base::Controller::multiple_chunk_required(circ,noise)){
int nplace = Base::Controller::num_process_per_experiment_;
if(Base::Controller::num_gpus_ > 0)
nplace *= Base::Controller::num_gpus_;

size_t complex_size = (simulation_precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>);

switch (simulation_method(circ, noise, false)) {
case Method::statevector:
case Method::statevector_thrust_cpu:
case Method::statevector_thrust_gpu:
cache_block_pass.set_blocking(circ.num_qubits, Base::Controller::get_min_memory_mb() << 20, nplace, complex_size,false);
break;
case Method::density_matrix:
case Method::density_matrix_thrust_cpu:
case Method::density_matrix_thrust_gpu:
cache_block_pass.set_blocking(circ.num_qubits, Base::Controller::get_min_memory_mb() << 20, nplace, complex_size,true);
break;
default:
throw std::runtime_error("QasmController: No enough memory to simulate this method on the sysytem");
}
}
}

return cache_block_pass;
}

void QasmController::set_parallelization_circuit(
const Circuit& circ,
const Noise::NoiseModel& noise_model) {
@@ -1140,7 +1099,10 @@ void QasmController::run_circuit_helper(const Circuit& circ,
auto fusion_pass = transpile_fusion(method, opt_circ.opset(), config);
fusion_pass.optimize_circuit(opt_circ, dummy_noise, state.opset(), result);

auto cache_block_pass = transpile_cache_blocking(opt_circ,noise,config);
bool is_matrix = false;
if(method == Method::density_matrix || method == Method::density_matrix_thrust_gpu || method == Method::density_matrix_thrust_cpu)
is_matrix = true;
auto cache_block_pass = transpile_cache_blocking(opt_circ,noise,config,(simulation_precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>),is_matrix);
cache_block_pass.optimize_circuit(opt_circ, dummy_noise, state.opset(), result);

uint_t block_bits = 0;
@@ -1218,7 +1180,10 @@ void QasmController::run_circuit_with_sampled_noise(const Circuit& circ,
measure_pass.set_config(config);
Noise::NoiseModel dummy_noise;

auto cache_block_pass = transpile_cache_blocking(circ,noise,config);
bool is_matrix = false;
if(method == Method::density_matrix || method == Method::density_matrix_thrust_gpu || method == Method::density_matrix_thrust_cpu)
is_matrix = true;
auto cache_block_pass = transpile_cache_blocking(circ,noise,config,(simulation_precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>),is_matrix);

// Sample noise using circuit method
while (shots-- > 0) {
13 changes: 1 addition & 12 deletions src/controllers/statevector_controller.hpp
Original file line number Diff line number Diff line change
@@ -355,18 +355,7 @@ void StatevectorController::run_circuit_helper(
fusion_pass.optimize_circuit(opt_circ, dummy_noise, state.opset(), result);
}

Transpile::CacheBlocking cache_block_pass;
cache_block_pass.set_config(config);
if(!cache_block_pass.enabled()){
//if blocking is not set by config, automatically set if required
if(Base::Controller::multiple_chunk_required(opt_circ,noise)){
int nplace = Base::Controller::num_process_per_experiment_;
if(Base::Controller::num_gpus_ > 0)
nplace *= Base::Controller::num_gpus_;
size_t complex_size = (precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>);
cache_block_pass.set_blocking(circ.num_qubits, Base::Controller::get_min_memory_mb() << 20, nplace, complex_size,false);
}
}
Transpile::CacheBlocking cache_block_pass = transpile_cache_blocking(opt_circ,dummy_noise,config,(precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>),false);
cache_block_pass.optimize_circuit(opt_circ, dummy_noise, state.opset(), result);

uint_t block_bits = 0;
13 changes: 1 addition & 12 deletions src/controllers/unitary_controller.hpp
Original file line number Diff line number Diff line change
@@ -356,18 +356,7 @@ void UnitaryController::run_circuit_helper(
fusion_pass.optimize_circuit(opt_circ, dummy_noise, state.opset(), result);
}

Transpile::CacheBlocking cache_block_pass;
cache_block_pass.set_config(config);
if(!cache_block_pass.enabled()){
//if blocking is not set by config, automatically set if required
if(Base::Controller::multiple_chunk_required(opt_circ,noise)){
int nplace = Base::Controller::num_process_per_experiment_;
if(Base::Controller::num_gpus_ > 0)
nplace *= Base::Controller::num_gpus_;
size_t complex_size = (precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>);
cache_block_pass.set_blocking(circ.num_qubits, Base::Controller::get_min_memory_mb() << 20, nplace, complex_size,true);
}
}
Transpile::CacheBlocking cache_block_pass = transpile_cache_blocking(opt_circ,dummy_noise,config,(precision_ == Precision::single_precision) ? sizeof(std::complex<float>) : sizeof(std::complex<double>),true);
cache_block_pass.optimize_circuit(opt_circ, dummy_noise, state.opset(), result);

uint_t block_bits = 0;
39 changes: 32 additions & 7 deletions src/simulators/density_matrix/densitymatrix_state_chunk.hpp
Original file line number Diff line number Diff line change
@@ -30,6 +30,32 @@
namespace AER {
namespace DensityMatrixChunk {

using OpType = Operations::OpType;

// OpSet of supported instructions
const Operations::OpSet StateOpSet(
// Op types
{OpType::gate, OpType::measure,
OpType::reset, OpType::snapshot,
OpType::barrier, OpType::bfunc,
OpType::roerror, OpType::matrix,
OpType::diagonal_matrix, OpType::kraus,
OpType::superop, OpType::save_expval,
OpType::save_expval_var, OpType::save_densmat,
OpType::save_probs, OpType::save_probs_ket,
OpType::save_amps_sq
},
// Gates
{"U", "CX", "u1", "u2", "u3", "u", "cx", "cy", "cz",
"swap", "id", "x", "y", "z", "h", "s", "sdg", "t",
"tdg", "ccx", "r", "rx", "ry", "rz", "rxx", "ryy", "rzz",
"rzx", "p", "cp", "cu1", "sx", "x90", "delay", "pauli"},
// Snapshots
{"density_matrix", "memory", "register", "probabilities",
"probabilities_with_variance", "expectation_value_pauli",
"expectation_value_pauli_with_variance"});


//=========================================================================
// DensityMatrix State subclass
//=========================================================================
@@ -39,7 +65,7 @@ class State : public Base::StateChunk<densmat_t> {
public:
using BaseState = Base::StateChunk<densmat_t>;

State() : BaseState(DensityMatrix::StateOpSet) {}
State() : BaseState(StateOpSet) {}
virtual ~State() {}

//-----------------------------------------------------------------------
@@ -446,9 +472,8 @@ auto State<densmat_t>::apply_to_matrix(bool copy)
//TO DO check memory availability
matrix.resize(1ull << (BaseState::num_qubits_),1ull << (BaseState::num_qubits_));

auto recv = BaseState::qregs_[0].copy_to_matrix();

#ifdef AER_MPI
auto recv = BaseState::qregs_[0].copy_to_matrix();
//gather states from other processes
for(iChunk=BaseState::num_local_chunks_;iChunk<BaseState::num_global_chunks_;iChunk++){
BaseState::recv_data(recv.data(),size,0,iChunk);
@@ -849,10 +874,10 @@ void State<densmat_t>::apply_snapshot(const Operations::Op &op,
snapshot_pauli_expval(op, result, true);
} break;
/* TODO
case DensityMatrix::Snapshots::expval_matrix: {
case Snapshots::expval_matrix: {
snapshot_matrix_expval(op, data, false);
} break;
case DensityMatrix::Snapshots::expval_matrix_var: {
case Snapshots::expval_matrix_var: {
snapshot_matrix_expval(op, data, true);
} break;
*/
@@ -947,8 +972,8 @@ cmatrix_t State<densmat_t>::reduced_density_matrix(const reg_t& qubits, bool las
return reduced_state;
}

template <class statevec_t>
cmatrix_t State<statevec_t>::reduced_density_matrix_helper(const reg_t &qubits,
template <class densmat_t>
cmatrix_t State<densmat_t>::reduced_density_matrix_helper(const reg_t &qubits,
const reg_t &qubits_sorted)
{
// Get superoperator qubits
5 changes: 0 additions & 5 deletions src/simulators/statevector/qubitvector_thrust.hpp
Original file line number Diff line number Diff line change
@@ -958,11 +958,6 @@ bool QubitVectorThrust<data_t>::fetch_chunk(void) const
int tid,nid;
int idev;

// tid = omp_get_thread_num();
// nid = omp_get_num_threads();

// idev = tid * chunk_manager_.num_devices() / nid;

if(chunk_->device() < 0){
//on host
idev = 0;
37 changes: 36 additions & 1 deletion src/simulators/statevector/statevector_state_chunk.hpp
Original file line number Diff line number Diff line change
@@ -33,6 +33,41 @@
namespace AER {
namespace StatevectorChunk {

using OpType = Operations::OpType;

// OpSet of supported instructions
const Operations::OpSet StateOpSet(
// Op types
{OpType::gate, OpType::measure,
OpType::reset, OpType::initialize,
OpType::snapshot, OpType::barrier,
OpType::bfunc, OpType::roerror,
OpType::matrix, OpType::diagonal_matrix,
OpType::multiplexer, OpType::kraus,
OpType::sim_op, OpType::save_expval,
OpType::save_expval_var, OpType::save_densmat,
OpType::save_probs, OpType::save_probs_ket,
OpType::save_amps, OpType::save_amps_sq,
OpType::save_statevec
// OpType::save_statevec_ket // TODO
},
// Gates
{"u1", "u2", "u3", "u", "U", "CX", "cx", "cz",
"cy", "cp", "cu1", "cu2", "cu3", "swap", "id", "p",
"x", "y", "z", "h", "s", "sdg", "t", "tdg",
"r", "rx", "ry", "rz", "rxx", "ryy", "rzz", "rzx",
"ccx", "cswap", "mcx", "mcy", "mcz", "mcu1", "mcu2", "mcu3",
"mcswap", "mcphase", "mcr", "mcrx", "mcry", "mcry", "sx", "csx",
"mcsx", "delay", "pauli", "mcx_gray"},
// Snapshots
{"statevector", "memory", "register", "probabilities",
"probabilities_with_variance", "expectation_value_pauli", "density_matrix",
"density_matrix_with_variance", "expectation_value_pauli_with_variance",
"expectation_value_matrix_single_shot", "expectation_value_matrix",
"expectation_value_matrix_with_variance",
"expectation_value_pauli_single_shot"});


//=========================================================================
// QubitVector State subclass
//=========================================================================
@@ -42,7 +77,7 @@ class State : public Base::StateChunk<statevec_t> {
public:
using BaseState = Base::StateChunk<statevec_t>;

State() : BaseState(Statevector::StateOpSet) {}
State() : BaseState(StateOpSet) {}

//-----------------------------------------------------------------------
// Base class overrides
22 changes: 19 additions & 3 deletions src/simulators/unitary/unitary_state_chunk.hpp
Original file line number Diff line number Diff line change
@@ -30,6 +30,23 @@
namespace AER {
namespace QubitUnitaryChunk {

// OpSet of supported instructions
const Operations::OpSet StateOpSet(
// Op types
{Operations::OpType::gate, Operations::OpType::barrier,
Operations::OpType::matrix, Operations::OpType::diagonal_matrix,
Operations::OpType::snapshot, Operations::OpType::save_unitary},
// Gates
{"u1", "u2", "u3", "u", "U", "CX", "cx", "cz",
"cy", "cp", "cu1", "cu2", "cu3", "swap", "id", "p",
"x", "y", "z", "h", "s", "sdg", "t", "tdg",
"r", "rx", "ry", "rz", "rxx", "ryy", "rzz", "rzx",
"ccx", "cswap", "mcx", "mcy", "mcz", "mcu1", "mcu2", "mcu3",
"mcswap", "mcphase", "mcr", "mcrx", "mcry", "mcry", "sx", "csx",
"mcsx", "delay", "pauli"},
// Snapshots
{"unitary"});

//=========================================================================
// QubitUnitary State subclass
//=========================================================================
@@ -39,7 +56,7 @@ class State : public Base::StateChunk<unitary_matrix_t> {
public:
using BaseState = Base::StateChunk<unitary_matrix_t>;

State() : BaseState(QubitUnitary::StateOpSet) {}
State() : BaseState(StateOpSet) {}
virtual ~State() = default;

//-----------------------------------------------------------------------
@@ -345,9 +362,8 @@ auto State<unitary_matrix_t>::move_to_matrix()
//TO DO check memory availability
matrix.resize(1ull << (BaseState::num_qubits_),1ull << (BaseState::num_qubits_));

auto recv = BaseState::qregs_[0].copy_to_matrix();

#ifdef AER_MPI
auto recv = BaseState::qregs_[0].copy_to_matrix();
//gather states from other processes
for(iChunk=BaseState::num_local_chunks_;iChunk<BaseState::num_global_chunks_;iChunk++){
BaseState::recv_data(recv.data(),size,0,iChunk);
2 changes: 1 addition & 1 deletion src/transpile/cacheblocking.hpp
Original file line number Diff line number Diff line change
@@ -65,7 +65,7 @@ class CacheBlocking : public CircuitOptimization {
}

//setting blocking parameters automatically
void set_blocking(int bits, size_t min_memory, uint_t n_place, size_t complex_size = 16, bool is_matrix = false);
void set_blocking(int bits, size_t min_memory, uint_t n_place, const size_t complex_size, bool is_matrix = false);

protected:
mutable int block_bits_; //qubits less than this will be blocked