Skip to content

We have presented CIL method to learn the optimal dynamic treatment regime by exploiting information from both trajectories (positive and negative).

Notifications You must be signed in to change notification settings

IhteshamShah/Cooperative-Imitation-Learning-CIL-

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cooperative-Imitation-Learning-CIL

Dynamic Treatment Regimes (DTRs) are sets of sequential decision rules that can be adapted over time to treat patients with a specific pathology. DTR consists of alternative treatment paths and any of these treatments can be adapted depending on the patient’s characteristics. Reinforcement Learning (RL) and Imitation Learning (IL) approaches have been deployed for obtaining optimal treatment for a patient but, these approaches rely only on positive trajectories (i.e., treatments that concluded with positive responses of the patient). In contrast, negative trajectories (i.e., samples of non-responding treatments) are discarded, although these have valuable information content. We propose a Cooperative Imitation Learning (CIL) method that exploits information from both negative and positive trajectories to learn the optimal DTR. The proposed method reduces the chance of selecting any treatment which results in a negative outcome (negative response of the patient) during the medical examination. To validate our approach, we have considered a well-known DTR which is defined for the treatment of patients with alcohol addiction. Results show that our approach outperforms those that rely only on positive trajectories.

Installation

The Code is written in MATLAB R2015b. If you don't have MATLAB installed you can find it here. If you are using an updated version of MATLAB you may need to modify the code accordingly. ## Disscussion

(i) System Model

The AMC module consists of three stages:
(i) Feature Extractor,
(ii) Heuristic Optimizer
(iii) ELM Classifier
A random signal x (n) is generated on the transmission side; after modulation, it is passed through some pre-defined channel (AWGN/ Rayleigh) with some pre-specified SNR value. The signal sensed on the receiving side represented as r(n).
The transmitter encodes sequences of randomly generated bits into continuous signal patterns by selecting the appropriate symbol glossaries. Through transmission from the considered channel at some pre-specified SNR values, signal is corrupted due to noise. At the receiver side, noise components from the received signal are removed at the first step in the pre-processing stage; then the signal is fed to the AMC module for further processing.

Prediction

The first stage in the AMC module is feature extraction where Gabor filter is used to extract the different features for classification of considered digitally modulated schemes.
The Gabor features extracted in the previous step is further optimized through Cuckoo Search Algorithm (CSA).
In the final step Extreme Learing Machine is used to classify the the modulation schemes.

(ii) Flow Chart

Prediction

The flow diagram in above figure depicts the step-wise methodology of the algorithm. The working of three core modules, i.e., Gabor, CSA, and ELM of the proposed system can be seen in parallel to each other. Gabor feature extraction module extracts Gabor features ( c , σ , f , w) from randomly generated signal passed through either of the two channels.
The extracted Gabor features are distinct but to achieve better classification accuracy, they are further optimized using CSA using the fitness function. The best solution having max fitness is then fed to ELM classifier. The ELM classifier (already trained according to reference values) then makes a decision using about the modulation classification.

(iii) Results

Prediction

Table displays the percentage classification accuracy (PCA) of our proposed CSA-ELM classifier for different variants of PSK, FSK and QAM considering samples sizes (512, 1024) at 0 dB SNR for Rayleigh channel. Here we have considered the 1000 trails of ELM, and calculated results have been shown in the respective tables. Almost all the modulations schemes are classified with an accuracy of ~99 % at 512 samples, which becomes ~ 100% for 1024 sample size.

Deployment

To deploy this project run

   install Python 3.2 or higher version

Acknowledgements

About

We have presented CIL method to learn the optimal dynamic treatment regime by exploiting information from both trajectories (positive and negative).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages