|
| 1 | +.. _MRS: |
| 2 | + |
| 3 | +Multivariate Rejection Sampling |
| 4 | +=============================== |
| 5 | + |
| 6 | +Multivariate Rejection Sampling allows you to quickly subsample the results of a |
| 7 | +previous Monte Carlo simulation to obtain the results when one or more variables |
| 8 | +have a different probability distribution without having to re-run the simulation. |
| 9 | + |
| 10 | +We will show you how to use the :class:`rocketpy.MultivariateRejectionSampler` |
| 11 | +class to possibly save time. It is highly recommended that you read about Monte |
| 12 | +Carlo simulations. |
| 13 | + |
| 14 | +Motivation |
| 15 | +---------- |
| 16 | + |
| 17 | +As discussed in :ref:`sensitivity-practical`, there are several sources of |
| 18 | +uncertainty that can affect the flight of a rocket, notably the weather and |
| 19 | +the measurements errors in design parameters. Still, it is desirable that the flight |
| 20 | +accomplishes its goal, for instance reaching a certain apogee, as well as staying under |
| 21 | +some safety restrictions, such as ensuring that the landing point is outside |
| 22 | +of a given area. |
| 23 | + |
| 24 | +Monte Carlo simulation is a technique that allows us to quantify the uncertainty and |
| 25 | +give objective answers to those types of questions in terms of probabilities and |
| 26 | +statistics. It relies on running several simulations under different conditions |
| 27 | +specified by probability distributions provided by the user. Hence, depending on the |
| 28 | +inputs and number of samples, running these Monte Carlo simulations might take a while. |
| 29 | + |
| 30 | +Now, imagine that you ran and saved the Monte Carlo simulations. Later, you need new a |
| 31 | +Monte Carlo simulation with different probability distributions that are somewhat close |
| 32 | +to the original simulation. The first straightforward option is to just re-run the |
| 33 | +monte carlo with the new arguments, but this might be time consuming. A second option |
| 34 | +is to use a sub-sampler that leverages the existing simulation to produce a new sample |
| 35 | +that conforms to the new probability distributions. The latter completely avoids |
| 36 | +the necessity of re-running the simulations and is, therefore, much faster. |
| 37 | + |
| 38 | +The Multivariate Rejection Sampler, or just MRS, is an algorithm that sub-samples the |
| 39 | +original results based on weights proportional to the ratio of the new and old |
| 40 | +probability distributions that have changed. The final result has a smaller sample size, |
| 41 | +but their distribution matches the one newly specified without having to re-run |
| 42 | +the simulation. |
| 43 | + |
| 44 | +The time efficiency of the MRS is especially interesting in two scenarios: quick testing |
| 45 | +and tight schedules. Imagine you have an initial design and ran a huge robust monte |
| 46 | +carlo simulation but you are also interested in minor variations of the original |
| 47 | +design. Instead of having to run an expensive monte carlo for each of these variations, |
| 48 | +you can just re-sample the original accordingly. For tight schedules, it is not |
| 49 | +unheard of cases where last minute changes have to be made to simulations. The MRS might |
| 50 | +then allow you to quickly have monte carlo results for the new configuration when a |
| 51 | +full simulation might just take more time than available. |
| 52 | + |
| 53 | +Importing and using the MRS |
| 54 | +--------------------------- |
| 55 | + |
| 56 | +We now show how to actually use the :class:`rocketpy.MultivariateRejectionSampler` |
| 57 | +class. We begin by importing it along with other utilities |
| 58 | + |
| 59 | +.. jupyter-execute:: |
| 60 | + |
| 61 | + from rocketpy import MultivariateRejectionSampler, MonteCarlo |
| 62 | + import numpy as np |
| 63 | + from scipy.stats import norm |
| 64 | + |
| 65 | +The reference monte carlo simulation used is the one from the |
| 66 | +"monte_carlo_class_usage.ipynb" notebook with a 1000 samples. A |
| 67 | +MultivariateRejectionSampler object is initialized by giving two file paths, one |
| 68 | +for the prefix of the original monte carlo simulation, and one for the output of the |
| 69 | +sub-samples. The code below defines these strings and initializes the MRS object |
| 70 | + |
| 71 | + |
| 72 | +.. jupyter-execute:: |
| 73 | + |
| 74 | + monte_carlo_filepath = ( |
| 75 | + "notebooks/monte_carlo_analysis/monte_carlo_analysis_outputs/monte_carlo_class_example" |
| 76 | + ) |
| 77 | + mrs_filepath = "notebooks/monte_carlo_analysis/monte_carlo_analysis_outputs/mrs" |
| 78 | + mrs = MultivariateRejectionSampler( |
| 79 | + monte_carlo_filepath=monte_carlo_filepath, |
| 80 | + mrs_filepath=mrs_filepath, |
| 81 | + ) |
| 82 | + |
| 83 | +Running a monte carlo simulation requires you to specify the distribution of |
| 84 | +all parameters that have uncertainty. The MRS, however, only needs the previous and new |
| 85 | +distributions of the parameters whose distribution changed. All other random parameters |
| 86 | +in the original monte carlo simulation retain their original distribution. |
| 87 | + |
| 88 | +In the original simulation, the mass of the rocket had a normal distribution with mean |
| 89 | +:math:`15.426` and standard deviation of :math:`0.5`. Assume that the mean of this |
| 90 | +distribution changed to :math:`15` and the standard deviation remained the same. To |
| 91 | +run the mrs, we create a dictionary whose keys are the name of the parameter and the |
| 92 | +value is a 2-tuple: the first entry contains the pdf of the old distribution, and the |
| 93 | +second entry contains the pdf of the new distribution. The code below shows how to |
| 94 | +create these distributions and the dictionary |
| 95 | + |
| 96 | +.. jupyter-execute:: |
| 97 | + |
| 98 | + old_mass_pdf = norm(15.426, 0.5).pdf |
| 99 | + new_mass_pdf = norm(15, 0.5).pdf |
| 100 | + distribution_dict = { |
| 101 | + "mass": (old_mass_pdf, new_mass_pdf), |
| 102 | + } |
| 103 | + |
| 104 | +Finally, we execute the `sample` method, as shown below |
| 105 | + |
| 106 | +.. jupyter-execute:: |
| 107 | + :hide-code: |
| 108 | + :hide-output: |
| 109 | + |
| 110 | + np.random.seed(seed=42) |
| 111 | + |
| 112 | +.. jupyter-execute:: |
| 113 | + |
| 114 | + mrs.sample(distribution_dict=distribution_dict) |
| 115 | + |
| 116 | +And that is it! The MRS has saved a file that has the same structure as the results of |
| 117 | +a monte carlo simulation but now the mass has been sampled from the newly stated |
| 118 | +distribution. To see that it is actually the case, let us import the results of the MRS |
| 119 | +and check the mean and standard deviation of the mass. First, we import in the same |
| 120 | +way we import the results from a monte carlo simulation |
| 121 | + |
| 122 | + |
| 123 | +.. jupyter-execute:: |
| 124 | + |
| 125 | + mrs_results = MonteCarlo(mrs_filepath, None, None, None) |
| 126 | + mrs_results.import_results() |
| 127 | + |
| 128 | +Notice that the sample size is now smaller than 1000 samples. Albeit the sample size is |
| 129 | +now random, we can check the expected number of samples by printing the |
| 130 | +`expected_sample_size` attribute |
| 131 | + |
| 132 | +.. jupyter-execute:: |
| 133 | + |
| 134 | + print(mrs.expected_sample_size) |
| 135 | + |
| 136 | +Now we check the mean and standard deviation of the mass. |
| 137 | + |
| 138 | +.. jupyter-execute:: |
| 139 | + |
| 140 | + mrs_mass_list = [] |
| 141 | + for single_input_dict in mrs_results.inputs_log: |
| 142 | + mrs_mass_list.append(single_input_dict["mass"]) |
| 143 | + |
| 144 | + print(f"MRS mass mean after resample: {np.mean(mrs_mass_list)}") |
| 145 | + print(f"MRS mass std after resample: {np.std(mrs_mass_list)}") |
| 146 | + |
| 147 | +They are very close to the specified values. |
| 148 | + |
| 149 | +Comparing Monte Carlo Results |
| 150 | +----------------------------- |
| 151 | + |
| 152 | +Alright, now that we have the results for this new configuration, how does it compare |
| 153 | +to the original one? Our rocket has, on average, decreased its mass in about 400 grams |
| 154 | +while maintaining all other aspects. How do you think, for example, that the distribution |
| 155 | +of the apogee has changed? Let us find out. |
| 156 | + |
| 157 | +First, we import the original results |
| 158 | + |
| 159 | +.. jupyter-execute:: |
| 160 | + |
| 161 | + original_results = MonteCarlo(monte_carlo_filepath, None, None, None) |
| 162 | + |
| 163 | +Prints |
| 164 | +^^^^^^ |
| 165 | + |
| 166 | +We use the `compare_info` method from the `MonteCarlo` class, passing along |
| 167 | +the MRS monte carlo object as argument, to print a summary of the comparison |
| 168 | + |
| 169 | +.. jupyter-execute:: |
| 170 | + |
| 171 | + original_results.compare_info(mrs_results) |
| 172 | + |
| 173 | +This summary resemble closely the printed information from one monte carlo simulation |
| 174 | +alone, with the difference now that it has a new column, "Source", that alternates the |
| 175 | +results between the original and the other simulation. To answer the question proposed |
| 176 | +earlier, compare the mean and median of the apogee between both cases. Is it what you |
| 177 | +expected? |
| 178 | + |
| 179 | + |
| 180 | +Histogram and boxplots |
| 181 | +^^^^^^^^^^^^^^^^^^^^^^ |
| 182 | + |
| 183 | +Besides printed comparison, we can also provide a comparison for the distributions in |
| 184 | +the form of histograms and boxplots, using the `compare_plots` method |
| 185 | + |
| 186 | + |
| 187 | +.. jupyter-execute:: |
| 188 | + |
| 189 | + original_results.compare_plots(mrs_results) |
| 190 | + |
| 191 | +Note that the histograms displays three colors. Two are from the sources, as depicted |
| 192 | +in the legend, the third comes from the overlap of the two. |
| 193 | + |
| 194 | +Ellipses |
| 195 | +^^^^^^^^ |
| 196 | + |
| 197 | +Finally, we can compare the ellipses for the apogees and landing points using the |
| 198 | +`compare_ellipses` method |
| 199 | + |
| 200 | +.. jupyter-execute:: |
| 201 | + |
| 202 | + original_results.compare_ellipses(mrs_results, ylim=(-4000, 3000)) |
| 203 | + |
| 204 | +Note we can pass along parameters used in the usual `ellipses` method of the |
| 205 | +`MonteCarlo` class, in this case the `ylim` argument to expand the y-axis limits. |
| 206 | + |
| 207 | +Time Comparison |
| 208 | +--------------- |
| 209 | + |
| 210 | +Is the MRS really much faster than just re-running a Monte Carlo simulation? |
| 211 | +Let us take a look at some numbers. All tests ran in a Dell G15 5530, with 16 |
| 212 | +13th Gen Intel® Core™ i5-13450HX CPUs, 16GB RAM, running Ubuntu 22.04. Each function |
| 213 | +ran 10 times, and no parallelization was used. |
| 214 | + |
| 215 | +To run the original monte carlo simulation with 1000 samples it took, |
| 216 | +on average, about 644 seconds, that is, 10 minutes and 44 seconds. For the MRS described |
| 217 | +here, it took, on average, 0.15 seconds, with an expected sample size of 117. To re-run |
| 218 | +the monte carlo simulations with 117 samples it took, on average, 76.3 seconds. Hence, |
| 219 | +the MRS was, on average, (76.3 / 0.15) ~ 500 times faster than re-running the monte |
| 220 | +carlo simulations with the same sample size provided by the MRS. |
| 221 | + |
| 222 | + |
| 223 | +Sampling from nested parameters |
| 224 | +------------------------------- |
| 225 | + |
| 226 | +To sample from parameters that are nested in inner levels, use a key name |
| 227 | +formed by the name of the key of the outer level concatenated with a "_" and |
| 228 | +the key of the inner level. For instance, to change the distribution |
| 229 | +from the "total_impulse" parameter, which is nested within the "motors" |
| 230 | +parameter dictionary, we have to use the key name "motors_total_impulse". |
| 231 | + |
| 232 | + |
| 233 | +.. jupyter-execute:: |
| 234 | + |
| 235 | + old_total_impulse_pdf = norm(6500, 1000).pdf |
| 236 | + new_total_impulse_pdf = norm(5500, 1000).pdf |
| 237 | + distribution_dict = { |
| 238 | + "motors_total_impulse": (old_total_impulse_pdf, new_total_impulse_pdf), |
| 239 | + } |
| 240 | + mrs.sample(distribution_dict=distribution_dict) |
| 241 | + |
| 242 | +Finally, if there are multiple named nested structures, we need to use a key |
| 243 | +name formed by outer level concatenated with "_", the structure name, "_" and the |
| 244 | +inner parameter name. For instance, if we want to change the distribution of |
| 245 | +the 'root_chord' of the 'Fins', we have to pass as key |
| 246 | +'aerodynamic_surfaces_Fins_root_chord' |
| 247 | + |
| 248 | +.. jupyter-execute:: |
| 249 | + |
| 250 | + old_root_chord_pdf = norm(0.12, 0.001).pdf |
| 251 | + new_root_chord_pdf = norm(0.119, 0.002).pdf |
| 252 | + distribution_dict = { |
| 253 | + "aerodynamic_surfaces_Fins_root_chord": ( |
| 254 | + old_root_chord_pdf, new_root_chord_pdf |
| 255 | + ), |
| 256 | + } |
| 257 | + mrs.sample(distribution_dict=distribution_dict) |
| 258 | + |
| 259 | +A word of caution |
| 260 | +----------------- |
| 261 | + |
| 262 | +Albeit the MRS provides results way faster than running the simulation again, it |
| 263 | +might reduce the sample size drastically. If several variables undergo |
| 264 | +changes in their distribution and the more discrepant these are from the original |
| 265 | +ones, the more pronounced will be this sample size reduction. If you need the monte |
| 266 | +carlo simulations to have the same sample size as before or if the expected sample size |
| 267 | +from the MRS is too low for you current application, then it might be better to |
| 268 | +re-run the simulations. |
| 269 | + |
| 270 | +References |
| 271 | +---------- |
| 272 | + |
| 273 | +[1] CEOTTO, Giovani H., SCHMITT, Rodrigo N., ALVES, Guilherme F., et al. RocketPy: six |
| 274 | +degree-of-freedom rocket trajectory simulator. Journal of Aerospace Engineering, 2021, |
| 275 | +vol. 34, no 6, p. 04021093. |
| 276 | + |
| 277 | +[2] CASELLA, George, ROBERT, Christian P., et WELLS, Martin T. Generalized accept-reject |
| 278 | +sampling schemes. Lecture notes-monograph series, 2004, p. 342-347. |
0 commit comments