SimBiology Tutorials: Estimating Parameters (Fixed Effects)
From the series: SimBiology Tutorials for QSP, PBPK, and PK/PD Modeling and Analysis
Estimating Parameters (Fixed Effects) video: This video demonstrates how to estimate parameters using non-linear regression (fixed effects). First an appropriate compartmental model is added to the SimBiology project. Simulations from this model are compared to the experimental data and parameter sliders and results from noncompartmental analysis are used to arrive at reasonable initial estimates. A non-linear regression is performed using the model and data. The results from the estimation are visualized and two different estimations are compared quantitatively using statistics such as the Akaike and Bayesian Information Criteria.
In this video, we will be fitting a model to the Theophyline data sets that we imported in the previous video on importing data and performing non-compartmental analysis. The data set contains 12 subjects, and for each subject there is a single oral dose followed by concentration measurements and it also has a co-variate column.
The first thing we want to understand is what kind of model we should be fitting. Here you see the data plotted in a semi-log y-scale. You can see that there is an absorption phase, followed by a linear elimination phase, linear in the log scale. And so this implies that the one compartment model might fit this data well.
We can edit one compartment model by going to the model builder and adding a model from the PK library. The model we will want to add is a one compartment model with first-order dosing and linear clearance. You can see this is a simple model, where the dose is absorbed into the central compartment, and subsequently eliminated.
We can now estimate the parameters for this model. So that is the absorption rate and the elimination rate, as well as the volume of the central compartment. So those are the three parameters that we will be estimating from the data.
Before we start estimating these three parameters, it can be helpful to understand what the initial estimates might be for these three parameters. And in order to do that, we can simulate the model and see whether we can bring the simulation results in line with the observations. So currently, these parameters are all one, except for the absorption rate. And we need to make the absorption rate one, or at least not zero, such that if we apply a dose to dose central, the rate from dose central to drug central is not zero. Because if it is zero, the dose will never appear in the central compartment.
So in order to simulate the model, we need to go back to the model analyzer and create a program. And the program is called Simulate Model. We then select our model, the one compartment model. And we can select a dose from the data set that we have. So we can use the columns in the data set, in this case, theophyline.dose_total.
This dose needs to go to dose central in our model, and the time units are in hour. And the final time points of the observations in the data set are around 25 hours. So we're going to simulate this model also for 25 hours.
So now we only simulate the model, and we can add the data from the Theophyline data set to compare the simulation here with the observations. So if we just reconfigure this here. And these are the observations, and in blue at the bottom here, you see the simulation.
So now we can add sliders to the simulation and bring the simulation in line with the observations. So we can add a slider for the central compartment, for the absorption rate, and for the clearance. And now we can use the results from our non-compartmental analysis, which are here in data sheet two.
As initial estimates, for example, for the central compartment, we can use the value for these. In that case, it's on the order of 0.5. And for the clearance, we can have a look at this column here, at the calculations for the clearance in the non-compartmental analysis. And that's on the order of 0.04. I'm going to leave the KA here, because we don't have a non-compartmental analysis parameter to guide us on the absorption rate.
But you can see that even with these values, the simulation looks pretty similar to the observations. So we can use these values as our initial estimates for the parameter estimation. So with that let's estimate the parameters.
Again, we can create a new program, and this program is called Fit Data. We can walk through the set up of this program, section by section. The first thing you need to do is define which data set you want to use. In this case, there is only the Theophyline data set.
Then you need to define which model in your project you want to use. Here we use the one compartment model that we edited earlier.
Now, in order to perform this estimation, SimBiology will use an objective function. And the objective function is a measure of the discrepancy between your simulation and your experimental data. And the most simple example will be the sum of square errors between the simulation and the experiment data.
The idea is for the estimation to minimize the value of the objective function. And once we have that minimum value, that represents the best fit. In order to create the objective function, SimBiology needs some information. You need to map the experimental data to the components in your model, and that's what we need to do next.
You can see the column headers here, ID time, concentration, and dose total. They coincide with the column headers here, ID, time, concentration, and dose total. Those column headers need to be matched up with our model, especially the last two, so concentration and dose total.
And you can see here that the concentration is mapped to the component central to drug central. And the dose total column is mapped to dose central. So this mapping between the columns in the data and the components in the model allow SimBiology to create the objective function for you.
And here you can see that for the dose, we have a Bolus dose. This is an instant administration of the drug amount to dose central. We're calling this Bolus, just to say we administer the oral dose in one go.
If you want, you can also define a rate. And that can be a column in your data set or you can set that to a certain number. And that rate will define the, for example, infusion rate.
You can also have multiple responses for your model. So if you have a concentration for multiple species in your model, you can add further responses to be incorporated in the objective function.
The next thing we need to do is we need to define which parameters we're going to be estimating. And this Fit program has automatically set that up for the one compartment model, because it knows that the three parameters that need to be estimated are the central compartment volume, appearance, and the absorption. Of course, you can also choose your own parameters, and you can drag and dropped him from the model browser on the left into the estimated parameters table.
Now if you expand these, you can set initial estimates. So if you remember, the initial and transformed value will be 0.4 for the volume, for the clearance, it will be 0.04, and for the absorption rate, it would be one. We're going to leave that as is. OK.
Now we will choose here to do a pooled fit, which results in a single set of estimates for all 12 subjects. If we unselect this box, we will get a set of estimates for each subject in our data set.
We can also choose an error model. In this case, we will try out multiple error moles, and we will start with the constant error model. You can also choose what kind of solver you would like to use, whether you want to use a global silver or a local solver, and then there are, within the local solvers, there are gradient and non-gradient based solvers.
Here, I would recommend this is a small model, use a gradient based local solver, like LSQ no name.
OK, the next thing we need to do, is we need to press run to start estimating the parameters. As the estimation progresses, you can see the log likelihood value, and you want that to be maximized.
Note that the log likelihood is the negative log of the objective function. So minimizing the objective function is the same as maximizing the log likelihood. You can also see the first order optimality, which you want to be minimized. v then you see how each of the parameters progresses as it's being demised.
If you now move back to the SimBiology Model Analyzer, you will be presented with a set of plots. The first plot shows you the fit between the calibrated model and the experimental data, where the calibrated model has been dosed with a corresponding dose of each subject as specified in the dose column of the data set. You will also see the predicted value versus observed value of the concentration, where you want the data points to lie as closely to the unity line.
The same you can do with the residuals versus time. Here, you want to make sure that the residuals are evenly distributed on either side.
And then lastly, we can have a look at the residuals in a qqplot. This line, here, represents a normal distribution. If the dots here lie close to that line here, that implies that the residuals are normally distributed, which is an underlying assumption for this kind of optimization.
Now, we've performed this estimation, and there are some things that we could change. And one of those things that we may want to change is the error model. Before we do that, I want to save the results from this program. And so I can call this constant and pooled. I'm now going to change this to the proportional error model and run the optimization again. And if we now look at the plots, you will see that there is an extra set of plots being generated for the proportional error model. But there are also quantitative measures that you can use to compare these two fits. So we'll save the results from this estimation as prop_pooled.
And now, what we can do, is we can grab a data sheet. So we create a new data sheet. And we drag the results from each of these estimations onto the data sheet. And now we can compare the two together. So we can compare the estimates, but we can also compare the log likelihood. Again, we want to maximize the likelihood, and we can compare measures like the Akaike information criterion and the Bayesian information criterion.
And especially the AIC and the BIC are good ways of seeing which one gives you the best fit. A lower AIC represents a better fit. So in this case the constant error model is probably most appropriate for this data set.
With that we finish that parameter estimation of fixed effects. In our next video, we will also look into performing a mixed effects estimation on the same data set.
You can also select a web site from the following list:
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.