Initial FeO prediction
Start by importing MagmaPEC and MagmaPandas (or regular Pandas)
[1]:
import MagmaPEC as mpc
import MagmaPandas as mp
Import your melt or whole-rock data
[2]:
wholerock_file = "./data/wholerock.csv"
wholerock = mp.read_melt(wholerock_file, index_col=["name"])
wholerock.head()
[2]:
| SiO2 | TiO2 | Al2O3 | MnO | MgO | CaO | Na2O | K2O | P2O5 | Cr2O3 | FeO | total | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| name | ||||||||||||
| PI032 | 46.962818 | 3.447381 | 16.650124 | 0.174618 | 7.004558 | 9.400741 | 3.131253 | 1.290536 | 0.635346 | 0.009630 | 11.350806 | 100.057812 |
| PI041 | 46.292709 | 3.553973 | 17.191874 | 0.170671 | 6.221699 | 9.443416 | 2.701110 | 1.076079 | 0.610387 | 0.007262 | 10.986336 | 98.255517 |
| PI053 | 47.515736 | 3.084096 | 15.182294 | 0.169948 | 8.250577 | 9.640357 | 2.298223 | 1.013332 | 0.448063 | 0.009225 | 10.955438 | 98.567288 |
| PI054 | 47.524357 | 3.369594 | 16.456587 | 0.174671 | 6.569622 | 9.216309 | 2.653994 | 1.367337 | 0.648534 | 0.029540 | 10.953660 | 98.964204 |
| PI055 | 46.452168 | 3.287018 | 16.039806 | 0.169084 | 7.396060 | 9.469899 | 2.035204 | 1.072211 | 0.539792 | 0.011676 | 10.896147 | 97.369065 |
Create the FeOi prediction object and initialise it with your data. Make sure to remove the FeO column from the melt compositions used to predict FeOi.
[3]:
x = wholerock.drop(columns=["FeO"])
FeOi_predict = mpc.FeOi_prediction(x=x, FeO=wholerock["FeO"])
Select the columns in x that you do not want to use to predict initial FeO contents. Here we do not want to use minor elements and the totals column:
[4]:
do_not_use = ["MnO", "P2O5", "Cr2O3", "total"]
The calculate_model_fits method calculates best-fit multiple linear regressions for all possible combinations of elements in x. For each new regression, the element whose removal results in the lowest regression F-test p-value is removed from the dataset.
It returns a dataframe with fitted coefficients and misfit statistics (RMSE, cross-validated RMSE and R2). NaN values for coefficients indicate that this element has been removed from the calibration dataset and is not used in the regression.
[5]:
model_fits = FeOi_predict.calculate_model_fits(exclude=do_not_use)
model_fits
[5]:
| intercept | SiO2 | TiO2 | Al2O3 | MgO | CaO | Na2O | K2O | RMSE | CV-RMSE | deltaRMSE | r2 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 6 | 24.162353 | -0.184450 | 1.085716 | -0.337465 | NaN | -0.292231 | 0.344686 | -0.579578 | 0.151416 | 0.278102 | 0.126686 | 0.930966 |
| 5 | 21.878478 | -0.144730 | 1.179186 | -0.293183 | NaN | -0.296750 | NaN | -0.300763 | 0.165387 | 0.273553 | 0.108166 | 0.917640 |
| 4 | 22.442147 | -0.168107 | 1.114827 | -0.289476 | NaN | -0.260280 | NaN | NaN | 0.171032 | 0.232349 | 0.061317 | 0.911921 |
| 3 | 11.585125 | NaN | 1.463365 | -0.230722 | NaN | -0.169262 | NaN | NaN | 0.206434 | 0.237533 | 0.031100 | 0.871685 |
| 2 | 9.213242 | NaN | 1.640962 | -0.217524 | NaN | NaN | NaN | NaN | 0.253769 | 0.294748 | 0.040978 | 0.806092 |
| 1 | 5.357205 | NaN | 1.725906 | NaN | NaN | NaN | NaN | NaN | 0.319430 | 0.355670 | 0.036240 | 0.692766 |
RMSE and R2 are both calculated on the entire calibration dataset and indicate how good the model is at predicting FeO. For RMSE lower values are better, while for R2 values close to 1 are best. To check for overfitting, cross-validated RMSE’s (CV-RMSE) are also calculated, where large differences between RMSE and CV-RMSE (deltaRMSE) can indicate overfitting issues. As long as RMSE and R2 values are acceptable, the model with the smallest deltaRMSE should be selected. Here, models 1, 2 and 3 have similar RMSE, deltaRMSE and R2 values and any of these models would work well for predicting melt FeO contents. In models 4, 5 and 6, deltaRMSE has increased and overfitting might be an issue here.
Next we use the results for the previous step to select our model. Here we use model 3, where TiO2, Al2O3 and CaO are used as predictors. In the select_predictor method, you pass the index of your preferred model to the idx parameter. It shows a plot of the regression and calibration data. Be mindful of the FeO range, and make sure that you do not extrapolate (too much) beyond this range during PEC corrections.
[6]:
FeOi_predict.select_predictors(idx=3)
We can check if the right predictors are used with the predictors attribute
[7]:
FeOi_predict.predictors
[7]:
array(['TiO2', 'Al2O3', 'CaO'], dtype=object)
Coefficients of the linear regression were automatically calculated by the select_predictors method and we can access these with the coefficients, intercept_error and errors attributes.
Fitted coefficients:
[8]:
FeOi_predict.coefficients
[8]:
intercept 11.585125
TiO2 1.463365
Al2O3 -0.230722
CaO -0.169262
dtype: float64
and their errors as standard deviations:
[9]:
FeOi_predict.errors
[9]:
intercept 1.203255
TiO2 0.164396
Al2O3 0.048417
CaO 0.049363
dtype: float64
The model attribute stores the selected model as a callable function, which we can use to predict melt FeO contents. The function accepts a pandas DataFrame or MagmaFrame with melt major element compositions in oxide wt. % as input. Let’s try that out on the whole-rock data:
[11]:
model = FeOi_predict.model
model(wholerock)
[11]:
name
PI032 11.197161
PI041 11.220927
PI053 10.963646
PI054 11.159202
PI055 11.091599
PI056 10.944143
PI057 10.590489
PI058 10.757390
PI060 11.154730
PI063 11.988362
PI065 11.165823
PI066 11.055243
PI067 11.146931
PI068 11.240555
PI069 10.846550
PI070 10.381338
PI071 10.801455
PI074 10.470945
PI077 10.007513
PI078 9.718330
PI079 9.230219
PI081 10.813904
PI084 10.438283
PI094 10.511939
PI097 10.183464
PI111 10.866402
PI114 10.701429
dtype: float32
The random_sample_coefficients method randomly samples fitted coefficients within their errors and calculates the matching x-intercept value. This method is used internally in the Monte Carlo PEC correction model to propagate FeOi prediction errors.
[10]:
FeOi_predict.random_sample_coefficients(n=5)
[10]:
| TiO2 | Al2O3 | CaO | intercept | |
|---|---|---|---|---|
| 0 | 1.555664 | -0.186890 | -0.129272 | 10.195312 |
| 1 | 1.704102 | -0.263428 | -0.229004 | 11.937500 |
| 2 | 1.424805 | -0.151489 | -0.275146 | 11.398438 |
| 3 | 1.440430 | -0.228882 | -0.127197 | 11.226562 |
| 4 | 1.457031 | -0.237793 | -0.176147 | 11.789062 |