I am new to Python and I need to iterate over 3 main variables to check the best mean error in an artificial intelligence models.
The 3 models are: Gradient booster, Random Forest and XGBooster.
Each model is fitted to the data separately. And at the end I need to ensemble them but the iteration is exhausting as there is 27 iterations to make.
The equation is as follows:
y_predict = xgradientBossterPredict + yrandomForest + z*XGBooster
Where
- x, y and z are between 0 and 1 (with 0.1 as step for each of them)
- x + y + z should be always equal to 1
I tried the following:
rmse = []
for (gbrCount in np.arange(0, 1.0, 0.1)):
for(xgbCount in np.arange(0, 1.0, 0.1)):
for(regCount in np.arange(0, 1.0, 0.1)):
y_p = (xgbCount*xgb.predict(testset)+ gbrCount*gbr.predict(testset)+regCount*regressor.predict(testset))
testset['SalePrice']=np.expm1(y_p)
y_train_p = xgb.predict(dataset)
y_train_p = np.expm1(y_train_p)
rmse.append(np.sqrt(mean_squared_error(y, y_train_p)))
rmse.append(xgbCount)
rmse.append(gbrCount)
rmse.append(regCount)
But I am getting the following error:
SyntaxError: unexpected EOF while parsing
for gbrCount in np.arange(0, 1.0, 0.1):
4
Answers
Please code like the following.
or
or
This is just a Python syntax error.
Omit the parentheses in this line:
for gbrCount in np.arange(0, 1.0, 0.1):
and also in the other
for
lines.That will solve your stated problem. But also note, in the
arange
docs, that you should instead be usinglinspace
if you want to use noninteger step paramter.As to making the sum equal
1
:You already have
if int(gbrCount+xgbCount+regCount) == 1:
Doesn’t that work? If not, note that floating point numbers are not exact, so that what looks like it should be1.0
might actually be 0.9999, so thatint()
gives0
. You should uselinspace
or else usenp.arange(0, 10 , 1)
so that everything is integers (inside the loop, dividing each value by10
).Your code will work fine with below syntax for FOR loops:
for sum always = 1 in loop, look below:
for each result in same row and not each value:
Simplest approach I can think of: loop over two of the variables, and determine the necessary value of the third (if it isn’t in range, just
continue
; or better yet, specify the range for the second variable in terms of the first, in a way that ensures the third can be in range).Example, with integers in 0..10 summing to 10:
(For the floating-point case, this may require some adjustment due to the imprecision of floating-point arithmetic.)