I started to learn from the Internet machine learning. I have a question about the task: (I am using, ubuntu, Python3.10, PyCharm):
% matplotlib inline
is not working for me, why,
as I comment out the above I get the result:
[52.23767295 47.5274183 ]
/usr/local/lib/python3.10/dist-packages/sklearn/base.py:450: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
warnings.warn(
what is wrong? How can I correct this code?
import pandas as pd
import matplotlib.pyplot as plt
# % matplotlib inline
from sklearn.linear_model import LinearRegression
#
auto = pd.read_csv(r"....csv")
auto.head()
var = auto.shape
X = auto.iloc[:, 1:-1]
X = X.drop('horsepower', axis=1)
y = auto.loc[:, 'mpg']
X.head()
y.head()
lr = LinearRegression()
lr.fit(X, y)
lr.score(X, y)
my_car1 = [4, 160, 190, 12, 90, 1]
my_car2 = [4, 200, 260, 15, 83, 1]
cars = [my_car1, my_car2]
mpg_predict = lr.predict(cars)
print(mpg_predict)
2
Answers
This is used for jupyter notebook, if u use the IDE, this doesn’t work.
For user warning, you can ignore that actually, it’s not an error. Getting warnings is kind of normal in machine learning. If you want to dismiss this warning, you could see this: SKLearn warning "valid feature names" in version 1.0
For a quick solution:
Use:
model.fit(X.values, Y)
will dismiss this warning.The key insight is the following extract from the error message:
The model has been trained with X which contains both columns titles (feature names) and values. But when the model tries to predict, it is fed the variable ‘cars’, which is a list of list, and do not contains the headers. Training input and prediction input must have a similar format (dataframes for example). By the way, here is a well commented and detailed notebook applying machine learning models using python to the predictions of miles per gallon :
https://www.kaggle.com/code/prince381/predicting-the-miles-per-gallon-mpg-of-cars