I get the error ValueError: Input contains NaN
, when I try to predict the next value of series by using ARIMA model from pmdarima
.
But the data I use didn’t contains null values.
codes:
from pmdarima.arima import ARIMA
tmp_series = pd.Series([0.8867208063423082, 0.4969678051201152, -0.35079875681211814, 0.07156197743204402, 0.6888394890593726, 0.6136916470350972, 0.9020102952782968, 0.38539523911177426, -0.02211092685162178, 0.7051282791422511, -0.21841121961990842, 0.003262841037836234, 0.3970253153400027, 0.8187445259415379, -0.525847439014037, 0.3039480910711944, 0.0279240073596233, 0.8238419467739897, 0.8234157376839023, 0.5897892005398399, 0.8333118174945449])
model_211 = ARIMA(order=(2, 1, 1), out_of_sample_size=0, mle_regression=True, suppress_warnings=True)
model_211.fit(tmp_series[:-1])
print(model_211.predict())
error message:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Input In [7], in <cell line: 7>()
5 display(model_211.params())
6 display(model_211.aic())
----> 7 display(model_211.predict())
File /usr/local/lib/python3.8/dist-packages/pmdarima/arima/arima.py:793, in ARIMA.predict(self, n_periods, X, return_conf_int, alpha, **kwargs)
790 arima = self.arima_res_
791 end = arima.nobs + n_periods - 1
--> 793 f, conf_int = _seasonal_prediction_with_confidence(
794 arima_res=arima,
795 start=arima.nobs,
796 end=end,
797 X=X,
798 alpha=alpha)
800 if return_conf_int:
801 # The confidence intervals may be a Pandas frame if it comes from
802 # SARIMAX & we want Numpy. We will to duck type it so we don't add
803 # new explicit requirements for the package
804 return f, check_array(conf_int, force_all_finite=False)
File /usr/local/lib/python3.8/dist-packages/pmdarima/arima/arima.py:205, in _seasonal_prediction_with_confidence(arima_res, start, end, X, alpha, **kwargs)
202 conf_int[:, 1] = f + q * np.sqrt(var)
204 y_pred = check_endog(f, dtype=None, copy=False, preserve_series=True)
--> 205 conf_int = check_array(conf_int, copy=False, dtype=None)
207 return y_pred, conf_int
File /usr/local/lib/python3.8/dist-packages/sklearn/utils/validation.py:899, in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator, input_name)
893 raise ValueError(
894 "Found array with dim %d. %s expected <= 2."
895 % (array.ndim, estimator_name)
896 )
898 if force_all_finite:
--> 899 _assert_all_finite(
900 array,
901 input_name=input_name,
902 estimator_name=estimator_name,
903 allow_nan=force_all_finite == "allow-nan",
904 )
906 if ensure_min_samples > 0:
907 n_samples = _num_samples(array)
File /usr/local/lib/python3.8/dist-packages/sklearn/utils/validation.py:146, in _assert_all_finite(X, allow_nan, msg_dtype, estimator_name, input_name)
124 if (
125 not allow_nan
126 and estimator_name
(...)
130 # Improve the error message on how to handle missing values in
131 # scikit-learn.
132 msg_err += (
133 f"n{estimator_name} does not accept missing values"
134 " encoded as NaN natively. For supervised learning, you might want"
(...)
144 "#estimators-that-handle-nan-values"
145 )
--> 146 raise ValueError(msg_err)
148 # for object dtype data, we only check for NaNs (GH-13254)
149 elif X.dtype == np.dtype("object") and not allow_nan:
ValueError: Input contains NaN.
So, I have two questions:
-
Is there any parameters I should set, in order to avoid this error?
-
I found out the similar problem: Failing to predict next value using ARIMA: Input contains NaN, infinity or a value too large for dtype('float64'). In the comment of this post says : It’s caused by a unsolved issue.
I’m not sure if this error is also caused by the same issue. If so, is there any suggestion of other package of ARIMA model?
Environment Information:
- I perform this code in a docker container
- OS info:
Distributor ID: Ubuntu Description: Ubuntu 20.04.4 LTS Release: 20.04 Codename: focal
- python env info:
Python 3.8.10
- pip package info (I only list related package, I put complete pip package list in here):
Package Version ---------------------------- -------------------- numpy 1.22.4 pandas 1.4.3 pmdarima 2.0.1 scikit-learn 1.1.1 scipy 1.8.1 statsmodels 0.13.2
- OS info:
2
Answers
Downgrading the following packages will resolve this error:
What environment do you work in?
your code print(work):
20 0.316942
21 0.338248
22 0.378482…