Régression polynomiale multivariée avec numpy

Question

J'ai plusieurs exemples (y_i, (a_i, b_i, c_i)) où y est supposé varier comme polynôme dans a,b,c jusqu'à un certain degré. Par exemple, pour un ensemble de données donné et le degré 2, je pourrais produire le modèle

y = a^2 + 2ab - 3cb + c^2 +.5ac

Cela peut être fait en utilisant les moindres carrés et est une légère extension de la routine polyfit de numpy. Existe-t-il une implémentation standard quelque part dans l'écosystème Python?

David Hoffman · Answer

sklearn fournit un moyen simple de le faire.

Construire sur un exemple posté ici :

#X is the independent variable (bivariate in this case) X = array([[0.44, 0.68], [0.99, 0.23]]) #vector is the dependent data vector = [109.85, 155.72] #predict is an independent variable for which we'd like to predict the value predict= [0.49, 0.18] #generate a model of polynomial features poly = PolynomialFeatures(degree=2) #transform the x data for proper fitting (for single variable type it returns,[1,x,x**2]) X_ = poly.fit_transform(X) #transform the prediction to fit the model type predict_ = poly.fit_transform(predict) #here we can remove polynomial orders we don't want #for instance I'm removing the `x` component X_ = np.delete(X_,(1),axis=1) predict_ = np.delete(predict_,(1),axis=1) #generate the regression object clf = linear_model.LinearRegression() #preform the actual regression clf.fit(X_, vector) print("X_ = ",X_) print("predict_ = ",predict_) print("Prediction = ",clf.predict(predict_))

Et voici la sortie:

>>> X_ = [[ 0.44 0.68 0.1936 0.2992 0.4624] >>> [ 0.99 0.23 0.9801 0.2277 0.0529]] >>> predict_ = [[ 0.49 0.18 0.2401 0.0882 0.0324]] >>> Prediction = [ 126.84247142]

reptilicus · Answer

polyfit fonctionne, mais il existe de meilleurs minimiseurs carrés. Je recommanderais kmpfit, disponible à

http://www.astro.rug.nl/software/kapteyn-beta/kmpfittutorial.html

Il est plus robuste que polyfit, et il y a un exemple sur leur page qui montre comment effectuer un ajustement linéaire simple qui devrait fournir les bases pour effectuer un ajustement polynomial du 2ème ordre.

 def model(p, v, x, w): a,b,c,d,e,f,g,h,i,j,k = p #coefficients to the polynomials return a*v**2 + b*x**2 + c*w**2 + d*v*x + e*v*w + f*x*w + g*v + h*x + i*y + k def residuals(p, data): # Function needed by fit routine v, x, w, z = data # The values for v, x, w and the measured hypersurface z a,b,c,d,e,f,g,h,i,j,k = p #coefficients to the polynomials return (z-model(p,v,x,w)) # Returns an array of residuals. #This should (z-model(p,v,x,w))/err if # there are error bars on the measured z values #initial guess at parameters. Avoid using 0.0 as initial guess par0 = [1.0, 1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0] #create a fitting object. data should be in the form #that the functions above are looking for, i.e. a Nx4 #list of lists/tuples like (v,x,w,z) fitobj = kmpfit.Fitter(residuals=residuals, data=data) # call the fitter fitobj.fit(params0=par0)

Le succès de ces opérations dépend étroitement des valeurs de départ de l'ajustement. Choisissez donc avec soin, si possible. Avec autant de paramètres libres, il pourrait être difficile de trouver une solution.

amos · Answer

sklearn en a un bon exemple en utilisant son pipeline ici . Voici l'essentiel de leur exemple:

polynomial_features = PolynomialFeatures(degree=degrees[i], include_bias=False) linear_regression = LinearRegression() pipeline = Pipeline([("polynomial_features", polynomial_features), ("linear_regression", linear_regression)]) pipeline.fit(X[:, np.newaxis], y)

Vous n'avez pas besoin de transformer vos données vous-même - transmettez-les simplement dans le pipeline.