Score Tensorflow Precision / Recall / F1 et matrice Confusion

Question

J'aimerais savoir s'il existe un moyen d'implémenter la fonction de partition différente du paquet scikit learn comme celui-ci:

from sklearn.metrics import confusion_matrix confusion_matrix(y_true, y_pred)

dans un modèle tensorflow pour obtenir le score différent.

with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess: init = tf.initialize_all_variables() sess.run(init) for Epoch in xrange(1): avg_cost = 0. total_batch = len(train_arrays) / batch_size for batch in range(total_batch): train_step.run(feed_dict = {x: train_arrays, y: train_labels}) avg_cost += sess.run(cost, feed_dict={x: train_arrays, y: train_labels})/total_batch if Epoch % display_step == 0: print "Epoch:", '%04d' % (Epoch+1), "cost=", "{:.9f}".format(avg_cost) print "Optimization Finished!" correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # Calculate accuracy accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) print "Accuracy:", batch, accuracy.eval({x: test_arrays, y: test_labels})

Devrai-je relancer la session pour obtenir la prédiction?

nicolasdavid · Accepted Answer

Peut-être que cet exemple va vous parler:

 pred = multilayer_perceptron(x, weights, biases) correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) with tf.Session() as sess: init = tf.initialize_all_variables() sess.run(init) for Epoch in xrange(150): for i in xrange(total_batch): train_step.run(feed_dict = {x: train_arrays, y: train_labels}) avg_cost += sess.run(cost, feed_dict={x: train_arrays, y: train_labels})/total_batch if Epoch % display_step == 0: print "Epoch:", '%04d' % (Epoch+1), "cost=", "{:.9f}".format(avg_cost) #metrics y_p = tf.argmax(pred, 1) val_accuracy, y_pred = sess.run([accuracy, y_p], feed_dict={x:test_arrays, y:test_label}) print "validation accuracy:", val_accuracy y_true = np.argmax(test_label,1) print "Precision", sk.metrics.precision_score(y_true, y_pred) print "Recall", sk.metrics.recall_score(y_true, y_pred) print "f1_score", sk.metrics.f1_score(y_true, y_pred) print "confusion_matrix" print sk.metrics.confusion_matrix(y_true, y_pred) fpr, tpr, tresholds = sk.metrics.roc_curve(y_true, y_pred)

Salvador Dali · Answer

Vous n'avez pas vraiment besoin de sklearn pour calculer le score précision/rappel/f1. Vous pouvez facilement les exprimer de manière TF-ish en regardant les formules:

Maintenant, si vous avez vos valeurs actual et predicted comme vecteurs de 0/1, vous pouvez calculer TP, TN, FP, FN en utilisant tf.count_nonzero :

TP = tf.count_nonzero(predicted * actual) TN = tf.count_nonzero((predicted - 1) * (actual - 1)) FP = tf.count_nonzero(predicted * (actual - 1)) FN = tf.count_nonzero((predicted - 1) * actual)

Maintenant, vos métriques sont faciles à calculer:

precision = TP / (TP + FP) recall = TP / (TP + FN) f1 = 2 * precision * recall / (precision + recall)

ted · Answer

Caisse multi-label

Les réponses précédentes ne précisent pas comment traiter le cas multi-libellé. Voici donc une telle version mettant en œuvre trois types de score f1 multi-libellés dans tensorflow : micro , macro et pondéré (selon scikit-learn)

Mise à jour (06/06/18): J'ai écrit un message de blog sur la façon de calculer le streaming le score f1 multilabel au cas où cela aiderait quelqu'un (c'est un processus plus long, ne voulez pas surcharger cette réponse)

f1s = [0, 0, 0] y_true = tf.cast(y_true, tf.float64) y_pred = tf.cast(y_pred, tf.float64) for i, axis in enumerate([None, 0]): TP = tf.count_nonzero(y_pred * y_true, axis=axis) FP = tf.count_nonzero(y_pred * (y_true - 1), axis=axis) FN = tf.count_nonzero((y_pred - 1) * y_true, axis=axis) precision = TP / (TP + FP) recall = TP / (TP + FN) f1 = 2 * precision * recall / (precision + recall) f1s[i] = tf.reduce_mean(f1) weights = tf.reduce_sum(y_true, axis=0) weights /= tf.reduce_sum(weights) f1s[2] = tf.reduce_sum(f1 * weights) micro, macro, weighted = f1s

La justesse

def tf_f1_score(y_true, y_pred): """Computes 3 different f1 scores, micro macro weighted. micro: f1 score accross the classes, as 1 macro: mean of f1 scores per class weighted: weighted average of f1 scores per class, weighted from the support of each class Args: y_true (Tensor): labels, with shape (batch, num_classes) y_pred (Tensor): model's predictions, same shape as y_true Returns: Tuple(Tensor): (micro, macro, weighted) Tuple of the computed f1 scores """ f1s = [0, 0, 0] y_true = tf.cast(y_true, tf.float64) y_pred = tf.cast(y_pred, tf.float64) for i, axis in enumerate([None, 0]): TP = tf.count_nonzero(y_pred * y_true, axis=axis) FP = tf.count_nonzero(y_pred * (y_true - 1), axis=axis) FN = tf.count_nonzero((y_pred - 1) * y_true, axis=axis) precision = TP / (TP + FP) recall = TP / (TP + FN) f1 = 2 * precision * recall / (precision + recall) f1s[i] = tf.reduce_mean(f1) weights = tf.reduce_sum(y_true, axis=0) weights /= tf.reduce_sum(weights) f1s[2] = tf.reduce_sum(f1 * weights) micro, macro, weighted = f1s return micro, macro, weighted def compare(nb, dims): labels = (np.random.randn(nb, dims) > 0.5).astype(int) predictions = (np.random.randn(nb, dims) > 0.5).astype(int) stime = time() mic = f1_score(labels, predictions, average='micro') mac = f1_score(labels, predictions, average='macro') wei = f1_score(labels, predictions, average='weighted') print('sklearn in {:.4f}:
 micro: {:.8f}
 macro: {:.8f}
 weighted: {:.8f}'.format( time() - stime, mic, mac, wei )) gtime = time() tf.reset_default_graph() y_true = tf.Variable(labels) y_pred = tf.Variable(predictions) micro, macro, weighted = tf_f1_score(y_true, y_pred) with tf.Session() as sess: tf.global_variables_initializer().run(session=sess) stime = time() mic, mac, wei = sess.run([micro, macro, weighted]) print('tensorflow in {:.4f} ({:.4f} with graph time):
 micro: {:.8f}
 macro: {:.8f}
 weighted: {:.8f}'.format( time() - stime, time()-gtime, mic, mac, wei )) compare(10 ** 6, 10)

renvoie :

>> rows: 10^6 dimensions: 10 sklearn in 2.3939: micro: 0.30890287 macro: 0.30890275 weighted: 0.30890279 tensorflow in 0.2465 (3.3246 with graph time): micro: 0.30890287 macro: 0.30890275 weighted: 0.30890279

Someone · Answer

Depuis que je n'ai pas assez de réputation pour ajouter un commentaire à Salvador Dalis répondre c'est le chemin à parcourir:

tf.count_nonzero transforme vos valeurs en un tf.int64 sauf indication contraire. En utilisant:

argmax_prediction = tf.argmax(prediction, 1) argmax_y = tf.argmax(y, 1) TP = tf.count_nonzero(argmax_prediction * argmax_y, dtype=tf.float32) TN = tf.count_nonzero((argmax_prediction - 1) * (argmax_y - 1), dtype=tf.float32) FP = tf.count_nonzero(argmax_prediction * (argmax_y - 1), dtype=tf.float32) FN = tf.count_nonzero((argmax_prediction - 1) * argmax_y, dtype=tf.float32)

c'est une très bonne idée.

Nandeesh · Answer

Utilisez les API de métriques fournies dans tf.contrib.metrics, par exemple:

labels = ... predictions = ... accuracy, update_op_acc = tf.contrib.metrics.streaming_accuracy(labels, predictions) error, update_op_error = tf.contrib.metrics.streaming_mean_absolute_error(labels, predictions) sess.run(tf.local_variables_initializer()) for batch in range(num_batches): sess.run([update_op_acc, update_op_error]) accuracy, mean_absolute_error = sess.run([accuracy, mean_absolute_error])