web-dev-qa-db-fra.com

Score Tensorflow Precision / Recall / F1 et matrice Confusion

J'aimerais savoir s'il existe un moyen d'implémenter la fonction de partition différente du paquet scikit learn comme celui-ci:

from sklearn.metrics import confusion_matrix
confusion_matrix(y_true, y_pred)

dans un modèle tensorflow pour obtenir le score différent.

with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as sess:
init = tf.initialize_all_variables()
sess.run(init)
for Epoch in xrange(1):
        avg_cost = 0.
        total_batch = len(train_arrays) / batch_size
        for batch in range(total_batch):
                train_step.run(feed_dict = {x: train_arrays, y: train_labels})
                avg_cost += sess.run(cost, feed_dict={x: train_arrays, y: train_labels})/total_batch
        if Epoch % display_step == 0:
                print "Epoch:", '%04d' % (Epoch+1), "cost=", "{:.9f}".format(avg_cost)

print "Optimization Finished!"
correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
print "Accuracy:", batch, accuracy.eval({x: test_arrays, y: test_labels})

Devrai-je relancer la session pour obtenir la prédiction?

30
nicolasdavid

Peut-être que cet exemple va vous parler:

    pred = multilayer_perceptron(x, weights, biases)
    correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))

    with tf.Session() as sess:
    init = tf.initialize_all_variables()
    sess.run(init)
    for Epoch in xrange(150):
            for i in xrange(total_batch):
                    train_step.run(feed_dict = {x: train_arrays, y: train_labels})
                    avg_cost += sess.run(cost, feed_dict={x: train_arrays, y: train_labels})/total_batch         
            if Epoch % display_step == 0:
                    print "Epoch:", '%04d' % (Epoch+1), "cost=", "{:.9f}".format(avg_cost)

    #metrics
    y_p = tf.argmax(pred, 1)
    val_accuracy, y_pred = sess.run([accuracy, y_p], feed_dict={x:test_arrays, y:test_label})

    print "validation accuracy:", val_accuracy
    y_true = np.argmax(test_label,1)
    print "Precision", sk.metrics.precision_score(y_true, y_pred)
    print "Recall", sk.metrics.recall_score(y_true, y_pred)
    print "f1_score", sk.metrics.f1_score(y_true, y_pred)
    print "confusion_matrix"
    print sk.metrics.confusion_matrix(y_true, y_pred)
    fpr, tpr, tresholds = sk.metrics.roc_curve(y_true, y_pred)
32
nicolasdavid

Vous n'avez pas vraiment besoin de sklearn pour calculer le score précision/rappel/f1. Vous pouvez facilement les exprimer de manière TF-ish en regardant les formules:

enter image description here

Maintenant, si vous avez vos valeurs actual et predicted comme vecteurs de 0/1, vous pouvez calculer TP, TN, FP, FN en utilisant tf.count_nonzero :

TP = tf.count_nonzero(predicted * actual)
TN = tf.count_nonzero((predicted - 1) * (actual - 1))
FP = tf.count_nonzero(predicted * (actual - 1))
FN = tf.count_nonzero((predicted - 1) * actual)

Maintenant, vos métriques sont faciles à calculer:

precision = TP / (TP + FP)
recall = TP / (TP + FN)
f1 = 2 * precision * recall / (precision + recall)
46
Salvador Dali

Caisse multi-label

Les réponses précédentes ne précisent pas comment traiter le cas multi-libellé. Voici donc une telle version mettant en œuvre trois types de score f1 multi-libellés dans tensorflow : micro , macro et pondéré (selon scikit-learn)

Mise à jour (06/06/18): J'ai écrit un message de blog sur la façon de calculer le streaming le score f1 multilabel au cas où cela aiderait quelqu'un (c'est un processus plus long, ne voulez pas surcharger cette réponse)

f1s = [0, 0, 0]

y_true = tf.cast(y_true, tf.float64)
y_pred = tf.cast(y_pred, tf.float64)

for i, axis in enumerate([None, 0]):
    TP = tf.count_nonzero(y_pred * y_true, axis=axis)
    FP = tf.count_nonzero(y_pred * (y_true - 1), axis=axis)
    FN = tf.count_nonzero((y_pred - 1) * y_true, axis=axis)

    precision = TP / (TP + FP)
    recall = TP / (TP + FN)
    f1 = 2 * precision * recall / (precision + recall)

    f1s[i] = tf.reduce_mean(f1)

weights = tf.reduce_sum(y_true, axis=0)
weights /= tf.reduce_sum(weights)

f1s[2] = tf.reduce_sum(f1 * weights)

micro, macro, weighted = f1s

La justesse

def tf_f1_score(y_true, y_pred):
    """Computes 3 different f1 scores, micro macro
    weighted.
    micro: f1 score accross the classes, as 1
    macro: mean of f1 scores per class
    weighted: weighted average of f1 scores per class,
            weighted from the support of each class


    Args:
        y_true (Tensor): labels, with shape (batch, num_classes)
        y_pred (Tensor): model's predictions, same shape as y_true

    Returns:
        Tuple(Tensor): (micro, macro, weighted)
                    Tuple of the computed f1 scores
    """

    f1s = [0, 0, 0]

    y_true = tf.cast(y_true, tf.float64)
    y_pred = tf.cast(y_pred, tf.float64)

    for i, axis in enumerate([None, 0]):
        TP = tf.count_nonzero(y_pred * y_true, axis=axis)
        FP = tf.count_nonzero(y_pred * (y_true - 1), axis=axis)
        FN = tf.count_nonzero((y_pred - 1) * y_true, axis=axis)

        precision = TP / (TP + FP)
        recall = TP / (TP + FN)
        f1 = 2 * precision * recall / (precision + recall)

        f1s[i] = tf.reduce_mean(f1)

    weights = tf.reduce_sum(y_true, axis=0)
    weights /= tf.reduce_sum(weights)

    f1s[2] = tf.reduce_sum(f1 * weights)

    micro, macro, weighted = f1s
    return micro, macro, weighted


def compare(nb, dims):
    labels = (np.random.randn(nb, dims) > 0.5).astype(int)
    predictions = (np.random.randn(nb, dims) > 0.5).astype(int)

    stime = time()
    mic = f1_score(labels, predictions, average='micro')
    mac = f1_score(labels, predictions, average='macro')
    wei = f1_score(labels, predictions, average='weighted')

    print('sklearn in {:.4f}:\n    micro: {:.8f}\n    macro: {:.8f}\n    weighted: {:.8f}'.format(
        time() - stime, mic, mac, wei
    ))

    gtime = time()
    tf.reset_default_graph()
    y_true = tf.Variable(labels)
    y_pred = tf.Variable(predictions)
    micro, macro, weighted = tf_f1_score(y_true, y_pred)
    with tf.Session() as sess:
        tf.global_variables_initializer().run(session=sess)
        stime = time()
        mic, mac, wei = sess.run([micro, macro, weighted])
        print('tensorflow in {:.4f} ({:.4f} with graph time):\n    micro: {:.8f}\n    macro: {:.8f}\n    weighted: {:.8f}'.format(
            time() - stime, time()-gtime,  mic, mac, wei
        ))

compare(10 ** 6, 10)

renvoie :

>> rows: 10^6 dimensions: 10
sklearn in 2.3939:
    micro: 0.30890287
    macro: 0.30890275
    weighted: 0.30890279
tensorflow in 0.2465 (3.3246 with graph time):
    micro: 0.30890287
    macro: 0.30890275
    weighted: 0.30890279
6
ted

Depuis que je n'ai pas assez de réputation pour ajouter un commentaire à Salvador Dalis répondre c'est le chemin à parcourir:

tf.count_nonzero transforme vos valeurs en un tf.int64 sauf indication contraire. En utilisant:

argmax_prediction = tf.argmax(prediction, 1)
argmax_y = tf.argmax(y, 1)

TP = tf.count_nonzero(argmax_prediction * argmax_y, dtype=tf.float32)
TN = tf.count_nonzero((argmax_prediction - 1) * (argmax_y - 1), dtype=tf.float32)
FP = tf.count_nonzero(argmax_prediction * (argmax_y - 1), dtype=tf.float32)
FN = tf.count_nonzero((argmax_prediction - 1) * argmax_y, dtype=tf.float32)

c'est une très bonne idée.

3
Someone

Utilisez les API de métriques fournies dans tf.contrib.metrics, par exemple:

labels = ...
predictions = ...

accuracy, update_op_acc = tf.contrib.metrics.streaming_accuracy(labels, predictions)
error, update_op_error = tf.contrib.metrics.streaming_mean_absolute_error(labels, predictions)

sess.run(tf.local_variables_initializer())
for batch in range(num_batches):
  sess.run([update_op_acc, update_op_error])
accuracy, mean_absolute_error = sess.run([accuracy, mean_absolute_error])
3
Nandeesh