Génération d'une matrice de transition Markov dans Python

Question

Imaginez que j'ai une série de 4 états markoviens possibles (A, B, C, D):

X = [A, B, B, C, B, A, D, D, A, B, A, D, ....]

Comment puis-je générer une matrice de transformation Markov à l'aide de Python? La matrice doit être 4 par 4, montrant la probabilité de passer de chaque état aux 3 autres états. J'ai regardé de nombreux exemples en ligne, mais dans chacun d'eux, la matrice est donnée, non calculée sur la base de données. J'ai également regardé dans hmmlearn mais nulle part je n'ai lu comment faire cracher la matrice de transition. Existe-t-il une bibliothèque que je peux utiliser à cet effet?

Voici un code R pour la chose exacte que j'essaie de faire en Python: https://stats.stackexchange.com/questions/26722/calculate-transition-matrix-markov-in-r

John Coleman · Accepted Answer

Cela pourrait vous donner quelques idées:

transitions = ['A', 'B', 'B', 'C', 'B', 'A', 'D', 'D', 'A', 'B', 'A', 'D'] def rank(c): return ord(c) - ord('A') T = [rank(c) for c in transitions] #create matrix of zeros M = [[0]*4 for _ in range(4)] for (i,j) in Zip(T,T[1:]): M[i][j] += 1 #now convert to probabilities: for row in M: n = sum(row) if n > 0: row[:] = [f/sum(row) for f in row] #print M: for row in M: print(row)

production:

[0.0, 0.5, 0.0, 0.5] [0.5, 0.25, 0.25, 0.0] [0.0, 1.0, 0.0, 0.0] [0.5, 0.0, 0.0, 0.5]

On Edit Voici une fonction qui implémente les idées ci-dessus:

#the following code takes a list such as #[1,1,2,6,8,5,5,7,8,8,1,1,4,5,5,0,0,0,1,1,4,4,5,1,3,3,4,5,4,1,1] #with states labeled as successive integers starting with 0 #and returns a transition matrix, M, #where M[i][j] is the probability of transitioning from i to j def transition_matrix(transitions): n = 1+ max(transitions) #number of states M = [[0]*n for _ in range(n)] for (i,j) in Zip(transitions,transitions[1:]): M[i][j] += 1 #now convert to probabilities: for row in M: s = sum(row) if s > 0: row[:] = [f/s for f in row] return M #test: t = [1,1,2,6,8,5,5,7,8,8,1,1,4,5,5,0,0,0,1,1,4,4,5,1,3,3,4,5,4,1,1] m = transition_matrix(t) for row in m: print(' '.join('{0:.2f}'.format(x) for x in row))

Production:

0.67 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.50 0.12 0.12 0.25 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.50 0.50 0.00 0.00 0.00 0.00 0.00 0.20 0.00 0.00 0.20 0.60 0.00 0.00 0.00 0.17 0.17 0.00 0.00 0.17 0.33 0.00 0.17 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1.00 0.00 0.33 0.00 0.00 0.00 0.33 0.00 0.00 0.33