blocage dans postgres sur simple requête de mise à jour

Question

Je travaille avec postgres 9.1 et j'obtiens une exception de blocage sous une exécution excessive d'une méthode de mise à jour simple.

Selon les journaux, le blocage se produit en raison de l'exécution de deux mises à jour identiques en même temps.

mettre à jour public.vm_action_info set last_on_demand_task_id = $ 1, version = version + 1

Comment deux mises à jour simples identiques peuvent-elles se bloquer mutuellement?

L'erreur que je reçois dans le journal

2013-08-18 11:00:24 IDT HINT: See server log for query details. 2013-08-18 11:00:24 IDT STATEMENT: update public.vm_action_info set last_on_demand_task_id=$1, version=version+1 where id=$2 2013-08-18 11:00:25 IDT ERROR: deadlock detected 2013-08-18 11:00:25 IDT DETAIL: Process 31533 waits for ShareLock on transaction 4228275; blocked by process 31530. Process 31530 waits for ExclusiveLock on Tuple (0,68) of relation 70337 of database 69205; blocked by process 31533. Process 31533: update public.vm_action_info set last_on_demand_task_id=$1, version=version+1 where id=$2 Process 31530: update public.vm_action_info set last_on_demand_task_id=$1, version=version+1 where id=$2 2013-08-18 11:00:25 IDT HINT: See server log for query details. 2013-08-18 11:00:25 IDT STATEMENT: update public.vm_action_info set last_on_demand_task_id=$1, version=version+1 where id=$2 2013-08-18 11:00:25 IDT ERROR: deadlock detected 2013-08-18 11:00:25 IDT DETAIL: Process 31530 waits for ExclusiveLock on Tuple (0,68) of relation 70337 of database 69205; blocked by process 31876. Process 31876 waits for ShareLock on transaction 4228275; blocked by process 31530. Process 31530: update public.vm_action_info set last_on_demand_task_id=$1, version=version+1 where id=$2 Process 31876: update public.vm_action_info set last_on_demand_task_id=$1, version=version+1 where id=$2

le schéma est:

CREATE TABLE vm_action_info( id integer NOT NULL, version integer NOT NULL DEFAULT 0, vm_info_id integer NOT NULL, last_exit_code integer, bundle_action_id integer NOT NULL, last_result_change_time numeric NOT NULL, last_completed_vm_task_id integer, last_on_demand_task_id bigint, CONSTRAINT vm_action_info_pkey PRIMARY KEY (id ), CONSTRAINT vm_action_info_bundle_action_id_fk FOREIGN KEY (bundle_action_id) REFERENCES bundle_action (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE, CONSTRAINT vm_discovery_info_fk FOREIGN KEY (vm_info_id) REFERENCES vm_info (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE, CONSTRAINT vm_task_last_on_demand_task_fk FOREIGN KEY (last_on_demand_task_id) REFERENCES vm_task (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION, CONSTRAINT vm_task_last_task_fk FOREIGN KEY (last_completed_vm_task_id) REFERENCES vm_task (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ) WITH (OIDS=FALSE); ALTER TABLE vm_action_info OWNER TO vadm; -- Index: vm_action_info_vm_info_id_index -- DROP INDEX vm_action_info_vm_info_id_index; CREATE INDEX vm_action_info_vm_info_id_index ON vm_action_info USING btree (vm_info_id ); CREATE TABLE vm_task ( id integer NOT NULL, version integer NOT NULL DEFAULT 0, vm_action_info_id integer NOT NULL, creation_time numeric NOT NULL DEFAULT 0, task_state text NOT NULL, triggered_by text NOT NULL, bundle_param_revision bigint NOT NULL DEFAULT 0, execution_time bigint, expiration_time bigint, username text, completion_time bigint, completion_status text, completion_error text, CONSTRAINT vm_task_pkey PRIMARY KEY (id ), CONSTRAINT vm_action_info_fk FOREIGN KEY (vm_action_info_id) REFERENCES vm_action_info (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE ) WITH ( OIDS=FALSE ); ALTER TABLE vm_task OWNER TO vadm; -- Index: vm_task_creation_time_index -- DROP INDEX vm_task_creation_time_index ; CREATE INDEX vm_task_creation_time_index ON vm_task USING btree (creation_time );

krokodilko · Answer

Je suppose que la source du problème est une référence de clé étrangère circulaire dans vos tables.

TABLEAU vm_action_info
==> CLÉ ÉTRANGÈRE (last_completed_vm_task_id) RÉFÉRENCES vm_task (id)

TABLEAU vm_task
==> CLÉ ÉTRANGÈRE (vm_action_info_id) RÉFÉRENCES vm_action_info (id)

La transaction comprend deux étapes:

ajouter une nouvelle entrée à la table des tâches

met à jour l'entrée correspondante dans vm_action_info la table vm_task.

Lorsque deux transactions vont mettre à jour le même enregistrement dans le vm_action_info table en même temps, cela se terminera par un blocage.

Regardez un cas de test simple:

CREATE TABLE vm_task ( id integer NOT NULL, version integer NOT NULL DEFAULT 0, vm_action_info_id integer NOT NULL, CONSTRAINT vm_task_pkey PRIMARY KEY (id ) ) WITH ( OIDS=FALSE ); insert into vm_task values ( 0, 0, 0 ), ( 1, 1, 1 ), ( 2, 2, 2 ); CREATE TABLE vm_action_info( id integer NOT NULL, version integer NOT NULL DEFAULT 0, last_on_demand_task_id bigint, CONSTRAINT vm_action_info_pkey PRIMARY KEY (id ) ) WITH (OIDS=FALSE); insert into vm_action_info values ( 0, 0, 0 ), ( 1, 1, 1 ), ( 2, 2, 2 ); alter table vm_task add CONSTRAINT vm_action_info_fk FOREIGN KEY (vm_action_info_id) REFERENCES vm_action_info (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE CASCADE ; Alter table vm_action_info add CONSTRAINT vm_task_last_on_demand_task_fk FOREIGN KEY (last_on_demand_task_id) REFERENCES vm_task (id) MATCH SIMPLE ON UPDATE NO ACTION ON DELETE NO ACTION ;

Dans la session 1, nous ajoutons un enregistrement à vm_task qui fait référence à id = 2 dans vm_action_info

session1=> begin; BEGIN session1=> insert into vm_task values( 100, 0, 2 ); INSERT 0 1 session1=>

Au même moment dans la session 2, une autre transaction commence:

session2=> begin; BEGIN session2=> insert into vm_task values( 200, 0, 2 ); INSERT 0 1 session2=>

Ensuite, la 1ère transaction effectue la mise à jour:

session1=> update vm_action_info set last_on_demand_task_id=100, version=version+1 session1=> where id=2;

mais cette commande se bloque et attend un verrou .....

puis la 2e session effectue la mise à jour ........

session2=> update vm_action_info set last_on_demand_task_id=200, version=version+1 where id=2; BŁĄD: wykryto zakleszczenie SZCZEGÓŁY: Proces 9384 oczekuje na ExclusiveLock na krotka (0,5) relacji 33083 bazy danych 16393; zablokowany przez 380 8. Proces 3808 oczekuje na ShareLock na transakcja 976; zablokowany przez 9384. PODPOWIEDŹ: Przejrzyj dziennik serwera by znaleźć szczegóły zapytania. session2=>

Blocage détecté !!!

En effet, les deux INSERT dans vm_task placent un verrou partagé sur la ligne id = 2 dans la table vm_action_info en raison de la référence de clé étrangère. Ensuite, la première mise à jour tente de placer un verrou d'écriture sur cette ligne et se bloque car la ligne est verrouillée par une autre (deuxième) transaction. Ensuite, la deuxième mise à jour tente de verrouiller le même enregistrement en mode écriture, mais il est verrouillé en mode partagé par la première transaction. Et cela provoque une impasse.

Je pense que cela peut être évité si vous placez un verrou d'écriture sur l'enregistrement dans vm_action_info, la transaction entière doit se composer de 5 étapes:

 begin; select * from vm_action_info where id=2 for update; insert into vm_task values( 100, 0, 2 ); update vm_action_info set last_on_demand_task_id=100, version=version+1 where id=2; commit;

Richard Huxton · Answer

Il se peut que votre système soit exceptionnellement occupé. Vous dites que vous n'avez vu cela qu'avec une "exécution excessive" de la requête.

Ce qui semble être la situation est la suivante:

pid=31530 wants to lock Tuple (0,68) on rel 70337 (vm_action_info I suspect) for update it is waiting behind pid=31533, pid=31876 pid=31533 is waiting behind transaction 4228275 pid=31876 is waiting behind transaction 4228275

Donc - nous avons ce qui semble être quatre transactions mettant à jour cette ligne en même temps. La transaction 4228275 n'a pas encore été validée ou annulée et maintient les autres. Deux d'entre eux attendaient deadlock_timeout secondes sinon nous ne verrions pas le timeout. Le délai expire, le détecteur de blocage jette un œil, voit un tas de transactions entrelacées et annule l'une d'entre elles. Cela pourrait ne pas être strictement une impasse, mais je ne suis pas sûr que le détecteur soit suffisamment intelligent pour le comprendre.

Essayez l'un des:

Réduisez le taux de mises à jour
Obtenez un serveur plus rapide
Augmentez deadlock_timeout

Probablement le # 3 est le plus simple :-) Pourrait aussi définir log_lock_waits afin que vous puissiez voir si/quand votre système est sous ce genre de contrainte.