Comment exécuter un travailleur de céleri avec l'application Django évolutive par AWS Elastic Beanstalk

Question

Comment utiliser Django avec Elastic Beanstalk pour exécuter également des tâches de céleri sur le nœud principal uniquement?

smentek · Accepted Answer

C’est ainsi que j’ai configuré céleri avec Django sur une tige de haricot élastique et que l’évolutivité fonctionne bien.

N'oubliez pas que l'option 'leader_only' pour container_commands fonctionne uniquement sur environment rebuild ou deployment de l'application. Si le service fonctionne suffisamment longtemps, le nœud leader peut être éliminé par Elastic Beanstalk. Pour résoudre ce problème, vous devrez peut-être appliquer une protection d’instance à votre noeud leader. Vérifiez: http://docs.aws.Amazon.com/autoscaling/latest/userguide/as-instance-termination.html#instance-protection-instance

Ajouter un script bash pour le travailleur de céleri et la configuration de battement.

Ajouter un fichier dossier_racine/.ebextensions/files/celery_configuration.txt:

#!/usr/bin/env bash # Get Django environment variables celeryenv=`cat /opt/python/current/env | tr '
' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'` celeryenv=${celeryenv%?} # Create celery configuraiton script celeryconf="[program:celeryd-worker] ; Set full path to celery program if using virtualenv command=/opt/python/run/venv/bin/celery worker -A Django_app --loglevel=INFO directory=/opt/python/current/app user=nobody numprocs=1 stdout_logfile=/var/log/celery-worker.log stderr_logfile=/var/log/celery-worker.log autostart=true autorestart=true startsecs=10 ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs = 600 ; When resorting to send SIGKILL to the program to terminate it ; send SIGKILL to its whole process group instead, ; taking care of its children as well. killasgroup=true ; if rabbitmq is supervised, set its priority higher ; so it starts first priority=998 environment=$celeryenv [program:celeryd-beat] ; Set full path to celery program if using virtualenv command=/opt/python/run/venv/bin/celery beat -A Django_app --loglevel=INFO --workdir=/tmp -S Django --pidfile /tmp/celerybeat.pid directory=/opt/python/current/app user=nobody numprocs=1 stdout_logfile=/var/log/celery-beat.log stderr_logfile=/var/log/celery-beat.log autostart=true autorestart=true startsecs=10 ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs = 600 ; When resorting to send SIGKILL to the program to terminate it ; send SIGKILL to its whole process group instead, ; taking care of its children as well. killasgroup=true ; if rabbitmq is supervised, set its priority higher ; so it starts first priority=998 environment=$celeryenv" # Create the celery supervisord conf script echo "$celeryconf" | tee /opt/python/etc/celery.conf # Add configuration script to supervisord conf (if not there already) if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf then echo "[include]" | tee -a /opt/python/etc/supervisord.conf echo "files: celery.conf" | tee -a /opt/python/etc/supervisord.conf fi # Reread the supervisord config supervisorctl -c /opt/python/etc/supervisord.conf reread # Update supervisord in cache without restarting all services supervisorctl -c /opt/python/etc/supervisord.conf update # Start/Restart celeryd through supervisord supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker

Prenez soin de l'exécution du script lors du déploiement, mais uniquement sur le noeud principal (leader_only: true) . Ajoutez le fichier dossier_racine/.ebextensions/02-python.config:

container_commands: 04_celery_tasks: command: "cat .ebextensions/files/celery_configuration.txt > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh" leader_only: true 05_celery_tasks_run: command: "/opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh" leader_only: true

Beat est configurable sans redéploiement, avec des applications Django distinctes: https://pypi.python.org/pypi/Django_celery_beat .
Stocker les résultats des tâches est une bonne idée pour: https://pypi.python.org/pypi/Django_celery_beat

Fichier Requirements.txt

celery==4.0.0 Django_celery_beat==1.0.1 Django_celery_results==1.0.1 pycurl==7.43.0 --global-option="--with-nss"

Configurez le céleri pour le courtier Amazon SQS .__ (Obtenez le point de terminaison souhaité dans la liste: http://docs.aws.Amazon.com/general/latest/gr/rande.html ) répertoire_racine/Django_app/settings.py:

... CELERY_RESULT_BACKEND = 'Django-db' CELERY_BROKER_URL = 'sqs://%s:%s@' % (aws_access_key_id, aws_secret_access_key) # Due to error on lib region N Virginia is used temporarily. please set it on Ireland "eu-west-1" after fix. CELERY_BROKER_TRANSPORT_OPTIONS = { "region": "eu-west-1", 'queue_name_prefix': 'Django_app-%s-' % os.environ.get('APP_ENV', 'dev'), 'visibility_timeout': 360, 'polling_interval': 1 } ...

Configuration du céleri pour Django Django_app app

Ajouter un fichier root_folder/Django_app/celery.py:

from __future__ import absolute_import, unicode_literals import os from celery import Celery # set the default Django settings module for the 'celery' program. os.environ.setdefault('Django_SETTINGS_MODULE', 'Django_app.settings') app = Celery('Django_app') # Using a string here means the worker don't have to serialize # the configuration object to child processes. # - namespace='CELERY' means all celery-related configuration keys # should have a `CELERY_` prefix. app.config_from_object('Django.conf:settings', namespace='CELERY') # Load task modules from all registered Django app configs. app.autodiscover_tasks()

Modifier le fichier dossier_racine/Django_app/__ init__.py:

from __future__ import absolute_import, unicode_literals # This will make sure the app is always imported when # Django starts so that shared_task will use this app. from Django_app.celery import app as celery_app __all__ = ['celery_app']

Vérifiez aussi:

Comment dirigez-vous un travailleur avec AWS Elastic Beanstalk? (solution sans évolutivité)
Pip Requirements.txt --global-option, provoquant des erreurs d'installation avec d'autres packages. "option non reconnue" (solution aux problèmes posés par les obsolètes pip sur des haricots élastiques qui ne peuvent pas traiter avec des options globales permettant de résoudre correctement la dépendance à pycurl)

Chris Berry · Answer

C’est ainsi que j’ai étendu la réponse de @smentek pour autoriser plusieurs instances de travail et une instance à un seul battement; il en va de même lorsque vous devez protéger votre chef. (Je n'ai toujours pas de solution automatisée pour cela pour le moment).

Veuillez noter que les mises à jour envvar d'EB via EB cli ou l'interface Web ne sont pas réfléchies par le battement du céleri ou les travailleurs jusqu'à ce que le redémarrage du serveur d'applications ait eu lieu. Cela m'a pris au dépourvu une fois.

Un seul fichier celery_configuration.sh génère deux scripts pour supervisord. Notez que célery-beat a autostart=false, sinon vous vous retrouvez avec plusieurs battements après le redémarrage d'une instance:

# get Django environment variables celeryenv=`cat /opt/python/current/env | tr '
' ',' | sed 's/export //g' | sed 's/$PATH/%(ENV_PATH)s/g' | sed 's/$PYTHONPATH//g' | sed 's/$LD_LIBRARY_PATH//g' | sed 's/%/%%/g'` celeryenv=${celeryenv%?} # create celery beat config script celerybeatconf="[program:celeryd-beat] ; Set full path to celery program if using virtualenv command=/opt/python/run/venv/bin/celery beat -A lexvoco --loglevel=INFO --workdir=/tmp -S Django --pidfile /tmp/celerybeat.pid directory=/opt/python/current/app user=nobody numprocs=1 stdout_logfile=/var/log/celery-beat.log stderr_logfile=/var/log/celery-beat.log autostart=false autorestart=true startsecs=10 ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs = 10 ; When resorting to send SIGKILL to the program to terminate it ; send SIGKILL to its whole process group instead, ; taking care of its children as well. killasgroup=true ; if rabbitmq is supervised, set its priority higher ; so it starts first priority=998 environment=$celeryenv" # create celery worker config script celeryworkerconf="[program:celeryd-worker] ; Set full path to celery program if using virtualenv command=/opt/python/run/venv/bin/celery worker -A lexvoco --loglevel=INFO directory=/opt/python/current/app user=nobody numprocs=1 stdout_logfile=/var/log/celery-worker.log stderr_logfile=/var/log/celery-worker.log autostart=true autorestart=true startsecs=10 ; Need to wait for currently executing tasks to finish at shutdown. ; Increase this if you have very long running tasks. stopwaitsecs = 600 ; When resorting to send SIGKILL to the program to terminate it ; send SIGKILL to its whole process group instead, ; taking care of its children as well. killasgroup=true ; if rabbitmq is supervised, set its priority higher ; so it starts first priority=999 environment=$celeryenv" # create files for the scripts echo "$celerybeatconf" | tee /opt/python/etc/celerybeat.conf echo "$celeryworkerconf" | tee /opt/python/etc/celeryworker.conf # add configuration script to supervisord conf (if not there already) if ! grep -Fxq "[include]" /opt/python/etc/supervisord.conf then echo "[include]" | tee -a /opt/python/etc/supervisord.conf echo "files: celerybeat.conf celeryworker.conf" | tee -a /opt/python/etc/supervisord.conf fi # reread the supervisord config /usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf reread # update supervisord in cache without restarting all services /usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf update

Ensuite, dans container_commands, nous ne redémarrons que le temps sur leader:

container_commands: # create the celery configuration file 01_create_celery_beat_configuration_file: command: "cat .ebextensions/files/celery_configuration.sh > /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && chmod 744 /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh && sed -i 's/
$//' /opt/elasticbeanstalk/hooks/appdeploy/post/run_supervised_celeryd.sh" # restart celery beat if leader 02_start_celery_beat: command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-beat" leader_only: true # restart celery worker 03_start_celery_worker: command: "/usr/local/bin/supervisorctl -c /opt/python/etc/supervisord.conf restart celeryd-worker"

jaume · Answer

Si quelqu'un suit la réponse de smentek et obtient l'erreur:

05_celery_tasks_run: /usr/bin/env bash does not exist.

sachez que si vous utilisez Windows, votre problème peut être lié au fait que le fichier "celery_configuration.txt" contient WINDOWS EOL alors qu'il devrait avoir UNIX EOL. Si vous utilisez Notepad ++, ouvrez le fichier et cliquez sur "Edition> Conversion EOL> Unix (LF)". Enregistrer, redéployer et erreur n’est plus là.

En outre, quelques avertissements pour les amateurs comme moi:

Assurez-vous d'inclure "Django_celery_beat" et "Django_celery_results" dans votre "INSTALLED_APPS" dans le fichier settings.py.
Pour vérifier les erreurs de céleri, connectez-vous à votre instance avec "eb ssh" puis "tail -n 40 /var/log/celery-worker.log" et "tail -n 40 /var/log/celery-beat.log" ( où "40" fait référence au nombre de lignes que vous voulez lire dans le fichier, en partant de la fin).

J'espère que cela aide quelqu'un, cela m'aurait épargné quelques heures!