Aucun nœud de données n'est démarré

Question

J'essaie d'installer Hadoop version 0.20.203.0 dans une configuration pseudo distribuée à l'aide du guide suivant:

http://www.javacodegeeks.com/2012/01/hadoop-modes-explained-standalone.html

Après avoir exécuté le script start-all.sh, je lance "jps".

Je reçois cette sortie:

4825 NameNode 5391 TaskTracker 5242 JobTracker 5477 Jps 5140 SecondaryNameNode

Quand j'essaye d'ajouter des informations au hdfs en utilisant:

bin/hadoop fs -put conf input

J'ai une erreur:

hadoop@m1a2:~/software/hadoop$ bin/hadoop fs -put conf input 12/04/10 18:15:31 WARN hdfs.DFSClient: DataStreamer Exception: org.Apache.hadoop.ipc.RemoteException: Java.io.IOException: File /user/hadoop/input/core-site.xml could only be replicated to 0 nodes, instead of 1 at org.Apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.Java:1417) at org.Apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.Java:596) at Sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43) at Java.lang.reflect.Method.invoke(Method.Java:616) at org.Apache.hadoop.ipc.RPC$Server.call(RPC.Java:523) at org.Apache.hadoop.ipc.Server$Handler$1.run(Server.Java:1383) at org.Apache.hadoop.ipc.Server$Handler$1.run(Server.Java:1379) at Java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.Java:416) at org.Apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.Java:1059) at org.Apache.hadoop.ipc.Server$Handler.run(Server.Java:1377) at org.Apache.hadoop.ipc.Client.call(Client.Java:1030) at org.Apache.hadoop.ipc.RPC$Invoker.invoke(RPC.Java:224) at $Proxy1.addBlock(Unknown Source) at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:57) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43) at Java.lang.reflect.Method.invoke(Method.Java:616) at org.Apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.Java:82) at org.Apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.Java:59) at $Proxy1.addBlock(Unknown Source) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.Java:3104) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.Java:2975) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.Java:2255) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.Java:2446) 12/04/10 18:15:31 WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null 12/04/10 18:15:31 WARN hdfs.DFSClient: Could not get block locations. Source file "/user/hadoop/input/core-site.xml" - Aborting... put: Java.io.IOException: File /user/hadoop/input/core-site.xml could only be replicated to 0 nodes, instead of 1 12/04/10 18:15:31 ERROR hdfs.DFSClient: Exception closing file /user/hadoop/input/core-site.xml : org.Apache.hadoop.ipc.RemoteException: Java.io.IOException: File /user/hadoop/input/core-site.xml could only be replicated to 0 nodes, instead of 1 at org.Apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.Java:1417) at org.Apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.Java:596) at Sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43) at Java.lang.reflect.Method.invoke(Method.Java:616) at org.Apache.hadoop.ipc.RPC$Server.call(RPC.Java:523) at org.Apache.hadoop.ipc.Server$Handler$1.run(Server.Java:1383) at org.Apache.hadoop.ipc.Server$Handler$1.run(Server.Java:1379) at Java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.Java:416) at org.Apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.Java:1059) at org.Apache.hadoop.ipc.Server$Handler.run(Server.Java:1377) org.Apache.hadoop.ipc.RemoteException: Java.io.IOException: File /user/hadoop/input/core-site.xml could only be replicated to 0 nodes, instead of 1 at org.Apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.Java:1417) at org.Apache.hadoop.hdfs.server.namenode.NameNode.addBlock(NameNode.Java:596) at Sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43) at Java.lang.reflect.Method.invoke(Method.Java:616) at org.Apache.hadoop.ipc.RPC$Server.call(RPC.Java:523) at org.Apache.hadoop.ipc.Server$Handler$1.run(Server.Java:1383) at org.Apache.hadoop.ipc.Server$Handler$1.run(Server.Java:1379) at Java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.Java:416) at org.Apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.Java:1059) at org.Apache.hadoop.ipc.Server$Handler.run(Server.Java:1377) at org.Apache.hadoop.ipc.Client.call(Client.Java:1030) at org.Apache.hadoop.ipc.RPC$Invoker.invoke(RPC.Java:224) at $Proxy1.addBlock(Unknown Source) at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:57) at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43) at Java.lang.reflect.Method.invoke(Method.Java:616) at org.Apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.Java:82) at org.Apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.Java:59) at $Proxy1.addBlock(Unknown Source) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.Java:3104) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.Java:2975) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2000(DFSClient.Java:2255) at org.Apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.Java:2446)

Je ne suis pas tout à fait sûr, mais je pense que cela est peut-être lié au fait que le code de données ne fonctionne pas.

Est-ce que quelqu'un sait ce que j'ai mal fait ou comment résoudre ce problème?

EDIT: Ceci est le fichier datanode.log:

2012-04-11 12:27:28,977 INFO org.Apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: /************************************************************ STARTUP_MSG: Starting DataNode STARTUP_MSG: Host = m1a2/139.147.5.55 STARTUP_MSG: args = [] STARTUP_MSG: version = 0.20.203.0 STARTUP_MSG: build = http://svn.Apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May 4 07:57:50 PDT 2011 ************************************************************/ 2012-04-11 12:27:29,166 INFO org.Apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties 2012-04-11 12:27:29,181 INFO org.Apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered. 2012-04-11 12:27:29,183 INFO org.Apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s). 2012-04-11 12:27:29,183 INFO org.Apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started 2012-04-11 12:27:29,342 INFO org.Apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered. 2012-04-11 12:27:29,347 WARN org.Apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists! 2012-04-11 12:27:29,615 ERROR org.Apache.hadoop.hdfs.server.datanode.DataNode: Java.io.IOException: Incompatible namespaceIDs in /tmp/hadoop-hadoop/dfs/data: namenode namespaceID = 301052954; datanode namespaceID = 229562149 at org.Apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.Java:232) at org.Apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.Java:147) at org.Apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.Java:354) at org.Apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.Java:268) at org.Apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.Java:1480) at org.Apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.Java:1419) at org.Apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.Java:1437) at org.Apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.Java:1563) at org.Apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.Java:1573) 2012-04-11 12:27:29,617 INFO org.Apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at m1a2/139.147.5.55 ************************************************************/

Chris Shain · Accepted Answer

Cette erreur que vous obtenez dans le journal DN est décrite ici: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/#Java-io- ioexception-incompatible-namespaceids

De cette page:

Pour le moment, il semble y avoir deux solutions de contournement décrites ci-dessous.

Solution de contournement 1: recommencer à zéro

Je peux témoigner que les étapes suivantes résolvent cette erreur, mais que les effets secondaires ne vous rendront pas heureux (moi non plus). La solution brute que j'ai trouvée consiste à:

Arrêtez le cluster
Supprimer le répertoire de données sur le DataNode problématique: le répertoire est spécifié par dfs.data.dir dans conf/hdfs-site.xml; Si vous avez suivi ce tutoriel, le répertoire approprié est/app/hadoop/tmp/dfs/data
Reformatez le NameNode (NOTE: toutes les données HDFS sont perdues au cours de ce processus!)
Redémarrer le cluster

Lorsque supprimer toutes les données HDFS et recommencer à partir de zéro ne semble pas une bonne idée (cela pourrait bien se passer lors de la configuration/des tests initiaux), vous pouvez essayer la deuxième approche.

Solution de contournement 2: mise à jour de l'ID d'espace de nom de DataNodes problématique

Un grand merci à Jared Stehler pour la suggestion suivante. Je ne l'ai pas encore testé moi-même, mais n'hésitez pas à l'essayer et à m'envoyer vos commentaires. Cette solution de contournement est «peu invasive» car il vous suffit de modifier un fichier sur les DataNodes problématiques:

Arrêtez le DataNode
Editez la valeur de namespaceID dans/current/VERSION pour qu'elle corresponde à la valeur du NameNode actuel.
Redémarrez le DataNode

Si vous avez suivi les instructions de mes tutoriels, le chemin complet des fichiers pertinents est le suivant:

NameNode:/app/hadoop/tmp/dfs/name/current/VERSION

DataNode:/app/hadoop/tmp/dfs/data/current/VERSION

(background: dfs.data.dir est défini par défaut sur

$ {hadoop.tmp.dir}/dfs/data, et nous avons défini hadoop.tmp.dir

dans ce tutoriel à/app/hadoop/tmp).

Si vous vous demandez à quoi ressemble le contenu de VERSION, voici un des miens:

# contenu de/current/VERSION

namespaceID = 393514426

iD de stockage = DS-1706792599-10.10.10.1-50010-1204306713481

cTime = 1215607609074

storageType = DATA_NODE

layoutVersion = -13

apurva.nandan · Answer

D'accord, je poste ceci une fois de plus:

Au cas où quelqu'un aurait besoin de ça, pour les versions plus récentes de Hadoop (en fait, j'utilise la version 2.4.0)

Dans ce cas, arrêtez le cluster sbin/stop-all.sh
Ensuite, allez à /etc/hadoop pour les fichiers de configuration.

Dans le fichier: hdfs-site.xml Recherchez les chemins de répertoire correspondant à dfs.namenode.name.dir dfs.namenode.data.dir

Supprimez les deux répertoires de manière récursive (rm -r).
Maintenant formatez le namenode via bin/hadoop namenode -format
Et enfin sbin/start-all.sh

J'espère que cela t'aides.

user2580337 · Answer

J'ai eu le même problème sur le pseudo-noeud en utilisant hadoop1.1.2 Alors j'ai lancé bin/stop-all.sh pour arrêter le cluster Puis vu la configuration de mon répertoire hadoop tmp dans hdfs-site.xml

<name>hadoop.tmp.dir</name> <value>/root/data/hdfstmp</value>

Alors je suis allé dans/root/data/hdfstmp et j'ai supprimé tous les fichiers en utilisant la commande (vous pourriez perdre vos données)

rm -rf *

puis formater à nouveau namenode

bin/hadoop namenode -format

puis démarrez le cluster en utilisant

bin/start-all.sh

La raison principale est que bin/hadoop namenode -format n'a pas supprimé les anciennes données. Nous devons donc le supprimer manuellement.

Somnath Kadam · Answer

Suivez les étapes suivantes:

1. bin/stop-all.sh 2. remove dfs/ and mapred/ folder of hadoop.tmp.dir in core-site.xml 3. bin/hadoop namenode -format 4. bin/start-all.sh 5. jps

Jickson T George · Answer

Essayez de formater votre code de données et redémarrez-le.

Dan Ciborowski - MSFT · Answer

J'utilise CDH4 comme version d'Hadoop et je ne parviens pas à le configurer. Même après avoir essayé de reformater mon namodode, je recevais toujours l'erreur.

Mon fichier VERSION était situé dans

/var/lib/hadoop-hdfs/cache/{username}/dfs/data/current/VERSION

Vous pouvez trouver l'emplacement du répertoire de cache HDFS en recherchant la propriété hadoop.tmp.dir :

more /etc/hadoop/conf/hdfs-site.xml

J'ai trouvé ça en faisant

cd /var/lib/hadoop-hdfs/cache/ rm -rf *

puis en reformatant le nom de code, j’ai enfin pu résoudre le problème. Merci à la première réponse de m'avoir aidé à comprendre quel dossier je devais bombarder.

saurav · Answer

J'ai essayé avec l'approche 2 proposée par Jared Stehler dans la réponse de Chris Shain et je peux confirmer qu'après avoir apporté ces modifications, j'ai pu résoudre le problème mentionné ci-dessus.

J'ai utilisé le même numéro de version pour le nom et le fichier de données VERSION. Cela signifie copier le numéro de version du fichier VERSION inside (/ app/hadoop/tmp/dfs/name/current) vers VERSION inside (/ app/hadoop/tmp/dfs/data/current) et cela a fonctionné à merveille

À votre santé !

goat · Answer

J'ai rencontré ce problème lorsque j'utilisais un vst 4.4.0-1 cloudera quickstart non modifié

Pour référence, le responsable de cloudera a déclaré que mon datanode était en bon état, même si le message d'erreur dans le stacktrace DataStreamer indiquait qu'aucun datanode n'était en cours d'exécution.

le crédit va à la solution de contournement n ° 2 de https://stackoverflow.com/a/10110369/249538 mais je vais détailler mon expérience spécifique d'utilisation du logiciel cloudera quickstart vm.

Plus précisément, j'ai fait:
dans cet ordre, arrêtez les services hue1, Hive1, mapreduce1, hdfs1 via le gestionnaire de cloudera http: //localhost.localdomain: 7180/cmf/services/status

trouvé mes fichiers VERSION via:
Sudo find / -name VERSION

j'ai eu:

/dfs/dn/current/BP-780931682-127.0.0.1-1381159027878/current/VERSION /dfs/dn/current/VERSION /dfs/nn/current/VERSION /dfs/snn/current/VERSION

j'ai vérifié le contenu de ces fichiers, mais ils avaient tous un namespaceID correspondant sauf qu'un fichier manquait totalement. donc j'ai ajouté une entrée à celui-ci.

puis j'ai redémarré les services dans l'ordre inverse via le gestionnaire de cloudera. maintenant je peux -put trucer sur hdfs.

mahmood · Answer

Dans mon cas, j’ai mal défini une destination pour dfs.name.dir et dfs.data.dir. Le format correct est

 <property> <name>dfs.name.dir</name> <value>/path/to/name</value> </property> <property> <name>dfs.data.dir</name> <value>/path/to/data</value> </property>

Aey Varistha · Answer

J'ai le même problème avec le code de données manquant Et je suis cette étape qui a fonctionné pour moi

1. Recherchez le dossier dans lequel le nom de fichier se trouve dans . cd hadoop/hadoopdata/hdfs 2. Regardez dans le dossier et vous verrez quel fichier vous avez dans hdfs ls 3.delete le dossier datanode car il s'agit d'une ancienne version de datanode rm -rf/datanode/* 4. vous obtiendrez la nouvelle version après avoir exécuté la commande précédente 5. start new datanode hadoop-daemon.sh start datanode 6. actualiser les services Web. Vous verrez apparaître le noeud perdu mon terminal