web-dev-qa-db-fra.com

Étincelle/Fil: Le fichier n'existe pas sur HDFS

J'ai une configuration de cluster Hadoop/Yarn sur AWS, j'ai un maître et 3 esclaves. J'ai vérifié que 3 nœuds actifs fonctionnaient sur les ports 50070 et 8088. J'ai testé un travail d'étincelle en mode de déploiement client, tout fonctionne correctement.

Lorsque j'essaie de soumettre un travail à l'étincelle à l'aide de ./spark-2.1.1-bin-hadoop2.7/bin/spark-submit --master yarn --deploy-mode cluster ip.py. Je reçois l'erreur suivante.

Diagnostics: le fichier n'existe pas: hdfs: //ec2-54-153-50-11.us-west-1.compute.amazonaws.com: 9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/__ spark_libs__1200479165381142167.Zip

Java.io.FileNotFoundException: Le fichier n'existe pas:
hdfs: //ec2-54-153-50-11.us-west 1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/__spark_libs__1200479165381142167.Zip

17/05/28 18:58:32 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-Java classes where applicable
17/05/28 18:58:33 INFO client.RMProxy: Connecting to ResourceManager at ec2-54-153-50-11.us-west-1.compute.amazonaws.com/172.31.5.235:8032
17/05/28 18:58:34 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers
17/05/28 18:58:34 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container)
17/05/28 18:58:34 INFO yarn.Client: Will allocate AM container, with 1408 MB memory including 384 MB overhead
17/05/28 18:58:34 INFO yarn.Client: Setting up container launch context for our AM
17/05/28 18:58:34 INFO yarn.Client: Setting up the launch environment for our AM container
17/05/28 18:58:34 INFO yarn.Client: Preparing resources for our AM container
17/05/28 18:58:36 WARN yarn.Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
17/05/28 18:58:41 INFO yarn.Client: Uploading resource file:/tmp/spark-fbd6d435-9abe-4396-838e-60f19bc2dc14/__spark_libs__1200479165381142167.Zip -> hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/__spark_libs__1200479165381142167.Zip
17/05/28 18:58:45 INFO yarn.Client: Uploading resource file:/home/ubuntu/ip.py -> hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/ip.py
17/05/28 18:58:45 INFO yarn.Client: Uploading resource file:/home/ubuntu/spark-2.1.1-bin-hadoop2.7/python/lib/pyspark.Zip -> hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/pyspark.Zip
17/05/28 18:58:45 INFO yarn.Client: Uploading resource file:/home/ubuntu/spark-2.1.1-bin-hadoop2.7/python/lib/py4j-0.10.4-src.Zip -> hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/py4j-0.10.4-src.Zip
17/05/28 18:58:45 INFO yarn.Client: Uploading resource file:/tmp/spark-fbd6d435-9abe-4396-838e-60f19bc2dc14/__spark_conf__7895841687984145748.Zip -> hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/__spark_conf__.Zip
17/05/28 18:58:46 INFO spark.SecurityManager: Changing view acls to: ubuntu
17/05/28 18:58:46 INFO spark.SecurityManager: Changing modify acls to: ubuntu
17/05/28 18:58:46 INFO spark.SecurityManager: Changing view acls groups to:
17/05/28 18:58:46 INFO spark.SecurityManager: Changing modify acls groups to:
17/05/28 18:58:46 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(ubuntu); groups with view permissions: Set(); users  with modify permissions: Set(ubuntu); groups with modify permissions: Set()
17/05/28 18:58:46 INFO yarn.Client: Submitting application application_1495996836198_0003 to ResourceManager
17/05/28 18:58:46 INFO impl.YarnClientImpl: Submitted application application_1495996836198_0003
17/05/28 18:58:47 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:47 INFO yarn.Client:
     client token: N/A
     diagnostics: N/A
     ApplicationMaster Host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1495997926073
     final status: UNDEFINED
     tracking URL: http://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:8088/proxy/application_1495996836198_0003/
     user: ubuntu
17/05/28 18:58:48 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:49 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:50 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:51 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:52 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:53 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:54 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:55 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:56 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:57 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:58 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:58:59 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:59:00 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:59:01 INFO yarn.Client: Application report for application_1495996836198_0003 (state: ACCEPTED)
17/05/28 18:59:02 INFO yarn.Client: Application report for application_1495996836198_0003 (state: FAILED)
17/05/28 18:59:02 INFO yarn.Client:
     client token: N/A
     diagnostics: Application application_1495996836198_0003 failed 2 times due to AM Container for appattempt_1495996836198_0003_000002 exited with  exitCode: -1000
For more detailed output, check application tracking page:http://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:8088/cluster/app/application_1495996836198_0003Then, click on links to logs of each attempt.
Diagnostics: File does not exist: hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/__spark_libs__1200479165381142167.Zip
Java.io.FileNotFoundException: File does not exist: hdfs://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:9000/user/ubuntu/.sparkStaging/application_1495996836198_0003/__spark_libs__1200479165381142167.Zip
    at org.Apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.Java:1309)
    at org.Apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.Java:1301)
    at org.Apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.Java:81)
    at org.Apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.Java:1301)
    at org.Apache.hadoop.yarn.util.FSDownload.copy(FSDownload.Java:253)
    at org.Apache.hadoop.yarn.util.FSDownload.access$000(FSDownload.Java:63)
    at org.Apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.Java:361)
    at org.Apache.hadoop.yarn.util.FSDownload$2.run(FSDownload.Java:359)
    at Java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.Java:421)
    at org.Apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.Java:1657)
    at org.Apache.hadoop.yarn.util.FSDownload.call(FSDownload.Java:358)
    at org.Apache.hadoop.yarn.util.FSDownload.call(FSDownload.Java:62)
    at Java.util.concurrent.FutureTask.run(FutureTask.Java:262)
    at Java.util.concurrent.Executors$RunnableAdapter.call(Executors.Java:473)
    at Java.util.concurrent.FutureTask.run(FutureTask.Java:262)
    at Java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.Java:1145)
    at Java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.Java:615)
    at Java.lang.Thread.run(Thread.Java:745)

Failing this attempt. Failing the application.
     ApplicationMaster Host: N/A
     ApplicationMaster RPC port: -1
     queue: default
     start time: 1495997926073
     final status: FAILED
     tracking URL: http://ec2-54-153-50-11.us-west-1.compute.amazonaws.com:8088/cluster/app/application_1495996836198_0003
     user: ubuntu
Exception in thread "main" org.Apache.spark.SparkException: Application application_1495996836198_0003 finished with failed status
    at org.Apache.spark.deploy.yarn.Client.run(Client.scala:1180)
    at org.Apache.spark.deploy.yarn.Client$.main(Client.scala:1226)
    at org.Apache.spark.deploy.yarn.Client.main(Client.scala)
    at Sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at Sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.Java:57)
    at Sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.Java:43)
    at Java.lang.reflect.Method.invoke(Method.Java:606)
    at org.Apache.spark.deploy.SparkSubmit$.org$Apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:743)
    at org.Apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
    at org.Apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
    at org.Apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
    at org.Apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
17/05/28 18:59:02 INFO util.ShutdownHookManager: Shutdown hook called
17/05/28 18:59:02 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-fbd6d435-9abe-4396-838e-60f19bc2dc14
ubuntu@ip-172-31-5-235:~$
6
user1187968

Je mettais le maître en local (.setMaster('local')) dans mon fichier source. Après que j'enlève cela, tout fonctionne bien.

8
user1187968

J'ai aussi eu ce problème. J'ai essayé la solution de supprimer la setMaster('local') dans le source, mais l'erreur n'a pas disparu.

Ce qui a finalement résolu mon problème était en réalité assez simple: le contexte spark devait être initialisé en premier (même avant les variables non liées à une étincelle).

Solution

À partir du message mentionné ci-dessus, voici un exemple en python. La même logique a fonctionné pour moi en scala

Bonjour, Si je suis tes suggestions, ça marche.

Notre code était comme ça:

Import numpy as np 
Import SparkContext 
foo = np.genfromtext(xxxxx) 
sc=SparkContext(...)
#compute

===> Ça échoue ...

Import numpy as np 
Import SparkContext 
sc=SparkContext(...) 
foo = np.genfromtext(xxxxx)
#compute

===> Cela fonctionne parfaitement ...

Remarque J'ai également supprimé la setMaster('local'), car il est logique que cela interfère également.

0
Alter