Hadoop: Java.lang.IncompatibleClassChangeError: Interface trouvée org.Apache.hadoop.mapreduce.JobContext, mais la classe était attendue

Question

Mes tâches MapReduce fonctionnent correctement lorsqu'elles sont assemblées dans Eclipse avec tous les fichiers JAR possibles Hadoop et Hive inclus dans le projet Eclipse en tant que dépendances. (Il s’agit des fichiers jar fournis avec l’installation locale Hadoop à un seul noeud).

Cependant, lorsque j'essaie de lancer le même programme assemblé à l'aide du projet Maven (voir ci-dessous), je reçois:

 Exception in thread "main" Java.lang.IncompatibleClassChangeError: Found interface org.Apache.hadoop.mapreduce.JobContext, but class was expected

Cette exception se produit lorsque le programme est assemblé à l'aide du projet Maven suivant:

<project xmlns="http://maven.Apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.Apache.org/POM/4.0.0 http://maven.Apache.org/xsd/maven-4.0.0.xsd"> <modelVersion>4.0.0</modelVersion> <groupId>com.bigdata.hadoop</groupId> <artifactId>FieldCounts</artifactId> <version>0.0.1-SNAPSHOT</version> <packaging>jar</packaging> <name>FieldCounts</name> <url>http://maven.Apache.org</url> <properties> <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> </properties> <dependencies> <dependency> <groupId>junit</groupId> <artifactId>junit</artifactId> <version>3.8.1</version> <scope>test</scope> </dependency> <dependency> <groupId>org.Apache.hadoop</groupId> <artifactId>hadoop-hdfs</artifactId> <version>2.2.0</version> </dependency> <dependency> <groupId>org.Apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.2.0</version> </dependency> <dependency> <groupId>org.Apache.hadoop</groupId> <artifactId>hadoop-mapreduce-client-jobclient</artifactId> <version>2.2.0</version> </dependency> <dependency> <groupId>org.Apache.Hive.hcatalog</groupId> <artifactId>hcatalog-core</artifactId> <version>0.12.0</version> </dependency> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>16.0.1</version> </dependency> </dependencies> <build> <plugins> <plugin> <groupId>org.Apache.maven.plugins</groupId> <artifactId>maven-compiler-plugin</artifactId> <version>2.3.2</version> <configuration> <source>${jdk.version}</source> <target>${jdk.version}</target> </configuration> </plugin> <plugin> <groupId>org.Apache.maven.plugins</groupId> <artifactId>maven-Assembly-plugin</artifactId> <executions> <execution> <goals> <goal>attached</goal> </goals> <phase>package</phase> <configuration> <descriptorRefs> <descriptorRef>jar-with-dependencies</descriptorRef> </descriptorRefs> <archive> <manifest> <mainClass>com.bigdata.hadoop.FieldCounts</mainClass> </manifest> </archive> </configuration> </execution> </executions> </plugin> </plugins> </build> </project>

* S'il vous plaît indiquer où et comment trouver des pots compatibles Hadoop? *

[update_1] J'utilise Hadoop 2.2.0.2.0.6.0-101

Comme j'ai trouvé ici: https://github.com/kevinweil/elephant-bird/issues/247

(Hadoop 1.0.3: JobContext est une classe} _

(Hadoop 2.0.0: JobContext est une interface} _

Dans mon pom.xml, j'ai trois pots avec la version 2.2.0

hadoop-hdfs 2.2.0 hadoop-common 2.2.0 hadoop-mapreduce-client-jobclient 2.2.0 hcatalog-core 0.12.0

La seule exception est hcatalog-core dont la version est la 0.12.0, je n’ai pas trouvé aucune version plus récente de ce bocal} _ et j’en ai besoin!

Comment puis-je trouver lequel de ces 4 pots produit Java.lang.IncompatibleClassChangeError: Found interface org.Apache.hadoop.mapreduce.JobContext, but class was expected?

S'il vous plaît, donnez-moi une idée comment résoudre ce problème. (La seule solution que je vois est de tout compiler à partir de la source!)

[/ update_1]

Texte complet de mon travail MarReduce:

package com.bigdata.hadoop; import Java.io.IOException; import Java.util.*; import org.Apache.hadoop.conf.*; import org.Apache.hadoop.io.*; import org.Apache.hadoop.mapreduce.*; import org.Apache.hadoop.util.*; import org.Apache.hcatalog.mapreduce.*; import org.Apache.hcatalog.data.*; import org.Apache.hcatalog.data.schema.*; import org.Apache.log4j.Logger; public class FieldCounts extends Configured implements Tool { public static class Map extends Mapper<WritableComparable, HCatRecord, TableFieldValueKey, IntWritable> { static Logger logger = Logger.getLogger("com.foo.Bar"); static boolean firstMapRun = true; static List<String> fieldNameList = new LinkedList<String>(); /** * Return a list of field names not containing `id` field name * @param schema * @return */ static List<String> getFieldNames(HCatSchema schema) { // Filter out `id` name just once if (firstMapRun) { firstMapRun = false; List<String> fieldNames = schema.getFieldNames(); for (String fieldName : fieldNames) { if (!fieldName.equals("id")) { fieldNameList.add(fieldName); } } } // if (firstMapRun) return fieldNameList; } @Override protected void map( WritableComparable key, HCatRecord hcatRecord, //org.Apache.hadoop.mapreduce.Mapper //<WritableComparable, HCatRecord, Text, IntWritable>.Context context) Context context) throws IOException, InterruptedException { HCatSchema schema = HCatBaseInputFormat.getTableSchema(context.getConfiguration()); //String schemaTypeStr = schema.getSchemaAsTypeString(); //logger.info("******** schemaTypeStr ********** : "+schemaTypeStr); //List<String> fieldNames = schema.getFieldNames(); List<String> fieldNames = getFieldNames(schema); for (String fieldName : fieldNames) { Object value = hcatRecord.get(fieldName, schema); String fieldValue = null; if (null == value) { fieldValue = "<NULL>"; } else { fieldValue = value.toString(); } //String fieldNameValue = fieldName+"."+fieldValue; //context.write(new Text(fieldNameValue), new IntWritable(1)); TableFieldValueKey fieldKey = new TableFieldValueKey(); fieldKey.fieldName = fieldName; fieldKey.fieldValue = fieldValue; context.write(fieldKey, new IntWritable(1)); } } } public static class Reduce extends Reducer<TableFieldValueKey, IntWritable, WritableComparable, HCatRecord> { protected void reduce( TableFieldValueKey key, Java.lang.Iterable<IntWritable> values, Context context) //org.Apache.hadoop.mapreduce.Reducer<Text, IntWritable, //WritableComparable, HCatRecord>.Context context) throws IOException, InterruptedException { Iterator<IntWritable> iter = values.iterator(); int sum = 0; // Sum up occurrences of the given key while (iter.hasNext()) { IntWritable iw = iter.next(); sum = sum + iw.get(); } HCatRecord record = new DefaultHCatRecord(3); record.set(0, key.fieldName); record.set(1, key.fieldValue); record.set(2, sum); context.write(null, record); } } public int run(String[] args) throws Exception { Configuration conf = getConf(); args = new GenericOptionsParser(conf, args).getRemainingArgs(); // To fix Hadoop "META-INFO" (http://stackoverflow.com/questions/17265002/hadoop-no-filesystem-for-scheme-file) conf.set("fs.hdfs.impl", org.Apache.hadoop.hdfs.DistributedFileSystem.class.getName()); conf.set("fs.file.impl", org.Apache.hadoop.fs.LocalFileSystem.class.getName()); // Get the input and output table names as arguments String inputTableName = args[0]; String outputTableName = args[1]; // Assume the default database String dbName = null; Job job = new Job(conf, "FieldCounts"); HCatInputFormat.setInput(job, InputJobInfo.create(dbName, inputTableName, null)); job.setJarByClass(FieldCounts.class); job.setMapperClass(Map.class); job.setReducerClass(Reduce.class); // An HCatalog record as input job.setInputFormatClass(HCatInputFormat.class); // Mapper emits TableFieldValueKey as key and an integer as value job.setMapOutputKeyClass(TableFieldValueKey.class); job.setMapOutputValueClass(IntWritable.class); // Ignore the key for the reducer output; emitting an HCatalog record as // value job.setOutputKeyClass(WritableComparable.class); job.setOutputValueClass(DefaultHCatRecord.class); job.setOutputFormatClass(HCatOutputFormat.class); HCatOutputFormat.setOutput(job, OutputJobInfo.create(dbName, outputTableName, null)); HCatSchema s = HCatOutputFormat.getTableSchema(job); System.err.println("INFO: output schema explicitly set for writing:" + s); HCatOutputFormat.setSchema(job, s); return (job.waitForCompletion(true) ? 0 : 1); } public static void main(String[] args) throws Exception { String classpath = System.getProperty("Java.class.path"); //System.out.println("*** CLASSPATH: "+classpath); int exitCode = ToolRunner.run(new FieldCounts(), args); System.exit(exitCode); } }

Et classe pour la clé complexe:

package com.bigdata.hadoop; import Java.io.DataInput; import Java.io.DataOutput; import Java.io.IOException; import org.Apache.hadoop.io.WritableComparable; import com.google.common.collect.ComparisonChain; public class TableFieldValueKey implements WritableComparable<TableFieldValueKey> { public String fieldName; public String fieldValue; public TableFieldValueKey() {} //must have a default constructor // public void readFields(DataInput in) throws IOException { fieldName = in.readUTF(); fieldValue = in.readUTF(); } public void write(DataOutput out) throws IOException { out.writeUTF(fieldName); out.writeUTF(fieldValue); } public int compareTo(TableFieldValueKey o) { return ComparisonChain.start().compare(fieldName, o.fieldName) .compare(fieldValue, o.fieldValue).result(); } }

SachinJ · Answer

Hadoop a subi un énorme refactoring de code de Hadoop 1.0 à Hadoop 2.0. Un effet secondaire Est que le code compilé avec Hadoop 1.0 n’est pas compatible avec Hadoop 2.0 et inversement. Cependant, le code source est généralement compatible et nécessite donc de recompiler le code avec la cible Distribution Hadoop.

L'exception "Found interface X, but class was expected" est très courante lorsque vous exécutez un code Compilé pour Hadoop 1.0 sur Hadoop 2.0 ou inversement.

Vous pouvez trouver la version hadoop correcte utilisée dans le cluster, puis spécifier cette version hadoop dans le fichier pom.xml. Générez votre projet avec la même version de hadoop utilisée dans le cluster et déployez-le.

akshat thakar · Answer

Vous devez recompiler "hcatalog-core" pour prendre en charge Hadoop 2.0.0. Actuellement, "hcatalog-core" ne prend en charge que Hadoop 1.0.

Chiron · Answer

De toute évidence, vous avez des versions incompatibles entre les versions Hadoop et Hive. Vous devez mettre à niveau (ou rétrograder) votre version Hadoop ou Hive.

Ceci est dû à l'incompatibilité entre Hadoop 1 et Hadoop 2.

Abhiram · Answer

Même j'ai couru à travers ce problème. J'essayais d'utiliser HCatMultipleInputs avec Hive-hcatalog-core-0.13.0.jar. Nous utilisons hadoop 2.5.1.

Le changement de code suivant m'a aidé à résoudre le problème:

 // JobContext ctx = nouveau JobContext (conf, jobContext.getJobID ()); JobContext ctx = nouveau Job (conf);

nikoo28 · Answer

Rechercher des entrées comme celle-ci

<dependency> <groupId>org.Apache.hadoop</groupId> <artifactId>hadoop-core</artifactId> <version>1.2.1</version> </dependency>

dans votre pom.xml. Ceux-ci définissent la version de hadoop à utiliser. Modifiez-les ou supprimez-les selon vos besoins.