<i id='hIzN2'><tr id='hIzN2'><dt id='hIzN2'><q id='hIzN2'><span id='hIzN2'><b id='hIzN2'><form id='hIzN2'><ins id='hIzN2'></ins><ul id='hIzN2'></ul><sub id='hIzN2'></sub></form><legend id='hIzN2'></legend><bdo id='hIzN2'><pre id='hIzN2'><center id='hIzN2'></center></pre></bdo></b><th id='hIzN2'></th></span></q></dt></tr></i><div id='hIzN2'><tfoot id='hIzN2'></tfoot><dl id='hIzN2'><fieldset id='hIzN2'></fieldset></dl></div>
  • <tfoot id='hIzN2'></tfoot>

  • <legend id='hIzN2'><style id='hIzN2'><dir id='hIzN2'><q id='hIzN2'></q></dir></style></legend>
  • <small id='hIzN2'></small><noframes id='hIzN2'>

          <bdo id='hIzN2'></bdo><ul id='hIzN2'></ul>

        python中的Hadoop Streaming Job失败错误

        时间:2023-09-12
        <tfoot id='uZzaY'></tfoot>
        • <bdo id='uZzaY'></bdo><ul id='uZzaY'></ul>
          <i id='uZzaY'><tr id='uZzaY'><dt id='uZzaY'><q id='uZzaY'><span id='uZzaY'><b id='uZzaY'><form id='uZzaY'><ins id='uZzaY'></ins><ul id='uZzaY'></ul><sub id='uZzaY'></sub></form><legend id='uZzaY'></legend><bdo id='uZzaY'><pre id='uZzaY'><center id='uZzaY'></center></pre></bdo></b><th id='uZzaY'></th></span></q></dt></tr></i><div id='uZzaY'><tfoot id='uZzaY'></tfoot><dl id='uZzaY'><fieldset id='uZzaY'></fieldset></dl></div>

                  <tbody id='uZzaY'></tbody>

                  <legend id='uZzaY'><style id='uZzaY'><dir id='uZzaY'><q id='uZzaY'></q></dir></style></legend>

                  <small id='uZzaY'></small><noframes id='uZzaY'>

                  本文介绍了python中的Hadoop Streaming Job失败错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  来自 本指南,我已经成功运行了示例练习.但是在运行我的 mapreduce 作业时,我收到以下错误
                  ERROR streaming.StreamJob:作业不成功!
                  2016 年 10 月 12 日 17:13:38 信息流.StreamJob:killJob...
                  流式传输作业失败!

                  来自日志文件的错误

                  From this guide, I have successfully run the sample exercise. But on running my mapreduce job, I am getting the following error
                  ERROR streaming.StreamJob: Job not Successful!
                  10/12/16 17:13:38 INFO streaming.StreamJob: killJob...
                  Streaming Job Failed!

                  Error from the log file

                  java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 2
                  at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:311)
                  at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:545)
                  at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:132)
                  at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
                  at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:36)
                  at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
                  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
                  at org.apache.hadoop.mapred.Child.main(Child.java:170)
                  

                  映射器.py

                  import sys
                  
                  i=0
                  
                  for line in sys.stdin:
                      i+=1
                      count={}
                      for word in line.strip().split():
                          count[word]=count.get(word,0)+1
                      for word,weight in count.items():
                          print '%s	%s:%s' % (word,str(i),str(weight))
                  

                  reducer.py

                  Reducer.py

                  import sys
                  
                  keymap={}
                  o_tweet="2323"
                  id_list=[]
                  for line in sys.stdin:
                      tweet,tw=line.strip().split()
                      #print tweet,o_tweet,tweet_id,id_list
                      tweet_id,w=tw.split(':')
                      w=int(w)
                      if tweet.__eq__(o_tweet):
                          for i,wt in id_list:
                              print '%s:%s	%s' % (tweet_id,i,str(w+wt))
                          id_list.append((tweet_id,w))
                      else:
                          id_list=[(tweet_id,w)]
                          o_tweet=tweet
                  

                  [edit] 运行作业的命令:

                  [edit] command to run the job:

                  hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-0.20.0-streaming.jar -file /home/hadoop/mapper.py -mapper /home/hadoop/mapper.py -file /home/hadoop/reducer.py -reducer /home/hadoop/reducer.py -input my-input/* -output my-output
                  

                  输入是任意随机序列的句子.

                  Input is any random sequence of sentences.

                  谢谢,

                  推荐答案

                  你的 -mapper 和 -reducer 应该只是脚本名称.

                  Your -mapper and -reducer should just be the script name.

                  hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar contrib/streaming/hadoop-0.20.0-streaming.jar -file /home/hadoop/mapper.py -mapper mapper.py -file /home/hadoop/reducer.py -reducer reducer.py -input my-input/* -output my-output
                  

                  当您的脚本位于 hdfs 内另一个文件夹中的作业中时,该作业与执行为."的尝试任务相关.(仅供参考,如果您想要添加另一个文件,例如查找表,您可以在 Python 中打开它,就好像它与您的脚本在同一目录中一样,而您的脚本在 M/R 作业中)

                  When your scripts are in the job that is in another folder within hdfs which is relative to the attempt task executing as "." (FYI if you ever want to ad another -file such as a look up table you can open it in Python as if it was in the same dir as your scripts while your script is in M/R job)

                  还要确保你有 chmod a+x mapper.py 和 chmod a+x reducer.py

                  also make sure you have chmod a+x mapper.py and chmod a+x reducer.py

                  这篇关于python中的Hadoop Streaming Job失败错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:使用 Apache Spark 将键值对缩减为键列表对 下一篇:Hadoop Streaming - 找不到文件错误

                  相关文章

                  最新文章

                • <i id='OlcAi'><tr id='OlcAi'><dt id='OlcAi'><q id='OlcAi'><span id='OlcAi'><b id='OlcAi'><form id='OlcAi'><ins id='OlcAi'></ins><ul id='OlcAi'></ul><sub id='OlcAi'></sub></form><legend id='OlcAi'></legend><bdo id='OlcAi'><pre id='OlcAi'><center id='OlcAi'></center></pre></bdo></b><th id='OlcAi'></th></span></q></dt></tr></i><div id='OlcAi'><tfoot id='OlcAi'></tfoot><dl id='OlcAi'><fieldset id='OlcAi'></fieldset></dl></div>
                • <legend id='OlcAi'><style id='OlcAi'><dir id='OlcAi'><q id='OlcAi'></q></dir></style></legend>

                      <bdo id='OlcAi'></bdo><ul id='OlcAi'></ul>
                    <tfoot id='OlcAi'></tfoot>

                      <small id='OlcAi'></small><noframes id='OlcAi'>