mapreduce 难以理解

时间：2023-09-12

本文介绍了mapreduce 难以理解的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经看到下面的链接，它是使用 python 开始 mapreduce 的链接

I have seen the below link which is of getting started mapreduce with python

http://code.google.com/p/appengine-mapreduce/wiki/GettingStartedInPython

但我仍然无法理解它是如何工作的.我正在执行下面的代码，但无法理解到底发生了什么?

But still I am not able to understand how its working. I am executing below code but not able to understand what exactly is happening?

mapreduce.yaml

mapreduce: 
- name: Testmapper 
   mapper: 
       input_reader: mapreduce.input_readers.DatastoreInputReader 
       handler: main.process 
   params: 
       - name: entity_kind 
         default: main.userDetail

mapreduce/main.py

some code

class userDetail(db.Model): 
name = db.StringProperty()

some code

def process(u): 
          u.name="mahesh" 
          yield op.db.Put(u)

我正在执行此操作，它在状态页面中为我提供了 status = success.

I am executing this and it gives me status = success in status page.

但无法理解发生了什么

我想用 mapreduce 做的主要事情是从实体中搜索或计算记录

The main thing I want do with mapreduce is to search or count records from entity

那么任何人都可以帮助我吗??

So anyone can please help me??

提前致谢

推荐答案

您正在将 "mahesh" 值设置为所有 userDetail 的 StringProperty name 实体.


You are setting the "mahesh" value to the StringProperty name of all your userDetail entities.
如果您想计算您的实体，请使用:
If you want to count your entities use :
from mapreduce import operation as op
 def process(entity):
    yield op.counters.Increment("counter1")


                        这篇关于mapreduce 难以理解的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持html5模板网！



上一篇：如何减少python中的元组列表 
下一篇：Hadoop 和 Python:禁用排序 

 
相关文章
python:不同包下同名的两个模块和类python: Two modules and classes with the same name under different packages(python:不同包下同名的两个模块和类)
配置 Python 以使用站点包的其他位置Configuring Python to use additional locations for site-packages(配置 Python 以使用站点包的其他位置)
如何在不重复导入顶级名称的情况下构造python包How to structure python packages without repeating top level name for import(如何在不重复导入顶级名称的情况下构造python包)
在 OpenShift 上安装 python 包Install python packages on OpenShift(在 OpenShift 上安装 python 包)
如何刷新 sys.path?How to refresh sys.path?(如何刷新 sys.path?)
分发带有已编译动态共享库的 Python 包Distribute a Python package with a compiled dynamic shared library(分发带有已编译动态共享库的 Python 包)



最新文章

从python中的字符串中提取英文单词
Windows 上的 Python Hadoop 流式传输，脚本不是有效的
使用 Hadoop 计算唯一身份访问者的最佳方法是什么
Python Hadoop Streaming 错误“ERROR streaming.StreamJob:作业
appengine-mapreduce 达到内存限制
带 MRJob 的多个输入
在 Hadoop Streaming 中生成单独的输出文件
hadoop 流:如何查看应用程序日志?
将列表的字典(2 级深)展平
MapReduce 结果似乎限制为 100?

<i id='MPSa1'><tr id='MPSa1'><dt id='MPSa1'><q id='MPSa1'><span id='MPSa1'><b id='MPSa1'><form id='MPSa1'><ins id='MPSa1'></ins><ul id='MPSa1'></ul><sub id='MPSa1'></sub></form><legend id='MPSa1'></legend><bdo id='MPSa1'><pre id='MPSa1'><center id='MPSa1'></center></pre></bdo></b><th id='MPSa1'></th></span></q></dt></tr></i><div id='MPSa1'><tfoot id='MPSa1'></tfoot><dl id='MPSa1'><fieldset id='MPSa1'></fieldset></dl></div>


<tfoot id='MPSa1'></tfoot>
<small id='MPSa1'></small><noframes id='MPSa1'>
<bdo id='MPSa1'></bdo><ul id='MPSa1'></ul>
<legend id='MPSa1'><style id='MPSa1'><dir id='MPSa1'><q id='MPSa1'></q></dir></style></legend>
<tbody id='MPSa1'></tbody>