我已经看到下面的链接,它是使用 python 开始 mapreduce 的链接
I have seen the below link which is of getting started mapreduce with python
http://code.google.com/p/appengine-mapreduce/wiki/GettingStartedInPython
但我仍然无法理解它是如何工作的.我正在执行下面的代码,但无法理解到底发生了什么?
But still I am not able to understand how its working. I am executing below code but not able to understand what exactly is happening?
mapreduce.yaml
mapreduce:
- name: Testmapper
mapper:
input_reader: mapreduce.input_readers.DatastoreInputReader
handler: main.process
params:
- name: entity_kind
default: main.userDetail
mapreduce/main.py
some code
class userDetail(db.Model):
name = db.StringProperty()
some code
def process(u):
u.name="mahesh"
yield op.db.Put(u)
我正在执行此操作,它在状态页面中为我提供了 status = success.
I am executing this and it gives me status = success in status page.
但无法理解发生了什么
我想用 mapreduce 做的主要事情是从实体中搜索或计算记录
The main thing I want do with mapreduce is to search or count records from entity
那么任何人都可以帮助我吗??
So anyone can please help me??
提前致谢
您正在将 "mahesh" 值设置为所有 userDetail 的 StringProperty name 实体.
You are setting the "mahesh" value to the StringProperty name of all your userDetail entities.
如果您想计算您的实体,请使用:
If you want to count your entities use :
from mapreduce import operation as op
def process(entity):
yield op.counters.Increment("counter1")
这篇关于mapreduce 难以理解的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
python:不同包下同名的两个模块和类python: Two modules and classes with the same name under different packages(python:不同包下同名的两个模块和类)
配置 Python 以使用站点包的其他位置Configuring Python to use additional locations for site-packages(配置 Python 以使用站点包的其他位置)
如何在不重复导入顶级名称的情况下构造python包How to structure python packages without repeating top level name for import(如何在不重复导入顶级名称的情况下构造python包)
在 OpenShift 上安装 python 包Install python packages on OpenShift(在 OpenShift 上安装 python 包)
如何刷新 sys.path?How to refresh sys.path?(如何刷新 sys.path?)
分发带有已编译动态共享库的 Python 包Distribute a Python package with a compiled dynamic shared library(分发带有已编译动态共享库的 Python 包)