我想创建一个年度排名(所以在 2012 年,经理 B 为 1.在 2011 年,经理 B 再次为 1).我在 pandas rank 函数上苦苦挣扎了一段时间,不想诉诸 for 循环.
I would like to create a rank on year (so in year 2012, Manager B is 1. In 2011, Manager B is 1 again). I struggled with the pandas rank function for awhile and DO NOT want to resort to a for loop.
s = pd.DataFrame([['2012','A',3],['2012','B',8],['2011','A',20],['2011','B',30]], columns=['Year','Manager','Return'])
Out[1]:
Year Manager Return
0 2012 A 3
1 2012 B 8
2 2011 A 20
3 2011 B 30
<小时>
我遇到的问题是附加代码(之前认为这无关紧要):
The issue I'm having is with the additional code (didn't think this would be relevant before):
s = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]], columns=['Year', 'Manager', 'Return'])
b = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]], columns=['Year', 'Manager', 'Return'])
s = s.append(b)
s['Rank'] = s.groupby(['Year'])['Return'].rank(ascending=False)
raise Exception('Reindexing only valid with uniquely valued Index '
Exception: Reindexing only valid with uniquely valued Index objects
有什么想法吗?
这是我正在使用的真实数据结构.重新索引时遇到问题..
Any ideas?
This is the real data structure I am using.
Been having trouble re-indexing..
听起来你想按Year分组,然后按降序排列Returns.
It sounds like you want to group by the Year, then rank the Returns in descending order.
import pandas as pd
s = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]],
columns=['Year', 'Manager', 'Return'])
s['Rank'] = s.groupby(['Year'])['Return'].rank(ascending=False)
print(s)
产量
Year Manager Return Rank
0 2012 A 3 2
1 2012 B 8 1
2 2011 A 20 2
3 2011 B 30 1
<小时>
解决 OP 修改后的问题:错误消息
To address the OP's revised question: The error message
ValueError: cannot reindex from a duplicate axis
在尝试对索引中具有重复值的 DataFrame 进行 groupby/rank 时发生.您可以通过构造 s 在追加后具有唯一索引值来避免该问题:
occurs when trying to groupby/rank on a DataFrame with duplicate values in the index. You can avoid the problem by constructing s to have unique index values after appending:
s = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]], columns=['Year', 'Manager', 'Return'])
b = pd.DataFrame([['2012', 'A', 3], ['2012', 'B', 8], ['2011', 'A', 20], ['2011', 'B', 30]], columns=['Year', 'Manager', 'Return'])
s = s.append(b, ignore_index=True)
产量
Year Manager Return
0 2012 A 3
1 2012 B 8
2 2011 A 20
3 2011 B 30
4 2012 A 3
5 2012 B 8
6 2011 A 20
7 2011 B 30
<小时>
如果您已经使用
If you've already appended new rows using
s = s.append(b)
然后使用 reset_index 创建唯一索引:
then use reset_index to create a unique index:
s = s.reset_index(drop=True)
这篇关于Pandas 按年份分组,按销售列排名,在具有重复数据的数据框中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
如何在python中的感兴趣区域周围绘制一个矩形How to draw a rectangle around a region of interest in python(如何在python中的感兴趣区域周围绘制一个矩形)
如何使用 OpenCV 检测和跟踪人员?How can I detect and track people using OpenCV?(如何使用 OpenCV 检测和跟踪人员?)
如何在图像的多个矩形边界框中应用阈值?How to apply threshold within multiple rectangular bounding boxes in an image?(如何在图像的多个矩形边界框中应用阈值?)
如何下载 Coco Dataset 的特定部分?How can I download a specific part of Coco Dataset?(如何下载 Coco Dataset 的特定部分?)
根据文本方向检测图像方向角度Detect image orientation angle based on text direction(根据文本方向检测图像方向角度)
使用 Opencv 检测图像中矩形的中心和角度Detect centre and angle of rectangles in an image using Opencv(使用 Opencv 检测图像中矩形的中心和角度)