我有一个这样的数据框,我正在尝试使用 Pandas 中的 Pivot 重塑我的数据框,这样我可以保留原始行中的一些值,同时将重复的行变成列并重命名它们.有时我有 5 个重复的行
I have a data frame like this and I'm trying reshape my data frame using Pivot from Pandas in a way that I can keep some values from the original rows while making the duplicates row into columns and renaming them. Sometimes I have rows with 5 duplicates
我一直在尝试,但我不明白.
I have been trying, but I don't get it.
import pandas as pd
df = pd.read_csv("C:dummy")
df = df.pivot(index=["ID"], columns=["Zone","PTC"], values=["Zone","PTC"])
# Rename columns and reset the index.
df.columns = [["PTC{}","Zone{}"],.format(c) for c in df.columns]
df.reset_index(inplace=True)
# Drop duplicates
df.drop(["PTC","Zone"], axis=1, inplace=True)
输入
ID Agent OV Zone Value PTC
1 10 26 M1 10 100
2 26.5 8 M2 50 95
2 26.5 8 M1 6 5
3 4.5 6 M3 4 40
3 4.5 6 M4 6 60
4 1.2 0.8 M1 8 100
5 2 0.4 M1 6 10
5 2 0.4 M2 41 86
5 2 0.4 M4 2 4
输出
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
1 10 26 M_1 10 100 0 0 0 0 0 0
2 26.5 8 M_2 50 95 M_1 6 5 0 0 0
3 4.5 6 M_3 4 40 M_4 6 60 0 0 0
4 1.2 0.8 M_1 8 100 0 0 0 0 0 0
5 2 0.4 M_1 6 10 M_2 41 86 M_4 2 4
使用cumcount 用于计数组,由 MultiIndex/pandas-docs/stable/generated/pandas.DataFrame.set_index.html" rel="nofollow noreferrer">set_index 与 unstack 和列的最后展平值:
Use cumcount for count groups, create MultiIndex by set_index with unstack and last flatten values of columns:
g = df.groupby(["ID","Agent", "OV"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack(fill_value=0).sort_index(axis=1, level=1)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]
df = df.reset_index()
print (df)
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
0 1 10.0 26.0 M1 10 100 0 0 0 0 0 0
1 2 26.5 8.0 M2 50 95 M1 6 5 0 0 0
2 3 4.5 6.0 M3 4 40 M4 6 60 0 0 0
3 4 1.2 0.8 M1 8 100 0 0 0 0 0 0
4 5 2.0 0.4 M1 6 10 M2 41 86 M4 2 4
如果只想替换为 0 数字列:
If want replace to 0 only numeric columns:
g = df.groupby(["ID","Agent"]).cumcount().add(1)
df = df.set_index(["ID","Agent","OV", g]).unstack().sort_index(axis=1, level=1)
idx = pd.IndexSlice
df.loc[:, idx[['Value','PTC']]] = df.loc[:, idx[['Value','PTC']]].fillna(0).astype(int)
df.columns = ["{}{}".format(a, b) for a, b in df.columns]
df = df.fillna('').reset_index()
print (df)
ID Agent OV Zone1 Value1 PTC1 Zone2 Value2 PTC2 Zone3 Value3 PTC3
0 1 10.0 26.0 M1 10 100 0 0 0 0
1 2 26.5 8.0 M2 50 95 M1 6 5 0 0
2 3 4.5 6.0 M3 4 40 M4 6 60 0 0
3 4 1.2 0.8 M1 8 100 0 0 0 0
4 5 2.0 0.4 M1 6 10 M2 41 86 M4 2 4
这篇关于Pivot 将行复制到新列 Pandas的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
如何在python中的感兴趣区域周围绘制一个矩形How to draw a rectangle around a region of interest in python(如何在python中的感兴趣区域周围绘制一个矩形)
如何使用 OpenCV 检测和跟踪人员?How can I detect and track people using OpenCV?(如何使用 OpenCV 检测和跟踪人员?)
如何在图像的多个矩形边界框中应用阈值?How to apply threshold within multiple rectangular bounding boxes in an image?(如何在图像的多个矩形边界框中应用阈值?)
如何下载 Coco Dataset 的特定部分?How can I download a specific part of Coco Dataset?(如何下载 Coco Dataset 的特定部分?)
根据文本方向检测图像方向角度Detect image orientation angle based on text direction(根据文本方向检测图像方向角度)
使用 Opencv 检测图像中矩形的中心和角度Detect centre and angle of rectangles in an image using Opencv(使用 Opencv 检测图像中矩形的中心和角度)