我的代码的目的是导入 2 个 Excel 文件,比较它们,然后将差异打印到一个新的 Excel 文件中.
The purpose of my code is to import 2 Excel files, compare them, and print out the differences to a new Excel file.
但是,在连接所有数据并使用 drop_duplicates 函数后,代码会被控制台接受.但是,当打印到新的 excel 文件时,当天仍会保留重复项.
However, after concatenating all the data, and using the drop_duplicates function, the code is accepted by the console. But, when printed to the new excel file, duplicates still remain within the day.
我错过了什么吗?drop_duplicates 函数是否无效?
Am I missing something? Is something nullifying the drop_duplicates function?
我的代码如下:
import datetime
import xlrd
import pandas as pd
#identify excel file paths
filepath = r"excel filepath"
filepath2 = r"excel filepath2"
#read relevant columns from the excel files
df1 = pd.read_excel(filepath, sheetname="Sheet1", parse_cols= "B, D, G, O")
df2 = pd.read_excel(filepath2, sheetname="Sheet1", parse_cols= "B, D, F, J")
#merge the columns from both excel files into one column each respectively
df4 = df1["Exchange Code"] + df1["Product Type"] + df1["Product Description"] + df1["Quantity"].apply(str)
df5 = df2["Exchange"] + df2["Product Type"] + df2["Product Description"] + df2["Quantity"].apply(str)
#concatenate both columns from each excel file, to make one big column containing all the data
df = pd.concat([df4, df5])
#remove all whitespace from each row of the column of data
df=df.str.strip()
df=["".join(x.split()) for x in df]
#convert the data to a dataframe from a series
df = pd.DataFrame({'Value': df})
#remove any duplicates
df.drop_duplicates(subset=None, keep="first", inplace=False)
#print to the console just as a visual aid
print(df)
#print the erroneous entries to an excel file
df.to_excel("Comparison19.xls")
你有 inplace=False 所以你没有修改 df.你想要一个
You've got inplace=False so you're not modifying df. You want either
df.drop_duplicates(subset=None, keep="first", inplace=True)
或
df = df.drop_duplicates(subset=None, keep="first", inplace=False)
这篇关于drop_duplicates 在 pandas 中不起作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
如何在python中的感兴趣区域周围绘制一个矩形How to draw a rectangle around a region of interest in python(如何在python中的感兴趣区域周围绘制一个矩形)
如何使用 OpenCV 检测和跟踪人员?How can I detect and track people using OpenCV?(如何使用 OpenCV 检测和跟踪人员?)
如何在图像的多个矩形边界框中应用阈值?How to apply threshold within multiple rectangular bounding boxes in an image?(如何在图像的多个矩形边界框中应用阈值?)
如何下载 Coco Dataset 的特定部分?How can I download a specific part of Coco Dataset?(如何下载 Coco Dataset 的特定部分?)
根据文本方向检测图像方向角度Detect image orientation angle based on text direction(根据文本方向检测图像方向角度)
使用 Opencv 检测图像中矩形的中心和角度Detect centre and angle of rectangles in an image using Opencv(使用 Opencv 检测图像中矩形的中心和角度)