我需要确定目录中哪个文件是二进制,哪个是文本.
I need identify which file is binary and which is a text in a directory.
我尝试使用 mimetypes 但在我的情况下这不是一个好主意,因为它无法识别所有文件 mime,而且我这里有陌生人...我只需要知道,二进制或文本.简单的 ?但是我找不到解决方案...
I tried use mimetypes but it isnt a good idea in my case because it cant identify all files mimes, and I have strangers ones here... I just need know, binary or text. Simple ? But I couldn´t find a solution...
谢谢
谢谢大家,我找到了适合我的问题的解决方案.我在 http://code.activestate.com/recipes/173220/ 和我只改变了一点以适合我.
Thanks everybody, I found a solution that suited my problem. I found this code at http://code.activestate.com/recipes/173220/ and I changed just a little piece to suit me.
它工作正常.
from __future__ import division
import string
def istext(filename):
s=open(filename).read(512)
text_characters = "".join(map(chr, range(32, 127)) + list("
"))
_null_trans = string.maketrans("", "")
if not s:
# Empty files are considered text
return True
if " " in s:
# Files with null bytes are likely binary
return False
# Get the non-text characters (maps a character to itself then
# use the 'remove' option to get rid of the text characters.)
t = s.translate(_null_trans, text_characters)
# If more than 30% non-text characters, then
# this is considered a binary file
if float(len(t))/float(len(s)) > 0.30:
return False
return True
这篇关于如何使用 Python 识别二进制文件和文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!
如何在python中的感兴趣区域周围绘制一个矩形How to draw a rectangle around a region of interest in python(如何在python中的感兴趣区域周围绘制一个矩形)
如何使用 OpenCV 检测和跟踪人员?How can I detect and track people using OpenCV?(如何使用 OpenCV 检测和跟踪人员?)
如何在图像的多个矩形边界框中应用阈值?How to apply threshold within multiple rectangular bounding boxes in an image?(如何在图像的多个矩形边界框中应用阈值?)
如何下载 Coco Dataset 的特定部分?How can I download a specific part of Coco Dataset?(如何下载 Coco Dataset 的特定部分?)
根据文本方向检测图像方向角度Detect image orientation angle based on text direction(根据文本方向检测图像方向角度)
使用 Opencv 检测图像中矩形的中心和角度Detect centre and angle of rectangles in an image using Opencv(使用 Opencv 检测图像中矩形的中心和角度)