Python转移Excel表格内容至Word

暑假在检测公司实习，整理幕墙安全检查的文件常常有一些复制粘贴的无脑操作，于是捣鼓了一下xlrd包(xlsx文件读取)和docx包(docx文件的写入)的一些基本操作，做个小记录。

正则表达式总结

# 总结
# ^ 匹配字符串的开始。
# $ 匹配字符串的结尾。
# \b 匹配一个单词的边界。
# \d 匹配任意数字。
# \D 匹配任意非数字字符。
# x? 匹配一个可选的 x 字符 (换言之，它匹配 1 次或者 0 次 x 字符)。
# x* 匹配0次或者多次 x 字符。
# x+ 匹配1次或者多次 x 字符。
# x{n,m} 匹配 x 字符，至少 n 次，至多 m 次。
# (a|b|c) 要么匹配 a，要么匹配 b，要么匹配 c。
# (x) 一般情况下表示一个记忆组 (remembered group)。你可以利用 re.search 函数返回对象的 groups() 函数获取它的值。
# 正则表达式中的点号通常意味着 “匹配任意单字符”

来源

Excel单元格内容: 主楼：45 裙楼：30

Word表格中有两个单元格，分别记录主楼、裙楼高度。如果格式规范可以直接用切片解决，但是原Excel表格中格式比较混乱，主楼后面的冒号中英符有混用，有不定量地出现空格的情况，主群楼高度有缺失，有浮点数等，不可预知的情况比较多…

解决方式：

表达式： \d+\.?\d*

1
2
3


``` python
LouGao = re.findall(r"\d+\.?\d*", MuQiangGaoDu)

['45', '30']```


``` python
slicenum = MuQiangGaoDu.find('裙')
    ZhuLouGao = re.findall(r"\d+\.?\d*", MuQiangGaoDu[0: slicenum])
    QunLouGao = re.findall(r"\d+\.?\d*", MuQiangGaoDu[slicenum:])

['45']```link

1	```# ['30']

字符串前的

"b" "u" "r" "f"

参考

b” “前缀表示：后面字符串是bytes 类型。

用处：

网络编程中，服务器和浏览器只认bytes 类型数据。

PS: bytes与str的转换方式

1 2	str.encode('utf-8') bytes.decode('utf-8')

u” “后面字符串以 Unicode 格式进行编码，一般用在中文字符串前面，防止因为源码储存格式问题，导致再次使用时出现乱码。

r” “去掉反斜杠的转移机制。

如"r\n\n\n"表示一个普通生字符串 \n\n\n 而不换行

以 f开头表示在字符串内支持大括号内的python 表达式

如

print(f'{name} done in {time.time() - t0:.2f} s')
OutPut: 
processing done in 1.00 s
<br /><br />

Excel版本转化——xls 2 xlsx

import win32com.client as win32

fname = "C:\\Code\\Python\\" + filename + ".xls"
excel = win32.gencache.EnsureDispatch('Excel.Application')
wb = excel.Workbooks.Open(fname)
wb.SaveAs(fname+"x", FileFormat = 51)
wb.Close()
excel.Application.Quit()

filename为文件名字符串

Word版本转化——doc 2 docx

import win32com.client as wc 
import docx

word = wc.Dispatch("Word.Application")
doc = word.Documents.Open(r"C:\\Code\\Python\\文件名.doc")
doc.SaveAs(r"C:\\Code\\Python\\文件名.docx", 12)
doc.Close
word.Quit
path = "文件名.docx"
file = docx.Document(path)
for p in file.paragraphs:
    print(p.text)

读取xlsx文件中指定单元格的数据

import xlrd

workbook = xlrd.open_workbook(filename + '.xlsx')
sheet = workbook.sheets()[0]  # 指定索引的sheet，从0开始
JianZhuMingCheng = sheet.cell(1, 2).value  # 单元格内容
if sheet.cell(5, 6).ctype == 2:  # 单元格内容的类型
    ...  # 0 empty, 1 string, 2 number, 3 date, 4 boolean, 5 error

更改docx文件的表格中的数值

from docx import Document

path = "C:\\Code\Python\\文件名.docx"
document = Document(path)
tables = document.tables
table = tables[0]

# 设置字体、字号
document.styles['Normal'].font.name = u'仿宋'
document.styles['Normal'].font.size = Pt(9)
document.styles['Normal']._element.rPr.rFonts.set(qn('w:eastAsia'), u'仿宋')

table.cell(0, 4).text = '需要输入的内容或任意字符串对象'  # 指定Word表格中的索引,从0开始
document.save("C:\\Code\Python\\" + filename + ".docx")  # filename为文件名字符串