法语助手到kindle系列2.5.1:拆!从网页转换到kindle
前文:
法语助手到kindle系列2.5:半自动化!从网页转换到kindle
本文的输入和输出文件跟上文是一样的,都是从法助的导出单词列表html变为字体大小适于kindle阅读的pdf。
区别在于对本篇对输出文档进行了拆分:
6寸的kindle显然不如文石之类的大屏阅读器舒服。但胜在便携,在拥挤的车厢使用时也不担心,随时可以揣进口袋腾出手,实在方便。
但kindle对pdf的支持能力也是够呛。目前我的单词列表超过两千词,因为pdf字大又加了目录,总页数近两万。实测体验看列表超过500词以上kindle的反应速度就明显受影响,呼出菜单,长按取词标注都等好久,等久了又不确定是不是刚才按屏幕压根没按上,重按一下,结果把正好刚跳出来的菜单又给按没了~~解决方案就一个字:
拆!
一个长文档拆成多个短文档即可。
手动拆太累又费时,自然又是要不惜花费些时间来写脚本。相当于在之前的脚本基础上加了个功能包,叫它2.5.1好了
操作步骤:
1.打开法助导出的网页

2.整个复制,贴到记事本

3.从记事本中再次整个复制,贴到一个新建空白word中。不必保存,脚本会做。

4.运行宏脚本“A_similitude_html_to_kindle”
5.运行结果:

10个docx文件(一个主文档,9个子文档,9号子文档是个副产品里头没单词)
8个pdf,拷入kindle背去吧。
处理2000个单词,分8个文档,脚本运行用时约7分钟。手工做的话估计要四倍时间。
所用脚本
Sub A_similitude_html_to_kindle()
'
'
'输出文件目录
ActiveDocument.SaveAs2 ("E:\Downloads\project similitude\similitude.docx")
Dim i_total_word As Integer
Dim i_num_of_doc As Integer
Dim i_line_per_doc As Integer
Dim i_text As String
'文档拆分个数
i_num_of_doc = 8
Selection.HomeKey Unit:=wdStory
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
' Selection.Find.Replacement.Style = ActiveDocument.Styles("标题 1")
With Selection.Find.Replacement.ParagraphFormat
.SpaceBeforeAuto = False
.SpaceAfterAuto = False
.WidowControl = False
.KeepWithNext = False
.KeepTogether = False
.PageBreakBefore = False
.WordWrap = True
End With
With Selection.Find
'搜索文件内容,找到“共导出xxxx条记录”一行,读取单词数
.Text = "共导出[0-9]{1,}条记录"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindStop
.Format = False
.MatchCase = False
.MatchWholeWord = False
.MatchByte = False
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceNone
'对搜索结果string进行处理,去掉文字,只保留数字,转换格式成整型赋值给i_total_word
i_text = Selection.Text
With CreateObject("VBSCRIPT.REGEXP")
.Global = True
.Pattern = "[^!-~]"
i_text = .Replace(i_text, "")
End With
i_total_word = CInt(i_text)
'计算每个子文档有多少词,赋值给i_line_per_doc
i_line_per_doc = i_total_word / i_num_of_doc
' 找到 S:所在行,设置为三级目录,并在前方添加一分页
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Style = ActiveDocument.Styles("标题 3")
With Selection.Find.Replacement.ParagraphFormat
.SpaceBeforeAuto = False
.SpaceAfterAuto = False
.WidowControl = False
.KeepWithNext = False
.KeepTogether = False
.PageBreakBefore = False
.WordWrap = True
End With
With Selection.Find
.Text = vbTab & "S:"
.Replacement.Text = ""
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchByte = True
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.HomeKey Unit:=wdStory
Selection.EscapeKey
Selection.HomeKey Unit:=wdStory
For i = 1 To i_num_of_doc
'到标题3级别的行中,挑出要加文档分段的行,前面加一行字“temp_similitude”
Selection.Find.Execute Replace:=wdReplaceAll
Selection.Find.ClearFormatting
Selection.Find.Style = ActiveDocument.Styles("标题 3")
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Text = "(<" & (i - 1) * i_line_per_doc + 1 & ">)"
.Replacement.Text = "temp_similitude^&"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchByte = True
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Next
'在结尾再加一个temp_similitude,做个空文档。不加这个的话,最后一个子文档总是只有标题1“temp_similitude”,而没有后面的内容
Selection.EndKey Unit:=wdStory
Selection.TypeParagraph
Selection.TypeParagraph
Selection.TypeText Text:="temp_similitude"
' 找到所有temp_similitude所在行,设为标题1,并在后面加个换行
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
Selection.Find.Replacement.Style = ActiveDocument.Styles("标题 1")
With Selection.Find.Replacement.ParagraphFormat
.SpaceBeforeAuto = False
.SpaceAfterAuto = False
.WidowControl = False
.KeepWithNext = False
.KeepTogether = False
.PageBreakBefore = False
.WordWrap = True
End With
With Selection.Find
.Text = "temp_similitude"
.Replacement.Text = "similitude^p"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchByte = True
.MatchWildcards = False
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
Selection.HomeKey Unit:=wdStory
Selection.EscapeKey
'文档拆分
ActiveWindow.ActivePane.View.Type = wdOutlineView
ActiveWindow.View.ShowHeading 1
Selection.EndKey Unit:=wdStory
Selection.HomeKey Unit:=wdLine, Extend:=wdExtend
ActiveWindow.View = wdOutlineView
Selection.EndKey Unit:=wdStory
'分i_num_of_doc个子文档,实际会生成i_num_of_doc+1个文档,排最后的是一个没用的空文档。
For i = 1 To i_num_of_doc + 1
Selection.HomeKey Unit:=wdLine
Selection.EndKey Unit:=wdLine, Extend:=wdExtend
ActiveDocument.Subdocuments.AddFromRange Range:=Selection.Range
Selection.MoveUp Unit:=wdLine, Count:=1
Next
ActiveDocument.Save
'脚本会依次打开各个子文档执行格式处理,并导出pdf。子文件的文件名是系统自定的少了个字母
For i = 1 To i_num_of_doc
Documents.Open FileName:="E:\Downloads\project similitude\similitud" & i & ".docx", ConfirmConversions:=False, ReadOnly _
:=False, AddToRecentFiles:=False, PasswordDocument:="", PasswordTemplate _
:="", Revert:=False, WritePasswordDocument:="", WritePasswordTemplate:="" _
, Format:=wdOpenFormatAuto, XMLTransform:=""
Application.Run MacroName:="A_一步到位similitude"
ActiveDocument.Save
ActiveWindow.Close
Next
MsgBox "fin"
End Sub

为啥又浪费时间做这种东西呢,一定是吃饱了撑的。