使用 Drive API/DriveApp 将 PDF 转换为 Google 文档

时间:2023-02-07
本文介绍了使用 Drive API/DriveApp 将 PDF 转换为 Google 文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此问题已成功解决.我正在编辑我的帖子以记录我的经验,以供后人参考.

This problem has been successfully resolved. I am editing my post to document my experience for posterity and future reference.

我有 117 个 PDF 文件(平均大小约为 238 KB)上传到 Google 云端硬盘.我想将它们全部转换为 Google Docs 并将它们保存在不同的 Drive 文件夹中.

I have 117 PDF files (average size ~238 KB) uploaded to Google Drive. I want to convert them all to Google Docs and keep them in a different Drive folder.

我尝试使用 Drive.Files.insert.但是,在大多数情况下,只有 5 个文件可以通过这种方式在函数因此错误过早过期之前进行转换

I attempted to convert the files using Drive.Files.insert. However, under most circumstances, only 5 files could be converted this way before the function expires prematurely with this error

超出限制:DriveApp.(第 # 行,文件代码")

Limit Exceeded: DriveApp. (line #, file "Code")

上面引用的行是调用 insert 函数的地方.第一次调用此函数后,后续调用通常会立即失败,并且没有创建额外的 google doc.

where the line referenced above is when the insert function is called. After calling this function for the first time, subsequent calls typically failed immediately with no additional google doc created.

我使用了 3 种主要方法来实现我的目标.一个是使用 Drive.Files.insert,如上所述.另外两个涉及使用 Drive.Files.copy 并发送 一批 HTTP 请求.最后两种方法是 Tanaike 建议的,我建议阅读下面的答案以获取更多信息.insertcopy 函数来自 Google Drive REST v2 API,而批处理多个 HTTP 请求来自 Drive REST v3.

I used 3 main ways to achieve my goal. One was using the Drive.Files.insert, as mentioned above. The other two involved using Drive.Files.copy and sending a batch of HTTP requests. These last two methods were suggested by Tanaike, and I recommend reading his answer below for more information. The insert and copy functions are from Google Drive REST v2 API, while batching multiple HTTP requests is from Drive REST v3.

使用 Drive.Files.insert,我遇到了处理问题具有执行限制(在上面的问题部分中进行了解释).一种解决方案是多次运行这些功能.为此,我需要一种方法来跟踪哪些文件被转换.我有两个选择:使用电子表格和 延续令牌.因此,我有 4 种不同的方法来测试:本段中提到的两种,批处理 HTTP 请求,并调用 Drive.Files.copy.

With Drive.Files.insert, I experienced issues dealing with execution limitations (explained in the Problem section above). One solution was to run the functions multiple times. And for that, I needed a way to keep track of which files were converted. I had two options for this: using a spreadsheet and a continuation token. Therefore, I had 4 different methods to test: the two mentioned in this paragraph, batching HTTP requests, and calling Drive.Files.copy.

因为团队驱动器的行为不同于常规驱动器,我觉得有必要尝试每种方法两次,其中包含 PDF 的文件夹是常规的非团队驱动器文件夹,另一种方法是该文件夹位于团队驱动器下.总的来说,这意味着我有 8 个不同的测试方法.

Because team drives behave differently from regular drives, I felt it necessary to try each of those methods twice, one in which the folder containing the PDFs is a regular non-Team Drive folder and one in which that folder is under a Team Drive. In total, this means I had 8 different methods to test.

这些是我使用的确切功能.每个都使用了两次,唯一的变化是源文件夹和目标文件夹的 ID(出于上述原因):

These are the exact functions I used. Each of these was used twice, with the only variations being the ID of the source and destination folders (for reasons stated above):

function toDocs() {
  var sheet = SpreadsheetApp.openById(/* spreadsheet id*/).getSheets()[0];
  var range = sheet.getRange("A2:E118");
  var table = range.getValues();
  var len = table.length;
  var resources = {
    title: null,
    mimeType: MimeType.GOOGLE_DOCS,
    parents: [{id: /* destination folder id *