Smart document registry: free Google Drive OCR and Gemini quota tricks

Google Apps Script, hidden Drive OCR, Gemini key rotation, and LockService — a Habr pipeline for cataloging huge archives.

Published: 10 June 2026

In brief

How do you walk gigabytes of PDFs and DOCX on Google Drive, extract paper titles and abstracts, and survive Gemini API quotas? A Habr write-up chains Google Apps Script, built-in Drive OCR, time triggers, LockService, and API key rotation — no paid document parsers.

What happened

The author needed to catalog a large scientific archive: exact title, short summary, and whether a specific researcher co-authored each paper. A naive Apps Script hit three walls at once.

Six-minute execution limit: OCR plus LLM per heavy PDF takes 15–40 seconds — the run dies around file 20. Binary formats: GAS cannot read PDF/DOCX natively; paid parsers are expensive. Gemini free-tier quotas → rapid HTTP 429.

The fix stacks several tricks. Hidden Google Drive OCR: via Drive API, copy PDF/DOCX to a temp Google Doc with ocr: true — same engine as manual scan open. Read text with DocumentApp, delete the temp file in finally or Drive fills with junk.

Beat the 6-minute cap with a Google Sheet as a simple DB: cache processed filenames, a minute trigger restarts the script, the new run skips finished rows and continues at file 16. LockService stops races: while one run OCRs a PDF for over a minute, the next trigger must not duplicate rows.

Gemini key rotation: an array of AI Studio keys; on 429, switch; if you wrap the pool, sleep 30s for RPM reset. Ask the LLM for JSON (responseMimeType: application/json) — title and summary in one call, no markdown fences.

Why it matters

The pattern shows Apps Script can run long background pipelines when you chunk work and guard state — cheaper than a dedicated OCR server for hundreds–thousands of Drive files, not millions.

Trade-offs: Google quota dependence and temp-file hygiene. The combo free OCR + Flash Lite + key pool can process on the order of 1,500 docs/day on three keys in ~2 hours of trigger time.

In practice

Enable Drive API in the Apps Script editor, not only DocumentApp.
OCR with Drive.Files.copy, ocr: true, ocrLanguage: "ru" — try/finally delete temps.
Track progress in Google Sheets; hash/skip processed names before the loop.
Time-driven triggers; delete triggers when the catalog finishes.
LockService.getScriptLock() per file — no parallel double-processing.
GEMINI_API_KEYS pool, rotate on 429, Utilities.sleep(30000) when all keys hit RPM.
responseMimeType: "application/json" — structured fields without ```json parsing.
Non-text formats (.pptx, .xlsx) → placeholder rows, zero tokens.

Takeaway

The Habr article is a practical autonomous Drive document registry: OCR without third-party services, LLM field extraction, resilience to timeouts and quotas. If your archive lives in Google cloud, adapt columns and prompts — code fragments are in the original.