My wife was recently in Taiwan visiting her family for a couple of weeks, and she was facing a long flight home in a few days. She wanted to read a book that had been assigned by our priest at church. Simple enough request, right? Except the book wasn't available as a digital download, wasn't on Kindle, wasn't on any of the usual platforms.
Now, I could have just said "sorry, can't
help" and wait to read the physical
copy when she got home.
But that's not how you solve
problems when you work with technology. You start asking: what pieces do I
have? What can I combine? What's possible if I string things together?
Let me tell you what happened.
The Problem: A Book That Doesn't Want to Be Digital
I found
the book on archive.org — a wonderful resource where people
have scanned and uploaded books into a lending library. You can sign up,
borrow a book digitally, and it has this reading icon that, when you press it,
reads the book aloud in a mechanical voice. Not bad for accessibility purposes,
provided you have a good internet connection.
But there was no download option.
The book was available to read on screen or listen to through their player,
but I couldn't send it to my wife. I did buy a physical copy (I'm legitimate
here), but that didn't solve the immediate problem: getting
her something to read on a 13-hour
flight. Scanning the book would
have taken a day or two.
So I sat there looking at this
reading icon on my iPad and thought: what if I treat this like building blocks?
The Chain: Archive → Audio → Otter → Claude → PDF
Here's what I built:
Block 1: Archive.org gave me access to the book with an audio reader — mechanical voice, but it worked.
Block 2: Otter on my iPhone. I put my
iPhone next to my iPad, turned on the speaker, hit play on the archive.org
reader, and let Otter record and transcribe everything. My iPad was reading the
book aloud, my iPhone was sitting
there listening and capturing it all as a transcript. I had to do it in two parts, since Otter Pro has a 4-hour recording limit. The
first obstacle was that the reading and transcript were all run together as if
it were a stream of consciousness.
Block 3: Claude for editing. I took the
Otter transcript and fed it into Claude (ChatGPT didn't work well for this) and said: "Act as an editor.
Put in the paragraph breaks,
add the chapter
titles and subtitles, clean this up."
That part took some iteration. We had to establish editing rules — how to handle dialogue, where to break paragraphs, how to identify chapter markers. The mechanical voice reading meant some punctuation cues were lost, some formatting was ambiguous. To aid in the process, I provided a scan of the Table of Contents so Claude could better identify where chapter breaks happened. So once we got the rules set up, Claude could process it chapter by chapter.
The sidebars
in the book gave us trouble at first. But we figured
out a way to flag them also based on a list of
sidebars I created, and we handled them separately. Claude did a pretty good
job with those too.
Block 4: Assembly. I took the edited chapters,
assembled them into a single document, converted
it to PDF, and sent it to my
wife. She loaded the PDF into Kindle to read it on her iPad during the flight.
None of these tools were designed
to work together. Archive.org wasn't meant to be an audio source for Otter. Otter wasn't meant to transcribe
books. Claude wasn't meant to be a book formatter. But when you put them
together in sequence, each one doing what it does well, you get a solution that
didn't exist before.
What I Like About This Approach
This is what I call the Lego approach
— building blocks of technology that you can snap together
in ways their creators never imagined.
Think about it: I didn't need a special
"convert protected digital
library books to readable PDFs" application. I didn't
need to learn complex workarounds or break any digital rights management. I
just needed to recognize that I had pieces that could connect to each other.
Archive.org → outputs audio
Otter → inputs audio, outputs
text
Claude → inputs text, outputs
formatted text
PDF converter
→ inputs formatted text, outputs readable
document
Kindle → input a
readable PDF document, outputs organized book with bookmarks and annotations
Each block does one thing well. The magic is in recognizing how they can connect.
This is how we've been approaching problems in the Data4Good team too. We don't always have the perfect tool for every job. But we have a growing collection of building blocks — web scrapers, transcription services, AI editors, data analyzers, visualization tools. The question isn't "do we have the exact right tool?" The question is "what combination of tools gets us there?"
The AI Editing Part: Rules
Matter
I do want to mention one thing about the Claude editing phase, because it taught me something important.
When I first fed the transcript to ChatGPT, it didn't work well. When I switched
to Claude and just said "clean
this up," it also struggled. The breakthrough came when we established
rules together:
- How to identify chapter breaks
- Where to place paragraph breaks
- How to handle quoted dialogue
- How to format section headers
- What to do with sidebars
Once we had those
rules articulated, Claude could apply them consistently across all the
chapters. It wasn't about the AI being "smart enough" — it was about iterating, with some trial and error, and me being clear enough
about what I wanted and giving AI enough process clarity to get it
right.
This connects back
to the building blocks idea: the better you understand what each block does
well (and what it doesn't), the better you can connect
them. Claude is excellent at applying consistent rules to large volumes of text. But it needed me to establish
what those rules were, using evidence from the actual transcript we were
working with.
It also underscores
the conversational approach to problem solving that I advocate. The back-and-forth dialog with AI is itself a
way to iterate to a solution. So I often
approach AI as a conversation.
Confession
Let me be
completely honest about the timeline and effort involved. Looking back at the
file history, the AI editing phase turned
out to be the most difficult and time-consuming building
block. The project
took about two weeks (late October through
mid-November) with at least 23 iterations across chapters and components. I
went through 6 versions of the editing rules themselves as we refined the
process.
Is that faster than
manually editing the raw transcripts? Probably not — the ROI isn't there yet if
you're measuring pure efficiency. But the learning
value was substantial. I now understand how to structure
rules for AI editing, what
works and what doesn't, and I have a reusable process. The first book took 13
days with 23 iterations. The next one would hopefully be faster.
What Do You Think?
This project makes me think about how we approach
innovation. We often talk about finding "the
right application" or waiting for technology to advance
enough to solve our problems. But maybe the more valuable skill is recognizing
that you can be your own systems integrator. You can build the chain.
The building
blocks are already
there. Archive.org exists.
Otter exists. Claude
exists. PDF converters exist.
Kindle exists. None of them were designed to work
together for this purpose. But they can.
So here's my
question for you: What problem are you facing that doesn't have a ready-made
solution? What building blocks do you have access to? What happens
if you start connecting them? When was the last time you solved a problem by chaining tools together rather than finding the perfect
tool? What makes you hesitate to try
unconventional combinations of technologies?
The Lego approach isn't about having all the perfect pieces. It's about recognizing that the pieces you have can snap together in ways you haven't tried yet. When the right tool doesn't exist, look for building blocks you can connect. Each tool should do one thing well; the magic is in the connections. Iteration and rule-setting are part of the building process. Being a systems integrator is a valuable skill in the AI age.
For Further Reading
If you're interested in exploring the building blocks approach further, here are some related stories from my
Letters to a Young Manager collection:
- "The Lego's
Lesson" (Story #9)
- A management training exercise using Lego blocks
metaphor that reveals how deadline pressure changes our
approach to teamwork and process
- "Assemble the Components" (Story #5)
- How building
reusable program subroutines taught
me that "assembly is
easier and faster than creating from scratch"
- "The Truck" (Story #296)
- The story
of a boy who solved
a stuck truck
problem with a brilliantly
simple solution: "Just let the air out of the tires"
[1] This post was created with AI assistance (Claude), drawing from the author’s documents, meeting transcripts, and lessons learned from the project described. The content was then reviewed, edited, and adapted by the author.