Your Library Data Is Already Being Sold. You're Just Not Getting Paid.

Let me walk you through something that's been bothering me.

In 2025, OverDrive reported 820 million digital checkouts across its library partners. That\'s 820 million data points about what people are reading, where they\'re reading it, in what format, and how long they're willing to wait to get it.

That data was generated by library patrons. Using library cards. At libraries funded by tax dollars. Managed by library staff.

OverDrive collected all of it. And here's what they did with it.

The Panorama Project

In 2018, OverDrive created something called the Panorama Project. "Initial funding provided by OverDrive." They put together an advisory council that includes:

Erica Lazzaro: OverDrive's own EVP and General Counsel. Advising the research her employer funds.
Sari Feldman: Former ALA President. Appointed to OverDrive\'s Board of Directors in July 2020, one month after KKR closed its acquisition. She sits alongside three KKR board members. She\'s a paid board member of the company funding the research she advises.
Skip Dye: VP of Library Marketing & Digital Sales at Penguin Random House. PRH is OverDrive\'s largest publisher partner. He has direct commercial interest in the research concluding that library lending doesn\'t hurt retail sales.

The Panorama Project's first study measured a coordinated national marketing campaign (the Big Library Read) where one book was featured across 14,700 library branches simultaneously for two weeks. Ebook sales jumped 720%. They attributed this to library lending driving retail sales.

That\'s not a study of library lending. That\'s a study of a marketing campaign. But 720% became the number OverDrive cites everywhere.

Their consumer survey had 4,300 self-selected library users report their own purchasing behavior. No control group. No verified purchase data. No measurement of books people didn't buy because they borrowed instead.

Meanwhile, actual independent research tells a different story. A German government-commissioned study found e-lending has a "direct negative effect on retail sales." Japanese academic research found no statistically significant positive correlation. Macmillan's internal data was compelling enough that their CEO implemented a library ebook embargo in 2019 while simultaneously participating in the Panorama Project.

The Bookstore Pipeline

In January 2026, OverDrive CEO Steve Potash testified before the DC Council. Buried in footnote 3 on page 10, he disclosed this:

"In partnership with Kent District Library (MI)... we are preparing to test a service that connects KDL patrons that opt-in with a designated bookstore."

Read that again. OverDrive is building a system where a person borrows a book from a public library, and OverDrive uses that borrowing data to connect them to a commercial bookstore to buy books.

The library is the lead generation funnel. The patron data is the product. The bookstore connection is the monetization.

No press release. No public announcement from Kent District Library. The only disclosure is a footnote in government testimony.

Who Owns OverDrive?

KKR. The private equity firm. They bought OverDrive in 2020 for $775 million. They own RBmedia (the world\'s largest audiobook publisher, which they already sold to H.I.G. Capital for over $1 billion). Richard Sarnoff, KKR\'s Chairman of Media, shepherded both deals.

Steve Potash\'s DC testimony never mentions KKR. Not once in 55 pages. He refers to "my family" and "our team." He doesn\'t mention that three of his board members are KKR appointees or that a KKR managing director sits on both the OverDrive board and the Simon & Schuster board simultaneously.

Every checkout, every hold, every wait time, every format preference across 43,000 library partners: that\'s not just platform data. That\'s a valuation asset. When KKR sells OverDrive, and they will, the data moat is a significant part of what the buyer is paying for.

So Here's My Question

Why are libraries generating this data for free?

Not the patron data. I\'m not talking about who checked out what. I\'m talking about the MARKET INTELLIGENCE.

What's circulating. In what format. In what geography. How many people are waiting. What the hold-to-copy ratio is on frontlist ebooks versus backlist. Where audiobook demand is outpacing print. Which titles are having unexpected second lives three years after publication.

Publishers don\'t have this. BookScan tracks retail sales. It does not track library circulation. The library channel (over a billion annual transactions, 174 million registered borrowers) is a black box to the publishing industry unless OverDrive chooses to share selected findings, wrapped in OverDrive\'s narrative, through OverDrive's self-funded research.

Libraries are sitting on the single largest dataset about American reading behavior that doesn't involve a cash register. And the only entity monetizing it is a private equity portfolio company.

What If Libraries Just... Published It?

Not patron records. Not individual checkouts. Not anything that identifies a human being.

Just this:

Title. Format. Circs this month. Holds this month. Copies owned. Hold ratio.

By zip code. Monthly. In a standard, open format anyone can use.

That\'s it. That\'s a CSV file. It takes 15 minutes to pull from any ILS.

And it changes everything.

For publishers: You finally see what\'s circulating in libraries. Not OverDrive\'s curated press release about their most-borrowed titles. Actual granular data showing demand by geography, format, and intensity. You see hold ratios of 28:1 on frontlist ebooks and you have to reckon with the fact that your $55, 26-checkout licensing model is suppressing transactions that would make you money.

For libraries: You have evidence. Not anecdotes. Not Panorama\'s self-serving surveys. Actual numbers showing that your community wants these books, you want to buy more copies, and the only thing stopping you is a pricing model designed by publishers who don\'t have visibility into the demand they\'re constraining. That\'s a negotiating position.

For authors: You see how your books actually perform in the library channel. Your royalty statement doesn\'t break out library sales. Your agent doesn\'t have library circ data. Nobody does, except OverDrive. And OverDrive doesn't share it with you.

For everyone: You can finally answer the question the industry has been fighting about for 15 years (does library lending help or hurt retail sales?) with independent data, analyzed by independent researchers, without a vendor's thumb on the scale.

The Pilot

I'm doing it. Little Schitt Creek Public Library is publishing monthly circulation intelligence starting now.

The data schema is open. Any library can use it. The sample report is below. It's a CSV file with a metadata header. No patron data. No privacy issue. No new technology required.

If 10 libraries do this, it\'s interesting. If 100 do it, it\'s a dataset. If 1,000 do it, it's the library-sector BookScan that should have existed a decade ago.

And the best part? You don't need permission. Not from ALA. Not from OverDrive. Not from your ILS vendor. Not from publishers. This is YOUR data. It was always your data. You just never published it because nobody told you it was worth something.

Someone else figured that out a long time ago. They've been extracting the value ever since.

Time to change that.

What a Report Looks Like

Here's January 2026 from Little Schitt Creek. Population served: 28,500.

Title	Author	Format	Circs	Holds	Copies	Hold Ratio
Onyx Storm	Rebecca Yarros	Print	52	48	6	8.00
Onyx Storm	Rebecca Yarros	eBook	44	112	4	28.00
Onyx Storm	Rebecca Yarros	Audio	28	67	3	22.33
Iron Flame	Rebecca Yarros	Print	47	12	8	1.50
Iron Flame	Rebecca Yarros	eBook	31	89	4	22.25
Iron Flame	Rebecca Yarros	Audio	22	41	3	13.67
Intermezzo	Sally Rooney	Print	38	6	6	1.00
Intermezzo	Sally Rooney	eBook	29	34	3	11.33
Atomic Habits	James Clear	Print	27	2	5	0.40
Atomic Habits	James Clear	eBook	19	8	2	4.00
Atomic Habits	James Clear	Audio	22	11	2	5.50

See those highlighted numbers? Those are ebook and audiobook hold ratios above 10. That means more than 10 patrons waiting for every copy the library owns.

Onyx Storm: 28 people in line for each ebook copy. At $55 a license, we'd need to spend over $600 just to get the wait down to something reasonable. For one title. In one month. At a library serving 28,500 people.

Now look at print. Same title, hold ratio of 8.00. Still high, but we have more copies and the per-unit cost is lower. The ebook bottleneck isn\'t about demand. It\'s about a licensing model that prices libraries out of meeting demand.

And Atomic Habits? Published in 2018. Still top-circulating six years later. Print ratio is 0.40: no wait, healthy circulation, everyone's happy. But the ebook is 4.00 and the audiobook is 5.50. The publisher could sell us more digital licenses at a reasonable backlist price and generate steady revenue for years. Instead, the pricing keeps us at 2 copies and people wait.

This is what publishers can\'t see today. And it\'s what OverDrive has been sitting on.

How to Join

Pull your top-circ report from your ILS. Every system has one.

Format it: Title, Author, ISBN, Format, Circs, Holds, Copies Owned, Hold Ratio.

Add your library name, zip code, service population, and the month.

Post it. Anywhere. Your website. A Google Sheet. GitHub. Your blog. An email to your state library association.

The data standard is published below. It's intentionally simple. If you can export a spreadsheet, you can do this.

15 minutes a month. No budget. No new software. No vendor approval needed.

Just a librarian with a CSV and the willingness to say out loud what the numbers show.

But What About Privacy?

Title-level circulation counts are not patron records. "Iron Flame circulated 47 times in print in January" identifies zero human beings. It\'s the same category of data that\'s in your annual report, your board presentations, and the checkout statistics OverDrive already publishes in their own press releases.

ALA's Privacy Interpretation of the Library Bill of Rights: "Any data collected for analysis should be anonymous or aggregated, it should never be linked to personal information."

This is aggregated. This is anonymous. This is your data.

The privacy question isn\'t whether libraries should publish circ stats. The privacy question is why a private equity portfolio company has unrestricted access to granular patron lending behavior across 43,000 library systems and nobody\'s asking what they're doing with it.

The Schema

I kept it simple on purpose. If it's complicated, nobody will do it.

File: CSV, UTF-8, one file per month.

Name it: [YourCode]_circ_[YYYY]-[MM].csv

Metadata (top of your posting or separate file):

Library name
Short code
Zip code
Service population
Report month
ILS platform
Contact email

Each row:

Field	What it is
isbn	ISBN-13. Use "NONE" if unavailable
title	As it appears in your bib record
author	Last, First
format	PRINT, EBOOK, AUDIOBOOK, LARGE_PRINT, MAGAZINE, or OTHER
genre	Optional. BISAC code or plain text
circs	Total checkouts that month
holds	Active holds, last day of the month
copies_owned	Total copies/licenses you have in that format
hold_ratio	holds / copies_owned
pub_date	Optional. Year or full date
publisher	Optional

That\'s it. That\'s the whole spec.

Who Am I?

I\'m a librarian with an MBA who\'s spent a decade watching vendors sell libraries their own stories back to them. I've built digital media labs, launched lending programs before they were trendy, and led tech teams in libraries, edtech, legal tech, and beyond.

I\'m not anti-vendor. Vendors provide services libraries need. But when a vendor funded by private equity is using library-generated data to build proprietary research, develop commercial products that monetize patron behavior, and position for a billion-dollar exit, and the libraries generating that data don\'t even know it's happening, someone should say something.

Consider this me saying something.

The data is yours. Publish it.

Note: The circulation data shown is illustrative. Little Schitt Creek Public Library is a demonstration project. The data standard is real and ready for adoption by any public library system.