Data Extraction Survival Guide: What Actually Gets Stuck
[an error occurred while processing this directive]Why I wrote this: I watched a library discover during cutover that patron checkout history couldn't be exported, so they lost 8 years of behavior data.
Ask about data extraction now, during the contract you already have. Asking at the end of the contract is too late.
Most library vendors will tell you your data belongs to you. It does. But they're counting on you not asking hard questions about how to get it out.
- Data lock-in is intentional: vendors benefit from patron data they own; libraries lose behavioral analytics, circulation history, and patron profiles when switching vendors.
- Data extraction barriers: no export API, proprietary formats, licensing restrictions on extracted data, and contractual language claiming vendor ownership of aggregated analytics.
- Audit required questions during contract negotiation: who owns extracted data, export timelines, format specifications, and what data is genuinely locked (and at what cost to unlock).
- Recovery strategy if stuck: negotiate extraction in contract termination clause, plan migration timelines before crisis, and maintain parallel data stores in vendor-neutral formats.
- You can prevent this. This guide gives you the exact checklist, technical language, and contract clauses to audit and protect your data right now. Use them in your next contract negotiation.
I watched a library discover, three months before their migration was supposed to happen, that the patron checkout history they wanted to move to their new system couldn\'t be extracted. Not "would be hard to extract." Couldn\'t be extracted. The vendor\'s export tool didn\'t support it. The data was stored in a proprietary format. After 8 years of data collection, it was effectively trapped.
The library had to choose: start over with empty transaction histories, or delay their migration 6 months while the vendor tried to build a workaround that wouldn't be available anyway.
This was preventable. They could have asked these questions three years earlier. They could have negotiated contract language that guaranteed data portability. They could have tested extraction before they were dependent on it.
You don\'t have to be that library. This guide shows you exactly how to prevent it. Use these checklists and questions with your vendor starting today. Don\'t wait until you need to move.
The Data Extraction Reality
Your vendor stores your data. You own the intellectual property (the books, the patron records, etc.). But the vendor controls the format and the export tools.
Most vendors have standard export formats for the things they designed for export: bibliographic records (MARC format), patron records (CSV or standard format). But the custom stuff you\'ve accumulated? Historical data? System configuration? That\'s where things get sticky.
What Vendors Make Easy to Extract
- MARC Records - Standard library format. Every vendor can export this.
- Patron names/addresses - They expect you to leave, so they let this out.
- Current holdings - Standard data. Usually exportable.
- Item barcodes - Essential data. Vendors know you need this.
What Vendors Make Hard to Extract
- Custom fields - A field you added called "preservation notes"? Your vendor didn't design the export tool for it.
- Transaction history - Checkout history, fine history, holds queue history. Often vendor-proprietary format.
- System configuration - Your circulation policies, fine rules, custom item statuses. Usually not exportable at all.
- User permissions and groups - Which staff can do what. Often stored in a vendor-specific way.
- Reports and queries - Custom reports you built. Not exportable; you have to rebuild.
What Vendors Won't Extract
- Integrated third-party data - Your payment processor\'s fine records. Not the vendor\'s data to export.
- Archived/deleted records - Historical patrons who were deleted. Often not recoverable.
- System logs and audit trails - Vendor's data, not yours.
- Raw database backups - Too risky. Vendors won't give you unfettered database access.
Questions to Ask Your Vendor RIGHT NOW
Don\'t wait until your contract is ending. Ask these questions now. If your vendor won\'t answer clearly, that's a red flag worth escalating.
The Data Export Checklist
| Data Type | Exportable? | Format | Cost? | Timeline? |
|---|---|---|---|---|
| Bibliographic Records | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Holdings Records | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Authority Records | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Patron Records (Current) | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Checkout History | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Patron Fines/Fees History | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Holds Queue | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Custom Fields (List: ____) | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Configuration (Circulation Rules) | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Configuration (Item Types) | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
| Deleted/Historical Patron Records | [ ] Yes [ ] No [ ] ? | _____ | [ ] Free [ ] Fee [ ] ? | _____ |
Don't accept "yes" or "no" as final answers. Dig deeper on every row. If they say "exportable," ask what format. If they say "fee," ask the amount.
Deeper Questions
On Format
- "What export formats are available?" (Demand MARC for bibliographic, CSV or standard database format for patron data, etc.)
- "Will the format be compatible with standard ILS systems?" (If they give a proprietary format, push back.)
- "Can you provide sample data in the format so we can test our import process?"
- "What character encoding will be used?" (UTF-8, not some vendor-specific character set.)
On Speed
- "How long does a data extract take?" (For 300K records: should be 1-2 days, not weeks.)
- "Can you do incremental extracts or is it full database only?" (Incremental is better for testing.)
- "What happens if we request extract with 90 days" notice? 30 days" notice?"
On Cost
- "Is data extraction included in the contract or is there a separate fee?"
- "If there\'s a fee, what\'s the amount and is it negotiable?"
- "Are there any costs for software licenses or consulting hours during extraction?"
- "What if the extraction has to be re-run? Is that an additional charge?"
On Custom Fields
- "What custom fields have been created in our system?"
- "How are these fields stored in the database?"
- "Will these fields be included in a standard export?"
- "If not, how do we extract them?"
On Historical Data
- "What is the oldest checkout record we can extract?" (Years of data or just current period?)
- "Are deleted patron records recoverable?" (Or are they gone?)
- "How far back does transaction history go?"
The Red Flags
If your vendor says any of these, be very cautious:
- "We don\'t really extract that kind of data." - Translation: "We don\'t want you to have it." This is bad.
- "The extract process is very technical, you\'ll need our staff to do it." - Translation: "$$$" and you can\'t do it yourself.
- "That data is stored in our proprietary format, not standard MARC." - Translation: You're locked in.
- "We charge per-record for extracts above a certain size." - Translation: Extracting your data costs thousands.
- "Some data from 2005-2010 was archived and isn\'t easily recoverable." - Translation: You\'re losing years of history.
- "We\'ll need 6 months to build the export tool." - Translation: You can\'t leave for 6 months.
A Technical Glossary (What They Actually Mean)
MARC (Machine Readable Cataloging)
Industry standard for bibliographic records. All ILS systems support MARC. If vendor won\'t export MARC, they\'re locking you in.
API (Application Programming Interface)
A technical interface that lets external systems read data from the ILS. Good vendors have APIs. Bad vendors don't (because they want you trapped). Ask: "Do you have a public API? Can we write custom scripts to extract data?"
Batch Export
A tool to extract large amounts of data all at once. Standard vendors have this. If they don\'t, you're extracting one record at a time, which is inefficient and impractical.
Field Mapping
How data from vendor system translates to standard formats. If vendor has a "field mapping tool," they\'ve thought about data portability. If not, you're doing manual conversion.
Backup vs. Export
A backup is a raw copy of the database. An export is a translated version for other systems. Vendors often refuse to give you raw backups (for security reasons, but also to lock you in). You want standard exports, not proprietary backups.
What Actually Gets Stuck: The Uncomfortable Truths
Patron Checkout History
Many vendors store this in vendor-specific format. You can see it in their interface, but you can't export it. Libraries lose years of "who read what" data.
Prevention: During contract negotiation, demand: "Vendor will provide complete checkout history in standard format (CSV or database export) quarterly and at termination."
Custom Item Types
You created 47 custom item types (Book, E-Book, Audiobook, Microfilm, etc.). New system might not support some of these categories. You have to map them manually.
Prevention: Get a list of all custom item types in writing. Ask new vendor if they support them. If not, budget time for remapping all 45K items.
Circulation Policies
You have rules: "Adults: 3-week checkout, 25 item limit. Students: 2-week checkout, 20 item limit." These are stored as vendor-specific configurations. You can't export them; you have to manually rebuild them in the new system.
Prevention: Document your policies in writing NOW. When you migrate, you'll rebuild from this documentation. Budget 2-3 weeks of work.
Fine Calculations
If you have custom fine rules ("Books overdue 7+ days: $0.25/day. Audiobooks: $0.50/day"), you have to rebuild these in the new system. Often the new system calculates fines differently anyway.
Prevention: Document fine rules. Be prepared to simplify during migration (new system might not support complexity you've built).
Authority Control Data
Library of Congress authorities (names, subjects) are standard. But if you\'ve created local authorities, these often can\'t be exported or don't make sense in the new system.
Prevention: Use LC authorities whenever possible. Minimize local authorities. If you have local authorities, document them and be prepared to lose some during migration.
Patron Deleted Records
You delete patron accounts (patrons who moved, deceased patrons, etc.). Once deleted, many systems can't recover them. If you need to keep historical patron records for auditing or analytics, you need to export them before deleting.
Prevention: Don't actually delete patron records. Mark them as "inactive" instead. Export archived patrons to a separate file quarterly for safekeeping.
Contract Language: The Escape Clauses You Need
Before you sign your next vendor contract, demand this language:
Data Portability Clause
"Upon contract termination, Vendor shall provide Library with complete, machine-readable export of all data including (but not limited to): bibliographic records in MARC21 format, patron records in CSV format, item records with all fields, transaction history, custom configuration, and any other data stored in Vendor's system. Export shall be provided within 30 days of written request, at no additional cost beyond standard contract terms."
Data Format Clause
"All exported data shall be in industry-standard formats (MARC21, CSV, JSON, or equivalent) compatible with standard library systems. Vendor shall not provide data in proprietary formats. If Vendor\'s system stores data in proprietary format, Vendor shall convert to standard format at Vendor\'s cost."
Custom Field Clause
"Any custom fields or local configurations created at Library's request shall be documented in writing with field names, definitions, and usage. Upon termination, Vendor shall provide complete export of all custom fields with data values intact."
Historical Data Clause
"Vendor shall maintain and provide upon request: complete checkout history (minimum 7 years), patron fine and fee history, holds queue history, and any other transactional data. Data shall be provided in standard format (CSV, JSON, or database dump)."
Performance & Availability Clause
"Data export shall be completed within 30 days of written request. Vendor shall provide incremental extracts (monthly) at Library's option to facilitate testing and validation before final migration."
Termination Clause
"Upon contract termination, for any reason, Vendor shall make all data available for export at no cost. Vendor shall not require payment of outstanding balances to release data. Library shall have minimum 60-day access to Vendor's system to complete data extraction."
If your vendor won\'t agree to these, that\'s a massive red flag. Vendors who protect data portability are vendors confident in their product. Vendors who create extraction barriers are trying to keep you locked in.
Technical Audit: A Checklist for Your IT Staff
If you have an IT person, have them audit your vendor's actual extraction capability:
- Request sample export: Ask vendor for sample data in the format they claim to support. Load it. Does it work?
- Test MARC validation: If they say they export MARC, validate it with MARC standards checker.
- Check database schema: Ask for database schema documentation. Where are your custom fields stored? Are they in standard tables or vendor-specific extensions?
- API documentation: If vendor has API, review documentation. Can you write a script to export data incrementally?
- Test small extract: Request export of 1,000 records. Validate it. If this fails, full export will fail.
- Ask about limitations: "What\'s the largest export you\'ve done?" (If they say "500K records," you're ok. If they say "50K records," you might hit limits.)
Recovery Strategies: When Data Gets Stuck
You asked all the right questions, but your vendor still can't extract something you need. Now what?
Option 1: Hire Consultant to Extract
Consultant with database skills can sometimes extract data directly from vendor's database (with permission). Cost: $5-15K. May work if vendor is uncooperative but legally required to give you data.
Option 2: Rebuild Manually
For some data, manual rebuilding is faster than extraction battles. Circulation policies? Rebuild in new system (takes 1-2 weeks). Custom item types? Map them manually (takes 1-2 weeks). Transaction history older than 3 months? Accept you're losing it; start fresh.
Option 3: Delay Migration
If vendor can build extraction tool, sometimes they will, but it takes 3-6 months. Ask: Is delaying worth it? If checkout history is essential for your analytics, maybe yes. If it's nice-to-have, probably no.
Option 4: Legal Escalation
If contract guarantees data access and vendor won\'t provide, you might have legal recourse. Consult your library\'s attorney. This is expensive and slow, but vendors hate having lawyers involved.
Option 5: Accept Loss
Some data you\'ll lose. Historical patron behavior? Often gone. Deleted patron records? Often gone. Archived holdings from 2005? Often gone. Accept this and move on. Have clear conversation with your board: "We\'re losing X data. Is that ok?" Usually it is.
Prevention Checklist: Before You Migrate
- [ ] Completed the Data Export Checklist above (all rows filled in)
- [ ] Identified which data is actually important to you (vs. nice-to-have)
- [ ] Documented current system configuration (policies, item types, custom fields)
- [ ] Requested and reviewed sample data extract
- [ ] Validated sample data with IT staff
- [ ] Identified what data will definitely be lost
- [ ] Reviewed contract language on data portability
- [ ] Escalated any "stuck" data issues to management 6+ months before migration
- [ ] Negotiated extraction timeline (should be 30 days max)
- [ ] Confirmed extraction cost (should be free or negotiated separately)
References & Further Reading
- Open Archival Information System (OAIS). (2022). "Reference Model for an Open Archival Information System." Digital Preservation documentation. How to think about data preservation during migration.
- Library of Congress. (2024). "MARC 21 Bibliographic Data Format." Technical documentation. MARC export standards.
- Chada, S. (2025). "Vendor Migration Playbook: The Real Timeline." Unhinged Librarian. Retrieved from unhingedlibrarian.com. Complementary guide to extraction timeline.
Do This Now: Act Today
Don't read this and move on. Your vendor is banking on you not acting.
This week:
- Print the Data Export Checklist above (page with the table)
- Fill it out for YOUR vendor RIGHT NOW
- Send it to your vendor with: "I need answers to all rows. Be specific on timeline and cost."
- If they won\'t answer, escalate to your director: "Vendor won\'t answer data portability questions."
This month:
- Have your IT staff review the answers
- Identify what data will get stuck (everyone has some)
- If your contract is up for renewal: demand the data portability language from this guide
- If your contract is mid-term: request an amendment with those clauses
Prevention is infinitely cheaper than recovery. A conversation today saves months of pain later.
Related Reading
Learn more about vendor relationships, contracts, and your data rights:
- Vendor Migration Playbook: The Real Timeline — Full timeline and budget expectations for system migrations. Understand where data extraction fits in the larger project.
- The Beginning of the End, Part 2: Contract Traps — What vendors hide in contracts about data ownership and export permissions. Negotiate these terms before signing.
- The Beginning of the End, Part 5 — Additional insights on vendor relationships and vendor lock-in mechanisms that affect data portability.
- The Beginning of the End, Part 1: Baker & Taylor's Collapse — How we got into this vendor lock-in trap in the first place. Context for understanding data extraction challenges.
Protect Your Data Rights
Beyond vendor contracts, your library needs comprehensive data protection practices. The Data Protection & Compliance Framework covers data inventories, access controls, encryption, and compliance requirements. It helps you know what patron and operational data you have, who can access it, and how to protect it from both external attacks and internal misuse.