[an error occurred while processing this directive]

By Sam Chada

Library technology consultant with 18 years in library tech. I've been on both sides of the sales call - I know what data vendors are collecting and what could go wrong.

Surveillance or Service? AI Privacy for Vulnerable Patrons

[an error occurred while processing this directive]
Content Note: This article discusses data surveillance, patron privacy breaches, and how vulnerable communities can be harmed when AI systems collect their behavioral data. The scenarios described (immigration enforcement, domestic violence, outing) are real risks based on documented incidents. If discussions of surveillance, privacy breaches, or harm to vulnerable people cause anxiety, you can: (1) jump to the "How to Protect Your Mission" section for concrete steps, (2) read the TL;DR to get the main points, or (3) take a break and come back when you're ready. Your safety matters more than reading the full post.

Why I wrote this: I just scrubbed a leaked search log that put a domestic violence survivor at risk; I don't want another library to repeat that.

Any vendor that won't sign data minimization and breach response terms is selling you patron danger, not personalization.

Design Launch Breach Drill 87% 49% 46%
Original chart I sketched while writing: rough checkpoints for AI Surveillance Vulnerable Patrons. Mark your own numbers on top of mine.

Libraries serve vulnerable populations. That's our actual mission. Not recommendations. Not personalization. Not vendor revenue. People who come to us because they have no other safe place to ask their questions.

TL;DR
  • Library AI (surveillance, behavior tracking, recommendation algorithms) disproportionately impacts vulnerable patrons: unhoused, immigrant, low-income communities who lack privacy alternatives.
  • AI privacy issues: data collection without informed consent, algorithmic bias in recommendations, integration with law enforcement surveillance, and patron data sold to third parties.
  • Vulnerable patrons rely on libraries as privacy-safe spaces. AI that enables tracking, identifies behavioral patterns, or enables law enforcement access breaches this implicit trust.
  • Board-level policy needed: minimum data collection, explicit vendor restrictions, opt-in vs. automatic enrollment, and transparency about what AI tools actually do with patron information.

When you implement an AI recommendation engine, a chatbot, or search analytics, vendors tell you it\'s "privacy-preserving." They use phrases like "aggregated, anonymized data." It sounds safe. It\'s not.

Here's what happens in practice: Your vendor collects search queries, reading history, and conversation logs.

Then, six months later, the vendor gets hacked. Data leaks. Real people's searches are now exposed.

The impact isn't equal across patron communities:

  • An affluent patron searching for book club recommendations? It's annoying.
  • An immigrant patron who searched "asylum petition timeline"? ICE shows up at their door.
  • A domestic violence survivor who searched "confidential shelter near me"? An abuser finds their location.
  • An LGBTQ+ youth who searched "transgender healthcare"? A family or religious institution weaponizes that data against them.

I know this sounds dramatic. I\'ve cleaned up the fallout. I\'ve notified a survivor that her search history was exposed.

The risk isn\'t theoretical. It\'s what I watch happen when libraries collect vulnerable people's information without understanding who could weaponize it.


What AI Tools Actually Collect (And What Vendors Know About It)

Here\'s what your vendor is collecting, even when they claim it\'s "privacy-preserving":

  • Recommendation engines: Every book/resource clicked, every search term entered, rating/review patterns, reading history - linked to patron accounts or session identifiers
  • Chatbots: Every question asked, every conversation thread, all personal details shared in conversation - stored on vendor servers
  • Search analytics: Query logs showing information needs (medical conditions, legal situations, personal crises, sensitive life events)
  • Discovery systems: Facet clicks, refine patterns, what materials attract interest - timestamped and linked to individuals
  • Usage analytics: IP addresses, device information, session data - potentially re-identifiable even when "anonymized"

The "Anonymized" Problem

Vendors claim they "de-identify" this data. That's a marketing statement, not a technical guarantee.

De-identify means removing names and ID numbers. But researchers have repeatedly shown that "anonymized" datasets can be re-identified by correlating multiple data points. An IP address + search timestamp + library location is often enough to pinpoint an individual.

Translation: "anonymized" data can often be matched back to specific people.

Why Vendors Keep Your Data

Here's what vendors count on you not asking: Why does the algorithm need to keep individual-level data at all?

If they're truly just improving recommendations, they could aggregate patterns without storing who searched for what.

They don't, because retention creates value. Your patron data is an asset they can sell, license, or exploit.

Breaches Are Inevitable

Breaches happen. Every vendor gets breached eventually.

And when they do, vulnerable people who trusted your library suddenly have their information in the hands of people who want to harm them.


Real Harms (Not Hypothetical)

These scenarios are based on real incidents. If they feel intense, it's because the stakes are real for these communities.

Immigration Enforcement (click to expand)

An undocumented patron comes to your library because it's the only place they can ask questions without fear.

They search: "How to apply for asylum" or "Are undocumented immigrants eligible for services?" or "What happens if I get pulled over without papers?"

If your AI tool logs these searches and the vendor gets breached, ICE now has a targeting list.

This isn\'t speculative. ICE agents have used library browsing records, checkout histories, and digital footprints to identify and track immigrants. Your recommendation engine data, if breached, becomes another data source. You\'re not just documenting an information need. You're creating a breadcrumb trail to vulnerable people.

Law Enforcement Access (click to expand)

A patron searches for information about bail, criminal defense, protest tactics, or their own arrest.

If that data is breached and accessed by law enforcement, it becomes evidence or targeting information.

Librarians have faced subpoenas for patron records. If you have detailed behavior logs, those subpoenas get a lot of useful data.

Domestic Violence (click to expand)

A survivor comes to your library to plan their escape.

They search for shelter locations, legal aid for protective orders, child custody resources, and how to safely leave. They use the library because they can't search at home, since the abuser controls the devices and accounts.

If your AI system logs these searches and the vendor gets breached, the abuser has a roadmap: shelter names, legal strategies, timing.

A data breach becomes a hunting guide. Survivors choose libraries because we're supposed to be safe. Implementing surveillance systems, even well-intentioned ones, breaks that safety.

Sexual Orientation & Gender Identity (click to expand)

A teenager from a conservative religious household uses your library to explore who they are.

They search: "Is it normal to be trans?" and "LGBTQ+-friendly therapists near me" and look at books about gender identity.

They trust your library because librarians protect privacy. If your AI system logs these searches and the vendor gets breached, or if a parent requests the data legally, that teenager\'s parents know about their identity exploration before they\'re ready to come out.

Outing is dangerous. Conversion therapy still happens. When you implement surveillance systems, you tell LGBTQ+ youth: this isn't a safe place. You force them to choose between the library and their safety.

Medical & Mental Health (click to expand)

A patron searches for HIV treatment, mental health resources, or specific medical conditions.

If that data is breached, it becomes identifiable health information that insurers, employers, or others could misuse.


Vulnerable Populations Matrix

Population Risk If Data Breached Harm Type
Immigrants/Undocumented Targeted for ICE deportation efforts Removal from country
LGBTQ+ Youth Outed to family/community/institutions Family rejection, conversion therapy
Domestic Violence Survivors Abuser accesses location/shelter information Physical harm, re-victimization
Political Activists State surveillance of activism/organizing Targeting, arrest, harassment
People w/ Undisclosed Health Issues Employers/insurers access health searches Discrimination, denial of services
Asylum Seekers Information about asylum search history Denial of asylum, deportation
People in Legal Trouble Law enforcement accesses legal research Additional charges, conviction

Why Consent Alone Isn\'t Enough: You cannot solve this problem through consent forms. A domestic violence survivor can\'t "opt out of tracking" if opting out means losing access to resources she needs. Consent only works when people have genuine alternatives. But vulnerable people don't have a choice; they choose between using the library with surveillance or not using it at all.

If they choose to use it, a breach doesn't ask for their consent. It just happens.

This means you have to make the choice for them by deciding: does this library collect patron behavior data at all?


How to Protect Your Mission (3-Step Template)

Before you implement any AI tool - and I mean before you sign anything - run this assessment. This isn't a box to check. This is protecting the vulnerable people who depend on your library.

Step 1: Demand Truth About Data Collection

Don\'t ask the vendor\'s sales rep. Ask the vendor's privacy officer or data engineer: What data does your tool collect? How long is it retained? Who has access? Can it be tied back to individual patrons? If they hedge, push back. "Eventually de-identified" is not a commitment. "May be accessed by" is not acceptable.

Document what they tell you - everything:

  • Search queries and reading history: YES/NO
  • Conversation logs: YES/NO
  • Personal information shared: YES/NO
  • Timestamps and session data: YES/NO
  • Device/IP information: YES/NO
  • De-identified or individually identifiable: ______
  • Retention period: ______
  • Who at vendor has access: ______

Step 2: Assess Harm by Population (Be Specific)

For each vulnerable population your library serves, ask: If this data is breached tomorrow, what happens to them? Don't be abstract. Get specific. Get angry about it.

Example: Recommendation Engine + Immigrant Community in Your Area

  • Data collected: Book searches, reading history, queries about visas/asylum/immigration law, resource access patterns
  • Who could access a breach: ICE agents, private data brokers, vigilante groups, state immigration enforcement
  • Real harm scenario: A family searching for visa information has their search history correlated with their home address through a data broker. ICE uses that to prioritize enforcement. Someone gets deported.
  • Severity: CRITICAL (could lead to deportation, family separation, death)
  • Your decision: Accept risk? YES / NO / ONLY IF [specific safeguards]

Step 3: Make a Real Decision

You have three options. Pick one. Own it.

Option A: Don't Implement

If the harm is severe and can\'t be mitigated, don\'t use the tool. This is a legitimate decision. More than legitimate - it\'s the right decision if the risk to vulnerable people is too high. Not every vendor feature is worth the risk. A recommendation engine is nice. Protecting your community\'s safety is essential.

Option B: Proceed with Aggressive Safeguards

If the tool serves a genuine need, demand contractual safeguards. Don\'t ask nicely. Demand them. Here\'s what you require:

• Data de-identified within 48 hours (not "eventually") • No retention of individual transaction history • Encryption both in transit and at rest • Immediate breach notification (24 hours max) • Right to audit vendor security practices • Subpoena response protocol: Vendor notifies library before responding • Limitation on vendor employee access to patron data

Option C: Proceed with Informed Consent

If you think transparency is enough, require informed consent where patrons understand exactly what\'s collected, who might access it, and can opt out without losing core services. Be clear: "Opting out means you don\'t use the recommendation engine, but you can still search, check out books, and use all other library services." (Note: This is often insufficient for vulnerable populations, but it's better than collecting data without asking. Many vulnerable people will opt out once they understand the real risk.)


Vendor Contract Language (Copy-Paste Ready)

Add these clauses to your vendor contracts:

DATA DE-IDENTIFICATION Vendor shall de-identify all personally identifiable patron information within 48 hours of collection. De-identification must comply with HIPAA Safe Harbor method or equivalent. Vendor shall not retain individual transaction records beyond this period. BREACH NOTIFICATION Vendor shall notify Library within 24 hours of discovering any unauthorized access, loss, or theft of patron data. Notification must include: affected population size, data types exposed, actions taken to contain breach. SUBPOENA RESPONSE If Vendor receives a subpoena, regulatory request, or legal inquiry requesting patron data, Vendor shall: (1) Notify Library immediately; (2) Not respond without Library consent or court order; (3) Provide Library opportunity to seek protective order. NO PROFILING Vendor shall not use patron behavioral data to create profiles, scores, or predictive models of individual patrons without explicit written consent. Vendor shall not sell, license, or share patron data with third parties. AUDIT RIGHTS Library retains right to audit Vendor's security practices, data handling procedures, and vendor subcontractor access to patron data upon 30 days notice, no less than annually.

Implementation Checklist

Before going live with any AI tool:

  • [ ] Conduct privacy impact assessment for each vulnerable population you serve
  • [ ] Review vendor contracts for data retention, encryption, breach notification
  • [ ] Negotiate safeguards based on harm assessment
  • [ ] Brief reference and circulation staff on what data is collected
  • [ ] Create patron notification explaining what is collected and why
  • [ ] Establish data deletion schedule (monthly, quarterly)
  • [ ] Plan breach notification process (who do we contact? in what order?)
  • [ ] Document decision rationale (why we chose this tool despite risks)
  • [ ] Audit annually (is vendor still following contract terms?)

Talking Points for Your Board

"One data breach could put vulnerable patrons at risk. Not someday - now."

Concrete example: "If our recommendation engine data is breached, every undocumented patron who searched "visa petition" or "asylum" is now identifiable. ICE agents have used library data to target deportations. We would be giving them a targeting list. This isn\'t theoretical - it\'s what happened with ICE and library records."

"Our core obligation is patron safety. AI tools create new risks to that."

Libraries have always protected patron privacy. It\'s in the ALA Code of Ethics. It\'s in our intellectual freedom principles. AI tools force us to choose: Do we serve vendor profit or patron safety? We can't do both.

"Privacy is the foundation of access. Vulnerable people won't use a library that surveils them."

Immigrants won\'t search for visa information if they fear the data will be used against them. Domestic violence survivors won\'t seek shelter information if they\'re afraid their searches will be exposed. LGBTQ+ youth won\'t explore their identity if they fear outing. A library without privacy is not a safe place. It's a trap.


The Central Question

Stop for a second. Here\'s what you're deciding when you implement AI tools with patron behavior tracking:

"What level of data collection is appropriate for the value this tool provides?"

This isn\'t about vendors being evil. Vendors aren\'t evil. Your sales rep is probably a decent person. But the system they work for is designed to extract value from your institution. They collect data because data has value.

That\'s not malicious. It\'s how the business model works. The question is whether that model aligns with your library's mission to serve vulnerable people safely.

Sometimes the answer is: yes, we\'ll implement this with strong safeguards because the tool genuinely serves our community. But make that decision knowing exactly what you're collecting and who could access it if something goes wrong.

Sometimes the answer is: no, this tool isn't worth the risk. A recommendation engine is not worth the safety of domestic violence survivors, undocumented immigrants, or anyone else you serve.

Both answers are legitimate. The problem is making the decision without full information.


See Also

Technical Terms Explained

This post uses some library and AI terminology. Here's what they mean in plain language:

AI tool / AI system
Software that uses patterns in data to make predictions or generate content. Examples: recommendation engines, chatbots, search analytics. The library doesn't need to understand how it works, but you do need to know what data it collects.
Recommendation engine
A tool that suggests books/resources based on what patrons have previously viewed or checked out. It learns by tracking patron behavior.
Chatbot
An automated conversation system that responds to questions. Library chatbots might answer "Where's the bathroom?" or "How do I renew my books?" Every question the patron types is recorded.
Search analytics
Tools that track what patrons search for and how often. Libraries use these to understand what information patrons need, but the data shows sensitive information (medical searches, legal searches, etc.).
De-identify / De-identification
Removing names and ID numbers from data. However, "de-identified" data can often be re-identified by combining it with other information (like location + search term + timestamp).
Anonymized data
Data that has been stripped of identifying information. Like "de-identified," but researchers have shown that truly anonymous data is hard to create, and most "anonymized" datasets can be re-identified.
Breach / Data breach
When a vendor gets hacked and their data (including your patron data) is stolen or leaked.
Re-identify / Re-identification
When researchers or attackers connect "anonymized" data back to specific individuals by combining it with other information. Example: An IP address + timestamp + search term might be unique enough to identify someone.
Subpoena
A legal order requiring you to provide specific records or data. Law enforcement can subpoena patron records, and the library must comply unless a privacy law protects the data.
Patron data
Any information about patrons: names, contact info, what they searched for, what they checked out, what they asked a chatbot, their behavior patterns. All of this is collected and stored by AI systems.

Sources & Further Reading

For information about how these sources were selected and verified, see How I Research Library Tech.

  1. American Library Association (2024). Library Confidentiality and Intellectual Freedom: Best Practices. Professional standards for patron privacy.
  2. Narayanan, A. & Felten, E. (2014). No Silver Bullet: De-identification Still Doesn't Work. IEEE Security & Privacy. Journal article on de-identification failures.
  3. National Domestic Violence Hotline (2024). Tech Safety for Domestic Violence Survivors. Guide on technology risks for survivors.
  4. Immigration Advocates Network (2024). Know Your Rights: Digital Privacy & Immigration Enforcement. Legal resource on digital privacy risks.
  5. National Institute of Standards and Technology (2024). Framework for Improving Critical Infrastructure cyber security: Privacy Profile. NIST cyber security standards for data protection.
  6. California Consumer Privacy Act (CCPA) 2024 Update. Rights and Obligations for Data Collectors. State privacy regulation with implications for libraries.

Protect Vulnerable Patrons Through Data Protection

Libraries serving vulnerable populations need stronger data protection. The Data Protection & Compliance Framework provides guidance on minimizing patron data collection, encrypting sensitive information, controlling access, and implementing proper incident response procedures. It includes specific templates for patron data inventories and compliance checklists to ensure vulnerable populations aren't exposed to surveillance, data breaches, or unwanted disclosure.

[an error occurred while processing this directive]