Researchers Warn of Confidentiality Leaks and Fabricated Citations in Commercial AI Research Tools
What Happened – A think‑aloud study of 15 academic researchers revealed that when using commercial generative‑AI platforms (e.g., Research Rabbit, Elicit AI) they routinely submit unpublished questions, draft hypotheses, and proprietary domain knowledge. Participants reported two major problems: (1) uncertainty that prompts are being stored, reused for model training, or exposed to third parties, and (2) inability to verify the provenance of AI‑generated citations, leading to hallucinated references and “synthetic blending.”
Why It Matters for TPRM –
- Confidential prompts act as a data‑exfiltration vector for any organization that permits employees to feed internal research, code, or strategy into third‑party LLMs.
- Hallucinated citations erode trust in AI‑generated deliverables, increasing manual review effort and the risk of disseminating false information.
- Lack of vendor‑level transparency hampers contractual and compliance controls (e.g., GDPR, IP protection).
Who Is Affected – Higher‑education and research institutions, R&D departments in technology and pharma firms, and any enterprise that allows staff to use external generative‑AI services for knowledge work.
Recommended Actions –
- Conduct a risk assessment of all employee‑facing generative‑AI tools; map data flows and retention policies.
- Update acceptable‑use policies to prohibit uploading unpublished or proprietary material to unvetted AI services.
- Require vendors to provide clear documentation on prompt handling, storage, and training‑data usage; negotiate contractual clauses for data deletion and audit rights.
- Implement verification workflows for AI‑generated citations (e.g., cross‑checking against internal repositories).
Technical Notes – The study highlights two failure modes: attribution displacement (accurate facts linked to the wrong source) and synthetic blending (fabricated claims mixed with legitimate citations). No specific CVEs or malware were identified; the risk stems from opaque prompt‑retention practices and black‑box retrieval pipelines. Source: Help Net Security