Independent clinical AI safety research

The first independent safety standard for clinical AI.

AI systems are making high-stakes medication safety decisions in hospitals. No independent, clinically rigorous evaluation measures whether they are safe. Posognos publishes the first continuously updated safety benchmark for clinical AI, built on the standards healthcare already trusts.

Safety PsiBench
01
Clinical Standards
What healthcare trusts
02
Independent Evaluation
Automated, continuous
03
Published Results
Open scorecard
04
Expert Validation
Named peer review

We are the first to independently evaluate
whether clinical AI is safe.

AI is replacing the clinical decision support systems hospitals have relied on for decades. These new models make medication safety recommendations that directly affect patient outcomes, but until now, no independent organization has measured whether they actually work.

33%

Harmful Orders Missed

Hospital systems still miss roughly one in three harmful medication orders. AI is replacing legacy rules engines, but no one is verifying whether it performs better.

96%

Alerts Overridden

Clinicians override up to 96% of drug safety alerts because most are irrelevant. AI that cannot distinguish critical from routine makes the problem worse.

$7.5B

Preventable Harm

Annual U.S. exposure from preventable adverse drug events. Independent evaluation is the missing infrastructure to reduce that number.

How independent evaluation works.

PsiBench translates the clinical safety standards hospitals already trust into automated evaluation scenarios, runs them against AI models independently, and publishes the results.

1

Encode Clinical Standards

Domain experts translate established medication safety standards into automated benchmark scenarios. Every scenario is validated by named clinical authorities and grounded in standards the industry already reports against.

2

Evaluate Independently

Posognos evaluates clinical AI models through EHR test environments and API endpoints using synthetic patient scenarios. No protected health information is accessed, generated, or stored.

3

Publish the Results

Aggregate scores are published on the PsiBench scorecard, freely available to the public. Detailed failure analysis, expert annotations, and remediation guidance are available to subscribers.

See how clinical AI models compare.

The scorecard measures what matters to the people who carry liability when AI gets it wrong: Does it catch the orders that could harm a patient? How does it compare to alternatives? Does performance hold across updates?

PsiBench Scorecard: Medication Safety (CPOE)

Preview, first results Q3 2026
AI Model Contraindication Detection Alert Specificity Override Appropriateness Overall Score
Model A 87 72 64 74
Model B 69 81 78 76
Model C 58 55 41 51
Legacy Rules Engine 44 22 30 32

Independent evaluation for everyone who depends on clinical AI.

Whether you build clinical AI, deploy it, or set the standards it should meet, PsiBench gives you the independent safety data you need to make better decisions.

AI Labs

Prove your models are safe before procurement asks

Hospital systems are starting to require independent safety validation for clinical AI. PsiBench provides third-party evaluation against the standards hospitals already trust, so you can demonstrate safety performance with data, not claims.

  • Independent safety validation recognized by hospital procurement
  • Detailed failure analysis with expert-annotated remediation guidance
  • Continuous regression testing across model versions and updates

Health Systems & Standards Bodies

Evaluate the AI your vendors are selling you

Clinical AI vendors are making safety claims you cannot independently verify. PsiBench provides the evaluation layer that lets you compare products against the standards your organization already uses, without building the testing infrastructure yourself.

  • Independent safety data on the clinical AI products in your environment
  • Side-by-side comparison against the standards you already report on
  • Continuous monitoring as vendors update their models

Built on the standards 2,000+ hospitals already trust.
Expanding across the regulatory landscape.

We do not invent safety metrics. We operationalize the clinical safety standards the industry already uses, so evaluation results are immediately meaningful to the organizations that rely on them.

National Medication Safety Standard
CPOE safety evaluation. 2,000+ hospitals.
Live
CMS SAFER
EHR safety guides. Nine domains. Enforcement begins 2026.
In Progress
Joint Commission
Accreditation safety standards. Hospital-wide quality evaluation.
Planned
ISMP
High-alert medication lists. Pharmacy workflow safety.
Planned
ICH-GCP
Clinical trial safety. International harmonization.
Future
CIOMS
Pharmacovigilance standards. Global drug safety reporting.
Future
2,000+

Hospitals Using the National Standard

Posognos' first benchmark is built on the medication safety evaluation standard already adopted by over 3,000 U.S. hospitals.

90%

Reduction in Testing Burden

Automated, continuous evaluation replaces weeks of manual testing. Synthetic-first methodology. Zero PHI. Privacy by design.

Built for independence from day one.

Credible safety evaluation requires independence from the organizations being evaluated, deep clinical expertise, and access to the standards the industry already trusts. Posognos was built on all three.

Founded by the Standard's Authors

Posognos' founding experts co-created the national medication safety evaluation used by 2,000+ U.S. hospitals. They bring decades of domain authority and direct relationships with the standards bodies that define clinical safety.

Expert-Validated at Every Step

Every PsiBench scenario is built and peer-reviewed by named domain experts with verifiable credentials. Their names appear on the evaluations they validate, because accountability is how trust is built.

Structurally Independent

Posognos is not funded by EHR vendors or AI labs. We do not consult for the entities we evaluate. Evaluation results are published independently. The integrity of the benchmark depends on it.

The people behind the benchmark.

Posognos' credibility comes from who builds and validates the evaluation. The team includes the original authors of the national medication safety standard, clinical informatics leaders, and the engineers who built the clinical decision support platforms used in thousands of hospitals.

Founding Expert Partners
DC

David Classen, MD MS

Co-created the CPOE safety evaluation adopted by 2,000+ U.S. hospitals. Built one of the first computerized physician order entry systems. National patient safety informatics leader.

SP

Stan Pestotnik, MS RPh

Medication safety pioneer. Founding CEO of TheraDoc (acquired). Creator of one of the first real-time clinical decision support systems. Former chief strategy officer at Pascal Metrics.

Validation Network
Practicing clinical pharmacists, informaticists, safety officers, and regulatory specialists who peer-review and certify PsiBench content. Their names appear on every evaluation they validate. Founding partners for CMS SAFER, Joint Commission, and ISMP domains: recruitment in progress.
Technical Leadership
BD

Bryce Daines, PhD

Former EIR at the Allen Institute for AI (AI2); led product at PierianDx (45+ clinical genomics labs) and Tute Genomics through its acquisition.

DC

Dan Cramer

SVP Product at Vizient; previously built and scaled the clinical quality platforms at TheraDoc and Safe & Reliable Healthcare through two acquisitions.

JP

Joshua Proulx

Architected TheraDoc's clinical decision-support alerting engine; co-founded Safe & Reliable Healthcare (acquired by Vizient); re-architected the Knome genomics platform at Tute Genomics.

The safety standard clinical AI has been waiting for.

Whether you build clinical AI, evaluate it for procurement, or hold the clinical expertise that should inform how it is measured, we want to hear from you.