Browsing by Author "Holinko, Viktoriia"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
Item From Black-Box Confidence to Measurable Trust in Clinical AI: A Framework for Evidence, Supervision, and Staged Autonomy(https://arxiv.org/, 2026) Заболотній, Сергій Васильович; Zabolotnii, Serhii; Holinko, Viktoriia; Antonenko, OlhaTrust in clinical artificial intelligence (AI) cannot be reduced to model accuracy, fluency of generation, or overall positive user impression. In medicine, trust must be engineered as a measurable system property grounded in evidence, supervision, and operational boundaries of AI autonomy. This article proposes a practical framework for trustworthy clinical AI built around three principles: evidence, supervision, and staged autonomy. Rather than replacing deterministic clinical logic wholesale with end-to-end black-box models, the proposed approach combines a deterministic core, a patient-specific AI assistant for contextual validation, a multi-tier model escalation mechanism, and a human supervision layer for verification, escalation, and risk control. We demonstrate that trust also depends on selective verification of clinically critical findings, bounded clinical context, disciplined prompt architecture, and careful evaluation on realistic cases. Classifier-driven modular prompting is examined as an incremental path to scaling clinical depth without sacrificing prompt performance and without waiting for complete rule-based coverage. To operationalize trust, a set of trust metrics is proposed, built on metrological principles -- measurement uncertainty, calibration, traceability -- enabling quantitative rather than subjective assessment of each architectural layer. In this perspective, trustworthy clinical AI emerges not as a property of an individual model, but as an architectural outcome of a system into which evidence trails, human oversight, tiered escalation, and graduated action rights are embedded from the outset.Item Local-First Clinical Text Structuring with Fine-Tuned MedGemma for Readmission Risk Assessment(https://zenodo.org, 2026) Заболотній, Сергій Васильович; Zabolotnii, Serhii; Holinko, ViktoriiaBackground. Unstructured clinical notes remain a bottleneck for deployable healthcare AI; cloud-dependent pipelines raise privacy and infrastructure barriers. Methods. We present MedGemma StructCore, a local-first two-stage extraction pipeline using compact MedGemma 4B models. Stage 1 applies Schema-Guided Reasoning to summarize notes into structured JSON across nine clinical clusters. Stage 2 projects summaries into canonical KVT4 (Cluster|Keyword|Value|Timestamp) facts via a LoRA-adapted model. Deterministic normalization, a signal-integrity gate, and offline hybrid regeneration audit and reduce silent objective signal-loss between stages. Prompt KV-cache reuse yields +10.6% speedup with bit-exact output [Verified]. Results. On MIMIC-IV (N=50,000; patient-level split; Ntest=9,857), the tabular baseline (A4) achieves AUROC 0.685 (95% CI 0.670–0.699) [Verified]. On the full canonical test split (Ntest=9,857), under a constrained training regime (Ntrain=1,500, Nval=400), A3factlevel achieves AUROC 0.659, AUPRC 0.321, and Brier 0.145. Against a fair tabular refit baseline (LogReg and XGBoost) with the same training split and demographic covariates, A3factlevel improves AUPRC and Brier [Verified], while AUROC uplift is small and not statistically verified [Preliminary]. Notably, XGBoost does not outperform logistic regression on the same feature set, confirming that downstream gains are attributable to KVT4 features rather than estimator choice. As a post-closure continuation branch, direct typed downstream fusion of four high-signal semantic labels improves the current Stage 2 baseline on the same canonical split and yields a verified AUPRC gain over the canonical A4 tabular arm [Verified], while remaining near-parity rather than clearly superior to A3factlevel. KVT4 format validity is 99.74%; a signal-integrity audit (N=4,000) finds 15.55% doc-level objective loss (among admissions with Stage 1 numeric vitals/labs), reduced to 8.48% by offline hybrid regeneration without additional LLM calls. Structured-reference validation now includes a large LABS benchmark on the full canonical test split and a preliminary VITALS benchmark path with chartevents-backed BP/Weight evaluation. A model scaling pilot replacing Stage 1 with GPT4.1-mini confirms that moderate LABS micro-F1 (≈0.52 ceiling) reflects reference-alignment mismatch rather than model capacity [Preliminary, N=200]. Conclusion. The primary contribution is reliable, auditable local-first clinical text structuring infrastructure running on consumer hardware. On the canonical test split, factlevel KVT tokenization improves precision–recall and probabilistic accuracy metrics (AUPRC, Brier) over a tabular refit baseline (Verified); AUROC uplift is small (Preliminary). Direct typed downstream fusion now provides the strongest verified continuation path over the current Stage 2 baseline, suggesting that typed semantic signals are a more promising next optimization target than further free-form Stage 2 generator variants. The current revision package therefore supports a conservative conclusion: notes-derived KVT4 facts add useful predictive signal, but stronger extraction-quality and fairness claims still require further validation.