OTOLens — AI-Assisted Middle Ear Diagnosis

The Problem

Middle ear disease is a leading cause of hearing impairment, particularly in children. Conditions such as Otitis Media with Effusion (OME), Tympanic Membrane Perforation (TMP), and Myringosclerosis (MYR) require experienced clinical interpretation of otoscopic images — a skill that is unevenly distributed across hospitals and regions.

OTOLens was built to support ENT clinicians and researchers in Thai hospitals, providing structured, reproducible diagnostic assistance and building a labeled dataset for future model development.

What Is OTOLens

OTOLens is an iPad and iPhone application that guides a clinician through a structured case workflow:

Record patient demographics and clinical information
Capture or import an otoscopic image
Run on-device AI inference — no internet required for analysis
Receive an AI-generated clinical summary
Submit the case for expert review — an independent specialist re-examines the image and provides a verified label

Reviewed cases and expert labels are uploaded to a secure cloud backend, building a growing annotated dataset. When a validated model update is ready, it is automatically delivered to all enrolled devices over-the-air — no manual update required.

Note: The expert review, relabelling, and automatic model deployment pipeline is subject to a data sharing and collaboration agreement between the contributing institution and the expert panel.

Version 2.0 builds on the previous release with updates to the image preprocessing stage, the model training pipeline, and the feedback loop connecting clinician submissions, expert review, and model deployment.

How It Works

Stage 1

Image Quality Gate

Before any diagnosis, the app verifies that the image is a valid ear image using a binary support vector machine (SVM) classifier (EarImageCheck). Low-quality or non-ear images are rejected with guidance to recapture.

Stage 2

Parallel Condition Detection

If the image passes the quality gate, three independent binary SVM classifiers run in parallel. Each classifier outputs a confidence score. Results are non-exclusive — a single ear may show multiple concurrent findings.

Classifier	Detects
OMEDetector	Otitis Media with Effusion
TMPDetector	Tympanic Membrane Perforation
MYRDetector	Myringosclerosis

Stage 3

AI Clinical Summary

Classifier outputs, confidence scores, and the patient's clinical data (age, sex, laterality, symptoms, history) are passed to the Claude AI model (Anthropic) via a secure API call. Claude generates a plain-language clinical summary suitable for documentation and communication with non-specialist staff.

All ML inference runs entirely on-device using Apple CoreML. The Claude API call requires an internet connection and is the only network-dependent step in the analysis pipeline.

We are working on the next version to bring the clinical summary on-device using a local LLM, which would reduce or eliminate the dependency on external API calls.

Version 2.0 Capabilities

Capability	Detail
On-device ML inference	4 CoreML binary SVMs, no internet required
Image quality gate	Automatic rejection of non-ear or poor-quality images
Multi-condition detection	OME, TMP, MYR — simultaneous, non-exclusive
Confidence scoring	Per-condition probability scores displayed to clinician
AI clinical summary	Claude (Anthropic) generates structured plain-language findings
Structured case intake	Laterality, patient sex, age, chief complaints, symptom duration, history
Case history	Full searchable record of all prior cases with images and results
Expert review & relabelling	Cases sent to independent expert for image re-examination and verified labelling
Automatic model OTA update	Validated model updates shipped to all enrolled devices automatically
Secure cloud sync	Cases, images, and labels uploaded to Cloudflare Workers backend
Hospital-scoped auth	Invitation code → account setup → PIN login
iPad + iPhone adaptive UI	NavigationSplitView on iPad, TabView on iPhone
Portrait-only	Consistent with standard otoscope workflow

Research Applications

OTOLens is designed as a data collection and model evaluation platform as much as a clinical tool:

Annotated dataset building — every case captures the raw image, extracted features, model confidence scores, clinical metadata, and expert label in a single structured record.
Model benchmarking — confidence scores and ground-truth labels are stored together, enabling offline accuracy analysis per condition, per site, and over time.
Iterative model improvement — the backend registry tracks model versions per case, so accuracy can be evaluated before and after model updates without contaminating the historical dataset.
Closed-loop retraining pipeline — expert-verified labels feed directly into model retraining; approved updates are automatically deployed over-the-air to all enrolled devices without requiring clinicians to manually update the app.
Multi-site deployment — hospital invitation codes scope data to institution, enabling multi-centre studies.

The expert review and automatic model deployment pipeline operates under a formal data sharing and collaboration agreement between the contributing institution and the expert panel.

Platform & Requirements

Platform iOS 17+ / iPadOS 17+
Device iPhone or iPad with rear camera
On-device inference Runs on Apple Neural Engine / CPU — no GPU required
Internet Required only for Claude AI summary generation and case upload
Language English (Thai localisation planned)

Researchers

Department of Otorhinolaryngology
Faculty of Medicine Siriraj Hospital, Mahidol University

1Assoc.Prof.Siriporn Limviriyakul, MD.
2Assist.Prof.Sarun Prakairungthong, MD.
3Wipaluk Thitisomboon, MD.
4Kanokrat Suvarnsit, MD.

Division of Information Technology, Office of The President, Mahidol University

5Bhattaraprot Bhabhatsatam, Ph.D.