Downloads: 4
United States | Computer Science and Information Technology | Volume 14 Issue 5, May 2025 | Pages: 1617 - 1619
Scalable ML Framework for Insurance Document Indexing
Abstract: We present a scalable ML pipeline leveraging multimodal OCR, transformer - based architectures (e. g., LayoutLMv3, Donut), and human - in - the - loop active learning for fine - grained classification and structured information extraction across heterogeneous insurance document corpora. The platform incorporates end - to - end data ingestion, annotation, model versioning, drift detection, and continuous retraining to ensure robustness and operational scalability. This approach significantly enhances throughput, reduces manual annotation overhead, and supports compliance in large - scale insurance ecosystems.
Keywords: Multimodal OCR; Transformer architectures; LayoutLMv3; Donut model; Human - in - the - loop learning; Active learning
Received Comments
No approved comments available.