Search-indexed profile page
Document Intelligence Pipeline | Bikash Sapkota
OCR and operational automation: Combined OCR extraction, text classification, rule-based extraction, NER, and interface improvements to support faster manual review.
Context: OCR and operational automation
Company: Smart Data Solutions
Problem: Scanned claims required structured extraction and classification before they could move efficiently through operational workflows.
Solution: Combined OCR extraction, text classification, rule-based extraction, NER, and interface improvements to support faster manual review.
Impact: Improved the path from scanned documents to structured operational data. Reduced friction for manual keying workflows. Connected ML extraction with practical back-office usability.
Architecture: Scanned claims -> OCR -> Classification -> Entity extraction -> Review interface
Stack: OCR, Tesseract, FineReader, WEKA, Random Forest, NER