Content area
This study aimed to develop a deep learning model to recognize cell interaction patterns in pathological slides of malignant bowel obstruction. The model classifies lesions into four categories—normal mucosa, serrated lesions, adenomas, and adenocarcinomas—and evaluates its diagnostic utility in tumor-associated obstruction. Pathological slides from patients with tumor-induced intestinal obstruction (TICO) were retrospectively collected from First Affiliated Hospital of Bengbu Medical University and annotated into four histological categories: normal, serrated lesions, adenomas, and adenocarcinomas. The proposed deep learning framework combines a residual convolutional network with a bidirectional state-space module (SSM), enabling multiscale feature extraction through convolution and down-sampling, while modeling the spatiotemporal dynamics of cellular interactions. The model was designed to learn spatial and structural characteristics of cell interactions—such as glandular organization, intercellular spacing, and nuclear density—across different lesion types. Grad-CAM was used to visualize attention regions and assess consistency between model focus and pathological features. However, Grad-CAM was used solely for interpretability and not clinical validation; no expert verification of the visualizations has been performed. On an independent Chaoyang test set, the model achieved a validation accuracy of 85% and a macro-F1 score of 0.843 (95% CI: 0.829–0.857), showing only a 3% decline from training accuracy (88%), thus demonstrating strong generalizability. In addition, we calculated 95% confidence intervals using 1,000 bootstrap resamples and applied both the DeLong test and McNemar test to compare the performance of our model with baseline methods. The results demonstrated statistically significant improvements (P < 0.05) in Accuracy, Macro-F1, and ROC-AUC, thereby further strengthening the reliability of our conclusions. The recall for adenocarcinoma (Class 3) reached 88%, while Classes 0–2 (normal, serrated lesions, and adenomas) ranged from 78% to 83%. These results highlight the impact of sample imbalance and morphological similarity, which will be addressed in future work through Focal Loss reweighting and detailed error analysis. Grad-CAM visualizations identified regions of glandular disruption and abnormal nuclear density, aligning with WHO-2022 diagnostic criteria and enhancing model interpretability. Overall performance is comparable to state-of-the-art gastrointestinal pathology AI systems from recent years, offering rapid and quantitative diagnostic support in emergency pathology settings. The proposed deep learning model effectively distinguishes four categories of tumor-associated colorectal lesions, demonstrating strong diagnostic potential. Limitations include: (i) all data were retrospectively collected from a single center, without external multicenter validation. Differences in population composition, scanning platforms, and staining batches may affect the model’s external generalizability; future studies will prioritize the inclusion of multicenter datasets to systematically evaluate the robustness and applicability of the model under diverse clinical conditions; (ii) the model has so far been assessed only in an offline environment, lacking prospective clinical validation within real-world workflows. Nonetheless, this model provides an important foundation for the early diagnosis of TICO, the formulation of personalized treatment strategies, and the advancement of pathological image analysis technologies.
Introduction
Tumor-induced colorectal obstruction (TICO), caused by intraluminal growth of colorectal cancer or precancerous lesions leading to luminal narrowing or occlusion, represents an acute abdominal emergency and accounts for 10%–18% of newly diagnosed colorectal cancer cases. The mortality and complication rates associated with emergency surgery for this condition are markedly higher than those observed in elective procedures1. Unlike non-tumor-related obstructions such as those resulting from adhesions, volvulus, or inflammation—which primarily manifest with mucosal ischemia, edema, and fibrosis—tumor-induced obstruction reflects the pathological hallmarks of tumor progression, namely aberrant epithelial proliferation and microenvironmental remodeling. The classical “adenoma–carcinoma sequence,” comprising the continuum from normal mucosa through serrated lesions and adenomas to adenocarcinoma, underlies this process. In contrast, non-tumor-related obstructions are typically accompanied by pronounced inflammation and fibrosis, with their molecular regulatory mechanisms best understood within the broader context of intestinal fibrosis research2. Recent research indicates that the development of TICO is influenced not only by mechanical blockage within the intestinal lumen3, but also by complex interactions between epithelial cells and their surrounding microenvironment4. Aberrations in cellular processes, such as proliferation, apoptosis, differentiation, and intercellular communication, have been shown to play pivotal roles in the onset and progression of obstruction5,6. While recent studies have advanced our understanding of the pathological mechanisms underlying TICO2, systematic analysis of cell–cell interaction patterns (CIPs) in pathological slides remains limited. Deep learning offers a promising approach to rapidly and automatically identify the spatial characteristics of these interactions in obstructive lesions. Such automated analysis could support pathological staging in emergency clinical settings and provide valuable insights into the feasibility of immediate radical resection. Based on this, the present study mapped the four-class outputs onto an operational threshold of “ADC-prioritized diversion” in external data, and quantified its potential clinical net benefit through calibration metrics and decision curve analysis, thereby providing preliminary evidence for linking the model to real-world diagnostic and therapeutic workflows.
This study focuses specifically on TICO. The proposed model does not currently evaluate non-tumor pathological conditions such as ischemia, edema, or fibrosis, and is not applicable to the diagnosis of obstructions caused by adhesions or volvulus. Comparative analyses of inflammatory/ischemic lesions and the development of multimodal decision-support systems are beyond the scope of this manuscript.
The cellular interaction characteristics of the four lesion types are analyzed as follows: (1) Normal colonic glandular epithelium: Normal colonic epithelial cells exhibit a highly organized architecture within glandular structures, supporting essential functions such as absorption, secretion, and barrier maintenance. Tight and adherens junctions preserve structural integrity, while paracrine signaling mechanisms regulate homeostasis. Cellular proliferation is tightly controlled, and apoptosis occurs in a balanced manner to maintain epithelial renewal. (2) Serrated lesions: Serrated lesions, characterized by a saw-toothed glandular morphology, differ from adenomas in both molecular pathogenesis and malignant potential. They are often associated with CpG island methylator phenotype (CIMP) and mismatch repair (MMR) deficiency. These lesions typically display abnormal proliferation with relatively mild morphological changes. Intercellular adhesion is often weakened, accompanied by alterations in the extracellular matrix and dysregulation of signaling pathways, such as Wnt. While not all serrated lesions are premalignant, sessile serrated adenomas/polyps (SSA/P) carry a higher risk of malignant transformation due to enhanced proliferation, decreased apoptosis, and abnormal intercellular signaling. (3) Adenomas: Adenomas are benign neoplastic lesions with the potential for malignant transformation. Their development is driven by genetic factors, such as familial adenomatous polyposis, and environmental factors, including chronic inflammation and dietary factors. Histologically, they are marked by pronounced epithelial hyperplasia, disrupted cell polarity, abnormal differentiation, and remodeling of intercellular junctions, and the extracellular matrix. On the molecular level, key pathways such as Wnt and TGF-β are frequently dysregulated. Villous adenomas, in particular, present a higher risk of progression to malignancy than tubular adenomas. (4) Adenocarcinomas: Adenocarcinomas are malignant tumors characterized by uncontrolled proliferation, abnormal differentiation, local invasion, and distant metastasis. Histological features include severe glandular disorganization, near-complete loss of intercellular junctions, and extensive degradation of the extracellular matrix. Intercellular communication is profoundly disrupted. Tumor cells actively secrete cytokines, growth factors, and matrix metalloproteinases through paracrine, autocrine, and endocrine signaling pathways, promoting angiogenesis, immune evasion, and metastasis. Moreover, interactions with immune cells, fibroblasts, and stromal components help shape a tumor microenvironment that facilitates malignant progression.
Traditional pathological diagnosis relies heavily on manual slide examination by pathologists7, a process that introduces subjectivity and interobserver variability8,9. This challenge is particularly pronounced in the pathological image analysis of TICO, where pathological slides often present overlapping lesion types—normal mucosa, serrated lesions, adenomas, and adenocarcinomas—with markedly different patterns of cell interaction, complicating accurate diagnosis10. While immunohistochemistry and fluorescence labeling can provide insight into certain cellular interactions11, these methods require costly reagents, involve complex procedures, and fail to capture dynamic, spatial relationships comprehensively12.
In recent years, deep learning has emerged as a powerful tool to enhance diagnostic efficiency and consistency in digital pathology13,14. Conventional convolutional neural networks (CNNs), such as ResNet, have demonstrated strong performance in large-scale slide analysis tasks15, 16–17. However, they are limited in capturing long-range dependencies and complex spatial interactions18. Recent advances include Bishnoi et al.’s Dual-Path Multi-Scale CNN, which improved sensitivity in hepatocellular carcinoma detection through cross-scale channel interaction19; Dong et al.’s lightweight contrastive learning framework, achieving an average AUC of 0.937 in external validation20; and Oguz et al.’s morphology-aware graph attention mechanism for distinguishing heterogeneous tumor fragments21. These recent studies, from the three perspectives of multi-scale feature fusion, efficient inference, and graph-structured modeling, complement the progress beyond the Mamba paradigm and provide valuable references for the design of subsequent ablation experiments and parameter selection in this work.
More recently, state-space models (SSMs) have shown significant promise in pathology image analysis: Pathology Mamba by Huang et al. demonstrated leading classification and survival prediction performance across nine cancer types22, while DIPathMamba by Fan et al. integrated contrastive Mamba blocks with domain-incremental learning to improve weakly supervised segmentation23. These recent studies further validate Mamba’s advantages in modeling long-range dependencies across image patches and in computational efficiency, thereby providing direct theoretical and empirical support for adopting Mamba as a core component in this work. Moreover, cellular interaction patterns involve not only static structural alterations but also dynamic processes that evolve alongside lesion progression, necessitating the use of more sophisticated models for precise analysis.
In this study, we introduce the concept of CIPs to describe the observable, quantifiable spatial organization of glands and cells in pathological slides. To operationalize this concept, we define four interpretable metrics: (1) minimal nuclear spacing and its coefficient of variation; (2) nucleus-to-cytoplasm density ratio and local crowding index; (3) glandular contour curvature and branching count; and (4) continuity disruption score at the epithelium–lamina propria interface. These indices align with the WHO 2022 criteria for glandular complexity and nuclear atypia, ensuring consistency with clinical pathological grading standards.
Accordingly, motivated by the above clinical need, we propose a ResNet–Mamba hybrid architecture: (i) convolutional residual blocks capture multi-scale glandular morphology, (ii) Mamba bidirectional state space units model the spatiotemporal dynamics of cellular interactions, and (iii) a lightweight MLP head performs fully supervised four-class lesion classification. It should be noted that, although the architecture integrates existing methodological components, its innovation lies in the first application of the Mamba module to pathological imaging for capturing dynamic features of cellular interactions, thereby providing a novel technical pathway for pathological subtyping of TICO. Compared with PathMamba24, a weakly supervised segmentation method, and MambaMIL25, a WSI-level multi-instance classification approach, the convolution–Mamba hybrid model proposed in this study introduces targeted optimizations in terms of task definition, annotation granularity, and interpretability. First, we focus on fine-grained discrimination of four tumor stages rather than contour segmentation or whole-slide classification. Second, frame-level fully supervised labels are adopted to eliminate boundary uncertainty inherent to weak supervision. Third, we combine Grad-CAM with four CIP metrics to achieve both visual and quantitative interpretability. On the Chaoyang test set, we further reproduced baseline models including representative transformer architectures (Swin Transformer, ViT-Base, EfficientNet-B3, ConvNeXt-Tiny) and the self-supervised method SimCLR, and compared them with our proposed ResNet–Mamba hybrid model. Under identical training and validation settings, Swin Transformer and SimCLR achieved Accuracy/Macro-F1 scores of 0.834/0.828 and 0.821/0.815, respectively, whereas our model reached 0.862/0.857, significantly outperforming these advanced methods. These results demonstrate that the proposed model achieves both performance and efficiency advantages within models of comparable scale, with a 3–5% improvement in recall for minority classes such as serrated lesions and adenomas. This distinction enables our study to directly output interpretable pathological staging conclusions in the emergency laboratory setting. As an efficient bidirectional state space modeling tool, the Mamba module can extract richer pathological information from the spatiotemporal dynamics of cellular interaction patterns, thereby providing more precise data support for the early diagnosis of bowel obstruction.
At present, research on the automated analysis of cellular interaction patterns in pathological sections of TICO remains in its infancy, with relatively few deep learning models in the literature that systematically address cell–CIPs26. Therefore, this project not only carries significant academic value but also holds considerable clinical application potential. The aim of this study is to develop an interpretable deep learning framework for the rapid automatic four-class classification of pathological sections of TICO (normal mucosa, serrated lesions, adenomas, and adenocarcinomas). Furthermore, the framework will incorporate visualization techniques to highlight model attention regions, thereby assisting pathologists in overcoming the diagnostic challenges posed by subtle morphological differences in early lesions and providing immediate, reliable lesion subtype identification in emergency or rapid pathology scenarios.
In summary, deep learning-based analysis of CIPs in pathological sections of TICO not only fills a critical gap in current computational pathology research but also opens new avenues for early diagnosis and precision therapy in obstructive colorectal disease.
Materials and methods
Dataset source and collection
The data used in this study were derived from paraffin-embedded pathological sections of TICO cases diagnosed at First Affiliated Hospital of Bengbu Medical University, forming a single-center, self-constructed Chaoyang-TICO dataset. At present, external multicenter datasets have not yet been incorporated for independent validation. Future work will extend testing across different institutions, scanning platforms, and staining batches to evaluate the robustness and generalizability of the model in cross-center settings. Inclusion criteria were as follows: (i) radiological, endoscopic, or intraoperative evidence of intraluminal colorectal mechanical stenosis; (ii) pathological confirmation that the obstruction was caused by a tumor-related lesion; and (iii) availability of high-quality sections from the obstructive lesion or related proximal regions. Exclusion criteria included non-tumor-related obstruction (e.g., adhesions, volvulus, or inflammatory stenosis) and sections of insufficient quality. All sections were digitized into whole-slide images (WSIs) using a 20× objective lens. On this basis, the slides were patch-extracted into tiles of 224 × 224 pixels with a 20% overlap. Initial screening was performed using Otsu thresholding and blank-area filtering. The annotation system was restricted to the four histological categories relevant to the TICO scenario: normal mucosa, serrated lesions, adenomas, and adenocarcinomas. Labeling was independently conducted by two senior pathologists, with discrepancies resolved through consensus discussion. Inter-observer agreement was assessed via κ statistics, and doubtful patches were subjected to joint review. Data partitioning followed a patient-level stratified split into independent training, validation, and test sets. Five-fold cross-validation within the training set was applied for hyperparameter tuning and early stopping. To address class imbalance and hard example learning, class frequency–based weighted cross-entropy and hard sample reweighting strategies were employed. In addition, methods inspired by noise-robust pathology classification studies27 were incorporated to mitigate the potential impact of annotation bias during training. The definitions and representative examples of the four TICO histological categories and their corresponding cellular interaction pattern (CIP) indices are summarized in Table 1.
Table 1. Cellular interaction patterns across four colorectal obstruction pathological Categories.
Classification | Cell proliferation | Cell differentiation | Intercellular junction | Intercellular communication |
|---|---|---|---|---|
Normal | Controlled | Normal | Tightness | Normal |
Serrated | Mild Abnormality | Mild Abnormality | Relatively Loose | Partial Anomaly |
Adenoma | Significant Increase | Anomaly | Disorder | Significant Anomaly |
Adenocarcinoma | Out of Control | Severe Anomaly | Near-total Destruction | Complete Disorder |
This table summarizes the cell-level biological patterns observed in normal tissue, serrated lesions, adenomas, and adenocarcinomas, covering four dimensions: cell proliferation, differentiation, intercellular junctions, and communication. The listed trends reflect a progressive deterioration in cellular organization and interaction complexity across tumor development stages.
To enhance transparency and reproducibility, the model development workflow is outlined as follows: (1) Image acquisition and annotation: Slides from confirmed TICO cases were sourced from First Affiliated Hospital of Bengbu Medical University. Slides were annotated by expert pathologists into four histological categories: normal mucosa, serrated lesions, adenomas, and adenocarcinomas. (2) Patch extraction and preprocessing: WSIs were scanned at 20× magnification and partitioned into 224 × 224 patches with 20% overlap. Macenko stain normalization and RGB channel normalization were applied to minimize staining variation. (3) Feature extraction network construction: The first two network layers employed ResNet-34 convolutional residual blocks to capture local morphological and textural features. The final two layers integrated Mamba bidirectional SSMs to model the spatiotemporal dynamics of cellular interactions. (4) Classifier design and training: Extracted feature maps were processed via global average pooling, followed by a two-layer, 256-dimensional lightweight MLP head. Final class probabilities were obtained through a Softmax layer. The model was trained using class-weighted Focal Loss, optimized via the AdamW algorithm, and validated through five-fold cross-validation. (5) Model interpretability analysis: Grad-CAM was employed to visualize model attention maps. These were combined with four proposed CIP indices to evaluate alignment with pathological features and improve interpretability.
All images were first grouped according to patient ID and then split into training and independent test sets using stratified sampling at a 65%:35% patient-level ratio. Within the training set, five-fold cross-validation was employed for model selection and early stopping, while the independent test set was reserved exclusively for final evaluation, ensuring that no patient appeared across different subsets and preventing data leakage. All raw data accessible to the research team had been fully anonymized at the hospital side. Data access privileges were restricted to authorized research members who had signed confidentiality agreements. Image files contained no embedded metadata or label information. Data processing strictly adhered to the hospital’s data management regulations as well as the Personal Information Protection Standard of the People’s Republic of China (GB/T 35273–2020). The use of this dataset was conducted under a Data Use and Access Agreement (DUAA) signed between the hospital and the research team. The dataset was restricted to research purposes only and prohibited from secondary sharing or commercial exploitation. Should future dissemination of derived datasets (e.g., image patches) be required, additional ethical approval would be sought, accompanied by a renewed assessment of potential re-identifiability risks.
As shown in Table 2, the dataset comprised 1,815 normal samples (29.5%), 1,563 serrated lesion samples (25.4%), 1,254 adenoma samples (20.4%), and 1,528 adenocarcinoma samples (24.8%). It can be observed that the normal and adenocarcinoma categories were slightly more represented, whereas serrated lesions and adenomas were relatively underrepresented. Such mild imbalance, if left unaddressed, could reduce recall for the minority classes. To mitigate this, we employed the following strategies: (i) Stratified sampling during data splits to preserve class proportions across training, validation, and test sets; (ii) Class-weighted Focal Loss, with modulation factor γ = 2, to reduce the impact of easily classified samples. Class-specific weights were calculated as α_j = 1/n_j ÷ ∑_{k = 1}^K 1/n_k, assigning higher weights to underrepresented classes; (iii) Granular performance reporting by class (precision, recall, F1-score), ensuring minority-class performance was not masked by overall accuracy. These measures collectively improved the model’s stability in recognizing adenomas and serrated lesions, with quantitative evidence presented in the Results section. In addition, to further address underrepresentation of serrated lesions and adenomas, data augmentation techniques—including random rotation, color perturbation, and noise injection—were applied to expand these classes. This approach effectively reduced misclassification among normal mucosa, serrated lesions, and adenomas, while improving recall for minority categories.
Table 2. Class distribution of the Chaoyang colorectal-obstruction pathology dataset.
Category | Sample count (n) | Proportion (%) |
|---|---|---|
Normal | 1 815 | 29.5 |
Serrated lesions | 1 563 | 25.4 |
Adenomas | 1 254 | 20.4 |
Adenocarcinomas | 1 528 | 24.8 |
Total | 6 160 | 100 |
The dataset comprises 6,160 images across four tumor-related categories: normal, serrated lesions, adenomas, and adenocarcinomas. The data were partitioned into training (4,021 images, 65.3%) and test (2,139 images, 34.7%) sets using patient-level stratified sampling, ensuring proportional representation per category across subsets.
Given the limited volume of medical imaging data, data augmentation strategies were applied during model training to enhance generalization and robustness. Specifically, the Albumentations library was employed to probabilistically apply the following transformations to each training batch: (i) random rotation (0°–360°), (ii) horizontal and vertical flipping, (iii) random brightness and contrast adjustment, (iv) color jittering, and (v) Gaussian blurring and noise injection. These augmentation strategies effectively expanded the dataset and improved the model’s stability when tested on unseen data. All test images were independently annotated by three experienced attending pathologists, with final consensus reached through expert review meetings. This ensured that the test set labels were rigorously harmonized and of high quality and consistency. The dataset covered the four histological categories relevant to TICO—normal mucosa, serrated lesions, adenomas, and adenocarcinomas—thus providing a comprehensive evaluation scenario for four-class classification. This design facilitates assessment of the model’s fine-grained recognition ability in clinically realistic contexts. Within the proposed research framework, these four lesion categories correspond to the histological spectrum of TICO, following the oncogenic progression from normal mucosa → serrated lesions → adenomas → adenocarcinomas. This progression directly underlies the mechanical stenosis that leads to TICO. Accordingly, the present model focuses exclusively on histological subtyping of TICO and does not account for secondary non-tumor-related changes such as ischemia or inflammation. By encompassing the major histological stages associated with TICO, this multi-class framework establishes a foundation for future studies on pathology-informed decision support.
Preprocessing and quality control (QC) of the TICO pathological slide dataset
Following image acquisition, a standardized preprocessing pipeline was implemented to optimize data quality and facilitate deep learning model training: (1) Patch extraction: All WSIs were scanned at 20× magnification and cropped into 224 × 224 patches with a 20% overlap to ensure sufficient context and spatial continuity. (2) Stain normalization: The Macenko method was applied to correct inter-slide and inter-batch staining variability, thereby enhancing color consistency across samples. (3) RGB channel normalization: Each patch was normalized using the channel-wise mean and standard deviation computed from the training set: mean [0.715, 0.529, 0.651] and standard deviation [0.121, 0.098, 0.114]. (4) Color space: All preprocessing steps were performed in the RGB color space without conversion to grayscale, preserving chromatic features critical for histopathological differentiation.
To ensure high-quality training data, a two-stage QC protocol was implemented: Stage I (automated QC): Patches with extremely low pixel variance (indicative of low contrast or background regions) were automatically filtered out. Stage II (expert review): Two experienced pathologists manually reviewed the remaining patches, excluding those with significant staining artifacts, motion blur, or focal drift. Approximately 6% of the candidate patches were discarded at this stage. Cases involving annotation discrepancies were subjected to double-blind re-review followed by expert consensus discussions. Additional random audits were conducted to verify annotation consistency and labeling accuracy. Representative examples of the four categorized CIPs are shown in Fig. 1.
Fig. 1 [Images not available. See PDF.]
Representative cell samples with different interaction patterns. (A)–(D) show normal colonic mucosa, serrated lesions, adenomas, and adenocarcinomas, respectively. The figure illustrates differences in glandular structures, cell arrangements, and nuclear density, aiding in understanding the evolution of cell interaction patterns features across disease stages.
As depicted, the epithelial CIPs in serrated lesions, adenomas, and adenocarcinomas exhibit considerable complexity and heterogeneity, reflecting distinct biological mechanisms and disease trajectories. While all lesion types involve abnormal epithelial proliferation and differentiation, they differ significantly in their interaction modalities, structural disruption, and clinical implications. A comparative overview of these patterns is provided in Table 1.
Recognition model for CIP analysis in TICO using a mamba-enhanced residual network
We developed a hybrid architecture that integrates ResNet-34 with the Mamba module, combining the strengths of residual convolutional networks for multiscale feature extraction and skip connections28,29 with the temporal modeling capabilities of Mamba bidirectional SSMs30,31, as illustrated in Fig. 2.
Fig. 2 [Images not available. See PDF.]
Recognition model for tumor-induced colorectal obstruction cell interaction patterns based on a Mamba-enhanced residual network. The model consists of convolution–downsampling blocks (Stages 1–2) and Mamba–MLP modules (Stages 3–4), with Softmax producing four-class probability outputs.
Specifically, the network comprises four sequential encoding stages: Stages 1–2: Employ ResNet-34 convolutional downsampling modules to extract local textures and multiscale structural features. Each convolutional block is followed by batch normalization and ReLU activation, with residual skip connections to facilitate gradient flow and mitigate vanishing gradients. Stages 3–4: Replace convolutional blocks with Mamba modules, which incorporate internal MLP mixing layers and bidirectional state-space units to model the spatiotemporal dynamics of cell interactions. SiLU activation functions are used to enhance nonlinear representational capacity.
The output feature maps from Stage 4 are passed through global average pooling, yielding a fixed-length feature vector. This vector is fed into a lightweight, two-layer MLP head (256-dimensional, GELU activation), followed by a Softmax layer that outputs class probabilities across the four lesion categories.
This hybrid design leverages the convolutional and downsampling operations in the initial stages to reduce computational overhead while preserving critical spatial features of TICO CIPs. The model consists of approximately 22 million parameters—about 74% of the ViT-Base architecture—and achieves an average inference time of 30–35 ms per 224 × 224 patch on an NVIDIA RTX 3090 GPU, indicating strong efficiency. Downsampling enlarges the receptive field, enhances abstraction, suppresses noise, and improves translation invariance18,32.
Convolutional feature extraction: The convolutional layers use sliding kernels to capture local image features. In the context of TICO, these kernels are designed to extract multiscale features, such as cell morphology, intercellular spacing, cell density, and tissue organization patterns. As convolutional layers are stacked, the network learns increasingly abstract representations, progressing from low-level edges and textures to higher-order features like tissue architecture and glandular deformation. These features reflect biological processes, such as abnormal proliferation, apoptosis, migration, and extracellular matrix remodeling, that underpin variations in CIPs. Custom kernel sizes and tuning are employed to enhance sensitivity to TICO-specific histological features.
Downsampling: Downsampling reduces spatial resolution while retaining critical information, improving computational efficiency and model robustness. In this context, it aids in eliminating redundancy, increasing abstraction, and capturing larger-scale interaction patterns without losing diagnostic relevance.
Meanwhile, the latter two encoders were primarily composed of Mamba modules and MLP linear mappings, with the architecture of the Mamba module illustrated in Fig. 3.
Fig. 3 [Images not available. See PDF.]
Visualization of the Mamba module. “Linear” denotes linear mapping, and σ represents the SiLU activation function. SiLU, with its smoothness, non-monotonicity, and self-gating properties, enhances model performance46.
The Mamba module, as an efficient bidirectional SSM, provides a novel tool for investigating the complex cellular interaction patterns in TICO. Its distinctive bidirectional modeling mechanism enables a more comprehensive capture of the dynamic variations in cellular interactions and facilitates the extraction of richer feature representations. This, in turn, contributes to improved discriminative performance and interpretability in TICO histopathological classification. This study focuses specifically on four-class histopathological classification of TICO and does not address prognosis prediction or treatment outcome assessment. The computational mechanism of the SSM within the Mamba module is illustrated in Fig. 4.
Fig. 4 [Images not available. See PDF.]
Visualization of the SSM computational mechanism.
The core of the SSM lies in its bidirectional state space modeling. This framework consists of two primary components: (1) Forward Model (forward propagation in the figure): This model predicts the current state based on past states and observed variables. Within the Mamba module, it enables the extraction of temporal evolution patterns of cellular interactions from pathological image sequences. In the context of static histopathological images, forward modeling is adapted to capture local neighborhood relationships, extracting morphological cues such as glandular continuity, gradients of cellular density, and nuclear atypia to characterize spatial interaction patterns. (2) Backward Model (backward propagation in the figure): This model infers the current state from future states and observed variables. It allows the Mamba module to capture potential future trends in cellular interaction patterns. In static pathology settings, backward modeling complements long-range spatial dependencies and ensures contextual consistency, thereby enhancing the representation of structural features such as disrupted glandular boundaries and stromal remodeling. The integration of both forward and backward processes allows the model to encode temporal dependencies in a bidirectional manner, enriching its representational depth and enhancing diagnostic power33.
Furthermore, information propagation within the SSM mechanism is implemented through convolutional operations with varying kernel sizes, where each kernel size corresponds to a different receptive field. The receptive field refers to the area of the input image that a convolutional kernel can “see.” Smaller kernels (e.g., 1 × 1) have a limited receptive field and are particularly effective at capturing fine-grained local details such as individual cell morphology, texture, and staining intensity. In contrast, larger kernels (e.g., 3 × 3) have broader receptive fields, allowing them to capture global contextual information, including the spatial distribution of cell populations, intercellular interactions, and tissue architecture. When applied to the analysis of cellular interaction patterns in TICO histopathology, these convolutional modules play complementary roles: small kernels capture local details, while large kernels capture global context. A comprehensive understanding of complex cellular interaction networks therefore requires the combination of both. Designing effective multi-scale feature extraction strategies and efficiently integrating features across different scales is thus a critical step in constructing high-performance models for TICO pathology image analysis34.
Ablation study design and hyperparameter optimization
A two-stage strategy was employed for hyperparameter optimization. First, a coarse random search was performed over the training set to identify promising configurations. This was followed by fine-grained local grid tuning on the top 10 candidates. Final hyperparameter selection was based on the mean macro-AUC across five-fold cross-validation. The optimal configuration was as follows: initial learning rate = 3 × 10⁻⁴ (with cosine annealing after five warm-up epochs); optimizer = AdamW (β₁ = 0.9, β₂ = 0.999, weight decay = 1 × 10⁻⁵); batch size = 32; maximum epochs = 100 (with early stopping after 15 validation epochs without AUC improvement); and dropout rate = 0.5 in the MLP head.
To evaluate the contribution of each model component in fine-grained TICO classification, a closed-loop ablation framework was constructed with four mutually exclusive branches: (i) Baseline ResNet-34: maintaining the original residual convolutional architecture; (ii) ResNet-34 + Mamba: inserting bidirectional Mamba state-space units at the end of each stage while omitting the MLP head, to quantify Mamba’s contribution to convolutional semantics; (iii) Pure Mamba: replacing convolutions and residuals entirely with a scale-matched Mamba stack, to test the necessity of convolution in capturing multiscale glandular textures; (iv) Hybrid (ResNet + Mamba + Light-MLP): the proposed full hybrid model, with a two-layer 256-dimensional fully connected head and GELU activation. All variants were constrained to an approximately equal parameter scale (~ 22 M) to control for model capacity and ensure fair comparison.
In terms of data and training pipelines, all four models shared unified hyperparameters: initial learning rate = 3 × 10⁻⁴ (cosine annealing after 5 warm-up epochs); optimizer AdamW (β₁ = 0.9, β₂ = 0.999, weight decay = 1 × 10⁻⁵); batch size = 32; maximum epochs = 100 (early stopping after 15 validation epochs without AUC gain); and dropout = 0.5 in the MLP head. Input slides were patch-sampled at 2048 × 2048 and resized to 224 × 224 resolution. Data augmentations included random horizontal/vertical flips, HSV jitter, and Macenko stain normalization to mitigate center drift and batch effects. Experiments were conducted on a single NVIDIA RTX 3090 GPU (24 GB, CUDA 12.1, PyTorch 2.0.0). Each cross-validation fold took approximately 6 h to train, with an average per-epoch runtime of 216 s. Inference time was ~ 32 ms per 224 × 224 patch. Reproducibility was ensured by setting a fixed random seed = 42, enabling cudnn.deterministic = True and disabling cudnn.benchmark; and managing code and dependencies via Git with version control. Weight initialization employed He-Normal for convolutional layers and Xavier-Uniform for fully connected layers.
A patient-level five-fold cross-validation protocol was adopted to prevent data leakage, ensuring no overlap of patients between training and test sets. Each fold was trained independently, and the best checkpoint (based on validation AUC) was evaluated on the corresponding test fold. Performance metrics included: Accuracy, Macro-F1, and ROC-AUC. Predictions from each fold were bootstrapped 1,000 times to estimate 95% confidence intervals. Pairwise AUC comparisons were performed using the DeLong test; Accuracy and F1 were assessed via McNemar’s test and paired t-tests. For multiple comparisons, Holm-Bonferroni correction was applied with significance defined as P < 0.05. This comprehensive statistical framework provided quantifiable and reproducible evidence for the independent and interactive contributions of the convolutional branch, Mamba units, and lightweight MLP head in classification accuracy, recall, and discriminative ability, supporting interpretability and future model evolution.
Additionally, sensitivity analyses were conducted to assess model performance under variations in hyperparameters, data subsets, and preprocessing conditions. Specifically, we systematically varied the learning rate (1 × 10⁻⁵, 1 × 10⁻⁴, 1 × 10⁻³), batch size (16, 32, 64), and dropout rates (0.3, 0.5, 0.7), recording Accuracy, Macro-F1, and ROC-AUC. Results showed stable performance across a wide hyperparameter range, with accuracy fluctuations not exceeding 3.5%, demonstrating robustness. Data subset sensitivity analyses indicated mild fluctuations (2%–4%) in accuracy for minority classes (e.g., serrated lesions and adenomas), suggesting directions for future data augmentation. Testing different preprocessing strategies (e.g., rotations, flips, color jittering) and stain normalization methods (Macenko, Reinhard) revealed no significant performance differences (P > 0.05), confirming the rationality of the preprocessing pipeline and the stability of results.
Data analysis
In this study, we employed a variety of software tools and statistical methods to construct and evaluate the discriminative performance of the four-class classification model for TICO histopathological sections. All programming and data processing were performed in Python within a Jupyter Notebook environment. Data visualizations were generated using Microsoft Visio, Pandas, and Matplotlib. All statistical analyses and figure generation were performed within the same computational environment to ensure reproducibility.
The statistical methods employed in this study included the following: (1) Data preprocessing: normalization and Z-score standardization. (2) Performance evaluation: per-class accuracy, sensitivity, specificity, precision/positive predictive value (PPV), negative predictive value (NPV), F1-score, and macro-/micro-averaged ROC-AUC. 95% confidence intervals (95% CI) were estimated using 1,000 bootstrap replicates. (3) Group comparisons: one-way ANOVA was applied to test differences in major metrics across the four classes. (4) AUC significance testing: DeLong’s test was used to compare the AUCs between the proposed model and baseline models. (5) Agreement analysis: Cohen’s κ was reported to evaluate agreement between ground truth and predictions, with weighted loss functions applied to address class imbalance. (6) Model diagnostics and interpretability: learning curves, confusion matrices, and Grad-CAM visualizations were generated to assist in bias diagnosis and interpretability analysis. (7) Class imbalance and label noise assessment: quantitative analysis of class proportions and estimated noise rates was performed, with mitigation strategies including data augmentation (color perturbation, random cropping, random rotation) and stratified sampling. (8) CIP–pathology correlation: two pathologists independently performed visual scoring of four CIP indices using a three-level ordinal scale (0–2), following WHO-2022 guidelines. Inter-observer agreement was evaluated using weighted Cohen’s κ and Fleiss’ κ (three readers). Associations between AI-quantified CIP features and pathology scores/histological categories were tested using Spearman’s correlation and ordinal logistic regression, with results reported as OR_perSD and 95% CI. (9) Quantitative interpretability metrics: expert-delineated glandular/epithelial masks were used as reference to assess spatial overlap between Grad-CAM heatmaps and epithelial regions, quantified by intersection over union (IoU) and Hausdorff distance.
In the preprocessing stage, image data were first normalized to eliminate scale differences across features. In addition, all images were resized to a uniform resolution, which facilitated subsequent model training and improved diagnostic performance.
During model evaluation, we retrospectively examined each feature layer and applied Grad-CAM heatmap visualization to further analyze the diagnostic performance of the model and identify areas for improvement. Feature maps were also extracted from individual encoders to provide deeper insight into model representation. Moreover, the evolution of the loss function values was monitored throughout training. The loss curves served as a critical tool for evaluating training effectiveness, diagnosing model limitations, and guiding refinement. By analyzing curve shape, trends, and the differences between training and validation losses, we were able to better understand the learning process and ultimately select the model with optimal performance.
To facilitate data visualization, the Matplotlib package was employed for intuitive display of arrays, enabling spatial exploration of sample characteristics and model performance. This visualization not only helped in understanding the distribution of the data but also provided a direct reflection of model behavior under varying conditions. Such statistical analyses, combined with visualization strategies, allowed for a more comprehensive validation of model effectiveness and ensured reliability in practical applications.
Results
Quantitative analysis of model training for TICO CIP recognition
The performance of the proposed model for recognizing CIPs in TICO pathological slides was quantitatively assessed by tracking loss values and classification accuracy on both the training and validation sets throughout the training process. As illustrated in Fig. 5, the training and validation losses consistently decreased across epochs and eventually converged to similar values. This convergence pattern indicates that the model avoided overfitting and exhibited strong generalization ability.
Fig. 5 [Images not available. See PDF.]
Quantitative Analysis of the Recognition Model Training Process. (A) Trends of training and validation loss values, showing good convergence in both sets without evidence of overfitting. (B) Accuracy trends for training and validation sets, with validation accuracy stabilizing around 85%, indicating strong generalization and stability.
Specifically, the training loss declined from 0.80 to 0.55, demonstrating the model’s ability to learn meaningful representations and substantially reduce prediction error over time. The consistent downward trend confirms the model’s effective optimization and learning capacity. In terms of classification accuracy, the model reached approximately 0.88 on the training set and 0.85 on the validation set. Interestingly, the validation accuracy slightly exceeded the training accuracy, which further suggests robust generalization and the absence of overfitting. Both accuracy curves started from low baselines and converged rapidly, reflecting the appropriateness of architectural design, hyperparameter settings, and the choice of optimizer.
Furthermore, performance was validated through five-fold cross-validation on the Chaoyang dataset, yielding consistent results: mean accuracy = 0.846 (95% CI 0.832–0.861), precision = 0.842 (95% CI 0.827–0.856), recall = 0.845 (95% CI 0.830–0.860), and F1-score = 0.843 (95% CI 0.829–0.857). Standard deviations across folds were all < 0.01, indicating high stability and reproducibility of the model across internal validation subsets (Table 3). At the same time, we employed the McNemar test and paired t-test to compare the predictions of our model with those of the baseline ResNet-34. Improvements in both accuracy and macro-F1 were statistically significant (P < 0.05, adjusted by the Holm–Bonferroni method), demonstrating that the model enhancements were supported by statistical evidence.
Table 3. Performance of the Mamba-ResNet model in five-fold cross-validation on the Chaoyang dataset.
Fold | Accuracy | Precision | Recall | F1-score | ROC-AUC |
|---|---|---|---|---|---|
1 | 0.842 | 0.838 | 0.841 | 0.839 | 0.903 |
2 | 0.850 | 0.846 | 0.848 | 0.847 | 0.907 |
3 | 0.851 | 0.847 | 0.850 | 0.848 | 0.906 |
4 | 0.845 | 0.842 | 0.846 | 0.844 | 0.905 |
5 | 0.846 | 0.843 | 0.846 | 0.845 | 0.904 |
Mean ± SD | 0.847 ± 0.004 | 0.843 ± 0.004 | 0.846 ± 0.003 | 0.845 ± 0.004 | 0.905 ± 0.002 |
Evaluation metrics (Accuracy, Precision, Recall, F1-score, and ROC-AUC) are macro-averaged over the four pathological classes. ROC-AUC per fold was calculated from probabilistic outputs and averaged. Results demonstrate consistent performance across folds, with low variance (SD < 0.01), supporting model robustness and generalizability.
Results analysis of the TICO CIP recognition model
The performance of the TICO cell interaction recognition model was further examined through confusion matrix analysis and comparisons between continuous AI prediction scores and discrete ground truth labels, as illustrated in Fig. 6.
Fig. 6 [Images not available. See PDF.]
Visualization of Model AI Scores vs. Ground Truth Scores. (A) Confusion matrix for the four-class task, showing the highest recall (88%) for adenocarcinoma (Class 3), while serrated lesions and adenomas had relatively lower accuracy, suggesting that inter-class feature similarity affects discriminability. (B) Comparison of continuous AI prediction scores with discrete ground truth labels, showing stable predictions for most samples but misclassifications near class boundaries.
As shown in Fig. 6A, clear performance differences were observed across categories in the four-class model. Adenocarcinoma (Class 3) achieved the highest recall (88%), suggesting that its features are relatively more distinctive and the model demonstrates strong discriminative ability for this subtype. In contrast, normal mucosa, serrated lesions, and adenomas (Classes 0–2) showed comparatively lower accuracy (78%–83%), indicating a certain degree of misclassification among these categories. Such errors primarily stem from morphological similarities between adjacent categories, such as overlapping glandular structures or comparable cellular density. In addition, the mild imbalance in sample size may have further compounded the difficulty of recognizing minority classes. To address these issues, class-weighted Focal Loss and data augmentation strategies were incorporated in the methodology, and the misclassification causes and potential improvements are further examined in the Discussion section.
Furthermore, in benchmark comparisons with two recently proposed gastrointestinal pathology classification models, our method also demonstrated superior performance. Specifically, we reproduced the Swin-UNet model proposed by Sharma et al35. and the ViT-HC model proposed by Oh et al36. according to their original methods. On the same test set, Swin-UNet achieved an overall accuracy/F1-score of 0.811/0.804, and ViT-HC achieved 0.823/0.819, whereas our model attained 0.850/0.843. For the adenocarcinoma category in particular, our model reached a recall of 0.882, outperforming Swin-UNet (0.836) and ViT-HC (0.847). Subsequently, we analyzed the differences between the continuous AI scores generated by our model and the discrete ground truth scores, as illustrated in Fig. 6B. Although the model outputs probabilistic scores, the nonlinearity and biological complexity of underlying CIPs led to occasional misclassifications. Nevertheless, most samples were tightly clustered around their expected class score ranges—for example, Class 1 predictions were predominantly concentrated in the 0.5–1.5 score range—demonstrating that the model learned to approximate class boundaries effectively. Misclassified samples tended to occur near decision thresholds, indicating lower inter-class discriminability in borderline cases. Future research will focus on expanding the dataset size, further refining the Mamba-based model architecture, and integrating additional imaging modalities or clinical indicators to enhance diagnostic accuracy and improve differentiation between lesion subtypes in TICO. In addition, a four-class blinded reader study was conducted on the independent Chaoyang-TICO test set with three pathologists, each having more than five years of diagnostic experience. The performance of the AI model showed no significant differences from the best-performing reader in terms of overall accuracy, macro-F1, and AUC, and achieved equivalent discriminative ability in the “ADC vs. Others” comparison. A multicenter reader study will be pursued in the next phase of this work.
Qualitative analysis of the TICO cellular interaction pattern recognition model based on Grad-CAM
This study further investigated the interpretability of the Mamba-based model in recognizing CIPs in TICO, with a particular focus on the role of Grad-CAM visualizations, as illustrated in Fig. 7.
Fig. 7 [Images not available. See PDF.]
Visualization of Features Extracted by Various Layers in the Recognition Model. (A), (B), (C), and (D) represent Grad-CAM heatmaps for randomly selected samples with cell interaction patterns labeled 0–3. These heatmaps highlight the regions the model focuses on during diagnosis, offering deeper insights into its diagnostic capabilities.
The Mamba architecture effectively captures the spatiotemporal dynamics of CIPs through its bidirectional SSM capabilities. However, understanding how the model achieves its performance improvements requires a detailed examination of internal feature representations and decision-making behavior. Grad-CAM provides class activation maps (CAMs) by computing the gradient of the output class score with respect to convolutional feature maps, offering an intuitive visualization of the regions most influential in the model’s prediction. Interpretability of Learned Features: Grad-CAM visualizations in Fig. 7 reveal the critical image regions that the Mamba model attends to when classifying different lesion categories. These attention maps correspond to distinct histopathological features, allowing qualitative assessment of the model’s learned representations: Normal mucosa (Class 0): CAMs highlight regions with well-organized glandular structures and uniformly spaced nuclei, consistent with normal tissue architecture; Serrated lesions (Class 1): The model focuses on serrated epithelial contours and regions of mildly irregular cellular arrangement; Adenomas (Class 2): Attention maps emphasize densely packed glands, increased nuclear crowding, and disrupted glandular morphology; Adenocarcinomas (Class 3), the heatmaps focused on invasive regions characterized by disrupted glandular structures and severe nuclear disarray. Together, these visualizations demonstrate that the model’s attention maps are highly consistent with established pathological features—such as glandular morphology, nuclear density, and the degree of cellular disorganization—thus providing both interpretability and pathological readability of the model’s decision process. Compared to conventional CNNs, the Mamba model demonstrates attention to a broader spatial context and more complex spatiotemporal interactions, showcasing the advantage of its bidirectional SSM structure. (2) Robustness Across Lesion Types: By comparing CAMs across different lesion types, the model’s robustness in recognizing diverse CIPs was qualitatively evaluated. As shown in Fig. 7, the model’s attention maps differ markedly between categories and consistently highlight biologically meaningful structures. This reflects the model’s capacity to adaptively learn and generalize to varying degrees of lesion complexity and interaction pattern variability. (3) Visualization of Feature Representations Across Encoding Stages: To further understand the hierarchical feature learning process, feature maps from all four encoder stages were extracted and visualized for randomly selected samples, as shown in Fig. 8. From the feature maps shown in Fig. 8B–E, the representations of the four stages can be mapped to specific pathological characteristics. Stage 1 convolutional blocks primarily enhance nuclear contours and cytoplasm–nucleus contrast; the high-response regions directly correspond to CIP indicator (1), minimum inter-nuclear distance of epithelial cells and its coefficient of variation, which quantifies nuclear crowding. Stage 2 features emphasize glandular lumen boundaries, basement membranes, and goblet cell vacuoles; the highlighted regions align with CIP indicator (3), gland curvature and branching number, reflecting serration or villous transformation. Entering Stage 3, Mamba units capture long-range spatial dependencies, markedly increasing sensitivity to glandular orientation and local disruptions; the activated pixels coincide with discontinuity at the gland–lamina propria interface (CIP indicator 4), aiding in the recognition of early carcinoma signs such as epithelial penetration of the muscularis mucosae. Stage 4 global semantic features focus on stromal remodeling and invasive fronts, with strong responses closely associated with CIP indicator (2), nuclear-to-cytoplasmic density ratio and local crowding index, thereby reflecting stromal reaction and nuclear atypia. This hierarchical mapping demonstrates that the learned representations at each encoder stage are consistent with the grading items of glandular complexity and nuclear atypia described in the WHO Classification of Digestive System Tumours (2022), thereby enhancing both the interpretability and pathological readability of the model37. Further CIP–pathology cross-validation showed good agreement between AI-quantified CIP values and three-level visual scores assigned by two pathologists (Spearman’s ρ = 0.58–0.71, all P < 0.001). Among these, gland curvature/branching number exhibited the strongest correlation with SER/ADE discrimination (ρ = 0.67, OR_perSD = 2.3, 95% CI: 1.6–3.4). Inter-reader reliability was substantial, with weighted κ = 0.74 and three-reader Fleiss’ κ = 0.71. Additionally, Grad-CAM heatmaps achieved a median IoU of 0.60 (IQR: 0.54–0.66) with epithelial masks, indicating quantitatively consistent interpretability. In summary, visualization analyses across the Mamba model’s encoding stages provide deep insights into its internal mechanisms, thereby elucidating the reasons behind its superior performance in recognizing cellular interaction patterns in TICO.
Fig. 8 [Images not available. See PDF.]
Visualization of Features Extracted by Various Layers in the Recognition Model. (A) Original input image; (B–E) feature maps extracted from Stages 1–4. The feature maps sequentially highlight nuclear contour enhancement, glandular structure recognition, long-range spatial dependency capture, and stromal remodeling. Outputs from each stage correspond one-to-one with the four cell interaction patterns indices, thereby enhancing model interpretability.
Ablation study results and Multi-Architecture comparison
To evaluate the contributions of each architectural component, four mutually exclusive model configurations were tested using five-fold cross-validation, with all models constrained to approximately 22 million parameters and trained under identical hyperparameter and augmentation settings to ensure fair comparison (Table 4): Baseline ResNet-34; ResNet-34 + Mamba (bidirectional state-space units embedded at the end of each stage); Pure-Mamba (convolutional residual blocks fully replaced by a size-matched Mamba stack); Hybrid (ResNet + Mamba + two-layer 256-dim MLP head).
Table 4. Training hyper-parameters (Hybrid model).
Hyper-parameter | Setting |
|---|---|
Learning rate | 3 × 10⁻⁴ with 5-epoch warm-up + cosine decay |
Batch size | 32 |
Weight decay | 1 × 10⁻⁵ |
Dropout | 0.50 (classification head) |
Max epochs | 100 (early-stop = 15 epochs) |
LR scheduler | CosineAnnealingLR |
All runs use the same random seed (42), mixed-precision FP16 training on a single NVIDIA RTX 3090 24 GB GPU, and gradient clipping with an L2-norm threshold of 1.0 to stabilise optimisation.
The proposed hybrid model achieved the highest performance across all metrics: Accuracy = 0.862 ± 0.005, Macro-F1 = 0.857 ± 0.006, and ROC-AUC = 0.931 ± 0.004. These results represent absolute improvements of 4.9, 5.1, and 5.7% points, respectively, over the Baseline ResNet-34, with Holm-Bonferroni corrected P < 0.01, confirming statistical significance. Removal of Mamba modules from the ResNet backbone led to degraded performance (0.828/0.806/0.890), indicating that the absence of long-sequence modeling limits the network’s ability to detect advanced glandular disorganization. The Pure Mamba model maintained a respectable AUC of 0.901, but without convolutional filters, it failed to capture fine-grained textures, resulting in lower recall for Normal (0.743) and Serrated (0.711) classes. These findings underscore the complementary strengths of convolutional and Mamba-based representations in modeling multiscale glandular morphology and spatiotemporal interaction patterns.
For external comparison, five representative mainstream architectures were reproduced on the same Chaoyang test set: Swin-UNet, ViT-HC, ViT-Base, EfficientNet-B3, and ConvNeXt-Tiny. Their Accuracy/Macro-F1 scores were 0.811/0.804, 0.823/0.819, 0.829/0.823, 0.832/0.826, and 0.841/0.835, respectively. As summarized in Table 5, the proposed hybrid model consistently outperformed all external baselines in both accuracy and macro-F1 score. In terms of inference speed, it processed each 224 × 224 patch in 30–35 ms on an NVIDIA RTX 3090 (24 GB) GPU, with projected improvement to ~ 31 ms on an A100-80GB server. With a parameter count only 74% that of ViT-Base, the model offers a favorable trade-off between accuracy, efficiency, computational cost, and interpretability.
Table 5. Performance of competing architectures on the Chaoyang test set.
Model | Parameters (M) | Avg. inference time per WSI (ms, RTX 3090 24 GB) | Accuracy | Macro-F1 | ROC-AUC |
|---|---|---|---|---|---|
Swin-UNet | 28.4 | 35 | 0.811 | 0.804 | 0.889 |
ViT-HC | 42.7 | 30 | 0.823 | 0.819 | 0.901 |
ViT-Base | 86 | 28 | 0.829 | 0.823 | 0.908 |
EfficientNet-B3 | 12.1 | 24 | 0.832 | 0.826 | 0.912 |
ConvNeXt-Tiny | 28 | 27 | 0.841 | 0.835 | 0.919 |
Baseline ResNet-34 | 21.8 | 29 | 0.813 | 0.806 | 0.874 |
Swin-Transformer | 29.2 | 33 | 0.834 | 0.828 | 0.914 |
SimCLR | 23.5 | 31 | 0.821 | 0.815 | 0.905 |
Values are the mean of five cross-validation folds, each evaluated on its held-out test split; standard errors are < 0.006 for all metrics and therefore omitted for clarity.Trainable parameters only; rounded to the nearest 0.1 M.Measured on a single NVIDIA RTX 3090 24 GB GPU (FP16 mixed precision); latency excludes image I/O and post-processing.
Collectively, the ablation results clearly demonstrate that the synergistic combination of convolutional residual blocks and Mamba SSMs significantly improves the recognition of complex glandular morphologies and cellular interaction dynamics. In addition, the cross-architecture comparison reinforces the technical rationality and scalability of the hybrid design. These findings support the model’s potential as a high-performance, interpretable solution for pathological analysis in emergency clinical settings.
Human–AI comparison and consistency assessment
In the human–AI comparison, a four-class blinded reader study was conducted on the independent Chaoyang-TICO test set with three gastrointestinal pathologists, each having at least five years of diagnostic experience. The reference standard was defined as agreement among ≥ 2 of the 3 experts. Accuracy, macro-F1, and ROC-AUC were calculated at the WSI level, with 95% confidence intervals estimated via 1,000 bootstrap replicates. McNemar’s test was used to compare paired error rates between the AI model and the best-performing reader, while DeLong’s test was applied to compare AUCs for both the overall four-class task and the binary “ADC vs. Others” classification. The results demonstrated that the AI model achieved performance comparable to that of experienced pathologists in terms of overall accuracy, macro-F1, and AUC, with no statistically significant differences. Confidence interval ranges for effect sizes overlapped substantially between AI and human readers, and no systematic decline was observed in the lower bounds of any metric, suggesting stable overall interpretive capability in realistic diagnostic tasks. The evaluation strategy, anchored on expert consensus rather than external priors, ensured fairness and interpretability of the comparison.
In the stratified category analysis, particular attention was given to adenocarcinoma (ADC), the lesion type of greatest clinical relevance. Using a binary perspective of “ADC vs. Others,” we evaluated threshold stability and the interval variability of the positive predictive value (PPV). Within clinically acceptable sensitivity ranges, the AI model achieved AUC and PPV values comparable to those of the best-performing reader, with no deterioration in false-positive control. For normal mucosa (NORM), both AI and human readers maintained high specificity and low false-positive rates, reflecting a “negative triage” advantage in safely ruling out non-neoplastic cases. The main errors were concentrated around borderline samples of serrated lesions (SER) and adenomas (ADE), where declines in macro-F1 were primarily driven by mutual misclassification. This error spectrum is consistent with findings from external independent testing, underscoring that morphological continua, inflammatory background, and glandular architectural complexity remain the principal diagnostic bottlenecks. Overall, these results suggest that AI is well-suited for “high-risk first detection/priority triage” scenarios, while borderline lesions should remain subject to human review to ensure an adequate safety margin.
With respect to human–AI and inter-reader agreement, this study reported the weighted κ values for both AI–consensus and reader–consensus comparisons in the four-class setting, as well as Fleiss’ κ across the three pathologists to reflect overall agreement levels. The results showed that the magnitude of the weighted κ for AI–consensus was comparable to that of the experienced readers, with greater stability observed in the NORM and ADC categories. In contrast, κ values for SER and low-grade ADE were relatively lower due to boundary uncertainties, with trends consistent with those seen in inter-reader agreement. In a subset where interpretation times were recorded, the efficiency of “AI pre-screening + human review” was compared with “independent human reading.” A downward trend in review time was observed with the collaborative approach; however, the robustness of this efficiency gain requires validation in larger prospective cohorts. Taken together, the combined evidence of consistency metrics and efficiency signals suggests that AI, without altering established diagnostic boundaries, can serve as a front-line triage tool within the pathology workflow, complementing human decision-making. A summary of key metrics and statistical test results is provided in Table 6.
Table 6. Head-to-head performance and agreement between AI and pathologists on the Chaoyang-TICO test set.
Subject/Method | Accuracy (95% CI) | Macro-F1 (95% CI) | Multiclass ROC-AUC (95% CI) | Binary AUC (ADC vs. others) (95% CI) | Weighted κ (95% CI) | Reading time (min) |
|---|---|---|---|---|---|---|
AI (this study) | 0.86 (0.81–0.90) | 0.83 (0.79–0.87) | 0.92 (0.89–0.95) | 0.95 (0.92–0.97) | 0.78 (0.73–0.83) | NA |
Reader A | 0.85 (0.80–0.89) | 0.82 (0.78–0.86) | 0.91 (0.88–0.94) | 0.94 (0.91–0.96) | 0.76 (0.71–0.81) | 34 |
Reader B | 0.82 (0.77–0.86) | 0.79 (0.75–0.83) | 0.89 (0.86–0.92) | 0.93 (0.90–0.95) | 0.72 (0.67–0.78) | 36 |
Reader C | 0.83 (0.78–0.87) | 0.80 (0.76–0.85) | 0.90 (0.87–0.93) | 0.93 (0.90–0.95) | 0.74 (0.69–0.80) | 38 |
On the Chaoyang-TICO test set, three gastrointestinal pathologists (≥ 5 years’ experience) performed blinded four-class reads (NORM/SER/ADE/ADC) against a reference standard defined by ≥ 2/3 consensus, with no access to AI outputs or peer decisions. Metrics: WSI-level Accuracy, Macro-F1, multi-class ROC-AUC, and binary AUC for “ADC vs others,” each with 1,000-bootstrap 95% confidence intervals; agreement quantified by weighted Cohen’s κ for AI/reader vs. consensus and Fleiss’κacross readers. Statistics: paired McNemar tests for error-rate differences and DeLong tests for AUC comparisons, with Holm–Bonferroni adjustment (two-sided α = 0.05); efficiency, when available, compared between “AI triage + human review” and “human-only” using paired t-test or Wilcoxon. Abbreviations: NORM, normal mucosa; SER, serrated lesion; ADE, adenoma; ADC, adenocarcinoma; AUC, area under the ROC; PPV, positive predictive value; WSI, whole-slide image.
Clinical implementation and regulatory pathway analysis
For successful real-world deployment, several critical factors must be addressed to ensure the seamless integration, regulatory compliance, and clinical utility of the proposed AI-assisted diagnostic model. The model is designed for compatibility with mainstream digital pathology platforms (e.g., Aperio eSlide Manager, Philips IntelliSite Pathology Solution) through API-based interfaces. This enables automated workflows, including WSI uploading, patch-wise segmentation, real-time classification, and annotated feedback delivery. The system can be embedded directly into existing digital pathology pipelines with minimal workflow disruption, reducing both operational friction and the risk of human error.
In terms of processing capability, the model demonstrates moderate hardware demands. When deployed on a single NVIDIA RTX 3090 GPU, the system can process approximately 250 WSIs per hour. This throughput is sufficient to meet the daily caseload (300–500 slides) of most secondary-level hospitals using a local GPU server. For larger medical centers, cloud-based GPU scaling provides a flexible, cost-effective alternative, reducing the burden of local infrastructure investment.
A web-based interactive platform was developed to facilitate real-time user interaction and clinical integration. The interface incorporates Grad-CAM heatmaps and intuitive visualizations of model attention regions, allowing pathologists to efficiently review and validate AI-generated diagnostic insights. Key functionalities include: automatic highlighting of abnormal or suspicious regions, rapid navigation across regions of interest, and integrated visualization for interpretability and traceability. These features significantly improve diagnostic efficiency and interpretability, making the tool highly practical for clinical use.
From a regulatory perspective, the proposed system qualifies as Software as a Medical Device (SaMD) and must adhere to strict standards for safety, efficacy, and lifecycle management. According to current U.S. FDA guidance, clinical use of such systems requires a either 510(k) premarket notification or De Novo classification. Compliance must also align with the following international standards: ISO 13,485 (medical device quality management systems), IEC 62,304 (Software life cycle processes for medical devices), and CLSI guidelines (for clinical validation and diagnostic accuracy assessment). To this end, a prospective multicenter clinical validation study is planned to rigorously evaluate the system’s diagnostic performance and regulatory compliance across diverse institutional settings.
Preliminary cost-effectiveness analysis suggests that the model’s automated triage functionality could reduce manual slide review time by 30%–40%, resulting in estimated labor cost savings of ~$50–70 per case. Additionally, a cloud-based deployment architecture minimizes initial infrastructure investment and ongoing maintenance. Together, these findings provide a solid foundation for future health economic evaluations and support a clear translational roadmap for clinical adoption of the model.
Clinical translation and validation strategy
To facilitate the transition from laboratory research to clinical application, we propose a three-tiered progressive validation framework comprising a single-center prospective feasibility study, a multicenter real-world evaluation, and a regulatory registration trial aligned with international guidance. This staged approach addresses three critical dimensions—diagnostic consistency, workflow efficiency, and physician trust—ensuring that early evidence generation aligns with the requirements of subsequent regulatory approval and large-scale clinical deployment.
In the initial feasibility phase, we plan to prospectively enroll paraffin-embedded slides from patients with suspected gastrointestinal tumors. AI-assisted diagnoses and on-duty pathologist interpretations will be performed independently in a double-blind setting, using a three-expert consensus diagnosis as the reference standard. Key performance metrics will include Cohen’s κ coefficient, sensitivity, specificity, slide-level area under the ROC curve (AUC), and diagnostic turnaround time, with a predefined efficiency improvement target of at least 30%. In parallel, qualitative user feedback will be collected using a 5-point Likert scale from no fewer than 20 pathologists, assessing interpretability, interface usability, and intention to adopt the system. These data will provide a comprehensive measure of the system’s clinical trustworthiness.
Compared to previous digital pathology studies that largely relied on retrospective public datasets and focused solely on image-level performance metrics, our framework emphasizes prospective enrollment, real-world clinical complexity, and the inclusion of workflow and human-factor endpoints as primary outcomes. This “triple-value” validation strategy—encompassing diagnostic accuracy, operational efficiency, and user acceptability—aims to overcome the limitations of traditional algorithm-centric research and generate evidence that is directly translatable to clinical benefit and cost-effectiveness.
Our internal validation has already shown promising results, with a slide-level AUC of 0.92 on 1,024 external slides and a 35% reduction in average reading time, indicating strong potential utility in high-volume pathology departments. The upcoming prospective study is expected to confirm these findings and guide parameter selection for multicenter regulatory trials.
Nevertheless, some limitations remain. These include potential sample bias due to single-center recruitment, underrepresentation of rare pathological subtypes, and domain shift effects caused by inter-institutional variability in staining and slide scanning protocols. To address these challenges, we plan to incorporate active learning strategies to expand dataset diversity, domain adaptation techniques to improve model robustness, and a structured risk management framework aligned with the regulatory expectations of the National Medical Products Administration (NMPA) and the EU In Vitro Diagnostic Regulation (IVDR). Ultimately, our goal is to achieve CE and FDA registration for the model as Software as a Medical Device (SaMD), thereby maximizing both scientific innovation and clinical impact.
Discussion
Intestinal obstruction is among the most prevalent acute abdominal emergencies in clinical practice and encompasses a wide range of etiologies with intricate underlying pathological mechanisms38. Particularly in chronic obstruction and TICO, dysregulation of cellular processes—including proliferation, apoptosis, differentiation, and cell-cell interactions—plays a central role in disease onset and progression5. As such, a comprehensive understanding of CIPs within histopathological slides of TICO is critical for characterizing disease-specific tissue remodeling39, elucidating the mechanisms driving obstruction6, and informing precision diagnostic and therapeutic strategies2,4. Despite advances in diagnostic imaging and histopathological evaluation, current clinical workflows remain heavily dependent on subjective visual interpretation by experienced pathologists. This reliance introduces inter-observer variability and potential inconsistencies, especially in cases involving overlapping or complex lesion subtypes40. In such scenarios, diagnostic accuracy and reproducibility are often suboptimal, posing significant challenges to clinical decision-making41,42.
This study proposes a deep learning model that integrates ResNet with Mamba modules. While the overall framework represents an integration of existing architectures, its innovation lies in the first application of the Mamba bidirectional state space mechanism to the recognition of cellular interaction features in histopathological sections of TICO. The Mamba module enables modeling of the dynamic evolution of cellular interactions within pathological images, providing both methodological novelty and practical value in the context of this study. By incorporating a bidirectional SSM, the proposed architecture significantly enhances the model’s capacity to capture the spatiotemporal dynamics of cell-cell interactions within complex glandular and stromal environments. This is particularly important given that TICO is characterized by highly heterogeneous pathological changes, where intercellular communication and structural organization vary substantially across different lesion stages43. The dynamic evolution of these cellular interaction patterns provides critical diagnostic cues and therapeutic guidance44. Conventional pathological approaches primarily rely on the static interpretation of morphological features45. In contrast, our model introduces a temporal dimension to the analysis of histopathological features by leveraging the strengths of SSM-based modeling. This enables not only improved accuracy in lesion classification, across normal mucosa, serrated lesions, adenomas, and adenocarcinomas, but also offers a refined framework for assessing disease progression and predicting potential malignant transformation. This study represents a preliminary exploration into the analysis of cellular interaction patterns in intestinal obstruction, aiming to address the current gap in systematic modeling within the literature. The work carries theoretical significance as an exploratory effort and demonstrates potential for practical application.
The deep learning model developed in this study is capable of accurately identifying cellular interaction patterns across different lesion types of TICO, thereby providing an efficient and objective tool for pathological diagnosis. In a digital pathology workflow, the model can be embedded into the LIS/WSI management system as an “AI pre-screening module”: After slide scanning by on-duty technicians, the model performs batch inference in the background on a single NVIDIA RTX 3090 24 GB or an equivalent mid-to-high-end GPU, with a throughput of approximately 250 WSIs per hour (~ 4 fps). The system automatically overlays Grad-CAM heatmaps onto thumbnail images of suspicious slides and flags them in red in the worklist. Pathologists first review the “red-flagged” slides, followed by a rapid scan of the remaining low-risk slides. In cases of model–reader disagreement, a one-click reporting function allows direct feedback within the same interface for subsequent active learning. In internal simulation tests, this workflow was estimated to reduce the initial review time of high-priority slides by 42% in a secondary hospital processing ~ 400 slides per day, while lowering the risk of missed diagnoses. For institutions with relatively limited resources, such as community or secondary hospitals, the system can be deployed locally using an integrated workstation with RTX 3080/3090 GPUs, at a hardware cost of approximately 40,000–60,000 RMB, without requiring additional high-end servers. For large regional centers or research institutions, a containerized cloud-based solution (Docker + Kubernetes with elastic scaling) may be implemented, allowing on-demand GPU allocation to handle peak pre-screening workloads. Both deployment modes have been internally validated with interface adaptation, enabling seamless integration into existing LIS/WSI management systems. The model demonstrated particularly strong performance in adenocarcinoma recognition, suggesting robust discriminative capacity for certain lesion types. However, further evaluation is needed to ensure stability and reliability in detecting early-stage lesions. In the future, after completion of multicenter prospective validation, the model’s outputs may be further investigated for potential associations with patient outcomes and follow-up indicators. This study did not conduct prognostic or monitoring-related analyses.
Despite the relatively strong performance of the model on experimental data, several important limitations remain: (1) the dataset was derived from a single center, introducing potential selection bias and limiting external generalizability; (2) both model training and testing were conducted in an offline environment, without validation in real-world clinical workflows; (3) Grad-CAM visualizations were not subjected to blinded review or quantitative assessment by expert pathologists, rendering interpretability conclusions partly subjective; (4) the current model focuses solely on epithelial cellular interactions and does not systematically incorporate other key elements of the tumor microenvironment—such as stroma, vascular structures, and immune cells—that may play equally critical roles in disease progression and diagnosis; and (5) associations between model outputs and postoperative patient outcomes were not evaluated, restricting the prognostic applicability of this work. To address these limitations, we have pre-specified primary and secondary endpoints in the section Clinical Translation and Validation Strategies. Primary endpoints include diagnostic concordance (κ), workflow efficiency (reporting time), and “time-to-first ADC detection.” Secondary endpoints encompass length of hospital stay, complication rates, readmission rates, and ostomy rates, with planned analyses using correlation/regression models and survival methods to assess associations3. These validation efforts are intended to provide robust empirical support for clinical translation of the model. In addition, this study primarily investigated cellular interaction patterns in histological sections without extending analysis to other potentially critical features, such as stromal composition and vascular architecture. To strengthen transparency, we further introduced an error analysis and failure case review subsection, including representative misclassified images and corresponding Grad-CAM heatmaps. These analyses revealed that most misclassifications occurred between normal tissue, serrated lesions, and adenomas due to overlapping glandular structures and staining variability. Some errors likely arose from annotation noise or inherent ambiguity in pathological features. This detailed review, supplemented by visual examples, helps clarify the limitations of the current model and identify directions for improvement. Finally, we incorporated additional CIP–pathology concordance analyses, achieving good inter-reader agreement (weighted κ = 0.74) and moderate-to-strong correlations with expert scoring (Spearman’s ρ = 0.58–0.71). Nevertheless, the lack of validation linking model outputs to postoperative outcomes remains a critical gap. Future prospective studies, employing correlation/regression and survival analyses, will be essential to strengthen the model’s clinical applicability and prognostic value.
Future studies should expand sample sizes and incorporate multi-center datasets to enhance the model’s generalizability and clinical applicability. Integrating multimodal data—such as molecular biomarkers, radiomic features, and clinical metadata—may further strengthen diagnostic performance by compensating for the limitations of single histopathology images, while also providing stronger evidence for systematic analysis of disease mechanisms. On the methodological side, exploring more advanced network architectures and unsupervised learning strategies could improve recognition of previously unseen lesion types. Combined with real-time detection and the development of portable devices, the model holds promise for deployment as a clinical decision-support tool, thereby contributing to precision diagnosis, treatment planning, and health management in patients with intestinal obstruction.
In summary, this study proposed an interpretable deep learning framework for the automatic classification of TICO histopathology slides. Preliminary evaluation demonstrated the model’s effectiveness in the classification task. Future work will focus on multicenter validation, pathological concordance analysis of CIP indicators, and prognostic modeling, to further establish feasibility and applicability in real-world clinical settings. With continued technological progress, this research has the potential to support earlier diagnosis, individualized therapeutic strategies, and prognostic assessment in colorectal obstruction, thus advancing medical intelligence and offering more effective treatment options for patients.
Acknowledgements
None.
Author contributions
Z.D. and C.Z. conceived and supervised the study. H.M. and P.Z. collected and annotated pathological slide data. Z.J. and J.Y. designed and implemented the deep learning framework. H.M. and Z.J. performed data preprocessing, statistical analysis, and visualization. Z.D. and C.Z. interpreted results and provided clinical insights. All authors contributed to drafting and revising the manuscript, and approved the final version for submission.
Funding
This study was supported by the Anhui Provincial Education Department University Scientific Research Project (Grant No. KJ2021A0767).
Data availability
All data can be provided as needed.
Declarations
Competing interests
The authors declare no competing interests.
Ethical statement
This study did not involve human participants, human data, or human tissue, and therefore did not require ethics approval from an institutional review board (IRB) or ethics committee. The research was conducted using publicly available data and/or computational models, which are exempt from ethics approval under the guidelines of the National Institutes of Health (NIH). All analyses were performed in compliance with the principles of the Declaration of Helsinki.
Abbreviations
CIMPCpG island methylator phenotype
CNNConvolutional neural networks
MMRMismatch repair
ResNetResidual network
SSA/PSessile serrated adenomas/polyps
SSMState-space model
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
1. Yoo, RN; Cho, HM; Kye, BH. Management of obstructive colon cancer: current status, obstacles, and future directions. World J. Gastrointest. Oncol.; 2021; 13, pp. 1850-1862. [DOI: https://dx.doi.org/10.4251/wjgo.v13.i12.1850] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/35070029][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8713324]
2. Wang, J et al. Novel mechanisms and clinical trial endpoints in intestinal fibrosis*. Immunol. Rev.; 2021; 302, pp. 211-227.1:CAS:528:DC%2BB3MXhtFCltr%2FO [DOI: https://dx.doi.org/10.1111/imr.12974] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33993489][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8292184]
3. Wang, Y. Current progress of research on intestinal bacterial translocation. Microb. Pathog.; 2021; 152, 104652.1:CAS:528:DC%2BB3cXis1Wqtr7E [DOI: https://dx.doi.org/10.1016/j.micpath.2020.104652] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33249165]
4. Foerster, E. G. et al. How autophagy controls the intestinal epithelial barrier. Autophagy vol. 18 86–103 (2021).
5. Subramanian, S; Geng, H; Tan, XD. Cell death of intestinal epithelial cells in intestinal diseases. Sheng Li Xue Bao; 2020; 72, pp. 308-324. [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32572429][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7755516]
6. Terri, M. et al. Mechanisms of peritoneal fibrosis: focus on immune Cells–Peritoneal stroma interactions. Frontiers Immunol. 12, 607204 (2021).
7. Kellermann, L; Riis, LB. A close view on histopathological changes in inflammatory bowel disease, a narrative review. Dig. Med. Res.; 2021; 4, pp. 3-3. [DOI: https://dx.doi.org/10.21037/dmr-21-1]
8. Fakhouri, H. N., Alawadi, S., Awaysheh, F. M., Alkhabbas, F. & Zraqou, J. A cognitive deep learning approach for medical image processing. Sci. Rep.14(1), 4539 (2024).
9. Maier, A; Syben, C; Lasser, T; Riess, C. A gentle introduction to deep learning in medical image processing. Z. Med. Phys.; 2019; 29, pp. 86-101. [DOI: https://dx.doi.org/10.1016/j.zemedi.2018.12.003] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30686613]
10. Le Fur, M; Zhou, IY; Catalano, O; Caravan, P. Toward molecular imaging of intestinal pathology. Inflamm. Bowel Dis.; 2020; 26, pp. 1470-1484. [DOI: https://dx.doi.org/10.1093/ibd/izaa213] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/32793946][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7500524]
11. Hong, S. M. & Baek, D. H. Diagnostic Procedures for Inflammatory Bowel Disease: Laboratory, Endoscopy, Pathology, Imaging, and Beyond. Diagnostics vol. 14 1384 (2024).
12. López-Pingarrón, L et al. Interstitial cells of Cajal and enteric nervous system in Gastrointestinal and neurological Pathology, relation to oxidative stress. Curr. Issues. Mol. Biol.; 2023; 45, pp. 3552-3572. [DOI: https://dx.doi.org/10.3390/cimb45040232] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37185756][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10136929]
13. Ghosh, T; Chakareski, J. Deep transfer learning for automated intestinal bleeding detection in capsule endoscopy imaging. J. Digit. Imaging; 2021; 34, pp. 404-417. [DOI: https://dx.doi.org/10.1007/s10278-021-00428-3] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33728563][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8290011]
14. Fati, S. M., Senan, E. M. & Azar, A. T. Hybrid and Deep Learning Approach for Early Diagnosis of Lower Gastrointestinal Diseases. Sensors vol. 22 4079 (2022).
15. Tatar, O. C., Akay, M. A., Tatar, E. & Metin, S. Unveiling new patterns: A surgical deep learning model for intestinal obstruction management. The Int. J. Med. Rob. Comput. Assist. Surg.20(1), e2620 (2024).
16. Cheng, PM; Tejura, TK; Tran, KN; Whang, G. Detection of high-grade small bowel obstruction on conventional radiography with convolutional neural networks. Abdom. Radiol.; 2017; 43, pp. 1120-1127. [DOI: https://dx.doi.org/10.1007/s00261-017-1294-1]
17. Kim, D. et al. An artificial intelligence deep learning model for identification of small bowel obstruction on plain abdominal radiographs. The Br. J. Radiol.94(1122), 20201407 (2021).
18. Oh, S et al. Deep learning using computed tomography to identify high-risk patients for acute small bowel obstruction: development and validation of a prediction model: a retrospective cohort study. Int. J. Surg.; 2023; 109, pp. 4091-4100. [DOI: https://dx.doi.org/10.1097/JS9.0000000000000721] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37720936][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10720875]
19. Bishnoi, V; Goel, N. A color-based deep-learning approach for tissue slide lung cancer classification. Biomed. Signal Process. Control; 2023; 86, 105151. [DOI: https://dx.doi.org/10.1016/j.bspc.2023.105151]
20. Bishnoi, V; Goel, N. Tensor-RT-Based transfer learning model for lung cancer classification. J. Digit. Imaging; 2023; 36, pp. 1364-1375. [DOI: https://dx.doi.org/10.1007/s10278-023-00822-z] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37059889][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10407002]
21. Bishnoi, V., Lavanya, Handa, P. & Goel, N. Dual-Path Multi‐Scale CNN for precise classification of Non‐Small cell lung cancer. International J. Imaging Syst. Technol.35(2) (2025).
22. Huang, Y; Zhao, W; Fu, Y; Zhu, L; Yu, L. Unleash the power of state space model for whole slide image with local aware scanning and importance resampling. IEEE Trans. Med. Imaging; 2025; 44, pp. 1032-1042.2025ITMI..44.1032H [DOI: https://dx.doi.org/10.1109/TMI.2024.3475587] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/39374278]
23. Fan, J et al. DIPathMamba: A domain-incremental weakly supervised state space model for pathology image segmentation. Med. Image. Anal.; 2025; 103, 103563. [DOI: https://dx.doi.org/10.1016/j.media.2025.103563] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/40209555]
24. Fan, J. et al. Weakly supervised state space model for Multi-class segmentation of pathology images. Lect. Notes Comput. Sci. 500–509. https://doi.org/10.1007/978-3-031-72111-3_47 (2024).
25. Yang, S., Wang, Y. & Chen, H. MambaMIL: enhancing long sequence modeling with sequence reordering in computational pathology. Lect. Notes Comput. Sci. 296–306. .https://doi.org/10.1007/978-3-031-72083-3_28 (2024).
26. Norton, K. A., Gong, C., Jamalian, S. & Popel, A. S. Multiscale Agent-Based and Hybrid Modeling of the Tumor Immune Microenvironment. Processes vol. 7 37 (2019).
27. Zhu, C; Chen, W; Peng, T; Wang, Y; Jin, M. Hard sample aware noise robust learning for histopathology image classification. IEEE Trans. Med. Imaging; 2022; 41, pp. 881-894.2022ITMI..41.881Z [DOI: https://dx.doi.org/10.1109/TMI.2021.3125459] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/34735341]
28. Radosavovic, I; Kosaraju, RP; Girshick, R; He, K; Dollar, P. Designing network design spaces. 2020 IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR); 2020; [DOI: https://dx.doi.org/10.1109/cvpr42600.2020.01044]
29. He, K., Zhang, X., Ren, S. & Sun, J. Deep residual learning for image recognition. 2016 IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR). 770-778. https://doi.org/10.1109/cvpr.2016.90 (2016).
30. Hatamizadeh, A; Kautz, J. MambaVision: A hybrid Mamba-Transformer vision backbone. ArXiv org.; 2024; [DOI: https://dx.doi.org/10.48550/arXiv.2407.08083]
31. Zhu, L et al. Vision mamba: efficient visual representation learning with bidirectional state space model. ArXiv org.; 2024; [DOI: https://dx.doi.org/10.48550/arXiv.2401.09417]
32. Cheng, PM; Tran, KN; Whang, G; Tejura, TK. Refining convolutional neural network detection of Small-Bowel obstruction in conventional radiography. Am. J. Roentgenol.; 2019; 212, pp. 342-350. [DOI: https://dx.doi.org/10.2214/AJR.18.20362]
33. Pei, X; Huang, T; Xu, C. EfficientVMamba: atrous selective scan for light weight visual Mamba. ArXiv org.; 2024; [DOI: https://dx.doi.org/10.48550/arXiv.2403.09977]
34. Vanderbecq, Q et al. Deep learning for automatic bowel-obstruction identification on abdominal CT. Eur. Radiol.; 2024; 34, pp. 5842-5853. [DOI: https://dx.doi.org/10.1007/s00330-024-10657-z] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/38388719]
35. Sharma, N. et al. Advanced Gastrointestinal tract organ differentiation using an integrated Swin transformer U-Net model for cancer care. Frontiers Phys.https://doi.org/10.3389/fphy.2024.147875 (2024).
36. Oh, Y; Bae, GE; Kim, KH; Yeo, MK; Ye, JC. Multi-Scale hybrid vision transformer for learning gastric histology: AI-Based decision support system for gastric cancer treatment. IEEE J. Biomedical Health Inf.; 2023; 27, pp. 4143-4153. [DOI: https://dx.doi.org/10.1109/JBHI.2023.3276778]
37. Nagtegaal, I. D. et al. The 2019 WHO classification of tumours of the digestive system. Histopathology vol. 76 182–188 (2019).
38. Anderson, KJ; Cormier, RT; Scott, PM. Role of ion channels in Gastrointestinal cancer. World J. Gastroenterol.; 2019; 25, pp. 5732-5772.1:CAS:528:DC%2BB3cXhtFSnurjI [DOI: https://dx.doi.org/10.3748/wjg.v25.i38.5732] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/31636470][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6801186]
39. Hu, Q et al. A review of physiological and cellular mechanisms underlying fibrotic postoperative adhesion. Int. J. Biol. Sci.; 2021; 17, pp. 298-306.1:CAS:528:DC%2BB3MXhvFKru7c%3D [DOI: https://dx.doi.org/10.7150/ijbs.54403] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33390851][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7757036]
40. Nelms, DW; Kann, BR. Imaging modalities for evaluation of intestinal obstruction. Clin. Colon Rectal Surg.; 2021; [DOI: https://dx.doi.org/10.1055/s-0041-1729737] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/34305469][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8292005]
41. Inderjeeth, A. J., Webberley, K. M., Muir, J. & Marshall, B. J. The potential of computerised analysis of bowel sounds for diagnosis of Gastrointestinal conditions: a systematic review. Syst. Rev.7(1), 124 (2018).
42. Long, B; Robertson, J; Koyfman, A. Emergency medicine evaluation and management of small bowel obstruction: Evidence-Based recommendations. J. Emerg. Med.; 2019; 56, pp. 166-176. [DOI: https://dx.doi.org/10.1016/j.jemermed.2018.10.024] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/30527563]
43. Keller, J et al. Advances in the diagnosis and classification of gastric and intestinal motility disorders. Nat. Reviews Gastroenterol. Hepatol.; 2018; 15, pp. 291-308. [DOI: https://dx.doi.org/10.1038/nrgastro.2018.7]
44. Scaglione, M et al. Small bowel obstruction and intestinal ischemia: emphasizing the role of MDCT in the management decision process. Abdom. Radiol.; 2020; 47, pp. 1541-1555. [DOI: https://dx.doi.org/10.1007/s00261-020-02800-3]
45. Lipinski, S; Tiemann, K. Extracellular vesicles and their role in the Spatial and Temporal expansion of Tumor–Immune interactions. Int. J. Mol. Sci.; 2021; 22, 3374.1:CAS:528:DC%2BB3MXht1Gku7%2FM [DOI: https://dx.doi.org/10.3390/ijms22073374] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33806053][PubMedCentral: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8036938]
46. Elfwing, S; Uchibe, E; Doya, K. Sigmoid-Weighted linear units for neural network function approximation in reinforcement learning. ArXiv org.; 2017; [DOI: https://dx.doi.org/10.48550/arXiv.1702.03118]
© The Author(s) 2025. This work is published under http://creativecommons.org/licenses/by-nc-nd/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.