Processing math: 100%

波谱学杂志, 2025, 42(2): 143-153 doi: 10.11938/cjmr20243130

研究论文

基于全局和局部特征信息的生成对抗网络在海马体分割中的应用

魏志宏1, 孔旭东1, 孔燕1, 闫士举2, 丁阳1, 魏贤顶1, 孔栋1, 杨波,1,*

1.江南大学附属医院 肿瘤放疗科,江苏 无锡 214122

2.上海理工大学 健康科学与工程学院,上海 200093

Application of Generative Adversarial Networks Based on Global and Local Feature Information in Hippocampus Segmentation

WEI Zhihong1, KONG Xudong1, KONG Yan1, YAN Shiju2, DING Yang1, WEI Xianding1, KONG Dong1, YANG Bo,1,*

1. Radiotherapy oncology department, Affiliated Hospital of Jiangnan University, Wuxi 214122, China

2. School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

通讯作者: *Tel: 13506177792, E-mail:wuxiyangbo@163.com.

收稿日期: 2024-09-3   网络出版日期: 2024-11-18

Corresponding authors: *Tel: 13506177792, E-mail:wuxiyangbo@163.com.

Received: 2024-09-3   Online: 2024-11-18

摘要

海马体由于结构复杂、体积小,导致对其进行精准分割较为困难.为此,本文提出一种基于全局和局部特征信息的生成对抗网络(GLGAN)分割方法.首先,为了提高网络稳定性和海马体分割精度,减少信息丢失和梯度爆炸等问题,本文通过改进生成对抗网络的生成器和损失函数,提出了全局生成对抗网络(GGAN).其次,由于判别器本质上是二分类的分类器,对微小局部变换不敏感,于是提出具有全局和局部特征信息的双判别器网络结构的生成对抗网络.最后,设计一个平衡生成对抗网络(GAN)对抗性损失和3D u-net分割损失的总损失函数.实验结果表明基于GLGAN的分割方法有利于密集评估海马体,促进判别器将生成器生成的掩膜值推向更真实分布,提高海马体分割精度.该方法分割海马体的Dice系数为0.804、IOU为0.672.

关键词: 生成对抗网络(GAN); 3D卷积神经网络; 分割; 海马体; 3D u-net

Abstract

Due to the complex structure and small size of the hippocampus, precise segmentation of the hippocampus remains challenging. To address this issue, this study proposes a generative adversarial network (GAN) based on global and local feature information (GLGAN) for hippocampus segmentation. First, to improve network stability and segmentation accuracy while reducing the likelihood of problems such as information loss and gradient explosion, we proposed the global GAN (GGAN) by optimizing the generator and loss function of GAN. Second, since the discriminator is essentially a binary classifier and is not sensitive to small local changes, we introduced a GAN method of dual discriminator network structure that integrates both global and local feature information. Finally, a total loss function was designed to balance GAN adversarial loss and 3D u-net segmentation loss. The experimental results show that proposed method based on GLGAN facilitates intensive evaluation of the hippocampus, and drives the discriminator to push the mask value provided by the generator to a more realistic distribution, thereby enhancing hippocampus segmentation accuracy. The Dice coefficient and IOU for hippocampus segmentation using GLGAN are 0.804 and 0.672 respectively.

Keywords: generative adversarial network (GAN); 3D CNN; segmentation; hippocampus; 3D u-net

PDF (810KB) 元数据 多维度评价 相关文章 导出 EndNote| Ris| Bibtex  收藏本文

本文引用格式

魏志宏, 孔旭东, 孔燕, 闫士举, 丁阳, 魏贤顶, 孔栋, 杨波. 基于全局和局部特征信息的生成对抗网络在海马体分割中的应用[J]. 波谱学杂志, 2025, 42(2): 143-153 doi:10.11938/cjmr20243130

WEI Zhihong, KONG Xudong, KONG Yan, YAN Shiju, DING Yang, WEI Xianding, KONG Dong, YANG Bo. Application of Generative Adversarial Networks Based on Global and Local Feature Information in Hippocampus Segmentation[J]. Chinese Journal of Magnetic Resonance, 2025, 42(2): 143-153 doi:10.11938/cjmr20243130

引言

海马体是人类脑组织的重要组成之一,位于大脑内侧颞叶和丘脑之间,是一种灰质结构,左右两边各一个,它的主要功能是定向和长时记忆的存储转换[1].在临床应用中,海马体的变化对于诊断精神分裂症、阿尔茨海默病(AD)以及其他各种疾病起着重要的作用[2].因此,利用3D医学图像(例如:磁共振成像 (MRI)[3]、计算机断层扫描[4])实现海马体的精准分割是提高疾病诊断不可或缺的一部分.此外,由于大脑背景复杂、噪声大、对比度低、海马体组织边界模糊以及结构的可变性等问题,导致实现海马体自动、精准分割仍是一个具有挑战性的任务[5].

随着计算机辅助技术的发展,海马体分割主要分为三类.(1)基于图谱配准的海马体分割方法:分为单图谱分割法和双图谱分割法,例如Haller等[6]首次通过单图谱分割海马体,但它需要依赖人工指定海马体;Kwak等[7]将Graph-Cut算法和单图谱配准方法相结合,提高了海马体分割精度.Heckmann等[8]首次提出基于多图谱配准的分割方法;Wu等[9]利用多尺度特征表示描述海马体图像补丁,并通过分层改进标签融合结果.(2)基于可变模型的海马体分割方法:例如Hu等[10]提出基于主动外观模型(AAM)设计并行MRI多模态分割算法;Zarpalas等[11]提出在活动轮廓模型(ACM)中加入自适应梯度分布边界图.(3)基于监督信息的海马体分割方法:例如Hao等[12]使用支持向量机分割海马体;Khan等[13]提出监督学习和动态指标相结合的图谱选择用于分割海马体;Alejo等[14]将有监督神经网络和无监督神经网络结合用于分割海马体;Cao等[15]提出一种多任务深度学习方法用于分割海马体;Ataloglou等[16]提出利用迁移学习和深度卷积神经网络分割海马体;Hazarika等[17]提出利用u-net分割海马体;Deng等[18]提出基于Pixel2Pixel生成对抗网络用于分割海马体;Ou等[19]提出基于空间注意力机制和自注意力机制的U型网络(SA-TF-uNet)用于MRI海马体分割等.

本研究旨在提出一种基于全局和局部特征信息的生成对抗网络(GLGAN)的新方法用于分割海马体,本研究的主要贡献有(1)改进生成对抗网络(GAN)的生成器.通过将原始3D u-net的损失函数、激活函数等改进,使3D u-net更适用于GAN的生成器,并提高网络的稳定性和分割精度.(2)改进GAN的判别器,提出具有全局和局部特征信息的双判别器网络结构的GAN.双判别器网络结构可以将大脑的全局信息特征和海马体的局部信息特征融合起来,增强局部信息特征,实现更加密集评估海马体.此外,随着双判别器的加入,网络损耗的梯度被反传,有利生成器网络参数的更新,进而提高海马体的分割精度.(3)设计了一个总损失函数,用于平衡GLGAN分割损失中包含的GAN对抗性损失和3D u-net分割损失.(4)将GAN应用于3D MRI海马体分割.通过GAN和3D MRI可以学习到更丰富的空间结构信息特征.

1 实验部分

1.1 数据集

本研究使用的3D大脑MRI和其对应的双侧海马体掩膜图像全部来自华为命题赛——“第九届全国大学生服务外包创新创业大赛”[20].数据包含全颅骨剥离的132名受试者图像(如表1所示:46名健康测试者、25名阿尔茨海默病测试者以及61名轻度认知障碍测试者),这些数据也全部来自阿尔茨海默病神经成像(ADNI)的中央数据库[20].本研究将132个原始样本分成110个训练样本,22个测试样本.为丰富实验数据和防止过拟合[21],本研究将数据集用翻转、平移、旋转等方法扩增50倍.3D大脑MRI图像、海马体掩膜图像如图1所示.

表1   数据集统计表

Table 1  Dataset statistics

类别年龄女性人数男性人数总人数
正常人76.52±5.79202646
轻度认知障碍75.24±7.13134861
阿尔茨海默病75.24±6.35131225

新窗口打开| 下载CSV


图1

图1   (a) 3D大脑MRI图像;(b)海马体掩膜图像

Fig. 1   (a) 3D brain image; (b) hippocampus mask image


1.2 数据预处理

本研究使用的3D大脑MRI有3种不同分辨率(192*192*160、256*256*166、256*256*180),以去除非大脑区域的背景体素,提高分割精度.首先找到132张大脑图像的最大边缘尺寸(156*188*117).其次以图像中心为原点,164*196*124为指定图幅剪裁原始图像和对应的掩膜图像,并调整到128*128*100图幅,调整后的图像作为GAN的第一种输入.将调整后的图像再次以图像中心为原点,提取64*64*50大小的图幅作为GAN的第二种输入.最后将左右海马体掩膜图像二值化.

1.3 3D u-net分割模型

本研究首先使用3D u-net分割海马体,原始u-net包括编码器(分析输入的图像)、解码器(分辨率的分割)以及跳跃连接[22,23].为了让3D u-net更加有效、精准的实现海马体分割以及更匹配GAN的生成器,提高图像的生成质量,本研究将3D u-net进行以下改进:①所有卷积层中Relu激活函数改成Leaky Relu激活函数,从而允许输出低于0的单元(即非活动单元)有个小梯度值.②在激活函数前,通过将每个批次特征的平均值[(1)式]和方差[(2)式]重新参数化模型,实现批归一化[24][(3)式],让网络梯度有个适当的反向传播,加快网络收敛,进而优化模型.③使用平均池化代替最大池化.最大池化可能会导致稀疏梯度,影响网络的训练.④使用Dice系数损失函数代替交叉熵损失函数.交叉熵损失函数偏向多类的决策边界,而海马体相较于大脑太小,在下采样过程中可能会出现信息的丢失,从而影响分割结果.而Dice系数损失函数和交叉熵损失函数相比,可以有效的解决前景背景类别不平衡问题.

均值公式: wx=1nni=1xi
(1)
方差公式: v2x=1nni=1(xiwx)2
(2)
标准化公式: ˆxi=xiwxv2x+e
(3)

其中xi表示第i个体素值,e代表常数.

本研究改进3D u-net网络的编码器由8个提取特征的卷积块(如图2所示)组成,每个卷积块由3D卷积神经网络、批标准化和Leaky Relu激活函数组成.8个卷积块的滤波器为64,64,64,64,128,128,256,256,卷积核的大小为(3,3,3),步长为(1,1,1),填充方式为相同尺寸填充(same).在每两个卷积块之后有一个平均池化层,其大小与步长为(2,2,2).解码器包括8个反卷积块,这8个反卷积块的滤波器为256,256,128,128,64,64,64,64,其他参数与卷积块参数一致.另外,来自卷积块的特征图通过跳跃连接被复制、剪切到解码器的特征图上[25].最后将3D u-net的输出,作为输入送至空间滤波器和步长均为(1,1,1)的卷积层,通过sigmoid激活函数,生成掩膜图像.

图2

图2   3D u-net卷积块

Fig. 2   3D u-net convolution block


1.4 生成对抗网络(GAN)

GAN由生成器和判别器组成,生成器的作用是生成图像[26],本研究用改进的3D u-net作为生成器.判别器的作用是判别输入的图像是合成图像还是真实图像[27],本研究的判别器是3D卷积神经网络结构.GAN[28]最后输出是将特征级联融合起来,级联层的输出是预测输入图像为真实图像的连续概率值.

1.4.1 全局生成对抗网络(GGAN)

GGAN的生成器是改进的3D u-net,它在一定程度上解决训练不稳定,梯度消失和梯度爆炸等一系列问题,从而提高海马体分割精度.GGAN生成器的输入是大脑图像和相对应的掩膜图像,输出是分割完的海马体掩膜图像.GGAN判别器的输入是生成器生成的图像和相对应的掩膜图像(体素大小为128*128*100),全局判别器的输出结果是判别输入图像是真实还是生成的海马体掩膜图像.全局判别器是一个3D卷积神经网络结构,由6个卷积层和1个全连接层组成.每个卷积层连着一个批的标准化和Leaky Relu激活函数,卷积层的空间滤波器大小为64,128,256,512,512,512,卷积核为(5,5,5),填充方式是same,步长为(2,2,2).卷积层后连着一个全连接层,全连接层的输出节点为512,代表全局信息特征.GGAN如图3所示.

图3

图3   全局生成对抗网络生成掩膜图像

Fig. 3   Global generative adversarial network


1.4.2 全局和局部特征信息的生成对抗网络(GLGAN)

GGAN虽然在改进生成器的基础上,提高网络的稳定性和海马体的分割效果,但GAN的判别器本质上是二分类的分类器,对微小的局部变换不敏感,不利于分割海马体[29].因此,本研究提出GLGAN分割海马体.通过改进GGAN的判别器,将全局和局部感兴趣区域的特征信息融合到判别器中,从而提高海马体的分割效果.GLGAN判别器在GGAN判别器模型的基础上增加局部判别器,局部判别器的输入是以掩膜为中心剪裁的图像(体素大小为64*64*50).由于感兴趣区域掩膜图像的分辨率是大脑图像的一半,所以去除全局判别器的第一个卷积层.局部判别器全连接层的输出节点是512,代表局部的信息特征.接着全局和局部判别器的输出被连接(Concatenate)在一起,形成一个1 024维的特征向量.然后是一个全连接层,从而实现输入是连续值.最后是sigmoid激活函数,输出一个置信分数,以判别分割出来掩膜的真假.GLGAN结构如图4所示,全局判别器、局部判别器以及全连接层参数如表234所示.

图4

图4   全局和局部特征信息的生成对抗网络生成掩膜图像

Fig. 4   Generative adversarial network of global and local feature information


表2   全局判别器参数

Table 2  Global discriminator parameters

类型步长卷积核输出
Conv2,2,25,5,564
Conv2,2,25,5,5128
Conv2,2,25,5,5256
Conv2,2,25,5,5512
Conv2,2,25,5,5512
Conv2,2,25,5,5512
FC------512

新窗口打开| 下载CSV


表3   局部判别器参数

Table 3  Local discriminator parameters

类型步长卷积核输出
Conv2,2,25,5,564
Conv2,2,25,5,5128
Conv2,2,25,5,5256
Conv2,2,25,5,5512
Conv2,2,25,5,5512
FC------512

新窗口打开| 下载CSV


表4   全连接层参数

Table 4  Full connection layer parameters

类型步长卷积核输出
Concatenate------1024
FC------1

新窗口打开| 下载CSV


1.5 损失函数

1.5.1 3D u-net损失函数

3D u-net交叉熵损失函数偏向多类决策边界,而海马体相较于大脑太小[30],在卷积神经网络下采样过程中会出现信息丢失的问题,从而影响分割效果.本研究通过利用Dice系数损失函数[见(4)式]可以有效的解决前景背景类别不平衡的问题,提高海马体组织的分割精度.

Dice=2|V1V2||V1|+|V2|
(4)

其中,V1表示标签的体积,V2表示分割图像的体积.

1.5.2 生成对抗网络损失函数

本研究利用最小二乘法搭建对抗性损失函数,最小二乘法损失函数有利梯度收敛,使训练更加稳定以及解决梯度消失的问题.全局和局部判别器的对抗性损失如(5)式和(6)式所示:

L(D)=12Ey~pdata(y)[||y1||2]+12Ex~pdata(x)[||G(x)||2]
(5)

其中,G(x)是原图像x经过生成器G生成的图像,y是目标图像,Ex~pdata(x)代表从真实数据分布pdata中采样x后的期望计算.Ey~pdata(y)代表从真实数据分布pdata中采样y后的期望计算.

L(D)=12Ey~pdata(y)[||y1||2]+12Ex~pdata(x)[||G(x)||2]
(6)

其中,y'是从海马体掩膜图像中剪裁的真实图幅,G(x' )是从合成的海马体掩膜图像中剪裁的图幅.

训练生成器的全局对抗性损失和局部对抗性损失如(7)式和(8)式所示:

L(G)=12Ex~pdata(x)[||G(x)1||2]
(7)
L(G)=12Ex~pdata(x)[||G(x)1||2]
(8)

基于GLGAN分割海马体组织的总损失联合了3D u-net的分割损失和GAN的对抗性损失.如(9)和 (10)式所示(由经验得m=1):

L()=L(G)+mL(G,D)
(9)
L()=Ex~pdata(x)log[D(x)]+Ez~p(z)log{1D[G(z)]}+2|V1V2||V1|+|V2|
(10)

其中,L(G,D)表示GAN的对抗性损失,z代表噪声,G(z)是生成器对噪声z生成的图像,Ez~p(z)是对噪声p(z)的期望,D(x)表示判别器对于输入x样本的输出.

2 结果与讨论

本研究模型是基于TensorFlow深度学习框架的基础上从零开始训练,初始学习率是0.000 1,批的大小为2,迭代次数为7 300次.

2.1 实验结果

GLGAN训练可以分成三部分:首先训练生成器,其次训练判别器,最后生成器和判别器联合训练.当且仅当三个网络联合训练,GLGAN才能更好的分割海马体组织.通过联合训练,判别器可以更好的判别真实标签和合成标签,判别器的联合损失被反馈到生成器中,降低真实标签和合成标签间的差异.

本研究选择IOU[计算公式见(11)式]和Dice系数[31]评价网络的分割性能,Dice系数和IOU的值在0和1之间,0表示分割完全不匹配,1表示分割完全重叠.表5是不同方法分割性能表(其中基于3D v-net的GGAN与基于3D u-net的GGAN除了将u-net换成v-net外,其他网络结构相同,并将分割结果相互比较),通过表5可知使用u-net[17]网络分割海马体的Dice系数为0.631,IOU为0.416;基于3D u-net的GGAN分割海马体的Dice系数为0.716,IOU为0.558;基于3D v-net的GGAN分割海马体的Dice系数为0.678,IOU为0.513;基于GLGAN分割海马体的Dice系数为0.804,IOU为0.672.

表5   不同方法的分割性能表

Table 5  Segmentation performance of different methods

方法DiceIOU
u-net[17]0.6310.461
GGAN(u-net)0.7160.558
GGAN(v-net)0.6780.513
GLGAN(本文方法)0.8040.672

新窗口打开| 下载CSV


IOU=|V1V2||V1V2|
(11)

图5为基于全局和和局部信息特征生成器的准确度曲线.从图中可得生成器的准确率为0.786.图6为GLGAN分割海马体的横断面、矢状面和冠状面的分割结果图(红色为金标准,绿色为分割结果图).图7是金标准、GGAN(u-net)和GLGAN分割海马体的横断面、矢状面和冠状面的叠加结果图,其中黄色线表示金标准,红色线表示GGAN分割结果,绿色线表示GLGAN分割结果,通过图7可以直观的看出GLGAN分割结果最接近金标准,分割结果优于GGAN.

图5

图5   基于全局和局部特征信息生成器的准确度曲线

Fig. 5   Accuracy curve based on global and local feature information generator


图6

图6   基于全局和局部特征信息的生成对抗网络分割海马体的横断面、矢状面和冠状面(红色为金标准,绿色为分割结果图)

Fig. 6   Segmentation of cross section, sagittal plane and coronal plane of hippocampus by generative adversarial network based on global and local feature information (red is the gold standard and green is the segmentation result)


图7

图7   金标准、GGAN(u-net)和GLGAN分割海马体的(a)横断面、(b)冠状面和(c)矢状面结果图. 黄色线表示金标准,红色线表示GGAN分割结果,绿色线表示GLGAN分割结果

Fig. 7   (a) cross-sectional, (b) coronal and (c) sagittal results of the gold standard, GGAN (u-net)and GLGAN segmentation of the hippocampus. Yellow line represents the gold standard, red line represents the GGAN segmentation, and green line represents the GLGAN segmentation


2.2 讨论

实验结果表明,当使用3D u-net分割海马体时,分割效果最差.使用改进的3D u-net作为生成器的GGAN与使用3D v-net作为生成器的GGAN方法相比,GGAN(u-net)分割准确度高于GGAN(v-net),Dice系数提高了0.038,IOU提高了0.045.本研究提出的GLGAN跟近期Ou新提出的SA-TF-UNet方法[19]相比,SA-TF-UNet是通过Transformer和空间注意力机制实现对全局和局部信息特征提取,并实现对困难样本的针对性优化;而本研究方法是通过改进GAN的生成器、判别器以及损失函数方法实现对全局和局部信息特征提取,并实现重点突出及聚焦于少量重要信息上,从而更密集评估海马体,提高分割精度.本研究方法的原始数据样本量为132幅,高于SA-TF-UNet方法的104幅,数据集不一样,且本研究的数据样本包含阿尔茨海默病、轻度认知障碍和正常人三种不同情况下的海马体,覆盖各种情况下、具有代表性的样本,这对本研究模型提出更高要求,也导致本研究模型的分割精度低于SA-TF-UNet.本研究提出的GLGAN比u-net的Dice系数提高了27.4%,IOU提高了45.8%;比GGAN(u-net)的Dice系数提高了12.3%,IOU提高了20.4%;比GGAN(v-net)的Dice系数提高了18.6%,IOU提高了31%.实验结果证明GLGAN随着双判别器的加入,网络损耗的梯度被反传,有利生成器网络参数的更新,从而有效解决样本类别不平衡的问题,提高海马体的分割精度.

本研究将GAN应用于3D MRI海马体分割,并通过改进GAN的判别器、生成器以及损失函数提高分割精度.本研究对原始数据进行了有效处理,且判别器在推断时不需要执行.本研究提出的网络模型是从零开始训练,且网络模型非常有效;该网络模型也适用于其他带标签的3D图像的分割任务.数据的多样性和充足性对训练一个好的神经网络模型非常关键[32],但由于本研究数据集样本量小、样本类别不平衡、样本的可变性不足等问题,给本研究带来了重大挑战.此外,海马体亚区的精准分割能给临床提供更好的参考价值,但本研究没有实现海马体的亚区分割.

3 结论

本研究提出了GLGAN方法,克服现有方法的局限性,实现更加密集地评估海马体,达到精准分割的目的.本研究通过改进3D u-net模型使其更适用于GAN的生成器,解决网络训练不稳定、梯度消失和梯度爆炸等一系列问题.双判别器网络结构将全局信息特征和局部信息特征融合在一起,使判别器将生成器生成的掩膜值推向更加真实的分布,从而难以区分真实图像和生成图像.设计一个总损失函数平衡对抗性损失和3D u-net损失.本研究对抗性损失虽然不一定是最优选择,但它会促进生成网络为本研究的数据集寻求最优的分割结果.

利益冲突

参考文献

张鹏. 人类海马体功能剖分及连接模式分析[D]. 长沙: 国防科学技术大学, 2019.

[本文引用: 1]

INSAUSTI R, MUOZLÓPEZ M, INSAUSTI A M.

The CA2 hippocampal subfield in humans: A review

[J]. Hippocampus, 2023, 33(6): 712-729.

DOI:10.1002/hipo.23547      PMID:37204159      [本文引用: 1]

CA2 is probably the most enigmatic of the hippocampal fields. It is small in size (in humans about 500 μm across the mediolateral axis), and yet, it is involved in important functions, such as in social memory and anxiety. This study offers a glimpse of several significant aspects of the anatomical organization of CA2. We present an overview of the anatomical structure of CA2, imbued in the general organization of the human hippocampal formation. The location and distinctiveness of CA2 is presented in relation with CA3 and CA1, based in a total of 23 human control cases serially sectioned throughout the whole longitudinal axis of the hippocampus, examined every 500 μm in Nissl-stained sections. The longitudinal extent of CA2 is close to 30 mm, starting in the hippocampal head, 2.5 mm caudal to the DG and 3.5 mm caudal to the start of CA3, approximately 10 mm from the hippocampus rostral end. The connectional information of human CA2 is very scarce, thereby we relied on nonhuman primate tract tracing studies of the hippocampal formation, given its resemblance to the human brain. Human CA2 is subject of neuropathological studies, and we chose to present Alzheimer's disease, schizophrenia, and Mesial Temporal Lobe Epilepsy with hippocampal sclerosis in those aspects that impinge directly into CA2.© 2023 The Authors. Hippocampus published by Wiley Periodicals LLC.

AKKUS Z, GALIMZIANOVA A, HOOGI A, et al.

Deep learning for brain MRI segmentation: state of the art and future directions

[J]. J Digit Imaging, 2017, 30(4): 449-459.

DOI:10.1007/s10278-017-9983-4      PMID:28577131      [本文引用: 1]

Quantitative analysis of brain MRI is routine for many neurological diseases and conditions and relies on accurate segmentation of structures of interest. Deep learning-based segmentation approaches for brain MRI are gaining interest due to their self-learning and generalization ability over large amounts of data. As the deep learning architectures are becoming more mature, they gradually outperform previous state-of-the-art classical machine learning algorithms. This review aims to provide an overview of current deep learning-based segmentation approaches for quantitative brain MRI. First we review the current deep learning architectures used for segmentation of anatomical brain structures and brain lesions. Next, the performance, speed, and properties of deep learning approaches are summarized and discussed. Finally, we provide a critical assessment of the current state and identify likely future developments and trends.

LI W.

Automatic segmentation of liver tumor in CT images with deep convolutional neural networks

[J]. J Comput Commun, 2015, 3(11): 146-151.

[本文引用: 1]

MONDAL A K, DOLZ J, DESROSIERS C.

Few-shot 3d multi-modal medical image segmentation using generative adversarial learning

[J]. arXiv preprint arXiv: 1810.12241, 2018.

[本文引用: 1]

HALLER J W, CHRISTENSEN G E, JOSHI S C, et al.

Hippocampal MR imaging morphometry by means of general pattern matching

[J]. Radiology, 1996, 199(3): 787-791.

PMID:8638006      [本文引用: 1]

To determine the repeatability and validity of a pattern-matching method for the segmentation and measurement of hippocampi on magnetic resonance (MR) images.Comparable two-dimensional MR images obtained in 18 subjects (nine healthy control subjects [six men, three women; aged 24-54 years] and nine patients with schizophrenia [six men, three women; aged 22-61 years]) were twice segmented manually and twice segmented by using pattern matching with digital atlas transformation. The atlas transformation was accomplished in two steps: global followed by local matching. Global matching was performed with use of landmarks; local matching was performed with use of a viscous fluid model.The mean percentage of difference between two atlas-based measurements was 1.33% +/- 1.23 (+/- standard deviation); that between two manual measurements was 4.67% +/- 4.71. The validity of the atlas transformation measurements was demonstrated by means of the high correlation (intraclass correlation coefficient =.96) with manual segmentation measurements. Schizophrenic hippocampal areas tended to be smaller; however, no differences in hippocampal shape were found between patients with schizophrenia and patients with control subjects.General pattern matching of a digital brain atlas to an individual MR image is a mathematically robust method of measurement that is reproducible and less variable than manual measurement.

KWAK K, YOON U, LEE D K, et al.

Fully-automated approach to hippocampus segmentation using a graph-cuts algorithm combined with atlas-based segmentation and morphological opening

[J]. Magn Reson Imaging, 2013, 31(7): 1190-1196.

DOI:10.1016/j.mri.2013.04.008      PMID:23684964      [本文引用: 1]

The hippocampus has been known to be an important structure as a biomarker for Alzheimer's disease (AD) and other neurological and psychiatric diseases. However, it requires accurate, robust and reproducible delineation of hippocampal structures. In this study, an automated hippocampal segmentation method based on a graph-cuts algorithm combined with atlas-based segmentation and morphological opening was proposed. First of all, the atlas-based segmentation was applied to define initial hippocampal region for a priori information on graph-cuts. The definition of initial seeds was further elaborated by incorporating estimation of partial volume probabilities at each voxel. Finally, morphological opening was applied to reduce false positive of the result processed by graph-cuts. In the experiments with twenty-seven healthy normal subjects, the proposed method showed more reliable results (similarity index=0.81±0.03) than the conventional atlas-based segmentation method (0.72±0.04). Also as for segmentation accuracy which is measured in terms of the ratios of false positive and false negative, the proposed method (precision=0.76±0.04, recall=0.86±0.05) produced lower ratios than the conventional methods (0.73±0.05, 0.72±0.06) demonstrating its plausibility for accurate, robust and reliable segmentation of hippocampus. Copyright © 2013 Elsevier Inc. All rights reserved.

HECKEMANN R A, HAJNAL J V, ALJABAR P, et al.

Automatic anatomical brain MRI segmentation combining label propagation and decision fusion

[J]. NeuroImage, 2006, 33(1): 115-126.

DOI:10.1016/j.neuroimage.2006.05.061      PMID:16860573      [本文引用: 1]

Regions in three-dimensional magnetic resonance (MR) brain images can be classified using protocols for manually segmenting and labeling structures. For large cohorts, time and expertise requirements make this approach impractical. To achieve automation, an individual segmentation can be propagated to another individual using an anatomical correspondence estimate relating the atlas image to the target image. The accuracy of the resulting target labeling has been limited but can potentially be improved by combining multiple segmentations using decision fusion. We studied segmentation propagation and decision fusion on 30 normal brain MR images, which had been manually segmented into 67 structures. Correspondence estimates were established by nonrigid registration using free-form deformations. Both direct label propagation and an indirect approach were tested. Individual propagations showed an average similarity index (SI) of 0.754+/-0.016 against manual segmentations. Decision fusion using 29 input segmentations increased SI to 0.836+/-0.009. For indirect propagation of a single source via 27 intermediate images, SI was 0.779+/-0.013. We also studied the effect of the decision fusion procedure using a numerical simulation with synthetic input data. The results helped to formulate a model that predicts the quality improvement of fused brain segmentations based on the number of individual propagated segmentations combined. We demonstrate a practicable procedure that exceeds the accuracy of previous automatic methods and can compete with manual delineations.

WU G, SHEN D.

Hierarchical Label Fusion with Multiscale Feature Representation and Label-Specific Patch Partition

[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention.Springer, Cham, 2014.

[本文引用: 1]

HU S, PIERRICK COUPÉ, PRUESSNER J C, et al.

Appearance-based modeling for segmentation of hippocampus and amygdala using multi-contrast MR imaging

[J]. NeuroImage, 2011, 58(2): 549-559.

DOI:10.1016/j.neuroimage.2011.06.054      PMID:21741485      [本文引用: 1]

A new automatic model-based segmentation scheme that combines level set shape modeling and active appearance modeling (AAM) is presented. Since different MR image contrasts can yield complementary information, multi-contrast images can be incorporated into the active appearance modeling to improve segmentation performance. During active appearance modeling, the weighting of each contrast is optimized to account for the potentially varying contribution of each image while optimizing the model parameters that correspond to the shape and appearance eigen-images in order to minimize the difference between the multi-contrast test images and the ones synthesized from the shape and appearance modeling. As appearance-based modeling techniques are dependent on the initial alignment of training data, we compare (i) linear alignment of whole brain, (ii) linear alignment of a local volume of interest and (iii) non-linear alignment of a local volume of interest. The proposed segmentation scheme can be used to segment human hippocampi (HC) and amygdalae (AG), which have weak intensity contrast with their background in MRI. The experiments demonstrate that non-linear alignment of training data yields the best results and that multimodal segmentation using T1-weighted, T2-weighted and proton density-weighted images yields better segmentation results than any single contrast. In a four-fold cross validation with eighty young normal subjects, the method yields a mean Dice к of 0.87 with intraclass correlation coefficient (ICC) of 0.946 for HC and a mean Dice к of 0.81 with ICC of 0.924 for AG between manual and automatic labels.Copyright © 2011 Elsevier Inc. All rights reserved.

ZARPALAS D, GKONTRA P, DARAS P, et al.

Hippocampus segmentation through gradient based reliability maps for local blending of ACM energy terms

[C]// Biomedical Imaging (ISBI), 2013 IEEE 10th International Symposium on.IEEE, 2013.

[本文引用: 1]

HAO Y F, WANG T Y, ZHANG X Q, et al.

Local label learning (LLL) for subcortical structure segmentation: Application to hippocampus segmentation

[J]. Hum Brain Mapp, 2014, 35: 2674-2697.

DOI:10.1002/hbm.22359      PMID:24151008      [本文引用: 1]

Automatic and reliable segmentation of subcortical structures is an important but difficult task in quantitative brain image analysis. Multi-atlas based segmentation methods have attracted great interest due to their promising performance. Under the multi-atlas based segmentation framework, using deformation fields generated for registering atlas images onto a target image to be segmented, labels of the atlases are first propagated to the target image space and then fused to get the target image segmentation based on a label fusion strategy. While many label fusion strategies have been developed, most of these methods adopt predefined weighting models that are not necessarily optimal. In this study, we propose a novel local label learning strategy to estimate the target image's segmentation label using statistical machine learning techniques. In particular, we use a L1-regularized support vector machine (SVM) with a k nearest neighbor (kNN) based training sample selection strategy to learn a classifier for each of the target image voxel from its neighboring voxels in the atlases based on both image intensity and texture features. Our method has produced segmentation results consistently better than state-of-the-art label fusion methods in validation experiments on hippocampal segmentation of over 100 MR images obtained from publicly available and in-house datasets. Volumetric analysis has also demonstrated the capability of our method in detecting hippocampal volume changes due to Alzheimer's disease.Copyright © 2013 Wiley Periodicals, Inc.

KHAN A R, CHERBUIN N, WEN W, et al.

Optimal weights for local multi-atlas fusion using supervised learning and dynamic information (SuperDyn): Validation on hippocampus segmentation

[J]. NeuroImage, 2011, 56(1): 126-139.

DOI:10.1016/j.neuroimage.2011.01.078      PMID:21296166      [本文引用: 1]

We developed a novel method for spatially-local selection of atlas-weights in multi-atlas segmentation that combines supervised learning on a training set and dynamic information in the form of local registration accuracy estimates (SuperDyn). Supervised learning was applied using a jackknife learning approach and the methods were evaluated using leave-N-out cross-validation. We applied our segmentation method to hippocampal segmentation in 1.5T and 3T MRI from two datasets: 69 healthy middle-aged subjects (aged 44-49) and 37 healthy and cognitively-impaired elderly subjects (aged 72-84). Mean Dice overlap scores (left hippocampus, right hippocampus) of (83.3, 83.2) and (85.1, 85.3) from the respective datasets were found to be significantly higher than those obtained via equally-weighted fusion, STAPLE, and dynamic fusion. In addition to global surface distance and volume metrics, we also investigated accuracy at a spatially-local scale using a surface-based segmentation performance assessment method (SurfSPA), which generates cohort-specific maps of segmentation accuracy quantified by inward or outward displacement relative to the manual segmentations. These measurements indicated greater agreement with manual segmentation and lower variability for the proposed segmentation method, as compared to equally-weighted fusion.Copyright © 2011 Elsevier Inc. All rights reserved.

DE ALEJO R P, RUIZ-CABELLO J, CORTIJO M, et al.

Computer-assisted enhanced volumetric segmentation magnetic resonance imaging data using a mixture of artificial neural networks

[J]. Magn Reson Imaging, 2003, 21(8): 901-912.

PMID:14599541      [本文引用: 1]

An accurate computer-assisted method able to perform regional segmentation on 3D single modality images and measure its volume is designed using a mixture of unsupervised and supervised artificial neural networks. Firstly, an unsupervised artificial neural network is used to estimate representative textures that appear in the images. The region of interest of the resultant images is selected by means of a multi-layer perceptron after a training using a single sample slice, which contains a central portion of the 3D region of interest. The method was applied to magnetic resonance imaging data collected from an experimental acute inflammatory model (T(2) weighted) and from a clinical study of human Alzheimer's disease (T(1) weighted) to evaluate the proposed method. In the first case, a high correlation and parallelism was registered between the volumetric measurements, of the injured and healthy tissue, by the proposed method with respect to the manual measurements (r = 0.82 and p < 0.05) and to the histopathological studies (r = 0.87 and p < 0.05). The method was also applied to the clinical studies, and similar results were derived of the manual and semi-automatic volumetric measurement of both hippocampus and the corpus callosum (0.95 and 0.88).

CAO L, LI L, ZHENG J, et al.

Multi-task neural networks for joint hippocampus segmentation and clinical score regression

[J]. Multimedia Tools Appl, 2018, 77(1): 1-18.

[本文引用: 1]

ATALOGLOU D, DIMOU A, ZARPALAS D, et al.

Fast and precise hippocampus segmentation through deep convolutional neural network ensembles and transfer learning

[J]. Neuroinformatics, 2019, 17(4): 563-582.

DOI:10.1007/s12021-019-09417-y      PMID:30877605      [本文引用: 1]

Automatic segmentation of the hippocampus from 3D magnetic resonance imaging mostly relied on multi-atlas registration methods. In this work, we exploit recent advances in deep learning to design and implement a fully automatic segmentation method, offering both superior accuracy and fast result. The proposed method is based on deep Convolutional Neural Networks (CNNs) and incorporates distinct segmentation and error correction steps. Segmentation masks are produced by an ensemble of three independent models, operating with orthogonal slices of the input volume, while erroneous labels are subsequently corrected by a combination of Replace and Refine networks. We explore different training approaches and demonstrate how, in CNN-based segmentation, multiple datasets can be effectively combined through transfer learning techniques, allowing for improved segmentation quality. The proposed method was evaluated using two different public datasets and compared favorably to existing methodologies. In the EADC-ADNI HarP dataset, the correspondence between the method's output and the available ground truth manual tracings yielded a mean Dice value of 0.9015, while the required segmentation time for an entire MRI volume was 14.8 seconds. In the MICCAI dataset, the mean Dice value increased to 0.8835 through transfer learning from the larger EADC-ADNI HarP dataset.

HAZARIKA R A, MAJI A K, SYIEM R, et al.

Hippocampus segmentation using u-net convolutional network from brain magnetic resonance imaging (MRI)

[J]. J Digit Imaging, 2022, 35(4): 893-909.

DOI:10.1007/s10278-022-00613-y      PMID:35304675      [本文引用: 3]

Hippocampus is a part of the limbic system in human brain that plays an important role in forming memories and dealing with intellectual abilities. In most of the neurological disorders related to dementia, such as, Alzheimer's disease, hippocampus is one of the earliest affected regions. Because there are no effective dementia drugs, an ambient assisted living approach may help to prevent or slow the progression of dementia. By segmenting and analyzing the size/shape of hippocampus, it may be possible to classify the early dementia stages. Because of complex structure, traditional image segmentation techniques can't segment hippocampus accurately. Machine learning (ML) is a well known tool in medical image processing that can predict and deliver the outcomes accurately by learning from it's previous results. Convolutional Neural Networks (CNN) is one of the most popular ML algorithms. In this work, a U-Net Convolutional Network based approach is used for hippocampus segmentation from 2D brain images. It is observed that, the original U-Net architecture can segment hippocampus with an average performance rate of 93.6%, which outperforms all other discussed state-of-arts. By using a filter size of [Formula: see text], the original U-Net architecture performs a sequence of convolutional processes. We tweaked the architecture further to extract more relevant features by replacing all [Formula: see text] kernels with three alternative kernels of sizes [Formula: see text], [Formula: see text], and [Formula: see text]. It is observed that, the modified architecture achieved an average performance rate of 96.5%, which outperforms the original U-Net model convincingly.© 2022. The Author(s) under exclusive licence to Society for Imaging Informatics in Medicine.

DENG H, ZHANG Y, LI R, et al.

Combining residual attention mechanisms and generative adversarial networks for hippocampus segmentation

[J]. Tsinghua Sci Technol, 2022, 27(1): 68-78.

[本文引用: 1]

OU Y X, GAO M, ZHAO D, et al.

SA-TF-UNet: MRI hippocampus segmentation based on spatial attention mechanism and Transformer

[J]. Journal of Image and Graphics, 2023, 28: 3191-3202.

[本文引用: 2]

欧宇轩, 高敏, 赵地, .

SA-TF-UNet: 基于空间注意力机制和Transformer的MRI海马体分割

[J]. 中国图象图形学报, 2023, 28: 3191-3202.

[本文引用: 2]

WEI Z H, YAN S J, HAN B S, et al.

Multi-output 3D convolutional neural network for diagnosis of Alzheimer's disease

[J]. Chinese J Magn Reson, 2021, 38(1): 92-100.

[本文引用: 2]

魏志宏, 闫士举, 韩宝三, .

基于多输出的3D卷积神经网络诊断阿尔兹海默病

[J]. 波谱学杂志, 2021, 38(1): 92-100.

DOI:10.11938/cjmr20202808      [本文引用: 2]

随着人口老龄化的加深,阿尔兹海默疾病更加大众化地出现在我们生活中,而早期精准诊断阿尔兹海默疾病并进行正向干预可有效延缓阿尔兹海默疾病的进程.基于磁共振图像的阿尔兹海默疾病的精准诊断需要综合利用多个感兴趣区域(ROIs)的信息,而单个ROI无法体现不同ROIs之间存在的联系与影响.本文首先提出三输入3D卷积神经网络(CNN),综合利用大脑3D磁共振图像中海马体、灰质(无海马体)和白质3个ROIs的信息.此外,随着神经网络的加深,原始图像的重要特征信息会部分丢失,因此我们又提出一种多输出3D CNN,通过增加中间层的连接和输出,缩短输入和输出之间的距离,增强特征传播,减少特征信息的丢失.结果显示采用多输出3DCNN模型实现整个测试集三分类的准确率为90.5%、精确率为91.0%、灵敏度为90.4%、特异性为95.2%、F1-score为90.5%,诊断性能优于单输出3D CNN模型.

LECUN Y, BENGIO Y, HINTON G.

Deep learning

[J]. Nature, 2015, 521(7553): 436-444.

[本文引用: 1]

ZHONG Z S, KIM Y S, PLICHTA K, et al.

Simultaneous co-segmentation of tumors in PET-CT images using deep fully convolutional networks

[J]. Med Phys, 2019, 46: 619-633.

[本文引用: 1]

ZHAO X, ZHANG X, LI X J, et al.

Multimodal glioma segmentation with fusion of multiple self-attention and deformable convolutions

[J]. Chinese J Magn Reson, 2023, 40(3): 280-292.

[本文引用: 1]

赵欣, 张鑫, 李鑫杰, .

融合多重自注意力和可变形卷积的多模态脑胶质瘤分割

[J]. 波谱学杂志, 2023, 40(3): 280-292.

DOI:10.11938/cjmr20233059      [本文引用: 1]

脑胶质瘤的磁共振图像分割对于脑肿瘤的诊断、手术规划以及放疗等治疗方案的确定具有非常重要的意义.针对现有脑肿瘤分割算法分割精度不高边缘分割不精确,易出现假阳性的问题,本文提出一种基于多重自注意力和可变形卷积的Unet改进模型.模型将原始Unet框架的标准卷积替换为残差模块,以防止模型训练过程中出现梯度消失;通过在瓶颈层加入基于Transformer的多重自注意力模块来提取局部特征和全局上下文信息,以更好地挖掘像素间的相关性;在跨层连接处采用可变形卷积来增强模型对形状感知的敏感性,以提升肿瘤边缘特征的提取能力.实验结果表明,所提算法的分割结果评价指标高于使用同样数据集的其他对比模型,而且对肿瘤边缘的分割更加精确.这表明本文算法是一种有效的脑胶质瘤自动分割算法.

LOFFE S, SZEGEDY C.

Batch normalization: Accelerating deep network training by reducing internal covariate shift

[J]. ArXiv Preprint Arxiv: 1502.03167, 2015.

[本文引用: 1]

LIAO X, QIAN Y, CHEN Y, et al.

MMTLNet: Multi-modality transfer learning network with adversarial training for 3D whole heart segmentation

[J]. Comput Med Imaging Graphics, 2020(85): 101785.

[本文引用: 1]

MADANI A, MORADI M, SYEDA-MAHMOOD T F. Medical image classification based on a generative adversarial network trained discriminator: US, 10937540B2[P]. 2021-03-02.

[本文引用: 1]

ARUN PANDIAN J, KANCHANADEVI K, KUMAR D, et al.

Deep convolutional generative adversarial network for metastatic tissue diagnosis in lymph node section

[M]// System Design for Epidemics Using Machine Learning and Deep Learning. Springer, Cham, 2023: 153-166.

[本文引用: 1]

REN H J, MA Y, XIAO L.

Knee joint model construction and local specific absorptivity estimation based on generative adversarial network

[J]. Chinese J Magn Reson, 2023, 40(4): 410-422.

[本文引用: 1]

任宏晋, 马岩, 肖亮.

基于生成对抗网络的膝关节模型构建与局部比吸收率估计

[J]. 波谱学杂志, 2023, 40(4): 410-422.

DOI:10.11938/cjmr20233053      [本文引用: 1]

局部比吸收率(SAR)是衡量高场磁共振成像安全性的重要指标.目前主要的方法是对扫描获得的磁共振图像进行组织分割,从而构建个体特异性模型,对其进行电磁仿真以计算局部SAR.针对仿真中膝关节模型长度影响局部SAR估计准确度的问题,本文提出了一种基于条件生成对抗网络(CGAN)的膝关节磁共振图像分割与视野扩展方法,将膝关节图像简化归类为肌肉、脂肪和骨骼三种组织,通过CGAN进行像素的语义分割,采用注意力机制以提高分割的准确度,并且沿头-足方向在图像两端生成扩展区域,构建出更长的模型.实验中对所提方法以及各种对比方法得到的膝关节模型进行电磁仿真,计算它们与人工标注模型的局部SAR的相对误差,结果验证了所提方法可以获得相对精确的膝关节局部SAR估计.

CHAKRABORTY S, CHATTERJEE S, DEY N, et al.

Modified cuckoo search algorithm in microscopic image segmentation of hippocampus

[J]. Microsc Res Techniq, 2017, 80(10): 1051-1072.

DOI:10.1002/jemt.22900      PMID:28557041      [本文引用: 1]

Microscopic image analysis is one of the challenging tasks due to the presence of weak correlation and different segments of interest that may lead to ambiguity. It is also valuable in foremost meadows of technology and medicine. Identification and counting of cells play a vital role in features extraction to diagnose particular diseases precisely. Different segments should be identified accurately in order to identify and to count cells in a microscope image. Consequently, in the current work, a novel method for cell segmentation and identification has been proposed that incorporated marking cells. Thus, a novel method based on cuckoo search after pre-processing step is employed. The method is developed and evaluated on light microscope images of rats' hippocampus which used as a sample for the brain cells. The proposed method can be applied on the color images directly. The proposed approach incorporates the McCulloch's method for lévy flight production in cuckoo search (CS) algorithm. Several objective functions, namely Otsu's method, Kapur entropy and Tsallis entropy are used for segmentation. In the cuckoo search process, the Otsu's between class variance, Kapur's entropy and Tsallis entropy are employed as the objective functions to be optimized. Experimental results are validated by different metrics, namely the peak signal to noise ratio (PSNR), mean square error, feature similarity index and CPU running time for all the test cases. The experimental results established that the Kapur's entropy segmentation method based on the modified CS required the least computational time compared to Otsu's between-class variance segmentation method and the Tsallis entropy segmentation method. Nevertheless, Tsallis entropy method with optimized multi-threshold levels achieved superior performance compared to the other two segmentation methods in terms of the PSNR.© 2017 Wiley Periodicals, Inc.

SMALL G W, HARRISON T M, BURGGREN A C, et al.

Altered memory-related functional connectivity of the anterior and posterior hippocampus in older adults at increased genetic risk for alzheimer's disease

[J]. Hum Brain Mapp, 2013, 37(1): 366-380.

[本文引用: 1]

PHAM D L, XU C, PRINCE J L.

Current methods in medical image segmentation

[J]. Annu Rev Biomed Eng, 2000, 2: 315-337.

PMID:11701515      [本文引用: 1]

Image segmentation plays a crucial role in many medical-imaging applications, by automating or facilitating the delineation of anatomical structures and other regions of interest. We present a critical appraisal of the current status of semi-automated and automated methods for the segmentation of anatomical medical images. Terminology and important issues in image segmentation are first presented. Current segmentation approaches are then reviewed with an emphasis on the advantages and disadvantages of these methods for medical imaging applications. We conclude with a discussion on the future of image segmentation methods in biomedical research.

SHIN H C, TENENHOLTZ N A, ROGERS J K, et al.

Medical image synthesis for data augmentation and anonymization using generative adversarial networks

[C]// International workshop on simulation and synthesis in medical imaging. Springer, Cham, 2018: 1-11.

[本文引用: 1]

/