基于Hadoop-GPU的RBM云计算实现

刘凯; 张立民; 吴莉强

doi:10.19304/j.cnki.issn1000-7180.2015.09.014

基于Hadoop-GPU的RBM云计算实现

Realization of RBM Training Based on Hadoop-GPU

摘要

摘要: 针对受限波尔兹曼机处理大数据时存在的训练缓慢问题,在Hadoop云计算平台和GPU并行加速的基础上设计了基于Hadoop-GPU框架的的RBM加速计算实现方法.通过对MapReduce机制和RBM训练过程的分析,将RBM训练分割为采用Map端实现吉布斯采样,Reduce端实现参数更新,并通过GPU实现运算并行加速的方法组合.最后通过MNIST手写数字识别集实验证明,在大规模数据下,Hadoop-GPU平台对RBM的训练具有良好的可行性,加速比达到20以上,并且随着数据规模的增加,加速比呈现更为显著的增长.

Abstract: In view of Restricted Boltzmann Machine problem of slow training and take too long time, on the basic of cloud computing platform Hadoop distributed task processing and GPU acceleration mechanism, the realization of RBM training based on Hadoop-GPU is designed. By researching the producers of Hadoop and training steps of RBM, the training of RBM is decomposed into two steps, and combined by Hadoop combination-job mechanism. As the basic model RBM training, the Map function in Hadoop runs Gibbs block sampling and Reduce function executes parameters update. The MNIST handwriting digit database is conduct on the test of this new realization. The MNIST experiment results illustrated that the novel algorithm has good feasibility and is advantageous for hug amount of data, especially with the increase of the data size, the more significant speedup showing growth and during RBM training the speed-up rate reach at least 20.

HTML全文

参考文献(0)

施引文献

资源附件(0)