基于FPGA的高效数据过滤技术研究

陈宝远; 张秀芝; 梁状

基于FPGA的高效数据过滤技术研究

Research on Efficient Data Filtering Based on FPGA

摘要

摘要: 随着数据量的爆炸式增长, 为了挖掘海量数据中蕴含的巨大价值, 如何在有效的时间内对海量的数据进行处理变得越来越重要.近年来出现的Hadoop凭借其高可靠性、高扩展性以及MapReduce编程模型获得了产业界和学术界的巨大关注.但是Hadoop设计之初是用来处理大规模聚集任务的, 这类任务往往需要扫描全部的数据, 这种基于扫描的数据处理方式效率比较低.为了提高Hadoop的数据处理效率, 在Hadoop集群中加入FPGA对Hadoop分布式文件系统中存储的数据进行过滤, 这样MapReduce程序只需要对过滤后的数据进行自定义的操作, 从而避免了扫描全部数据.实验表明处理海量的数据时, FPGA能够显著地过滤掉无用数据, 从而提高数据处理效率.

Abstract: With the explosive growth of data volume, how to process the massive data within a reasonable time is becoming more and more important. Recently, Hadoop has got a great attention in academia and industry with high reliability, high expansibility and MapReduce programing model. While Hadoop was designed to handle large scale gathering mission, and this kind of mission need to scan the whole data-sets.In order to increase the efficiency of data processing, FPGA was injected into Hadoop cluster. FPGA will filter the data stored in HDFS(Hadoop Distributed File System). In this way MapRedeuce just need to process the filtered data. A series experiment shows that FPGA could filter useless data and increase the efficiency of data processing.

HTML全文

参考文献(15)

施引文献

资源附件(0)