首页 → 下载

背景：

阅读新闻

[PDF]Towards a Deduplication Framework utilising Apache Spark

[日期：2015-03-19]

来源：作者：Niklas Wilcke

[字体：大中小]

Towards a Deduplication Framework utilising Apache Spark

Niklas Wilcke

This paper is about a new framework called DeduPlication (DduP). DduP aims to solve large scale deduplication problems on arbitrary data tuples. DduP tries to bridge the gap between big data, high performance and duplicate detection.

Towards a Deduplication Framework utilising Apache

0
顶一下

推荐打印 | 录入： | 阅读：次

[PDF]Scalable Multiple NameNodes Hadoop Cloud Storage System

[PDF]Shared Execution of Recurring Workloads in MapReduce

相关新闻 Record Linkage Duplicate Detection Deduplication

本文评论　　

评论声明

尊重网上道德，遵守中华人民共和国的各项有关法律法规
承担一切因您的行为而直接或间接导致的民事或刑事法律责任
本站管理人员有权保留或删除其管辖留言中的任意内容
本站有权在网站内转载或引用您的评论
参与本评论即表明您已经阅读并接受上述条款

Digg排行