狠狠撸

Distributed Filesystem Ning Peng

定义 A file system (often also written as filesystem ) is a method of storing and organizing computer files and their data. Essentially, it organizes these files into a database for the storage, organization, manipulation, and retrieval by the computer's operating system.

Challenge NTFS / EXT3 等工作得很好 NFS 在很多场合也还不错但是…

Challenge 更高的容量大容量媒体文件 - PB 海量的文件 – Trillion

Challange 稳定可靠磁盘损坏部件失效：主机，接口，控制器，电源，网络 ...

How to scale 本来应该说是 Scale Your Storage 这个 topic 太大涉及到应用，硬件，网络先简单的谈分布式文件系统谈点穷人的方案

看看我们有的文件系统太多了…看分类吧 Disk file systems –ext3/ntfs/zfs/wafl ( 大部分我们熟悉的 ) File systems with built in fault-tolerance-zfs/brfs File systems optimized for flash memory, solid state media Record-oriented file systems Shared disk file systems Distributed file systems Distributed fault-tolerant file systems Distributed parallel file systems Distributed parallel fault-tolerant file systems GoogleFilesystem/CloudStore/Lustre/HDFS Peer-to-peer file systems Special purpose file systems Pseudo- and virtual file systems Encrypted file systems

Google File system Like Google File system KFS HDFS Why like? Master – chunk 架构 POSIX Like Interface 设计目标一致

GFS Goal 最开始是为爬虫等应用设计的 The system is built from many inexpensive commodity components that often fail. The system stores a modest number of large Files. large streaming reads and small random reads The workloads also have many large, sequential writes that append data to Files High sustained bandwidth is more important than low latency

GFS Architecture 文件分块大小 64MB 为什么是这个大小？ Master 管理元数据 Namespace File Attributes Chunks , and location Single Master ? Yes Simple

Master 设计都放在内存里 64MB Chunk -> 64 bytes Log & Check Point

一致性 Master 控制的部分是一致的创建删除文件等串行写 - 一致已定义并行写 – 一致未定义

GFS 限制显而易见的单点故障点和性能瓶颈从几百 TB 到几十 PB 更多的 metadata 更大的容量更多的 housekeeping 工作耗时客户端并发的限制解决 Multi-cell / Pratition Namespace 层面解决问题

MogileFS Application level No single point of failure Automatic file replication "Better than RAID" Flat Namespace Shared-Nothing / No RAID required Local filesystem agnostic

MogileFS Overview Application Tracker ? DB Metadata & location Need HA Storage ? Nodes NFS or HTTP transport

MogileFS 特点和 Google FS like 的相似 Metadata 信息不同 Chunk replication vs file Replication 限制 DB 是主要的瓶颈

FastDFS FastDFS 是一个轻量级的开源分布式文件系统 FastDFS 主要解决了大容量的文件存储和高并发访问的问题，文件存取时实现了负载均衡 FastDFS 实现了软件方式的 RAID ，可以使用廉价的 IDE 硬盘进行存储支持存储服务器在线扩容支持相同内容的文件只保存一份，节约磁盘空间 FastDFS 只能通过 Client API 访问，不支持 POSIX 访问方式 FastDFS 特别适合大中型网站使用，用来存储资源文件（如：图片、文档、音频、视频等等）

FastDFS 构成 Tracker Server 主要做调度工作，在访问上起负载均衡的作用。记录 storage server 的状态，是连接 Client 和 Storage server 的枢纽。 Storage Server 存储服务器，文件物理内容和 meta data 都保存到存储服务器上 Storage Server 构成不同的组 ( 卷 /Volume) 同组的 Storage Server 的文件是相同的

FastDFS 基本流程 -PUT client 询问 tracker 上传到的 storage ，不需要附加参数； tracker 返回一台可用的 storage ； client 直接和 storage 通讯完成文件上传。

FastDFS 基本流程 -GET client 询问 tracker 下载文件的 storage ，参数为文件标识（卷名和文件名） tracker 返回一台可用的 storage ； client 直接和 storage 通讯完成文件下载。

FastDFS 特点异常的简单高效对等结构分组存取 Tracker 的责任很简单只做调度，所以性能不是问题文件不分 CHUNK 适合于大量小文件存取同一组内的服务器的文件完全相同的组之间没有运行中的动态负载均衡文件名是 FastDFS 指定而非 Client 指定对 ClientApplication 要求较高，限制了使用场景

DFS 设计考虑因素 Tradeoff … Tradeoff 高吞吐、 IOPS 海量、大文件可靠性，可用性易于开发，易于管理，成本我的 File 的类型大小，访问模式我对共享访问的需求特别是一致性我对并行访问的需求

我的性能瓶颈在哪里存储介质读写流程存储路径 Cache

DFS 的一些因素的分析 1 st – 我真的需要吗？ POSIX Interface Access mode support RAID / Chunk Replication / File Replication NameSpace Management Metadata Management Out-band mode Cache Access Interface MetaServer topo Single / Active Standby / Peer / Hash or ……

Reference & Thanks wikipedia gfs-sosp2003.pdf http://www.danga.com/mogilefs/ FastDFS – 余庆

Q & A msn: pengning_cn #AT# hotmail.com 谢谢大家！

狠狠撸

Dfs ning

Recommended

More Related Content

What's hot (20)

Viewers also liked (8)

Similar to Dfs ning (20)

More from Andy Shi (6)

Dfs ning

Editor's Notes