This morning I applied for a Summer of Code project. My proposal follows:
0 Background
============
Sun Microsystem's Zettabyte Filesystem (ZFS), since Nov 2005, has proven to be a
major breakthrough in filesystem technology, supporting features such as disk
pools, snapshots, "unlimited" scalability, and various other performance
improvements over prior art. Such advancements will make ZFS a major force in
industry in coming years.
A BSD-licensed implementation of ZFS is utterly impractical within the
constraints of Google's Summer of Code (SoC). However, porting the CDDL'd ZFS
implementation present in OpenSolaris is feasible. In particular, Pawel Jakub
Dawidek has been able to port a significant portion [1] of ZFS to FreeBSD. His
work provides a good basis for porting ZFS to NetBSD.
Furthermore, OpenSolaris has published a high-level roadmap [3] for porting ZFS
to other platforms. This outlines a path for success, which I shall follow.
1 Objectives
============
While Pawel's work provides a valuable starting point for much of my work,
numerous FreeBSD interfaces differ from those in NetBSD, including the virtual
memory (vm(9)/uvm(9)), scheduler(9), driver(9), and file system (vfs(9)) kernel
interfaces. However, most userland tools require a minimal amount of work to
bring into NetBSD.
The primary objectives of this project are:
- A functional zpool implementation
- Documentation and plan for future development
Secondary objectives include:
- /dev/zfs
- libzfs
- ZVOL
1.1 zpool and testing functionality
-----------------------------------
Sun's storage pool abstraction, zpool, offers significant advantages over
classical logical volume interfaces such as NetBSD's ld(4). Porting this
component is the first and foremost objective of this project. It comprises
roughly 80% of the kernel-land code necessary for a functional ZFS
implementation.
Unfortunately, Pawel's port of zpool relies on FreeBSD's geom(4) interface,
which is not present in NetBSD, for virtual device (VDEV) functionality. While
NetBSD's VDEV implementation will be fairly straightforward compared to
FreeBSD's, it also requires a significant portion of newly-written code.
The ZFS I/O Pipeline (ZIO) and Adaptive Replacement Cache (ARC) require work
with NetBSD's virtual memory and mutex interfaces.
The ztest, and therefore zdb, userland components will enable me to accurately
test throughout the remainder of the project.
1.2 Userland management tools
-----------------------------
ZFS's userland management tools, zpool(1) and zfs(1), rely on the /dev/zfs
device. The kernel device will differ from prior implementations, however the
userland tools should not.
1.3 ZFS Emulated Volume
-----------------------
The ZFS Emulated Volume component (ZVOL) enables raw devices from the storage
pool to be presented to userland device consumers. It therefore requires a
NetBSD-specific device driver.
1.4 Filesystem Internals
------------------------
The ZFS Posix Layer (ZPL) is significantly nuanced and OS-specific. It is a
very small and very difficult component when compared with the rest of ZFS. A
detailed blueprint, in the form of a development plan (as described below) will
be composed so that minimal work will be required to complete the ZFS filesystem
layer.
1.5 Development Plans
---------------------
As the adage goes- when one fails to plan, he plans to fail. Throughout the
entire project, I shall author an iterative development plan.
Development plans will include the following:
- Function-call diagrams (as needed)
- Updated Gantt charts illustrating dependencies and contingencies in the
scope of the project timeframe
- Required functionality upon completion
- Optional functionality upon completion
- File- and function-level summaries of work needed to successfully port each
portion of code
- Other specific documentation requested by mentors
Iterative development plans, at the end of the project, shall outline the work
that has been completed and will thusly provide future ZFS maintainers with
detailed documentation of NetBSD-specific features and bugs.
2 Deliverables
==============
In addition to project milestones, I shall maintain a project journal that
outlines day-to-day goals and achievements. Weekly summaries based on my
written logs and CVS commit messages will enable the project mentors to provide
meaningful feedback regarding my work and prevent potential problems from going
unnoticed.
2.1 Milestones
--------------
2007/05/28: [Project start]
2007/06/03: Initial development plan
2007/06/17: Storage Pool Allocator (VDEV, ARC, & ZIO) complete
2007/06/24: Data Management Unit complete
2007/07/01: libzpool, ztest, and zdb complete
[Primary zpool implementation complete]
2007/07/08: Phase 2 development plan outlining userland management tools
including:
- zfs(1)
- zpool(1)
- /dev/zfs
2007/07/09: [Project midterm]
2007/07/22: Phase 2 implementation complete
2007/07/29: Phase 3 of development plan outlining ZVOL
2007/08/18: 'Final' development plan blueprinting zpl
2007/08/20: Project summary and lessons learned
[Project complete]
3 About Me
==========
I received my bachelors degree in Computer Science from Stevens Institute of
Technology (SIT), where I am currently pursuing a masters degree in Systems
Engineering while working in the School of Sciences and Arts academic system
administration team. Through the university, I have access to hardware on which
to test my work under Solaris and NetBSD. My current resume is available here:
http://www.olix0r.net/resume.pdf
I have been tracking NetBSD-current for over a year, and follow many NetBSD
lists regularly. Furthermore, I have taken courses with Jan Schaumann and
Hubert Feyrer; and I worked under Jan for over a year and with Thor Simon while
he consulted for SIT. I have contacted each of these NetBSD developers while
preparing this proposal. Furthermore I live in close proximity to Thor, which
enables me to meet with him to discuss project matters.
While completing my undergraduate degree, I worked on an eight-month senior
design project designing, implementing (10K+ LOC), and documenting an intrusion
prevention system in C. Also, I have worked on selective summer-long research
projects with university faculty.
Recently, I have begun to port Fabrice Bellard's KQEMU Loadable Kernel Module
from FreeBSD to NetBSD. Through this project, I have gained familiarity with
NetBSD's lkm(9), driver(9), uvm(9), and scheduler(9) interfaces. At this point,
my port of KQEMU has been delayed due to changing customer (SIT) priorities.
4 References
============
[1] Pawel Dawidek's initial post to freebsd-current regarding his ZFS port
[2] NetBSD's ZFS SoC project description
[3] ZFS Porting page
[4] ZFS Source Tour
[5] Solaris ZFS Administration Guide
[6] ZFS On-Disk Specification [Draft]
$Id: ZFS_PROPOSAL.txt,v 1.9 2007/03/26 11:41:20 ogould Exp $