Non-Volatile Memories (SSD Design and Applications)
NAND flash memory has emerged as a promising storage technology and has enormous potential as the storage alternative for traditional hard disks. Recently, SSDs (Solid State Drives) have been widely adopted as the main storage media not only in mobile devices, but also in large scale enterprise servers due to their better random access performance, low power consumption, high shock resistance and portability.
- A New FTL Design: Flash Translation Layer (FTL) is a software/hardware interface inside NAND flash memory and a core part of the SSD design. We designe a new hybrid FTL (called CFTL) adaptive to workloads and also developed efficient caching schemes to improve our FTL performance further.
Hot and Cold Data Identification
Hot data identification is an issue of paramount importance in storage systems since it has a great impact on their overall performance as well as retains a big potential to be applicable to many other fields such as cache algorithms, data placement schemes, sensor networks, shingled write disks, etc. Although it has a great impact on designing and integrating storage systems, it has been least investigated. The challenges are not only the classification accuracy, but also a limited resource (i.e., SRAM) and computational overheads.
- Multiple Bloom Filter-based Scheme: We adopt multiple bloom filters to capture recency as well as frequency. In addition to this novel scheme, we propose a more reasonable baseline algorithm to approximate an ideal hot data identification named Window-based Direct Address Counting (called WDAC) scheme.
- Sampling-based Scheme: A sampling technique is adopted for the hot data identification scheme. This sampling-based algorithm enables our scheme (called HotDataTrap) to early discard some of the cold items so that it can reduce runtime overheads and a waste of memory spaces. Therefore, our proposed HotDataTrap can also be well adopted in storage systems with very limited resources.
Shingled Write Disks (SWD)
Due to areal density limitation of traditional hard drive (HDD), Shingled Write Disk (SWD) has been proposed as a promising solution to achieve higher areal density without significantly changing manufacture materials and processes. Specifically, SWD can achieve higher capacity by partially overlapping tracks. However, this shingled tracks inevitably cause overwriting data stored at subsequent tracks so that such special characteristic prevents SWD from supporting in-place update. Therefore, innovative architectures and solutions for SWD performance improvement are required.
- A New SWD Algorithm Design: We are designing our new hot data identification scheme and it is applied to a novel shingled write disk design (named H-SWD). This technique can significantly reduce total block movements so that it can remarkably improve the garbage collection performance compared to the existing SWD design. Furhtermore, we extend our H-SWD design for better performance.
- A New Mapping Scheme Design: Due to the special characteristic of an SWD (i.e., not support in-place update), an address mapping scheme is required like a Flash Translation Layer (called FTL) in NAND flash-based SSDs. We explore a novel address mapping scheme to best fit for our H-SWD design.
Data Deduplication
Data deduplication (for short, dedupe) is a specialized data compression technique to eliminate coarse-grained redundant data. It is widely adopted to save storage consumption by retaining only one unique instance of data on storage media and replacing redundant data with a pointer to the unique data afterwards. Especially, due to the exponential data growth most of the data centers are experiencing these days, this data de-duplication has received a lot of attention from storage vendors and IT administrators alike. Traditionally most of the dedupe research work has focused only on the write performance improvement during the dedupe process. However, little attention to its read performance improvement has been paid.
- A Novel Dedupe Algorithm Design: We designed a novel dedupe algorithm to guarantee a demanded read performance in dedupe storage by monitoring its read performance indicator (named Chunk Fragmentation Level: CFL) while assuring its write performance as well. Our dedupe system judiciously writes non-unique (shared) data chunks into storage together with unique chunks according to a selective deuplication threshold value.
- A Novel Dedupe Cache Design: We are designing a novel dedupe cache to significantly improve the read performance in the dedupe storage. This design tries to make the best use of a special feature in data deduplication process. In addition, we are extending our basic scheme on the basis of our extensive observations on dedupe processes.
Smart Home Storage
As home digital data grow rapidly and some home digital data require storage management from household members, a smart home storage design becomes a critical issue. That is, high quality multimedia content will expedite the growth of home digital data. These home digital data, mostly multimedia data for entertainment and backup data, require almost the same types of storage management as enterprise digital data such as backup, capacity planning and even long-term data preservation. However, unlike the enterprise storage systems, family members should be in charge of their home storage management.
- A Virtual USB Drive: We introduce a virtual USB drive as a key component for the smart home storage design. The virtual USB drive adopts a USB interface and works exactly the same as typical USB flash memory. Through the virtual USB drive, any home digital device not only synchronizes the digital data into pre-designated storage, but also seamlessly accesses them.