Welcome!

I am Baoquan Zhang, a Ph.D. Candidate (Sep.2015 - current) in Computer Science at University of Minnesota Twin-Cities advised by Professor David H.C. Du. Currently, I am a member of CRIS (Center for Research in Intelligent Storage), UMN and working on Application and System Co-Designs, Memory/Storage Systems, Key-Value Stores, etc..

Before joining UMN, I got B.E. (Sep. 2008 - July 2012) and M.S. (Sep. 2012 - April 2015) degrees in Computer Science at Harbin Engineering University, China. As a Research Assistant (June 2013 - April 2015), I worked in Cloud Storage and Distributed Systems in the storage systems group at Research Institute of Information Technology, Tsinghua University, Beijing advised by Prof. Dongsheng Wang.

Email zhan4281@umn.edu

Working Experience

Dell EMC, Summer Intern      Eden Prairie, Minnesota, May.22 2017 -- Aug.11 2017

Performance improvements of the IO tracing module in OS of Dell storage controllers (C++)
Existing solution traces IO requests with a global log introducing over 80% performance degradation. To improve its performance, my project makes following contributions:

  • Identify that the global lock for tracing serialization as the most significant bottleneck.
  • Implement a scalable log pool tracing and searching parallely to resolve the global lock.
  • Reduce the performance degradation to less than 5% in certain workloads.
Tsinghua University, Full-Time Research Assistant      Bejing, China, May 2013 -- April 2015

Construction of geo-distributed data management systems (Python, Java)
This project aims at constructing a geo-distributed data management system realizing data retrieval (including selection and join) among different data centers, in which I make following contributions:

  • Deploy hybrid systems combining MySQL with Impala on HDFS in same data center and migrate data among databases and data warehouses.<\li>
  • Implemented the data selections and joins among different data centers.<\li>
  • The system is finally utilized by government and deployed in data centers in Suzhou and Beijing.<\li>

Research Experience

Large-Scale Non-Volatile Memory (NVM) Systems (Since Sep. 2017)
Certain type of NVM, e.g. STT-RAM, is able to provide DRAM-like speed and HDD-like persistency. With these new features, large-scale systems, which converge memory and storage using NVM, become feasible solutions for large amount of data processing providing memory-speed and byte-addressable services. Based on this architecture, my research plan includes:
  • NVMLevel: A design of LevelDB on single-level store
    • Merge Level0 with MemTable and Partitioned Level0 into disjoined key ranges.
    • Migrate BloomFilter into level indexes from SSTables to accelerate search.
    • Compact levels based on key ranges instead of SSTables to reduce write-amplifications.
  • Space-Efficient Data Redundancy: Tradeoff between replicas and erasure codes
    • Utilize hybrid replicas/erasure codes for large key-value pairs.
    • Keep full version data for primary replica and erasure codes for secondary replicas.
  • Improving atomic updates in NVM Library (NVML)
    • Deploy dynamic strategy combining undo log with copy-on-write to achieve atomic updates.
    • Predict overheads to decide either log the old value or copy-on-write.
Heterogeneous and Reliable Storage Systems (Since Sep. 2015)
Current large-scale storage systems deploy different types of storage devices, e.g. SSD, HHD, SMR Drive etc., which have different performance and I/O peculiars. To efficiently utilize and manage the heterogeneous storage resources, my research includes:
  • SmartRAID -- A RAID-5/6 design on Shingled Magnetic Recordings (SMR)
    • Identify the performance degradation of RAID-5 on SMR drives caused by data cleanings.
    • Designed a RAID-5/6 mitigating the performance degradation using an alternating spare drive so that the drives can be cleaned without infuencing the performance of whole array.
    • Evaluation shows that SmartRAID is able to reduce over 99% performance degradation.
  • Empirical Evaluations for the data cleaning in Host-Aware SMR drives (HA-SMR)
    • Study different types of data cleanings, including trigger conditions and performance influences.
    • Identify the data cleaning strategy and influencing factors of data cleaning time.
    • Propose a approach creating idle time artificially and mitigating over 40% of performance degradation.
  • Improving data integrity in Linux Software RAID using Protection Information (T10-PI)
    • Identify that current Linux Multiple Device (MD) Module lose the integrity data while passing data.
    • Deploy dedicated buffers to store, verify and pass integrity data along with data.
    • Evaluations show that Integrity guarantee introduces about 5% -- 20% performance overheads.
  • Optimizing I/O scheduling in distributed systems for data-intensive computing
    • Realized load-balancing using a mathematical model considering device performance.
    • Merged the data retrievals to same data set within user-defined time windows.
    • Evaluations indicate that our method achieves over 30% of performance improvements.

Publications

[1] Baoquan Zhang, XuChao Xie, David H.C. Du, "Evaluating Media Cache Cleaning and Improving I/O performance in Shingled Magnetic Recording (SMR) Drives" (under revision)
[2] Baoquan Zhang, David H.C. Du, "SmartRAID: A RAID-5 with Alternating Idle Shingled Magnetic Recording (SMR) Drives" (under revision)
[3] F. Wu, Z. Fan, M. C. Yang, B. Zhang, X. Ge and D. H. C. Du, "Performance Evaluation of Host Aware Shingled Magnetic Recording (HA-SMR) Drives," in IEEE Transactions on Computers, vol. 66, no. 11, pp. 1932-1945, Nov. 1 2017.
[4] Baoquan Zhang, Raghunath Raja Chandrasekar, Lance Evans, and David H.C. Du "Improving Data Integrity in Linux MD RAID with T10 Protection Information (T10 PI)", Technical Report in CRIS Meeting, May. 2016
[5] Zhang, Baoquan, Jingmei Li, Tao Xu, Dongsheng Wang, and Nan Ding. "Shared I/O scheduling in cloud for structured data processing." In Big Data and Cloud Computing (BdCloud), 2014 IEEE Fourth International Conference on, pp. 159-166. IEEE, 2014.
[6] Chinese Patent: A method and a system for data joining among geo-distributed cloud, Patent No. ZL201410081163.7

Miscellaneous

ADC Graduate Fellowship, University of Minnesota – Twin Cities, 2015 – 2016
National Fellowship of China, Harbin Engineering University, 2014 – 2015