Tang ShanJiang


Year of Graduation:

Project Title:
Performance and Reliability Modeling for Hadoop in Cloud Computing




Personal Website:

Area(s) of Expertise:
Parallel and Distributed Computing, Cloud Computing

Research Keyword(s):
Speedup, Replication, Hadoop, Amdahl’s Law

About My Research:

Hadoop is a file system-based batch processing framework. It can execute multiple jobs simultaneously submitted from multitenancy. However, the replication of data block is a must in hadoop system in order to guarantee the fault tolerance, data availability, and increase concurrency. How to replicate the data and its affection on performance is an issue worthy discussion. Our work focuses on modeling the performance and reliability of hadoop framework for various applications, aiming to provide users’ guidance.

My Projects:

  • MROrder: Automated MapReduce Job Ordering Optimizaton Prototype System.
  • DHFS: A dynamic slot allocation technique to improve performance for HFS.
  • DynamicMR: A Dynamic Slot Allocation and Scheduling System for MapReduce Clusters.