My research interests focus on performance tuning of parallel software and GPU computing.
Email: traits (dot) zhang (at) gmail (dot) com
1/2015–present, Research Fellow, Institute for Computational Engineering and Sciences (ICES), The University of Texas at Austin, Texas, U.S.
11/2009–9/2014, Research Associate, Institute of Software, Chinese Academy of Sciences, Beijing, China
7/2007–10/2009, Engineer, Institute of Software, Chinese Academy of Sciences, Beijing, China
2012.1-2014.9, Sub-project: High performance math library for domestic CPU, the National High-tech R&D Program of China (Grant No.2012AA010903), Institute of Software, Chinese Academy of Sciences
2009.1-2011.12, Research and Development of Compiler System and Toolchain for Domestic CPU (No.2009ZX01036-001-002), National S&T Major Projects: Core Electronic Devices, High-end General Chips and Fundamental Software, Institute of Software, Chinese Academy of Sciences
2008.1-2012.4, Sub-project: High Performance LC-MS-based Protein Quantification Software Package, the Knowledge Innovation Program of the Chinese Academy of Sciences (No.KGCX1-YW-13), Institute of Software ,Chinese Academy of Sciences
2009.1-2010.11, Sub-project: Optimization of Linpack on GPGPU Cluster, supported by Ministry of Finance under the Grant No. ZDYZ2008-2, Institute of Software, Chinese Academy of Sciences. (IPE Mole-8.5 obtained No.19 on Top500 June 2010 list.)
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version. This library is under BSD License.
Automatic tuning SpMV library on AMD Brook+
MZ-Analyzer is a tool for visualization and analysis of multiple mass spectrometry data in 2D and 3D mode.
Xianyi Zhang, Chao Yang, Fangfang Liu, Yiqun Liu, and Yutong Lu, Optimizing and Scaling HPCG on Tianhe-2: Early Experience,ICA3PP 2014.
Wang Qian, Zhang Xianyi, Zhang Yunquan, Qing Yi, AUGEM: Automatically Generate High Performance Dense Linear Algebra Kernels on x86 CPUs, In the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'13), Denver CO, November 2013.[pdf]
Zhang Xianyi, Wang Qian, Zhang Yunquan, Model-driven Level 3 BLAS Performance Optimization on Loongson 3A Processor, 2012 IEEE 18th International Conference on Parallel and Distributed Systems (ICPADS), 17-19 Dec. 2012.
Wang Lei , Zhang Yunquan , Zhang Xianyi, Liu Fangfang, Accelerating Linpack Performance with Mixed Precision Algorithm on CPU+GPGPU Heterogeneous Cluster, 2010 10th IEEE International Conference on Computer and Information Technology, Bradford, UK, June 2010. [pdf]
Xianyi Zhang, Yunquan Zhang, Early Linpack Performance Benchmarking on IPE Mole-8.5 Fermi GPU Cluster, GTC 2010, San Jose, CA, USA, Sep 2010. (poster) [pptx]
Jing Wang, Yunquan Zhang, Xianyi Zhang, Xiangzheng Sun, Zelin Hu, Sujun Li, Rong Zeng, "QuantWiz: A Parallel Software Package for LC-MS-based Label-Free Protein Quantification," pp.683-687, 2009 11th IEEE International Conference on High Performance Computing and Communications, Seoul, Korea June 25-June 27,2009. [pdf]
9/2005-7/2007, Computer Architecture, Dept. of Computer Science and Technology, Graduate School of Beijing Institute of Technology, M.E.
9/2001-7/2005, Dept. of Computer Science and Technology, Beijing Institute of Technology, B.E.