sarje :: abhinav () {
:: research
interests My research interests include the following:
High Performance Computing Parallel Algorithms Scientific Computing
Algorithms & Applications on Emerging Architectures String and Text Algorithms Cloud Computing & Abstractions
select primary projects [very old] A cloud computing based abstract framework for computations on trees On the lines of functional-style programming frameworks, we propose and demonstrate an abstract framework to perform computations on Tree data structures. All the computational patterns are abstracted away, allowing the framework to be general enough to implement numerous applications using the framework. The application specific information is provided by the user and this is neatly abstracted from the system implementation which may be in a distributed/parallel fashion. Our demonstration implementation of the framework is designed as a generic-programming C++ library. Accelerating pairwise computations in parallel on multi- and many-cores Given the architectural limitations of the emerging high performance architectures -- small Local Stores on the SPEs and explicit DMA transfers on the Cell processor, small on-chip shared memory on graphics processors, these parallel approachs to efficiently schedule all-pairs computations between vectors, obtains efficiency by optimizing the number of memory transfers and efficient usage of the on-chip memories. These approaches can be easily applied to dense matrix computations based on pairwise calculations. Applications of this lie in various fields, including materials science, fluid dynamics and systems biology. Our C/C++ libraries, libpnorm and libpairwise, are available under LGPL and BSL, and can be downloaded from our group website. Parallel algorithms for genomic alignments on the heterogeneous multi-cores Two prominent parallel algorithms for sequence alignments exist. One uses the wavefront pattern, and the other is based on parallel-prefix. A comparison of these algorithms for performance on a cheap high-performance processor, the Cell BE, leads to a hybrid algorithm based on these two approaches. This scheme gives speedups of four on a Playstation 3 (six SPEs) compared to a Pentium 4 processor. Furthermore, it gives speedups of seven on a QS20 Cell blade (sixteen SPEs). These hold for the three varieties of alignments – global/local, spliced and syntenic. Construction of gene regulatory networks in parallel on clusters of Cell processors A parallel algorithm for construction of gene regulatory network, based on mutual information computations using B-splines, implemented on a small cluster of Cell processors gives performance comparable to a high-end massive parallel processing system, like the IBM Blue Gene/L. One QS20 Cell blade gives a speedup of more than seven relative to a Pentium D processor. Using a cluster of eight QS20 Cell blades gives a performance gain of two over 128 nodes of Blue Gene/L.