Time-ordered linear model (TOLM) is feature-reduction method based on the idea that a co-bisector can represent the main tendency of a series of vectors. This co-bisector model has two main advantages: first, unlike present methods such as PCA, the co-bisector model conserves the temporal properties of a series of vectors since they have order-restricted projection locations on the co-bisector. Second, TLM preserves the spatial distance ratio between neighboring samples which have fixed locations in a abstract cell state space.


Estimating developmental states of tumors and normal tissues using a linear time-ordered model. BMC Bioinformatics.201112:53


M&M is a novel integrative statistical framework (for integration of MeDIP-seq and MRE-seq) that dynamically scales, normalizes, and combines MeDIP-seq and MRE-seq data to detect differentially methylated regions in a genome-wide fashion.M&M leverages the complementary nature of MeDIP-seq and MRE-seq data to allow rapid comparative analysis between whole methylomes at a fraction of the cost of WGBS.

Functional DNA methylation differences between tissues, cell types, and across individuals discovered using the M&M algorithm. Genome research .23 (9), 1522-1540


FeatSNP is an online tool and a curated database for exploring 81 million common SNPs’ potential functional impact on the human brain. FeatSNP uses the brain transcriptomes of the human population to improve functional annotation of human SNPs by integrating transcription factor binding prediction, public eQTL information, and brain-specific epigenetic landscape, as well as information of Topologically Associating Domains(TADs). FeatSNP supports both single and batched SNP searching, and its unique interactive user interface enables users to explore the functional annotations and generate publication-quality visualization results.


FeatSNP: an interactive database for brain-specific epigenetic annotation of human SNPs. Frontiers in genetics .10, 262


We optimized the analysis strategy for ATAC-seq and defined a series of QC metrics, including reads under peak ratio (RUPr), background (BG), promoter enrichment (ProEn), subsampling enrichment (SubEn), and other measurements. We incorporated these QC tests into our recently developed ATAC-seq Integrative Analysis Package (AIAP) to provide a complete ATAC-seq analysis system, including quality assurance, improved peak calling, and downstream differential analysis. We demonstrated a significant improvement of sensitivity (20%~60%) in both peak calling and differential analysis by processing paired-end ATAC-seq datasets using AIAP. AIAP is compiled into Docker/Singularity, and with one command line execution, it generates a comprehensive QC report. We used ENCODE ATAC-seq data to benchmark and generate QC recommendations, and developed qATACViewer for the user-friendly interaction with the QC report.


Improving ATAC-seq Data Analysis with AIAP, a Quality Control and Integrative Analysis Package


We developed an user friendly package, BeCorrect, to perform batch-effort correction and to visualize corrected ATAC-seq signals in a genome browser. BeCorrect uses bedgraph as input, and after reading the raw read counts table and corrected read counts table, which is generated using the RUVseq package, BeCorrect calculates the corrected weights across the genome. BeCorrect can also be used to correct the within-group variation caused by different sequencing depths, which can be considered a technical batch effect. 


Comparison of differential accessibility analysis strategies for ATAC-seq data. Scientific Reports. volume 10, Article number: 10150 (2020