Welcome to Enhlink’s documentation!
Introduction
We developed Enhlink, a fast, accurate, and efficient algorithm used to infer enhancers-promoter linkages from scATAC-seq datasets, but can also be applied to infer linkage from single-cell methylation, DNase, MNase, ChiP-seq datasets or any sparse and high-dimensional genomic datasets with binary features. Enhlink specifically allows the user to model the influence of the covariates during the inference and that can easily be integrated into existing analytical workflows. Enhlink is further designed to take advantage of multi-omic datasets to infer enhancer-promoter links while incorporating complementary omic measurements, such as gene expression. Enhlink models the surrounding enhancers and the covariates as predictive features of promoter accessibility (or gene expression). Enhlink uses a modified Information Gain score, a random forest framework, and a bootstrap procedure to estimate significant features associated with a given promoter. In addition, Enhlink optionally performs a second-order analysis that identifies linkages specific to a given covariate. Enhlink is not limited to proximal enhancers, rather can infer distal enhancer-gene linkages. Enhlink integrates a simulation workflow designed using experimentally validated enhancers-promoter signals, improving prediction accuracy.
- Introduction and installation
- Tutorial 1: Simple processing of snATAC-seq data
- Tutorial 2: Use of simulated variables to estimate expected accuracy
- Tutorial 3: Create sparse matrices compatible with Enhlink
- Tutorial 4: Processing multi-omics data with Enhlink
- Tutorial 5: Enhlink advanced functionalities
- Inferring positively/negatively correlated linkage only
- Using an external feature matrix with Enhlink.
- Regularization and adjusting sensitivity / specificity
- Distal interactions inference
- using Enhgrid to distribute computation, test the robustness of results, and improve speed
- Filter links based on TAD domains
- Tutorial 6: Generating simulated matrices for method comparison
- License
Access
The package is accessible at this link: https://gitlab.com/grouumf/enhlinktools/
Citation
Enhlink is currently described in our preprint: https://www.biorxiv.org/content/10.1101/2023.05.11.540453v1
License
GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007
Support
If you are having issues, please let us know. You can reach us using the following email address:
Olivier Poirion, Ph.D. o.poirion@gmail.com (personal) or olivier.poirion@jax.org (current work)
Data availability
The PCHi-C data used in this manuscript are available with currently a limited access through GEO with the following accession: GSE214107. The data are planned to be released to the public the 31 December 2023. In the meantime, they will be available on demand. The Linkage atlases for the islet and adipose datasets are available as Figshare datasets fith the following DOIs: https://doi.org/10.6084/m9.figshare.22336033.v1 (adipose) and https://doi.org/10.6084/m9.figshare.22335919.v1 (islet).
- enhgrid
- Library that compiles the enhgrid executable
- Index
- Variables
- func analyseOneGeneList
- func getGeneBucketsFromGene
- func getGeneBucketsFromPromoter
- func launchOneIterThread
- func main
- func mergeBucketResultsFile
- func mergeOneSetOfBucketFiles
- func processNGeneLists
- func reduce
- func splitGenesToBucket
- func stringToFloatArray
- func stringToIntArray
- func stringToMaxFeatTypeArray
- func testIfRequiredFilesExist
- type paramArrays
- enhlink
- enhlinkobject
- enhtools
- Library that compiles the enhtools executable
- Index
- Variables
- func checkIfLineCanBeSplitIntoPeaks
- func filterBedpe
- func filterWithBed
- func getPosFromOption
- func incompleteIntervalIntersect
- func intersectBedpeWithBedFile
- func intersectBedpes
- func intersectBedpes2
- func intersectBedpes3
- func intersectBedpes4
- func intersectOneInput
- func intersectWithBed
- func intervalIntersect
- func isIntersecting
- func loadPeakFile
- func main
- func writeHeader
- func writeOneInterToBuffer
- func writeStats
- type geneStats
- type geneStatsMap
- type intervalResults
- type matching
- type mergeFunc
- type metaMap
- type peakMeta
- type peakPair
- type scoreMerging
- type twoKeysBoolMap
- matrix
- Index
- Constants
- Variables
- func GetRandomBootstrapIndex
- func LoadIndexFileToIndex
- func LoadPeakDictsToIndex
- func Mean
- func Std
- func TestStringToPeak
- func TtestPval
- func diffMap
- func maxUintMap
- func minIndexSliceInt
- func minInt
- func processMtxHeader
- func reverseIndex
- func reverseIndexC
- type Attributes
- type Format
- type MatColFloatHash
- type MatColHash
- type SparseBoolMatrix
- type SparseFloatMatrix