# enhgrid ```go import "gitlab.com/Grouumf/enhlinktools/enhgrid" ``` ### Library that compiles the enhgrid executable enhgrid performs enhlink on multiple processes for a range of hyperparameter values. enhgrid generates output files for each hyperparameter combination. The following parameters can accept multiple values: ``` -downsample -n_boot -depth -max_features -secondOrderMaxFeat -threshold -min_matsize -min_leafsize -merging_cutoff -neighborhood -maxFeatType -lambda1 -lambda2 -threads ``` Multiple values can be passed as input using either comma or space: for example \-depth 2,3,4 or \-depth "2 3 4" Enhgrid can accept the exact same parameters than Enhlink with additional functionalities: \#\# Parameters unique to enhgrid: ``` -randomNTargets which allows to pick, for each grid iteration, N tatgets at random from the index and process them instead of the full list of targets -repetition Number of repetition to be performed for each iteration (default: 1) -processes Number of Enhlink processes to be launched in parallel (default: 1) -splitTargetList Split the list of genes through the n processes ``` \<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\< WARNING \>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\> As of March 20 2024, Enhlink v0.21.0, we Changed some of Enhgrid's parameters names for clarity and consistency purpose. Below are the list of changes: \(version \< 0.21.0\) \-\> \(version \>= 0.21.0\) cluster \-\> clusters promoter \-\> gtf genes \-\> targets gene \-\> target isGeneExpr \-\> isExpr rmPeaksInPromoter \-\> rmPeaksInTargets splitGeneList \-\> splitTargetList randomNGenes \-\> randomNTargets onlyPositiveLink \-\> linkType \<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\> USAGE: ``` enhgrid -mat -xgi -ygi -gtf -out -tag -mat2 -xgi2 -ygi2 # IF PASSING A GENE MATRIX FILE -target # IF FOCUSING ON ONE TARGET -targets # IF FOCUSING ON A LIST OF TARGETS -isExpr # IF THE GENE MATRIX IS A EXPRESSION MATRIX -covariates -xgi_subset -ygi_subset -clusters # OPTIONAL -downsample -threads -n_boot -depth -max_features # OPTIONAL -threshold min_matsize -min_leafsize -merging_cutoff # OPTIONAL -format {coo, mtx, cellRanger} -keep_sparse -maxFeatType # OPTIONAL -rmPeaksInTargets -linkType {"all", "positive", "negative"} -secondOrder -ignoreEnhancerWeight # OPTIONAL -neighborhood -secondOrderMaxFeat -uniformSampling # OPTIONAL -randomGenes -repetition -processes --splitTargetList # OPTIONAL and specific to enhgrid ``` please check enhgrid \-h and the tutorial and introduction sections for a more precise description of the input parameters ## Index - [Variables](<#variables>) - [func analyseOneGeneList(enhObj enhlinkobject.EnhlinkObject, processID int, bucket map[string]bool, waiting *sync.WaitGroup, guard chan bool)](<#func-analyseonegenelist>) - [func getGeneBucketsFromGene(geneFile utils.Filename, processes int) ([]map[string]bool, int)](<#func-getgenebucketsfromgene>) - [func getGeneBucketsFromPromoter(plist *enhlinkobject.PromoterList, processes int) ([]map[string]bool, int)](<#func-getgenebucketsfrompromoter>) - [func launchOneIterThread(isOver bool, count int, attributes enhlinkobject.TreeAttributes, enhMat, geneMat, covMat *matrix.SparseBoolMatrix, floatMat *matrix.SparseFloatMatrix, plist *enhlinkobject.PromoterList, writer *io.WriteCloser, waiting *sync.WaitGroup, mutex *sync.Mutex, guard chan bool)](<#func-launchoneiterthread>) - [func main()](<#func-main>) - [func mergeBucketResultsFile(outTag string, clusterList []string, nbBuckets int)](<#func-mergebucketresultsfile>) - [func mergeOneSetOfBucketFiles(outTag, cluster, ext string, nbBuckets int)](<#func-mergeonesetofbucketfiles>) - [func processNGeneLists(attributes enhlinkobject.TreeAttributes, enhMat, geneMat, covMat *matrix.SparseBoolMatrix, floatMat *matrix.SparseFloatMatrix, plist *enhlinkobject.PromoterList, outTag string)](<#func-processngenelists>) - [func reduce(combinations [][][]int) (res [][]int)](<#func-reduce>) - [func splitGenesToBucket(geneMap map[string]uint, processes int) (geneBuckets []map[string]bool, nbGenes int)](<#func-splitgenestobucket>) - [func stringToFloatArray(stringArr, option string) (outArr []float64)](<#func-stringtofloatarray>) - [func stringToIntArray(stringArr, option string) (outArr []int)](<#func-stringtointarray>) - [func stringToMaxFeatTypeArray(stringArr, option string) (outArr []enhlinkobject.MaxFeaturesType)](<#func-stringtomaxfeattypearray>) - [func testIfRequiredFilesExist()](<#func-testifrequiredfilesexist>) - [type paramArrays](<#type-paramarrays>) - [func (pa *paramArrays) generateAllCombination()](<#func-paramarrays-generateallcombination>) - [func (pa *paramArrays) init()](<#func-paramarrays-init>) - [func (pa *paramArrays) initIterators()](<#func-paramarrays-inititerators>) - [func (pa *paramArrays) iter(attributes enhlinkobject.TreeAttributes, tStart time.Time) (newAttr enhlinkobject.TreeAttributes, isOver bool)](<#func-paramarrays-iter>) - [func (pa *paramArrays) returnLastThreadVal() int](<#func-paramarrays-returnlastthreadval>) ## Variables CLUSTERFILE cluster file ```go var CLUSTERFILE utils.Filename ``` DOWNSAMPLEARR Downsample the number of samples to use ```go var DOWNSAMPLEARR string ``` GENE gene ```go var GENE string ``` IGNOREENHANCERWEIGHT Ignore Enhancers weight \(the ratio of accessibility\) in the computation of the modified Information Gain ```go var IGNOREENHANCERWEIGHT bool ``` INPUTFORMAT iput matrix format ```go var INPUTFORMAT string ``` INPUTGENEMAT input matrix name for the gene matrix \(input\) ```go var INPUTGENEMAT utils.Filename ``` INPUTMAT input matrix name \(input\) ```go var INPUTMAT utils.Filename ``` ISGENEEXPR using gene expression for the gene mat ```go var ISGENEEXPR bool ``` KEEPSPARSE Keep the main ColMat matrix sparse. Usefull for memory reason if background is very large ```go var KEEPSPARSE bool ``` LAMBDA1ARR Lambda parameter of a poisson distribution, that controls the amount of dropouts of the simulated variables ```go var LAMBDA1ARR string ``` LAMBDA2ARR Lambda parameter of a poisson distribution, that controls the amount of false positives in the simulated variables ```go var LAMBDA2ARR string ``` LINKTYPE Which link to keep \{"all", "positive", "negative"\} ```go var LINKTYPE string ``` MAXFEATURESARR Maximum number of explanatory features per bootstrap model. ```go var MAXFEATURESARR string ``` MAXFEATURESTYPEARR max feature type ```go var MAXFEATURESTYPEARR string ``` MERGINGCUTOFF merging cutoff for closeby promoters ```go var MERGINGCUTOFF int ``` METADATA optional covariate matrix ```go var METADATA utils.Filename ``` MINLEAFSIZEARR Min size of leaf ```go var MINLEAFSIZEARR string ``` MINMATSIZEARR Min matrix size \(int\) ```go var MINMATSIZEARR string ``` NBBOOTARR Number of boostrap ```go var NBBOOTARR string ``` NBPROCESSES Number of Enhlink processes to be launched in parallel ```go var NBPROCESSES int ``` NBSIMFEATURESARR Number of simulated features to use ```go var NBSIMFEATURESARR string ``` NBTHREADSARR number of internal threads for each enhlink computation ```go var NBTHREADSARR string ``` NEIGHBORHOODARR number of internal threads ```go var NEIGHBORHOODARR string ``` ONLYSIM only perform simulation ```go var ONLYSIM bool ``` OUTDIR output directory ```go var OUTDIR string ``` OUTTAG output files tag ```go var OUTTAG string ``` PROMOTERFILE promoter file ```go var PROMOTERFILE utils.Filename ``` RANDOMNBGENES random subset of genes to analyze ```go var RANDOMNBGENES int ``` REPETITION Number of repetition to be performed for each iteration \(default: 1\) ```go var REPETITION int ``` RMPEAKSINPROMOTERS Remove peaks within promoter boundaries ```go var RMPEAKSINPROMOTERS bool ``` SECONDORDER compute second order links \- covar correlation ```go var SECONDORDER bool ``` SECONDORDERMAXFEATURESARR Maximum number of explanatory features per bootstrap model for the second order model. ```go var SECONDORDERMAXFEATURESARR string ``` SHOWVERSION show version and quit ```go var SHOWVERSION bool ``` SPLITGENELIST Split the gene list through the processes ```go var SPLITGENELIST bool ``` THRESHOLDARR Significance level ```go var THRESHOLDARR string ``` TREEDEPTHARR Max tree level ```go var TREEDEPTHARR string ``` UNIFORMSAMPLING Randomly sample the cells to have an uniform covariate distribution for each bootstrap. Needs a covariate matrix ```go var UNIFORMSAMPLING bool ``` XGI row index for input mat ```go var XGI utils.Filename ``` XGIGENE row index for input gene mat ```go var XGIGENE utils.Filename ``` XGISUBSET row index subset for input mat ```go var XGISUBSET utils.Filename ``` YGI column index for input mat ```go var YGI utils.Filename ``` YGIGENE column index for input gene mat ```go var YGIGENE utils.Filename ``` YGIGENESUBSET column index subset for input gene mat ```go var YGIGENESUBSET utils.Filename ``` YGISUBSET column index subset for input mat ```go var YGISUBSET utils.Filename ``` ## func [analyseOneGeneList]() ```go func analyseOneGeneList(enhObj enhlinkobject.EnhlinkObject, processID int, bucket map[string]bool, waiting *sync.WaitGroup, guard chan bool) ``` ## func [getGeneBucketsFromGene]() ```go func getGeneBucketsFromGene(geneFile utils.Filename, processes int) ([]map[string]bool, int) ``` ## func [getGeneBucketsFromPromoter]() ```go func getGeneBucketsFromPromoter(plist *enhlinkobject.PromoterList, processes int) ([]map[string]bool, int) ``` ## func [launchOneIterThread]() ```go func launchOneIterThread(isOver bool, count int, attributes enhlinkobject.TreeAttributes, enhMat, geneMat, covMat *matrix.SparseBoolMatrix, floatMat *matrix.SparseFloatMatrix, plist *enhlinkobject.PromoterList, writer *io.WriteCloser, waiting *sync.WaitGroup, mutex *sync.Mutex, guard chan bool) ``` ## func [main]() ```go func main() ``` ## func [mergeBucketResultsFile]() ```go func mergeBucketResultsFile(outTag string, clusterList []string, nbBuckets int) ``` ## func [mergeOneSetOfBucketFiles]() ```go func mergeOneSetOfBucketFiles(outTag, cluster, ext string, nbBuckets int) ``` ## func [processNGeneLists]() ```go func processNGeneLists(attributes enhlinkobject.TreeAttributes, enhMat, geneMat, covMat *matrix.SparseBoolMatrix, floatMat *matrix.SparseFloatMatrix, plist *enhlinkobject.PromoterList, outTag string) ``` ## func [reduce]() ```go func reduce(combinations [][][]int) (res [][]int) ``` ## func [splitGenesToBucket]() ```go func splitGenesToBucket(geneMap map[string]uint, processes int) (geneBuckets []map[string]bool, nbGenes int) ``` ## func [stringToFloatArray]() ```go func stringToFloatArray(stringArr, option string) (outArr []float64) ``` ## func [stringToIntArray]() ```go func stringToIntArray(stringArr, option string) (outArr []int) ``` ## func [stringToMaxFeatTypeArray]() ```go func stringToMaxFeatTypeArray(stringArr, option string) (outArr []enhlinkobject.MaxFeaturesType) ``` ## func [testIfRequiredFilesExist]() ```go func testIfRequiredFilesExist() ``` ## type [paramArrays]() ```go type paramArrays struct { downsample []int nbBoot []int depth []int maxFeatures []int secondOrderMaxFeatures []int minMatsize []int minLeafsize []int nbThreads []int neighborhood []int nbSimFeatures []int maxFeatType []enhlinkobject.MaxFeaturesType threshold []float64 lambda1, lambda2 []float64 iterators map[string]int nbSteps, currentStep int keys []string paramCombinations [][]int } ``` ### func \(\*paramArrays\) [generateAllCombination]() ```go func (pa *paramArrays) generateAllCombination() ``` ### func \(\*paramArrays\) [init]() ```go func (pa *paramArrays) init() ``` ### func \(\*paramArrays\) [initIterators]() ```go func (pa *paramArrays) initIterators() ``` ### func \(\*paramArrays\) [iter]() ```go func (pa *paramArrays) iter(attributes enhlinkobject.TreeAttributes, tStart time.Time) (newAttr enhlinkobject.TreeAttributes, isOver bool) ``` ### func \(\*paramArrays\) [returnLastThreadVal]() ```go func (pa *paramArrays) returnLastThreadVal() int ``` # enhlink ```go import "gitlab.com/Grouumf/enhlinktools/enhlink" ``` ### Library that compiles the enhlink executable enhlink inferes enhancer / promoter co\-accessibilities \(links\) using random forests of ID3 trees and Information gain. enhlink main inputs are: ``` a) a (cell x peak) sparse matrix, b) a 4-columns promoter TSV file , c) an optional (cell x gene) sparse matrix if the gene activity cannot be inferred from the peaks of the the first matrix and the promoter regions. This matrix can either be interpreted as boolean (e.g. the promoter of a given gene is either accessible or not for a given cell), or as a float matrix using the -isExpr option, which reflects the gene expression (for example in a context of a scATAC-seq/RNA-seq multi-omic study) ``` In addition, covariates \(cell x covariates\) and clusters \(cell x clusterID\) TSV file can be provided. Finally, multiple optional parameters can be set to fine tune the speed, accuracies, and range of the models. \<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\< WARNING \>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\> As of March 20 2024, Enhlink v0.21.0, we Changed some of Enhlink's parameters names for clarity and consistency purpose. Below are the list of changes: \(version \< 0.21.0\) \-\> \(version \>= 0.21.0\) cluster \-\> clusters promoter \-\> gtf genes \-\> targets gene \-\> target isGeneExpr \-\> isExpr rmPeaksInPromoter \-\> rmPeaksInTargets onlyPositiveLink \-\> linkType \<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\<\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\>\> USAGE: ``` enhlink -mat -xgi -ygi -promoter -out -tag -mat2 -xgi2 -ygi2 # IF PASSING A GENE MATRIX FILE -target # IF FOCUSING ON ONE TARGET -targets # IF FOCUSING ON A LIST OF TARGETS -isExpr # IF MATRIX 2 IS A EXPRESSION MATRIX -covariates -xgi_subset -ygi_subset -cluster # OPTIONAL -downsample -threads -n_boot -depth -max_features # OPTIONAL -threshold min_matsize -min_leafsize -merging_cutoff # OPTIONAL -format {coo, mtx, cellRanger} -keep_sparse -maxFeatType # OPTIONAL -rmPeaksInTargets -linkType {"all", "positive", "negative"} -secondOrder -ignoreEnhancerWeight # OPTIONAL -neighborhood -secondOrderMaxFeat -uniformSampling # OPTIONAL ``` please check enhlink \-h and the tutorial and introduction sections for a more precise description of the input parameters ## Index - [Variables](<#variables>) - [func main()](<#func-main>) - [func testIfRequiredFilesExist()](<#func-testifrequiredfilesexist>) ## Variables CLUSTERFILE cluster file ```go var CLUSTERFILE utils.Filename ``` DOWNSAMPLE Downsample the number of samples to use ```go var DOWNSAMPLE int ``` GENE gene ```go var GENE string ``` IGNOREENHANCERWEIGHT Ignore Enhancers weight \(the ratio of accessibility\) in the computation of the modified Information Gain ```go var IGNOREENHANCERWEIGHT bool ``` INPUTFORMAT iput matrix format ```go var INPUTFORMAT string ``` INPUTGENEMAT input matrix name for the gene matrix \(input\) ```go var INPUTGENEMAT utils.Filename ``` INPUTMAT input matrix name \(input\) ```go var INPUTMAT utils.Filename ``` ISGENEEXPR using gene expression for the gene mat ```go var ISGENEEXPR bool ``` KEEPSPARSE Keep the main ColMat matrix sparse. Usefull for memory reason if background is very large ```go var KEEPSPARSE bool ``` LAMBDA1 Lambda parameter of a poisson distribution, that controls the amount of dropouts of the simulated variables ```go var LAMBDA1 float64 ``` LAMBDA2 Lambda parameter of a poisson distribution, that controls the amount of false positives in the simulated variables ```go var LAMBDA2 float64 ``` LINKTYPE Which link to keep \{"all", "positive", "negative"\} ```go var LINKTYPE string ``` MAXFEATURES Maximum number of explanatory features per bootstrap model. ```go var MAXFEATURES int ``` MAXFEATURESTYPE Maximum of features to be considered for a given tree. \{\\"all\\", \\"sqrt\\", \\"log\\"\} ```go var MAXFEATURESTYPE enhlinkobject.MaxFeaturesType ``` MERGINGCUTOFF merging cutoff for closeby promoters ```go var MERGINGCUTOFF int ``` METADATA optional covariate matrix ```go var METADATA utils.Filename ``` MINLEAFSIZE Min size of leaf ```go var MINLEAFSIZE int ``` MINMATSIZE Min matrix size \(int\) ```go var MINMATSIZE int ``` NBBOOT Number of boostrap ```go var NBBOOT int ``` NBSIMFEATURES Number of simulated features to use ```go var NBSIMFEATURES int ``` NBTHREADS number of internal threads ```go var NBTHREADS int ``` NEIGHBORHOOD number of internal threads ```go var NEIGHBORHOOD int ``` ONLYSIM only perform simulation ```go var ONLYSIM bool ``` OUTDIR output directory ```go var OUTDIR string ``` OUTTAG output files tag ```go var OUTTAG string ``` PROMOTERFILE promoter file ```go var PROMOTERFILE utils.Filename ``` RMPEAKSINPROMOTERS Remove peaks within promoter boundaries ```go var RMPEAKSINPROMOTERS bool ``` SECONDORDER compute second order links \- covar correlation ```go var SECONDORDER bool ``` SECONDORDERMAXFEATURES Maximum number of explanatory features per bootstrap model for second order models ```go var SECONDORDERMAXFEATURES int ``` SHOWVERSION show version and quit ```go var SHOWVERSION bool ``` THRESHOLD Significance level ```go var THRESHOLD float64 ``` TREEDEPTH Max tree level ```go var TREEDEPTH int ``` UNIFORMSAMPLING Randomly sample the cells to have an uniform covariate distribution for each bootstrap. Needs a covariate matrix ```go var UNIFORMSAMPLING bool ``` XGI row index for input mat ```go var XGI utils.Filename ``` XGIGENE row index for input gene mat ```go var XGIGENE utils.Filename ``` XGISUBSET row index subset for input mat ```go var XGISUBSET utils.Filename ``` YGI column index for input mat ```go var YGI utils.Filename ``` YGIGENE column index for input gene mat ```go var YGIGENE utils.Filename ``` YGIGENESUBSET column index subset for input gene mat ```go var YGIGENESUBSET utils.Filename ``` YGISUBSET column index subset for input mat ```go var YGISUBSET utils.Filename ``` ```go var maxfeaturestypeStr string ``` ## func [main]() ```go func main() ``` ## func [testIfRequiredFilesExist]() ```go func testIfRequiredFilesExist() ``` # enhlinkobject ```go import "gitlab.com/Grouumf/enhlinktools/enhlinkobject" ``` package enhlinkobject is a library to create an Enhlink Object and perform Enhlink analysis ## Index - [Variables](<#variables>) - [func AssertIfFileExists(filename, tag string)](<#func-assertiffileexists>) - [func MergeClosePromoterRegions(mergingCutoff int, plist *PromoterList)](<#func-mergeclosepromoterregions>) - [func pickNGenesAtRandom(nbGenes int, geneSet map[string]uint) (newGeneSet map[string]bool)](<#func-pickngenesatrandom>) - [type EnhlinkObject](<#type-enhlinkobject>) - [func (eo *EnhlinkObject) AnalyseAllGenesFromGeneMat()](<#func-enhlinkobject-analyseallgenesfromgenemat>) - [func (eo *EnhlinkObject) AnalyseAllPromoters(geneSubset utils.Filename)](<#func-enhlinkobject-analyseallpromoters>) - [func (eo *EnhlinkObject) AnalyseNGenes(geneMap map[string]bool, verbose bool)](<#func-enhlinkobject-analysengenes>) - [func (eo *EnhlinkObject) AnalyseOneGene(gene string)](<#func-enhlinkobject-analyseonegene>) - [func (eo *EnhlinkObject) AnalyseRandomSubsetFromGeneMat(nSamples int)](<#func-enhlinkobject-analyserandomsubsetfromgenemat>) - [func (eo *EnhlinkObject) AnalyseRandomSubsetOfPromoters(geneSubsetFile utils.Filename, nSamples int)](<#func-enhlinkobject-analyserandomsubsetofpromoters>) - [func (eo *EnhlinkObject) Init(mat matrix.SparseBoolMatrix, geneMat, covMat *matrix.SparseBoolMatrix, floatMat *matrix.SparseFloatMatrix, plist *PromoterList, attributes TreeAttributes)](<#func-enhlinkobject-init>) - [func (eo *EnhlinkObject) analyseOneGene(gene string)](<#func-enhlinkobject-analyseonegene>) - [func (eo *EnhlinkObject) blacklistAllPeaksInPromoter(targetPeaks []utils.Peak)](<#func-enhlinkobject-blacklistallpeaksinpromoter>) - [func (eo *EnhlinkObject) computeOnePvalue(arr []float64, ygi uint, pvals *[]pvalPoint)](<#func-enhlinkobject-computeonepvalue>) - [func (eo *EnhlinkObject) computePvalues(scoreArr map[uint][]float64) (pvals []pvalPoint)](<#func-enhlinkobject-computepvalues>) - [func (eo *EnhlinkObject) computeRecursiveIGFloat(xgiArr []uint, ygiMap map[uint]bool, bestScoreMap map[uint]float64, depth, lenXgi int)](<#func-enhlinkobject-computerecursiveigfloat>) - [func (eo *EnhlinkObject) computeRecursiveInformationGain(xgiArr []uint, ygiMap map[uint]bool, bestScoreMap map[uint]float64, depth, lenXgi int)](<#func-enhlinkobject-computerecursiveinformationgain>) - [func (eo *EnhlinkObject) computeTrees()](<#func-enhlinkobject-computetrees>) - [func (eo *EnhlinkObject) computeTreesCovar()](<#func-enhlinkobject-computetreescovar>) - [func (eo *EnhlinkObject) computeTreesOneThreads(cluster string, ygiMap map[uint]bool, scoreArr map[uint][]float64)](<#func-enhlinkobject-computetreesonethreads>) - [func (eo *EnhlinkObject) computeTreesSim(ygiMap map[uint]bool)](<#func-enhlinkobject-computetreessim>) - [func (eo *EnhlinkObject) createYgiMapForCovar(ygiToFocus uint, validYgi, validCovar []uint) (ygiMap map[uint]bool)](<#func-enhlinkobject-createygimapforcovar>) - [func (eo *EnhlinkObject) deferCloseFiles()](<#func-enhlinkobject-deferclosefiles>) - [func (eo *EnhlinkObject) defineBoolYgiVectorFromPeakMat(intervals []interval.IntInterface)](<#func-enhlinkobject-defineboolygivectorfrompeakmat>) - [func (eo *EnhlinkObject) defineClusterFloatYgiSum()](<#func-enhlinkobject-defineclusterfloatygisum>) - [func (eo *EnhlinkObject) defineClusterYgiSum()](<#func-enhlinkobject-defineclusterygisum>) - [func (eo *EnhlinkObject) defineYgiVectorFromFloatMat(gene string) (isValid bool)](<#func-enhlinkobject-defineygivectorfromfloatmat>) - [func (eo *EnhlinkObject) defineYgiVectorFromGeneMat(gene string) (isValid bool)](<#func-enhlinkobject-defineygivectorfromgenemat>) - [func (eo *EnhlinkObject) defineYgiVectorFromPeakMat(targetPeak utils.Peak) (isValid bool)](<#func-enhlinkobject-defineygivectorfrompeakmat>) - [func (eo *EnhlinkObject) getIGFloat(xgiArr *[]uint, ygi uint) (infGainScore float64)](<#func-enhlinkobject-getigfloat>) - [func (eo *EnhlinkObject) getInformationGain(xgiArr *[]uint, ygi uint) (infGainScore float64)](<#func-enhlinkobject-getinformationgain>) - [func (eo *EnhlinkObject) initIntervals()](<#func-enhlinkobject-initintervals>) - [func (eo *EnhlinkObject) initRandomYgiFor2ndOrder(totnbRealFeat int)](<#func-enhlinkobject-initrandomygifor2ndorder>) - [func (eo *EnhlinkObject) initSimFloatMat()](<#func-enhlinkobject-initsimfloatmat>) - [func (eo *EnhlinkObject) initSimMat()](<#func-enhlinkobject-initsimmat>) - [func (eo *EnhlinkObject) initSimWriter()](<#func-enhlinkobject-initsimwriter>) - [func (eo *EnhlinkObject) initSurroundingEnhancersMat(peak utils.Peak)](<#func-enhlinkobject-initsurroundingenhancersmat>) - [func (eo *EnhlinkObject) initWriters()](<#func-enhlinkobject-initwriters>) - [func (eo *EnhlinkObject) initWritersWithHeader()](<#func-enhlinkobject-initwriterswithheader>) - [func (eo *EnhlinkObject) initYgiVectCovar(ygi uint)](<#func-enhlinkobject-initygivectcovar>) - [func (eo *EnhlinkObject) initbucketCovariates()](<#func-enhlinkobject-initbucketcovariates>) - [func (eo *EnhlinkObject) writePvals(pvals []pvalPoint, cluster string)](<#func-enhlinkobject-writepvals>) - [func (eo *EnhlinkObject) writePvals2ndOrder(pvals []pvalPoint, cluster string, currentYgi uint)](<#func-enhlinkobject-writepvals2ndorder>) - [func (eo *EnhlinkObject) writePvalsSim(pvals []pvalPoint, cluster string)](<#func-enhlinkobject-writepvalssim>) - [type LinkType](<#type-linktype>) - [func (t LinkType) IsValid() LinkType](<#func-linktype-isvalid>) - [type MaxFeaturesType](<#type-maxfeaturestype>) - [func (mf *MaxFeaturesType) SelectFeatures(ygiMap map[uint]bool) map[uint]bool](<#func-maxfeaturestype-selectfeatures>) - [func (mf *MaxFeaturesType) Set(v string) error](<#func-maxfeaturestype-set>) - [func (mf *MaxFeaturesType) String() string](<#func-maxfeaturestype-string>) - [func (mf *MaxFeaturesType) check()](<#func-maxfeaturestype-check>) - [type PromoterList](<#type-promoterlist>) - [func LoadPromotersFile(fname utils.Filename) (plist PromoterList)](<#func-loadpromotersfile>) - [func (pl *PromoterList) Len() int](<#func-promoterlist-len>) - [type TreeAttributes](<#type-treeattributes>) - [type pvalPoint](<#type-pvalpoint>) ## Variables VERSION version of the current software ```go var VERSION = "0.21.4" ``` linkFormats possible options for matrix format ```go var linkTypes = [...]LinkType{allLink, posLink, negLink} ``` ## func [AssertIfFileExists]() ```go func AssertIfFileExists(filename, tag string) ``` AssertIfFileExists panic if err is nil from os.Stats ## func [MergeClosePromoterRegions]() ```go func MergeClosePromoterRegions(mergingCutoff int, plist *PromoterList) ``` MergeClosePromoterRegions merge close promoters according to cutoff ## func [pickNGenesAtRandom]() ```go func pickNGenesAtRandom(nbGenes int, geneSet map[string]uint) (newGeneSet map[string]bool) ``` ## type [EnhlinkObject]() EnhlinkObject main enhlink object containing ```go type EnhlinkObject struct { //////////////// files and matrices //////////// // promoter file promoterFile utils.Filename // sparse matrix SparseMatrix matrix.SparseBoolMatrix // sparse matrix for gene activity SparseMatrixGene *matrix.SparseBoolMatrix // sparse float matrix for gene expression (substitute SparseMatrixGene ) SparseMatrixFloat *matrix.SparseFloatMatrix // sparse matrix for covariates SparseMatrixCovar *matrix.SparseBoolMatrix //////////////// Internal variables ////////// // current gene under study currentGene string // internal promoter map that defines all the current promoter regions // If matrix is constructed from peakMat, it is only 1 region currentPeaks map[utils.Peak]bool // peaks banned from beeing in the neighborhood matrix // because they are in a current promoter region blacklistedPeaks map[uint]bool // features on which to perform the analysis relevantFeatures []int // endog response binary vector ygiVector []int //map[xgiID]value ygiCovVector []int //map[xgiID]value // endog response float vector ygiVectorFloat []float64 //map[xgiID]value ygiCovVectorFloat []float64 //map[xgiID]value // Sum of ygi for all cluster ygiClusterSum map[string]float64 // Remove peaks within promoter boundaries rmPeaksInPromoters bool // surrounding matrix surroundingPeaks []uint // Number of additional random features nbRandFeat int // Number of features used for the model nbFeatUsed int // is gene matrix provided isGeneMat bool // is gene expression matrix provided isFloatMat bool // is cov matrix provided isCovMat bool //starting time tStart time.Time bucketCovariates map[string][][]uint // valid peak and covariates before validYgi, validCovar map[string][]uint xgiCovMap []map[int]bool // Internal variable to indicate wether the 2nd order inference mode is activated isInferring2nd bool //verbose status verbose bool //////////////// Simulated variables //////// simColMat matrix.MatColHash simYgiVector []int simYgiVectorFloat []float64 nbSimFeat int isSim bool lambda1 float64 // poisson param for dropout level lambda2 float64 // poisson param for false positive level //////// Float matrix attributes /////////// nonNullMean float64 //////////////// TREE attributes ////////// //treeAttributes object passed duringthe init attributes TreeAttributes // Number of internal threads to perform the multiple tasks nbThreads int // region in number of base pairs to define the surrounding enhancers surroundingSize int //Min matrix size minMatSize int // Max depth maxDepth int //Number of classes for ygi vector nbClass int // min leaf size of the tree minLeafSize int // number of boostrap nbBoot int // P-value threshold threshold float64 // downsample the number of samples downsample int // Maximum number of explanatory features per bootstrap model. maxNbFeatures int // Maximum number of explanatory features per bootstrap model for second order models. secondOrderMaxFeat int //Ignore Enhancers weight (the ratio of accessibility) in the computation of the modified Information Gain ignoreEnhancerWeight bool // Keep the main ColMat matrix sparse. Usefull for memory reason if background is very large keepSparse bool // Identify the covariates associated with each inferred enhancer-promoter links secondOrder bool // Maximum of features to be considered for a given tree. {\"all\", \"sqrt\", \"log\"}* or int float/ maxFeatType MaxFeaturesType // Only perform simulation onlySim bool // keep only links with positive correlations LinkType LinkType // uniform covariate sampling for each tree uniformSampling bool ////////// Sync objects ///////////////// guard chan bool mutex, mutex2 sync.Mutex waiting sync.WaitGroup //promoter list map[gene]list Promoters *PromoterList // Reduced Intervals for ygi index map[chrID]interval YgiIntervalReduced utils.PeakIntervalTreeObject // Intervals for ygi index map[chrID]interval YgiInterval utils.PeakIntervalTreeObject // refined index of ygis not in promoters ygisNotInPromoters map[string]uint //////////////// Files objects ////////// outDir, outTag string // map[cluster] -> file writers, writersCov, writers2ndOrder map[string]*io.WriteCloser // map[cluster] file name files, filesCov, files2ndOrder map[string]string // writer of simulated features results writerSim *io.WriteCloser fileSim string } ``` ### func \(\*EnhlinkObject\) [AnalyseAllGenesFromGeneMat]() ```go func (eo *EnhlinkObject) AnalyseAllGenesFromGeneMat() ``` AnalyseAllGenesFromGeneMat analyse all genes from GeneMat ### func \(\*EnhlinkObject\) [AnalyseAllPromoters]() ```go func (eo *EnhlinkObject) AnalyseAllPromoters(geneSubset utils.Filename) ``` AnalyseAllPromoters analyse all genes from GeneMat ### func \(\*EnhlinkObject\) [AnalyseNGenes]() ```go func (eo *EnhlinkObject) AnalyseNGenes(geneMap map[string]bool, verbose bool) ``` AnalyseNGenes analysis one gene and close output files ### func \(\*EnhlinkObject\) [AnalyseOneGene]() ```go func (eo *EnhlinkObject) AnalyseOneGene(gene string) ``` AnalyseOneGene analysis one gene and close output files ### func \(\*EnhlinkObject\) [AnalyseRandomSubsetFromGeneMat]() ```go func (eo *EnhlinkObject) AnalyseRandomSubsetFromGeneMat(nSamples int) ``` AnalyseRandomSubsetFromGeneMat pick n genes at random from gene mat and analyse them ### func \(\*EnhlinkObject\) [AnalyseRandomSubsetOfPromoters]() ```go func (eo *EnhlinkObject) AnalyseRandomSubsetOfPromoters(geneSubsetFile utils.Filename, nSamples int) ``` AnalyseRandomSubsetOfPromoters analyse all genes from GeneMat ### func \(\*EnhlinkObject\) [Init]() ```go func (eo *EnhlinkObject) Init(mat matrix.SparseBoolMatrix, geneMat, covMat *matrix.SparseBoolMatrix, floatMat *matrix.SparseFloatMatrix, plist *PromoterList, attributes TreeAttributes) ``` Init init enhlinkObject with a sparse matrix and a promoter list ### func \(\*EnhlinkObject\) [analyseOneGene]() ```go func (eo *EnhlinkObject) analyseOneGene(gene string) ``` analyseOneGene analysis one gene and close output files ### func \(\*EnhlinkObject\) [blacklistAllPeaksInPromoter]() ```go func (eo *EnhlinkObject) blacklistAllPeaksInPromoter(targetPeaks []utils.Peak) ``` blacklistAllPeaksInPromoter init blacklistedPeaks with all peaks within any current prom region ### func \(\*EnhlinkObject\) [computeOnePvalue]() ```go func (eo *EnhlinkObject) computeOnePvalue(arr []float64, ygi uint, pvals *[]pvalPoint) ``` ### func \(\*EnhlinkObject\) [computePvalues]() ```go func (eo *EnhlinkObject) computePvalues(scoreArr map[uint][]float64) (pvals []pvalPoint) ``` ### func \(\*EnhlinkObject\) [computeRecursiveIGFloat]() ```go func (eo *EnhlinkObject) computeRecursiveIGFloat(xgiArr []uint, ygiMap map[uint]bool, bestScoreMap map[uint]float64, depth, lenXgi int) ``` ### func \(\*EnhlinkObject\) [computeRecursiveInformationGain]() ```go func (eo *EnhlinkObject) computeRecursiveInformationGain(xgiArr []uint, ygiMap map[uint]bool, bestScoreMap map[uint]float64, depth, lenXgi int) ``` ### func \(\*EnhlinkObject\) [computeTrees]() ```go func (eo *EnhlinkObject) computeTrees() ``` computeTrees Compute tree ### func \(\*EnhlinkObject\) [computeTreesCovar]() ```go func (eo *EnhlinkObject) computeTreesCovar() ``` ### func \(\*EnhlinkObject\) [computeTreesOneThreads]() ```go func (eo *EnhlinkObject) computeTreesOneThreads(cluster string, ygiMap map[uint]bool, scoreArr map[uint][]float64) ``` computeTreesOneThreads Compute tree for one bootstrap index ### func \(\*EnhlinkObject\) [computeTreesSim]() ```go func (eo *EnhlinkObject) computeTreesSim(ygiMap map[uint]bool) ``` computeTreesSim Compute tree using simulated variables ### func \(\*EnhlinkObject\) [createYgiMapForCovar]() ```go func (eo *EnhlinkObject) createYgiMapForCovar(ygiToFocus uint, validYgi, validCovar []uint) (ygiMap map[uint]bool) ``` ### func \(\*EnhlinkObject\) [deferCloseFiles]() ```go func (eo *EnhlinkObject) deferCloseFiles() ``` ### func \(\*EnhlinkObject\) [defineBoolYgiVectorFromPeakMat]() ```go func (eo *EnhlinkObject) defineBoolYgiVectorFromPeakMat(intervals []interval.IntInterface) ``` ### func \(\*EnhlinkObject\) [defineClusterFloatYgiSum]() ```go func (eo *EnhlinkObject) defineClusterFloatYgiSum() ``` defineClusterFloatYgiSum define the nb of xgi ### func \(\*EnhlinkObject\) [defineClusterYgiSum]() ```go func (eo *EnhlinkObject) defineClusterYgiSum() ``` defineClusterYgiSum define the nb of xgi ### func \(\*EnhlinkObject\) [defineYgiVectorFromFloatMat]() ```go func (eo *EnhlinkObject) defineYgiVectorFromFloatMat(gene string) (isValid bool) ``` defineYgiVectorFromGeneFloatMat define the endog ygi vectors using the gene float mat. return if the vector is valid ### func \(\*EnhlinkObject\) [defineYgiVectorFromGeneMat]() ```go func (eo *EnhlinkObject) defineYgiVectorFromGeneMat(gene string) (isValid bool) ``` defineYgiVectorFromGeneMat define the endog ygi vectors using the gene mat. return if the vector is valid ### func \(\*EnhlinkObject\) [defineYgiVectorFromPeakMat]() ```go func (eo *EnhlinkObject) defineYgiVectorFromPeakMat(targetPeak utils.Peak) (isValid bool) ``` defineYgiVectorFromPeakMat define the endog ygi vectors using the peak mat. return if the vector is valid ### func \(\*EnhlinkObject\) [getIGFloat]() ```go func (eo *EnhlinkObject) getIGFloat(xgiArr *[]uint, ygi uint) (infGainScore float64) ``` getIGFloat return weighted Information gain for float ygi vector. Dichotomize ygi using nonNullMean and compute IG. The final score is IG x non\-null ygi ratio x non\-null feature ratio ### func \(\*EnhlinkObject\) [getInformationGain]() ```go func (eo *EnhlinkObject) getInformationGain(xgiArr *[]uint, ygi uint) (infGainScore float64) ``` getIGFloat return weighted Information gain for integer ygi vector. The final score is IG x non\-null ygi ratio x non\-null feature ratio ### func \(\*EnhlinkObject\) [initIntervals]() ```go func (eo *EnhlinkObject) initIntervals() ``` initIntervals init \(\*eo\).YgiInterval. If \(\*eo\).rmPeaksInPromoters is true, remove from index ygis intersecting promoters ### func \(\*EnhlinkObject\) [initRandomYgiFor2ndOrder]() ```go func (eo *EnhlinkObject) initRandomYgiFor2ndOrder(totnbRealFeat int) ``` ### func \(\*EnhlinkObject\) [initSimFloatMat]() ```go func (eo *EnhlinkObject) initSimFloatMat() ``` ### func \(\*EnhlinkObject\) [initSimMat]() ```go func (eo *EnhlinkObject) initSimMat() ``` ### func \(\*EnhlinkObject\) [initSimWriter]() ```go func (eo *EnhlinkObject) initSimWriter() ``` ### func \(\*EnhlinkObject\) [initSurroundingEnhancersMat]() ```go func (eo *EnhlinkObject) initSurroundingEnhancersMat(peak utils.Peak) ``` initSurroundingEnhancersMat ### func \(\*EnhlinkObject\) [initWriters]() ```go func (eo *EnhlinkObject) initWriters() ``` ### func \(\*EnhlinkObject\) [initWritersWithHeader]() ```go func (eo *EnhlinkObject) initWritersWithHeader() ``` ### func \(\*EnhlinkObject\) [initYgiVectCovar]() ```go func (eo *EnhlinkObject) initYgiVectCovar(ygi uint) ``` ### func \(\*EnhlinkObject\) [initbucketCovariates]() ```go func (eo *EnhlinkObject) initbucketCovariates() ``` ### func \(\*EnhlinkObject\) [writePvals]() ```go func (eo *EnhlinkObject) writePvals(pvals []pvalPoint, cluster string) ``` ### func \(\*EnhlinkObject\) [writePvals2ndOrder]() ```go func (eo *EnhlinkObject) writePvals2ndOrder(pvals []pvalPoint, cluster string, currentYgi uint) ``` ### func \(\*EnhlinkObject\) [writePvalsSim]() ```go func (eo *EnhlinkObject) writePvalsSim(pvals []pvalPoint, cluster string) ``` ## type [LinkType]() LinkType type of link to keep from \{"all", "positive", "negative"\} ```go type LinkType string ``` ```go const ( allLink LinkType = "all" posLink LinkType = "positive" negLink LinkType = "negative" ) ``` ### func \(LinkType\) [IsValid]() ```go func (t LinkType) IsValid() LinkType ``` IsValid is the matrix format valid ## type [MaxFeaturesType]() MaxFeaturesType max features type ```go type MaxFeaturesType struct { mfString string fracFeat float64 nbFeat int } ``` ### func \(\*MaxFeaturesType\) [SelectFeatures]() ```go func (mf *MaxFeaturesType) SelectFeatures(ygiMap map[uint]bool) map[uint]bool ``` SelectFeatures create feature map according to the strategy chosen ### func \(\*MaxFeaturesType\) [Set]() ```go func (mf *MaxFeaturesType) Set(v string) error ``` Set set value ### func \(\*MaxFeaturesType\) [String]() ```go func (mf *MaxFeaturesType) String() string ``` ### func \(\*MaxFeaturesType\) [check]() ```go func (mf *MaxFeaturesType) check() ``` ## type [PromoterList]() PromoterList map\[geneID\] \-\> list of peaks ```go type PromoterList map[string][]utils.Peak ``` ### func [LoadPromotersFile]() ```go func LoadPromotersFile(fname utils.Filename) (plist PromoterList) ``` LoadPromotersFile load the promoter file ### func \(\*PromoterList\) [Len]() ```go func (pl *PromoterList) Len() int ``` Len return length ## type [TreeAttributes]() TreeAttributes attributes for enhlink ```go type TreeAttributes struct { // Number of internal threads to perform the multiple tasks NbThreads int // Remove peaks within promoter boundaries RmPeaksInPromoters bool // region in number of base pairs to define the surrounding enhancers SurroundingSize int //Min matrix size MinMatSize int // Max depth MaxDepth int // min leaf size of the tree MinLeafSize int // Number of boostraps NBboot int // P-value threshold Threshold float64 // Downsample the number of samples Downsample int // output directory and files tag OutDir, OutTag string // Maximum number of explanatory features per bootstrap model. MaxNbFeatures int // Maximum number of explanatory features per bootstrap model for second order models. SecondOrderMaxFeat int // Number of simulated features to use NbSimFeat int // Poisson parameter to control the amount of dropouts of the simulated variables Lambda1 float64 // Poisson parameter to control the amount of false positive of the simulated variables Lambda2 float64 // Keep the main ColMat matrix sparse. Usefull for memory reason if background is very large KeepSparse bool // Maximum of features to be considered for a given tree. {\"all\", \"sqrt\", \"log\"}* or int float/ //Which links to keep {all pos, nef} LinkType LinkType MaxFeatType MaxFeaturesType // only perform simulation OnlySim bool //Identify the covariates associated with each inferred enhancer-promoter links SecondOrder bool //Ignore Enhancers weight (the ratio of accessibility) in the computation of the modified IF IgnoreEnhancerWeight bool // For each tree, Randomly sample the cells to have an uniform covariate distribution UniformSampling bool //////////////// Arguments used only for header writing //////////// Version string MatAttr, GmatAttr matrix.Attributes // mergingCutoff only used for header writting MergingCutoff int IsGeneExpr bool //// Files //// PromoterFile, Metadata utils.Filename // verbose Verbose bool } ``` ## type [pvalPoint]() ```go type pvalPoint struct { pval, fdr, score float64 index uint isValid bool } ``` # enhtools ```go import "gitlab.com/Grouumf/enhlinktools/enhtools" ``` ### Library that compiles the enhtools executable enhtools interescts bedpe files and compute accuracy metrics \(TPR, FPR, F\-score...\) USAGE: ``` # with -intersect (Intersection of output directory obtained from enhlink) enhtools -intersect -in -in2 -out (optional: -tag/tag2/outtag -scorePos/pvalPos/geneIDPos -mergeScore {left, right, mean} -prec ) # with -intersect2 (Intersection of two bedpe files) enhtools -intersect2 -in -in2 -out (optional: -tag/tag2/outtag -scorePos/pvalPos/geneIDPos -mergeScore {left, right, mean} -prec -stdout) # with intersect3 (Intersect input bedpe files, based on unperfect matches. A match occurs if both regions of the 2 bedpe files intersect respectively) enhtools -intersect3 -in -in2 -out (optional: -stdout) ``` ``` # with intersect4 (Intersect input bedpe files, based on unperfect matches. A match occurs if at least one of two regions of the first bedpe file intersects with one of the two regions of the second bedpe file) enhtools -intersect4 -in -in2 -out (optional: -stdout -stdout scorePos/pvalPos) ``` ``` # with diff (Difference between (file1 - file2) input bedpe files, based on unperfect matches. A match occurs if both regions of the 2 bedpe files intersect respectively) enhtools -diff -in -in2 -out (optional: -stdout) # with -filter (filter links from bedpe file (-in) which are not within at least one of the region defined in (-bed). If -diff is added, only the links not within at least one region of -bed will be outputed. -filter is well adapted for filtering links not in TAD regions, defined in a BED file enhtools -filter -in -bed (optional: -stdout -diff) ``` 1 TIPS: scorePos/pvalPos/geneIDPos can be used to set different column IDs for in and in2. Use the ":" to delimitate the seprators for the files from in and in2 ## Index - [Variables](<#variables>) - [func checkIfLineCanBeSplitIntoPeaks(line, sep string, peakPos []int, peakMax, nbPeaks int)](<#func-checkiflinecanbesplitintopeaks>) - [func filterBedpe()](<#func-filterbedpe>) - [func filterWithBed(bedpeFile, bedFile utils.Filename, outFile string)](<#func-filterwithbed>) - [func getPosFromOption(ps, option string, left bool) (pos []int)](<#func-getposfromoption>) - [func incompleteIntervalIntersect(file1, file2 utils.Filename, outFile string)](<#func-incompleteintervalintersect>) - [func intersectBedpeWithBedFile()](<#func-intersectbedpewithbedfile>) - [func intersectBedpes()](<#func-intersectbedpes>) - [func intersectBedpes2()](<#func-intersectbedpes2>) - [func intersectBedpes3()](<#func-intersectbedpes3>) - [func intersectBedpes4()](<#func-intersectbedpes4>) - [func intersectOneInput(file1, file2, clTag string, waiting *sync.WaitGroup)](<#func-intersectoneinput>) - [func intersectWithBed(bedpeFile, bedFile utils.Filename, outFile string)](<#func-intersectwithbed>) - [func intervalIntersect(file1, file2 utils.Filename, outFile string)](<#func-intervalintersect>) - [func isIntersecting(peak1, peak2 utils.Peak) bool](<#func-isintersecting>) - [func loadPeakFile(fname utils.Filename, sepIn, sepOut string, isLeft bool) (promPeakDict twoKeysBoolMap, pairsMeta metaMap, promGeneMap map[string]string)](<#func-loadpeakfile>) - [func main()](<#func-main>) - [func writeHeader(writer *io.WriteCloser)](<#func-writeheader>) - [func writeOneInterToBuffer(buffer *bytes.Buffer, meta1, meta2 peakMeta, enh1, prom1 string)](<#func-writeoneintertobuffer>) - [func writeStats(foutStat, clTag string, genestats geneStatsMap)](<#func-writestats>) - [type geneStats](<#type-genestats>) - [type geneStatsMap](<#type-genestatsmap>) - [func (gsm *geneStatsMap) Init()](<#func-genestatsmap-init>) - [func (gsm *geneStatsMap) incLeft(gene string, inc int)](<#func-genestatsmap-incleft>) - [func (gsm *geneStatsMap) incRight(gene string, inc int)](<#func-genestatsmap-incright>) - [func (gsm *geneStatsMap) incTwoSides(gene string, inc int)](<#func-genestatsmap-inctwosides>) - [type intervalResults](<#type-intervalresults>) - [func loadTree(bedpeFile utils.Filename) (treeResult intervalResults)](<#func-loadtree>) - [type matching](<#type-matching>) - [func (t matching) isValid() matching](<#func-matching-isvalid>) - [type mergeFunc](<#type-mergefunc>) - [type metaMap](<#type-metamap>) - [type peakMeta](<#type-peakmeta>) - [type peakPair](<#type-peakpair>) - [type scoreMerging](<#type-scoremerging>) - [func (sm *scoreMerging) check(mergingType string)](<#func-scoremerging-check>) - [func (sm *scoreMerging) merge(score1, score2 float64, mergingType string) float64](<#func-scoremerging-merge>) - [type twoKeysBoolMap](<#type-twokeysboolmap>) ## Variables BEDFILE bed file containing regions used to filter links ```go var BEDFILE string ``` DIFF bedpe difference for intersect3 ```go var DIFF bool ``` FILTER filter bedpe if they are not within one of the region defined in the bed file ```go var FILTER bool ``` GENESID bedpe column ID\(s\) for the gene ID ```go var GENESID string ``` INPUT1 output directory ```go var INPUT1 string ``` INPUT2 output directory ```go var INPUT2 string ``` INTAG output files tag ```go var INTAG string ``` INTAG2 output files tag ```go var INTAG2 string ``` INTERSECT intersect inputs ```go var INTERSECT bool ``` INTERSECT2 intersect inputs ```go var INTERSECT2 bool ``` INTERSECT3 intersect inputs based on unperfect matches if both regions of the 2 bedpe files intersect ```go var INTERSECT3 bool ``` INTERSECT4 Intersect input bedpe files, based on unperfect matches. A match occurs if at least one of two regions of the first bedpe file intersects with one of the two regions of the second bedpe file ```go var INTERSECT4 bool ``` INTERSECTWITHBED Intersect bedpe with bed \(at least one bedpe region of intersect one of the BED region\) ```go var INTERSECTWITHBED bool ``` MATCHING matching type for \-intersectBed "either" \(default\) "left" \(left region of the bed\), "right" \(right region\), or "both \(both regions match\)" ```go var MATCHING string ``` MATCHINGTYPE possible matching options ```go var MATCHINGTYPE = [...]matching{either, left, right, both} ``` MERGINGTYPE howto merge score ```go var MERGINGTYPE string ``` OUTDIR output directory ```go var OUTDIR string ``` OUTTAG output files tag ```go var OUTTAG string ``` PREC precision ```go var PREC int ``` PVALSPOS bedpe column ID\(s\) for the pvals ```go var PVALSPOS string ``` SCORESPOS column ID\(s\) for the scores ```go var SCORESPOS string ``` SHOWVERSION show version and quit ```go var SHOWVERSION bool ``` STDOUT write output to stdout ```go var STDOUT bool ``` ## func [checkIfLineCanBeSplitIntoPeaks]() ```go func checkIfLineCanBeSplitIntoPeaks(line, sep string, peakPos []int, peakMax, nbPeaks int) ``` ## func [filterBedpe]() ```go func filterBedpe() ``` ## func [filterWithBed]() ```go func filterWithBed(bedpeFile, bedFile utils.Filename, outFile string) ``` ## func [getPosFromOption]() ```go func getPosFromOption(ps, option string, left bool) (pos []int) ``` ## func [incompleteIntervalIntersect]() ```go func incompleteIntervalIntersect(file1, file2 utils.Filename, outFile string) ``` ## func [intersectBedpeWithBedFile]() ```go func intersectBedpeWithBedFile() ``` ## func [intersectBedpes]() ```go func intersectBedpes() ``` ## func [intersectBedpes2]() ```go func intersectBedpes2() ``` ## func [intersectBedpes3]() ```go func intersectBedpes3() ``` ## func [intersectBedpes4]() ```go func intersectBedpes4() ``` ## func [intersectOneInput]() ```go func intersectOneInput(file1, file2, clTag string, waiting *sync.WaitGroup) ``` ## func [intersectWithBed]() ```go func intersectWithBed(bedpeFile, bedFile utils.Filename, outFile string) ``` ## func [intervalIntersect]() ```go func intervalIntersect(file1, file2 utils.Filename, outFile string) ``` ## func [isIntersecting]() ```go func isIntersecting(peak1, peak2 utils.Peak) bool ``` ## func [loadPeakFile]() ```go func loadPeakFile(fname utils.Filename, sepIn, sepOut string, isLeft bool) (promPeakDict twoKeysBoolMap, pairsMeta metaMap, promGeneMap map[string]string) ``` ## func [main]() ```go func main() ``` main main function ## func [writeHeader]() ```go func writeHeader(writer *io.WriteCloser) ``` ## func [writeOneInterToBuffer]() ```go func writeOneInterToBuffer(buffer *bytes.Buffer, meta1, meta2 peakMeta, enh1, prom1 string) ``` ## func [writeStats]() ```go func writeStats(foutStat, clTag string, genestats geneStatsMap) ``` ## type [geneStats]() ```go type geneStats struct { total, left, right int leftHit, rightHit int } ``` ## type [geneStatsMap]() ```go type geneStatsMap struct { smap map[string]geneStats all geneStats } ``` ### func \(\*geneStatsMap\) [Init]() ```go func (gsm *geneStatsMap) Init() ``` ### func \(\*geneStatsMap\) [incLeft]() ```go func (gsm *geneStatsMap) incLeft(gene string, inc int) ``` ### func \(\*geneStatsMap\) [incRight]() ```go func (gsm *geneStatsMap) incRight(gene string, inc int) ``` ### func \(\*geneStatsMap\) [incTwoSides]() ```go func (gsm *geneStatsMap) incTwoSides(gene string, inc int) ``` ## type [intervalResults]() ```go type intervalResults struct { chrIntervalTree map[string]*interval.IntTree intervalMapping map[uintptr]utils.Peak metaMapping map[uintptr][]string } ``` ### func [loadTree]() ```go func loadTree(bedpeFile utils.Filename) (treeResult intervalResults) ``` ## type [matching]() matching type ```go type matching string ``` ```go const ( // matching type either matching = "either" both matching = "both" left matching = "left" right matching = "right" ) ``` ### func \(matching\) [isValid]() ```go func (t matching) isValid() matching ``` isValid is the matching type valid ## type [mergeFunc]() ```go type mergeFunc func(score1, score2 float64) float64 ``` ## type [metaMap]() ```go type metaMap map[peakPair]peakMeta ``` ## type [peakMeta]() ```go type peakMeta struct { pvals, scores []float64 gene string } ``` ## type [peakPair]() ```go type peakPair [2]string ``` ## type [scoreMerging]() ```go type scoreMerging struct { mergingType string isInit bool mfunc mergeFunc } ``` ### func \(\*scoreMerging\) [check]() ```go func (sm *scoreMerging) check(mergingType string) ``` ### func \(\*scoreMerging\) [merge]() ```go func (sm *scoreMerging) merge(score1, score2 float64, mergingType string) float64 ``` ## type [twoKeysBoolMap]() ```go type twoKeysBoolMap map[string]map[string]bool ``` # matrix ```go import "gitlab.com/Grouumf/enhlinktools/matrix" ``` Package matrix is a library to load sparse matrices from single\-cell data and/or mtx and COO format and create efficient indexing. ## Index - [Constants](<#constants>) - [Variables](<#variables>) - [func GetRandomBootstrapIndex(arr []uint, downsample int) (index []uint)](<#func-getrandombootstrapindex>) - [func LoadIndexFileToIndex(fname utils.Filename, downcase bool, refMapping map[string]uint) (celliddict map[string]uint, maxIndex int)](<#func-loadindexfiletoindex>) - [func LoadPeakDictsToIndex(fname utils.Filename, sepIn, sepOut string) (featiddict map[string]uint, maxIndex int)](<#func-loadpeakdictstoindex>) - [func Mean(arr []float64, size int) (mean float64)](<#func-mean>) - [func Std(arr []float64, mean float64, size int) (std float64)](<#func-std>) - [func TestStringToPeak(str string) error](<#func-teststringtopeak>) - [func TtestPval(mean, std float64, size int) (pval float64)](<#func-ttestpval>) - [func diffMap(m1, m2 map[string]uint) (missing []string)](<#func-diffmap>) - [func maxUintMap(imap map[string]uint) (maxMap int)](<#func-maxuintmap>) - [func minIndexSliceInt(slice []int, validIndexes []int) int](<#func-minindexsliceint>) - [func minInt(a, b int) int](<#func-minint>) - [func processMtxHeader(ismtx, transpose bool, reader *bufio.Scanner, maxLengthX, maxLengthY int, xgi, ygi utils.Filename) (splitChar string)](<#func-processmtxheader>) - [func reverseIndex(index map[string]uint, lenIndex int) (indexC []string)](<#func-reverseindex>) - [func reverseIndexC(indexC []string) (index map[string]uint)](<#func-reverseindexc>) - [type Attributes](<#type-attributes>) - [type Format](<#type-format>) - [func (t Format) isValid() Format](<#func-format-isvalid>) - [type MatColFloatHash](<#type-matcolfloathash>) - [func (mc *MatColFloatHash) GetRow(ygi uint) map[uint]float64](<#func-matcolfloathash-getrow>) - [func (mc *MatColFloatHash) Init(matCol []map[uint]float64, xDim uint)](<#func-matcolfloathash-init>) - [type MatColHash](<#type-matcolhash>) - [func (mc *MatColHash) Get(ygi, xgi uint) bool](<#func-matcolhash-get>) - [func (mc *MatColHash) GetCol(ygi uint) []bool](<#func-matcolhash-getcol>) - [func (mc *MatColHash) GetDim() (xDim, yDim int)](<#func-matcolhash-getdim>) - [func (mc *MatColHash) GetIndex(ygi uint) uint](<#func-matcolhash-getindex>) - [func (mc *MatColHash) GetRow(ygi uint) map[uint]bool](<#func-matcolhash-getrow>) - [func (mc *MatColHash) GetRowDense(ygi uint) (vect []bool)](<#func-matcolhash-getrowdense>) - [func (mc *MatColHash) Init(matCol *[]map[uint]bool, xDim uint)](<#func-matcolhash-init>) - [func (mc *MatColHash) InitDense(matColDense [][]bool)](<#func-matcolhash-initdense>) - [func (mc *MatColHash) IsDense() bool](<#func-matcolhash-isdense>) - [func (mc *MatColHash) Len(ygiIndex uint) int](<#func-matcolhash-len>) - [func (mc *MatColHash) RmDense()](<#func-matcolhash-rmdense>) - [func (mc *MatColHash) ToDense()](<#func-matcolhash-todense>) - [func (mc *MatColHash) ToDenseFromSubset(ygis []uint) (newYgis []uint)](<#func-matcolhash-todensefromsubset>) - [func (mc *MatColHash) ToDenseFromSubsetAlreadyLoaded(ygis []uint) (newYgis []uint)](<#func-matcolhash-todensefromsubsetalreadyloaded>) - [type SparseBoolMatrix](<#type-sparseboolmatrix>) - [func (sbm *SparseBoolMatrix) CreateRandMat(nbFeat int, refFeats []uint)](<#func-sparseboolmatrix-createrandmat>) - [func (sbm *SparseBoolMatrix) GetMatColT() []map[uint]bool](<#func-sparseboolmatrix-getmatcolt>) - [func (sbm *SparseBoolMatrix) GetUniformSampling(downsample, totXgi int, matColBucket [][]uint) (xgiIndex []uint)](<#func-sparseboolmatrix-getuniformsampling>) - [func (sbm *SparseBoolMatrix) Init(attributes Attributes)](<#func-sparseboolmatrix-init>) - [func (sbm *SparseBoolMatrix) Init2(attributes Attributes)](<#func-sparseboolmatrix-init2>) - [func (sbm *SparseBoolMatrix) InitMeta(xgiMap map[string]uint, attributes Attributes, skipFirst bool)](<#func-sparseboolmatrix-initmeta>) - [func (sbm *SparseBoolMatrix) InitTranspose()](<#func-sparseboolmatrix-inittranspose>) - [func (sbm *SparseBoolMatrix) LoadClustersFile()](<#func-sparseboolmatrix-loadclustersfile>) - [func (sbm *SparseBoolMatrix) LoadMatrix()](<#func-sparseboolmatrix-loadmatrix>) - [func (sbm *SparseBoolMatrix) LoadMatrix2(xgiMap, ygiMap map[string]uint)](<#func-sparseboolmatrix-loadmatrix2>) - [func (sbm *SparseBoolMatrix) initThreading(attributes Attributes)](<#func-sparseboolmatrix-initthreading>) - [func (sbm *SparseBoolMatrix) loadMatrixCoo(matFormat Format, xgiSubset, ygiSubset map[string]uint)](<#func-sparseboolmatrix-loadmatrixcoo>) - [func (sbm *SparseBoolMatrix) loadMatrixCooOneTh(count, nblines, thID int, lines *[buffsize]string, xgiSubset, ygiSubset map[string]uint, matColMain *[]map[uint]bool, splitChar string, transpose, delOne bool)](<#func-sparseboolmatrix-loadmatrixcoooneth>) - [func (sbm *SparseBoolMatrix) loadMetaMatrix(xgiMap map[string]uint, skipFirst bool)](<#func-sparseboolmatrix-loadmetamatrix>) - [type SparseFloatMatrix](<#type-sparsefloatmatrix>) - [func (sfm *SparseFloatMatrix) Init(attributes Attributes)](<#func-sparsefloatmatrix-init>) - [func (sfm *SparseFloatMatrix) LoadMatrix(xgiMap, ygiMap map[string]uint)](<#func-sparsefloatmatrix-loadmatrix>) - [func (sfm *SparseFloatMatrix) initThreading(attributes Attributes)](<#func-sparsefloatmatrix-initthreading>) - [func (sfm *SparseFloatMatrix) loadMatrixFloat(xgiSubset, ygiSubset map[string]uint)](<#func-sparsefloatmatrix-loadmatrixfloat>) - [func (sfm *SparseFloatMatrix) loadMatrixFloatOneTh(count, nblines, thID int, lines *[buffsize]string, xgiSubset, ygiSubset map[string]uint, matColMain *[]map[uint]float64, splitChar string, transpose, delOne bool)](<#func-sparsefloatmatrix-loadmatrixfloatoneth>) ## Constants ```go const ( // Matrix format coo Format = "coo" mtx Format = "mtx" cellRanger Format = "cellRanger" buffsize int = 120000 nbSteps int = 100 ) ``` ## Variables MATRIXFORMATS possible options for matrix format ```go var MATRIXFORMATS = [...]Format{coo, mtx, cellRanger} ``` ## func [GetRandomBootstrapIndex]() ```go func GetRandomBootstrapIndex(arr []uint, downsample int) (index []uint) ``` GetRandomBootstrapIndex get a random index with repetition ## func [LoadIndexFileToIndex]() ```go func LoadIndexFileToIndex(fname utils.Filename, downcase bool, refMapping map[string]uint) (celliddict map[string]uint, maxIndex int) ``` LoadIndexFileToIndex create cell ID index dict. Return also max Index ## func [LoadPeakDictsToIndex]() ```go func LoadPeakDictsToIndex(fname utils.Filename, sepIn, sepOut string) (featiddict map[string]uint, maxIndex int) ``` LoadPeakDictsToIndex create cell ID index dict ## func [Mean]() ```go func Mean(arr []float64, size int) (mean float64) ``` Mean return mean of arr given total size ## func [Std]() ```go func Std(arr []float64, mean float64, size int) (std float64) ``` Std return std of arr given total size ## func [TestStringToPeak]() ```go func TestStringToPeak(str string) error ``` TestStringToPeak test if string is a valid peak ## func [TtestPval]() ```go func TtestPval(mean, std float64, size int) (pval float64) ``` TtestPval return Student test pval using T CDF ## func [diffMap]() ```go func diffMap(m1, m2 map[string]uint) (missing []string) ``` ## func [maxUintMap]() ```go func maxUintMap(imap map[string]uint) (maxMap int) ``` ## func [minIndexSliceInt]() ```go func minIndexSliceInt(slice []int, validIndexes []int) int ``` ## func [minInt]() ```go func minInt(a, b int) int ``` ## func [processMtxHeader]() ```go func processMtxHeader(ismtx, transpose bool, reader *bufio.Scanner, maxLengthX, maxLengthY int, xgi, ygi utils.Filename) (splitChar string) ``` ## func [reverseIndex]() ```go func reverseIndex(index map[string]uint, lenIndex int) (indexC []string) ``` reverseIndex Internal function to reverse a map index ## func [reverseIndexC]() ```go func reverseIndexC(indexC []string) (index map[string]uint) ``` reverseIndexC Internal function to reverse an index ## type [Attributes]() Attributes matrix attributes pasrsed during init ```go type Attributes struct { Xgi utils.Filename Ygi utils.Filename MatFile utils.Filename XgiSubset utils.Filename YgiSubset utils.Filename ClustersFile utils.Filename MatrixFormat string NbThreads int } ``` ## type [Format]() Format matrix format type ```go type Format string ``` ### func \(Format\) [isValid]() ```go func (t Format) isValid() Format ``` isValid is the matrix format valid ## type [MatColFloatHash]() MatColFloatHash matrix column class for sparse float matrix ```go type MatColFloatHash struct { // Column matrix mat[ygi][xgi] matCol []map[uint]float64 // Dense column matrix mat[ygi][xgi] // Index subIndexHash map[int]uint xDim, yDim uint } ``` ### func \(\*MatColFloatHash\) [GetRow]() ```go func (mc *MatColFloatHash) GetRow(ygi uint) map[uint]float64 ``` GetRow get row from matCol using a sparse map\[uint\]bool ### func \(\*MatColFloatHash\) [Init]() ```go func (mc *MatColFloatHash) Init(matCol []map[uint]float64, xDim uint) ``` Init init MatColHash ## type [MatColHash]() MatColHash matrix column class that can allocate dense submatrices ```go type MatColHash struct { // Column matrix mat[ygi][xgi] matCol *[]map[uint]bool // Dense column matrix mat[ygi][xgi] matColDense [][]bool // Index subIndexHash map[int]uint xDim uint isDense bool } ``` ### func \(\*MatColHash\) [Get]() ```go func (mc *MatColHash) Get(ygi, xgi uint) bool ``` Get get matrix value ### func \(\*MatColHash\) [GetCol]() ```go func (mc *MatColHash) GetCol(ygi uint) []bool ``` GetCol get matrix coloumn in dense bool vector ### func \(\*MatColHash\) [GetDim]() ```go func (mc *MatColHash) GetDim() (xDim, yDim int) ``` GetDim return dimenssion ### func \(\*MatColHash\) [GetIndex]() ```go func (mc *MatColHash) GetIndex(ygi uint) uint ``` GetIndex get index from hashed ygi ### func \(\*MatColHash\) [GetRow]() ```go func (mc *MatColHash) GetRow(ygi uint) map[uint]bool ``` GetRow get row from matCol using a sparse map\[uint\]bool ### func \(\*MatColHash\) [GetRowDense]() ```go func (mc *MatColHash) GetRowDense(ygi uint) (vect []bool) ``` GetRowDense return row vector as a dense bool array. If matrix is not sparse, construct the vector ### func \(\*MatColHash\) [Init]() ```go func (mc *MatColHash) Init(matCol *[]map[uint]bool, xDim uint) ``` Init init MatColHash ### func \(\*MatColHash\) [InitDense]() ```go func (mc *MatColHash) InitDense(matColDense [][]bool) ``` InitDense init MatColHash with a dense matrix ### func \(\*MatColHash\) [IsDense]() ```go func (mc *MatColHash) IsDense() bool ``` IsDense return if struct dense is initiated ### func \(\*MatColHash\) [Len]() ```go func (mc *MatColHash) Len(ygiIndex uint) int ``` Len Return the number of non\-zero elements of a columns ### func \(\*MatColHash\) [RmDense]() ```go func (mc *MatColHash) RmDense() ``` RmDense remove dense matrix if any ### func \(\*MatColHash\) [ToDense]() ```go func (mc *MatColHash) ToDense() ``` ToDense sparse to dense ### func \(\*MatColHash\) [ToDenseFromSubset]() ```go func (mc *MatColHash) ToDenseFromSubset(ygis []uint) (newYgis []uint) ``` ToDenseFromSubset sparse to dense ### func \(\*MatColHash\) [ToDenseFromSubsetAlreadyLoaded]() ```go func (mc *MatColHash) ToDenseFromSubsetAlreadyLoaded(ygis []uint) (newYgis []uint) ``` ToDenseFromSubsetAlreadyLoaded sparse to dense but does not recreate matColDense because already loaded \(used when neighborhood == 0\) ## type [SparseBoolMatrix]() SparseBoolMatrix class ```go type SparseBoolMatrix struct { // Input files xgi, ygi, matFile, clustersFile utils.Filename xgiSubset, ygiSubset utils.Filename matrixFormat Format XgiIndex, YgiIndex []string XgiIndexC, YgiIndexC map[string]uint Clusters map[string][]uint // cluster key -> list of cell IDs Xdim, Ydim int // Dimension of the matrixyDim int // Dimension of the matrix MatCol MatColHash // mat.Get(posy, posx) RandMatCol MatColHash // mat[posy][posx] with random posx from matCol, matColT []map[uint]bool // Original matCol value and passed as reference to MatCol. MatcolT is the transpose //////// Sync utils //////// nbThreads int waiting sync.WaitGroup guard chan int mutex sync.Mutex } ``` ### func \(\*SparseBoolMatrix\) [CreateRandMat]() ```go func (sbm *SparseBoolMatrix) CreateRandMat(nbFeat int, refFeats []uint) ``` CreateRandMat Create a random matrix of size nbFeat x len\(XgiIndex\) ### func \(\*SparseBoolMatrix\) [GetMatColT]() ```go func (sbm *SparseBoolMatrix) GetMatColT() []map[uint]bool ``` GetMatColT Get MatColT ### func \(\*SparseBoolMatrix\) [GetUniformSampling]() ```go func (sbm *SparseBoolMatrix) GetUniformSampling(downsample, totXgi int, matColBucket [][]uint) (xgiIndex []uint) ``` GetUniformSampling get a uniform sampling of the xgi indexes according to the ygi ### func \(\*SparseBoolMatrix\) [Init]() ```go func (sbm *SparseBoolMatrix) Init(attributes Attributes) ``` Init Init dedicated to the gene matrix without loading the cluster file. The ygi index is regarded as peak region and the Clusters file is loaded ### func \(\*SparseBoolMatrix\) [Init2]() ```go func (sbm *SparseBoolMatrix) Init2(attributes Attributes) ``` Init2 Init dedicated to the gene matrix without loading the cluster file. The ygi index is not regarded as peak region and is loaded with LoadIndexFileToIndex ### func \(\*SparseBoolMatrix\) [InitMeta]() ```go func (sbm *SparseBoolMatrix) InitMeta(xgiMap map[string]uint, attributes Attributes, skipFirst bool) ``` InitMeta init Metadata matrix, drop first binary attributes ### func \(\*SparseBoolMatrix\) [InitTranspose]() ```go func (sbm *SparseBoolMatrix) InitTranspose() ``` InitTranspose create a transpose matrix of matCol and instantiate matColBucket ### func \(\*SparseBoolMatrix\) [LoadClustersFile]() ```go func (sbm *SparseBoolMatrix) LoadClustersFile() ``` LoadClustersFile load cluster file for sparse matrix ### func \(\*SparseBoolMatrix\) [LoadMatrix]() ```go func (sbm *SparseBoolMatrix) LoadMatrix() ``` LoadMatrix load matrix ### func \(\*SparseBoolMatrix\) [LoadMatrix2]() ```go func (sbm *SparseBoolMatrix) LoadMatrix2(xgiMap, ygiMap map[string]uint) ``` LoadMatrix2 load matrix with xgi and ygi Index. If ygiMap is empty, use the default ygi index ### func \(\*SparseBoolMatrix\) [initThreading]() ```go func (sbm *SparseBoolMatrix) initThreading(attributes Attributes) ``` ### func \(\*SparseBoolMatrix\) [loadMatrixCoo]() ```go func (sbm *SparseBoolMatrix) loadMatrixCoo(matFormat Format, xgiSubset, ygiSubset map[string]uint) ``` loadMatrixCoo load function with either MTX header or not. if xgiSubset is provided, replace xgi index by the index present in xgiSubset ### func \(\*SparseBoolMatrix\) [loadMatrixCooOneTh]() ```go func (sbm *SparseBoolMatrix) loadMatrixCooOneTh(count, nblines, thID int, lines *[buffsize]string, xgiSubset, ygiSubset map[string]uint, matColMain *[]map[uint]bool, splitChar string, transpose, delOne bool) ``` ### func \(\*SparseBoolMatrix\) [loadMetaMatrix]() ```go func (sbm *SparseBoolMatrix) loadMetaMatrix(xgiMap map[string]uint, skipFirst bool) ``` Load meta file with a header and in dense tsv format. If skipFirst, he first value of each field is skipped to avoid singluar matrix ## type [SparseFloatMatrix]() SparseFloatMatrix class ```go type SparseFloatMatrix struct { // Input files xgi, ygi, matFile utils.Filename xgiSubset, ygiSubset utils.Filename matrixFormat Format XgiIndex, YgiIndex []string XgiIndexC, YgiIndexC map[string]uint Xdim, Ydim int // Dimension of the matrixyDim int // Dimension of the matrix MatCol MatColFloatHash // mat.Get(posy, posx) //////// Sync utils //////// nbThreads int waiting sync.WaitGroup guard chan int mutex sync.Mutex } ``` ### func \(\*SparseFloatMatrix\) [Init]() ```go func (sfm *SparseFloatMatrix) Init(attributes Attributes) ``` Init Init dedicated to the gene matrix without loading the cluster file. The ygi index is regarded as peak region and the Clusters file is loaded ### func \(\*SparseFloatMatrix\) [LoadMatrix]() ```go func (sfm *SparseFloatMatrix) LoadMatrix(xgiMap, ygiMap map[string]uint) ``` LoadMatrix load float matrix with xgi and ygi Index. If ygiMap is empty, use the default ygi index ### func \(\*SparseFloatMatrix\) [initThreading]() ```go func (sfm *SparseFloatMatrix) initThreading(attributes Attributes) ``` ### func \(\*SparseFloatMatrix\) [loadMatrixFloat]() ```go func (sfm *SparseFloatMatrix) loadMatrixFloat(xgiSubset, ygiSubset map[string]uint) ``` loadMatrixFloat load function with either MTX header or not. if xgiSubset is provided, replace xgi index by the index present in xgiSubset ### func \(\*SparseFloatMatrix\) [loadMatrixFloatOneTh]() ```go func (sfm *SparseFloatMatrix) loadMatrixFloatOneTh(count, nblines, thID int, lines *[buffsize]string, xgiSubset, ygiSubset map[string]uint, matColMain *[]map[uint]float64, splitChar string, transpose, delOne bool) ``` Generated by [gomarkdoc]()