# Introduction and installation Enhlink is a novel computational approach that utilizes single-cell signals to infer linkages between regulatory sequences such as enhancers and promoters. To accomplish this, Enhlink employs an ensemble strategy that integrates cell covariates and produces robust p-values for any link and covariate-specific link. * Designed to leverage multi-omic datasets, Enhlink incorporates complementary omic measurements such as gene expression to infer enhancer-target links. * By modeling the surrounding enhancers and covariates as predictive features of the target accessibility (or gene expression), Enhlink utilizes a modified Information Gain score, a random forest framework, and a bootstrap procedure to estimate significant features associated with a given target. * Enhlink performs a second-order analysis, if desired, that identifies linkages specific to a given covariate. * Enhlink is not limited to proximal enhancers but rather can infer distal enhancer-gene linkages. * Enhlink incorporates a simulation workflow designed using experimentally validated enhancer-promoter signals, which improves prediction accuracy. ![](img/figure_workflow.png) ## Enhlink softwares Enhlink is an analytical framework developed in Go (https://go.dev/), and compiled into three executables: * enhlink * enhgrid * enhtools The command line manual and arguments of each executable can be accessed using the -h flag, (e.g. enhlink -h). _enhlink_ is the main executable that launches the Enhlink pipeline, while _enhgrid_ allows distributing the processing of a list of genes into multiple CPUs and launch a range of input values for all the hyperparameters accepting a numerical value. enhgrid is useful for, for example, automatizing a grid-search approach by trying a combination of multiple hyperparameters or for testing different noise levels. _enhtools_ intersect results from multiple runs and output either the common or unique links of a particular run. It also computes the accuracy between two runs (F-score, Precision, Recall). The source code can be easily compiled using a go compiler and the Unix/linux executables can be downloaded here: (https://gitlab.com/Grouumf/enhlinktools) ## Installation ### Using precompiled executables we precompiled the three executables for _enhlink_, _enhgrid_, and _enhtools_ and made them available in an associated [_figshare_](https://figshare.com/articles/software/Compiled_executables_for_Enhlink_/22807103) project. They are compatible with either Linux x86_64 or OSX arm64 platforms. ```bash # Download executables for Enhlink's suite v0.21.4 and decompress them wget wget https://figshare.com/ndownloader/articles/22807103/versions/5 -O enhlink_exec.zip unzip enhlink_exec.zip # For Linux x86_64 tar -xvf ./enhlink_linux_v0.21.4.tar.gz # For OSX arm64 tar -xvf ./enhlink_osx_v0.21.4.tar.gz # make the binaries executable chmod +x ./enhlink_linux_v0.21.4/* # Enhlink is now ready to be used ./enhlink_linux_v0.21.4/enhlink -h # enhgrid is now ready to be used ./enhlink_linux_v0.21.4/enhgrid -h # enhtools is now ready to be used ./enhlink_linux_v0.21.4/enhtools -h # Link or copy the executables into an accessible local path: ln -s `pwd`/enhlink_linux_v0.21.4/* /usr/local/bin/ # Enhlink is now accessible everywhere enhlink -h enhgrid -h enhtools -h ``` ### Using go installation mechanisms #### Install a a golang compiler (if not existing) * Download binaries: [https://golang.org/dl/](https://golang.org/dl/) * Configure $GOPATH/$GOBIN ```bash #In .bashrc or .zshrc export GOROOT=$HOME/go # or wherever is you go folder export GOBIN=$HOME/go/local/bin # or wherever is your local bin folder for go exectuable export GOPATH=$HOME/go/code/:$HOME/code PATH=$GOPATH:$GOROOT:$PATH PATH=$HOME/go/bin/:$GOBIN:$PATH ``` * source your init file `source ~/.bashrc` #### Install the packages ``` # go version >= 1.18. This might work too with older go versions ## Install enhlink go install gitlab.com/Grouumf/enhlinktools/enhlink@latest ## Install enhgrid go install gitlab.com/Grouumf/enhlinktools/enhgrid@latest ## Install enhtools go install gitlab.com/Grouumf/enhlinktools/enhtools@latest ``` ## Installation (from repository) ```bash git clone https://gitlab.com/Grouumf/enhlinktools.git cd ./enhlinktools/enhlink go install . cd ../enhgrid/ go install . cd ../enhtools/ go install . ``` Alternatively, one can use `go build` to direcly build the executable into the same folder ```bash git clone https://gitlab.com/Grouumf/enhlinktools.git cd ./enhlinktools/enhlink # this command compile the executable into the local directory go build . cd ../enhgrid/ go build . cd ../enhtools/ go build . ``` ## enhlink ### Simple processing from cell-ranger ATAC processed files In its simpliest form, Enhlink requires a sparse matrix (_-mat_) with de cell barcode index (_-xgi_), the peak index (_-ygi_) as a bed file, and a fourth-columns bed file (_-gtf_) indicating the target regions with its name ID defined in a 4th columns. The command below will infer links for all the regions defined in the _Homo_sapiens.GRCh38.99.TSS.2K.bed_ promoter file using the +/-250kb surrounding peaks of each target region as features. ```bash enhlink -mat matrix.mtx \ -xgi barcodes.tsv \ -ygi peaks.bed \ -gtf Homo_sapiens.GRCh38.99.TSS.2K.bed \ -out output_example_1/ \ -threads 6 \ -format cellRanger ``` Please have a look to the different tutorial sections to see more examples. * [Tutorial 1](tutorial_1_atac_only.md) describes the processing of one (ATAC) cell x peak matrix * [Tutorial 2](tutorial_2_simulated_variables.md) describes how to use simulation to estimate the expected accuracy at a given target region * [Tutorial 3](tutorial_3_create_mat.md) describes how to create a sparse cell x peak matrix compatible with Enhlink * [Tutorial 4](tutorial_4_multiomics.md) describes how to process a multi-omics from one of the single-cell RNA/ATAC-seq datasets of 10X Genomics * [Tutorial 5](tutorial_5_advanced_functionalities.md) describes Enhlink's advance functionalities, such as distal interaction and hyperparameters selection * [Tutorial 6](tutorial_6_generate_simulated_matrix.md) describes how to visualize the links and how to create bigwig files. ### Arguments Here are _enhlink_'s arguments. ```bash #################### Module to Link enhancers to promoters ######################## enhlink inferes enhancer / target co-accessibilities (links) using random forests of ID3 trees and Information gain. <<<<<<<<<<<<<<<<<<<< WARNING >>>>>>>>>>>>>>>>>>>> As of March 20 2024, Enhlink v0.21.0, we Changed some of Enhlink's parameters names for clarity and consistency purpose. Below are the list of changes: (version < 0.21.0) -> (version >= 0.21.0) cluster -> clusters promoter -> gtf genes -> targets gene -> target isGeneExpr -> isExpr rmPeaksInPromoter -> rmPeaksInTargets <<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>> enhlink main inputs are: a) a (cell x peak) sparse matrix, b) a 4-columns promoter TSV file , c) an optional (cell x gene) sparse matrix if the gene activity cannot be inferred from the peaks of the the first matrix and the promoter regions. This matrix can either be interpreted as boolean (e.g. the promoter of a given gene is either accessible or not for a given cell), or as a float matrix using the -isExpr option, which reflects the gene expression (for example in a context of a scATAC-seq/RNA-seq multi-omic study) In addition, covariates (cell x covariates) and clusters (cell x clusterID) TSV file can be provided. Finally, multiple optional parameters can be set to fine tune the speed, accuracies, and range of the models. USAGE: enhlink -mat -xgi -ygi -gtf -out -tag \ [-mat2 -xgi2 -ygi2 ] \ # IF PASSING A GENE MATRIX FILE [-target ] \ # IF FOCUSING ON ONE TARGET [-targets ] \ # IF FOCUSING ON A LIST OF TARGETS [-isExpr] # IF THE GENE MATRIX IS A GENE EXPRESSION MATRIX [-covariates -xgi_subset -ygi_subset -cluster ] \ # OPTIONAL [-downsample -threads -n_boot -depth -max_features -threshold min_matsize -min_leafsize -merging_cutoff -neighborhood -format {coo, mtx} -keep_sparse -maxFeatType -rmPeaksInPromoter -linkType {"all", "positive", "negative"} -secondOrder -ignoreEnhancerWeight --uniformSampling] \ # OPTIONAL -clusters value Clusters file -covariates value optional covariate dense TSV matrix (row:cell x col:covariates). a column header with the name of the covariates must be the first line and the cell IDs must be the first column. Only categorical covariates can be used for now. -depth int Max tree depth. If negative, no max tree depth will be used (To use the maximum depth, set to: -1) (default 2) -downsample int Downsample the number of samples used for each bootstrap if > 0 and if > len(cluster) -format string Input matrix format {coo, mtx, cellRanger}. "coo" is a numeric 3-columns (cell-Index feat-index value) file with indices starting at 0. "mtx" is similar to "coo" but the first two lines indicates the number of rows and columns (see documentation). "cellRanger" is similar to "mtx" but indices start at 1 and rows and columns are switched (default "coo") -gtf value BED File containing the genomic coordinates (three first columns) of each target and the target ID (e.g. gene symbol) -ignoreEnhancerWeight Ignore Enhancers weight (the ratio of accessibility) in the computation of the modified Information Gain. Useful to not overweight widely accessible enhancers when performing distal inference with neighborhood very large (or == 0) -isExpr Use a expression float matrix for mat2 instead of a boolean matrix -keep_sparse Keep the main ColMat matrix sparse in the memory. Usefull for memory reason if neighboorhood is very wide (or ==0), but slow down computation (default: false) -lambda1 float Lambda parameter of a poisson distribution, that controls the amount of dropouts of the simulated variables (default 1.8) -lambda2 float Lambda parameter of a poisson distribution, that controls the amount of false positives of the simulated variables (default 0.05) -mat value Input matrix file -mat2 value Input expression matrix file -maxFeatType string Maximum of features to be considered for a given tree. accepted value: {"all", "sqrt", "log"}. can be an Int (number of features) or a float between 0 and 1 (fraction of features) (default "all") -max_features int Maximum number of explanatory features per bootstrap model. If set, only K features are kept for each model based on the coefficient, even if the number of non null coefs is larger (default 4) -merging_cutoff int Cutoff (bp) for merging close target peak (default: 1500) (default 1500) -min_leafsize int Min size of a tree leaf (default: 10) (default 10) -min_matsize int Min matrix size to use for computation. If too small, create random features from the surrounding enhancers (default: 0) (default 100) -n_boot int Number of boostrap regressions to perform for each target (default 100) -nb_sim_features int if set to K and K > 0, Perform a simulation on K features, using the other variables as background and a poisson distribution to control the amount of noise in the simulated variables. The accuracy metrics (TPR, TNR, FPR, FNR), are reported for each target in a separated file (default: 0) -neighborhood int Neighborhood regions in (+/-) BP to explore (default 2.5e5) (default 250000) -linkType string Which links to keep {"all", "positive", "negative"}. Positive and negative links refers to positively or negatively correlated linkages (default "all") -onlySim Only perform simulation. -out string Output directory -rmPeaksInTargets For each target model, Remove peaks which are within the target boundaries -secondOrder Identify the covariates associated with each inferred enhancer-target links -secondOrderMaxFeat int Maximum number of explanatory features per bootstrap model. If set, only K features are kept for each model based on the coefficient, even if the number of non null coefs is larger (default 2) -tag string Output files tag -target string Perform computation for one single target (e.g. one gene) -targets value File containing list of targets (e.g. gene IDs) to use for computation -threads int Number of internal threads (default: 2) (default 2) -threshold float P-value threshold (default: 0.05) (default 0.05) -uniformSampling Randomly sample the cells to have an uniform covariate distribution for each bootstrap. Needs a covariate matrix -version Show current version -xgi value row index for input mat -xgi2 value row index for input expression mat -xgi_subset value Subset of xgi (cells) to use for computation -ygi value column index for input mat -ygi2 value column index for input expression mat -ygi_subset value Subset of ygi (peaks) to use for computation ``` ### enhgrid _enhgrid_ can substitute _enhlink_ and produce the same exact results. However, _enhgrid_ is useful to parallelize the processing of a large number of genes with the _processes_ options. With the example below and having 6 x 3 = 18 CPUs available we can use the following commands: ```bash enhgrid -mat matrix.mtx \ -xgi barcodes.tsv \ -ygi peaks.bed \ -gtf Homo_sapiens.GRCh38.99.TSS.2K.bed \ -out output_example_3/ \ -threads 6 \ -processes 3 \ -splitTargetList \ -format cellRanger ``` This would be faster than using _enhlink_ 18 threads since the threads are used for one target region at the time. Another reason to use _enhgrid_ is when ones want to try multiple hyperparameters values, such as _lambda1_ and _lambda2_ described in [Tutorial 2](tutorial_2_simulated_variables.md). For example, below we use enhgrid to generate linkages for different neighborhood sizes (in kb) and different forest sizes: ``` # this is just an example but if you launch it, it will take a long time to compute. enhgrid -mat matrix.mtx \ -xgi barcodes.tsv \ -ygi peaks.bed \ -gtf Homo_sapiens.GRCh38.99.TSS.2K.bed \ -out output_example_3/ \ -threads 6 \ -n_boot 50,100,150 \ -neighborhood 250000,500000,1e6 \ -processes 3 \ -format cellRanger ``` enhgrid will generate output files for each combinations of _-neighborhood_ and _-nboot_. Here are _enhgrid_'s arguments. ```bash #################### Module to Link enhancers to promoters ######################## enhgrid performs enhlink on multiple processes for a range of hyperparameter values. enhgrid generates output files for each hyperparameter combination. The following parameters can accept multiple values: -downsample -n_boot -depth -max_features -secondOrderMaxFeat -threshold -min_matsize -min_leafsize -merging_cutoff -neighborhood -maxFeatType -lambda1 -lambda2 -threads Multiple values can be passed as input using either comma or space: for example -depth 2,3,4 or -depth "2 3 4" Enhgrid can accept the exact same parameters than Enhlink with additional functionalities: ## Parameters unique to enhgrid: -randomNTargets which allows to pick, for each grid iteration, N tatgets at random from the index and process them instead of the full list of targets -repetition Number of repetition to be performed for each iteration (default: 1) -processes Number of Enhlink processes to be launched in parallel (default: 1) -splitTargetList Split the list of genes through the n processes <<<<<<<<<<<<<<<<<<<< WARNING >>>>>>>>>>>>>>>>>>>> As of March 20 2024, Enhlink v0.21.0, we Changed some of Enhgrid's parameters names for clarity and consistency purpose. Below are the list of changes: (version < 0.21.0) -> (version >= 0.21.0) cluster -> clusters promoter -> gtf genes -> targets gene -> target isGeneExpr -> isExpr rmPeaksInPromoter -> rmPeaksInTargets splitGeneList -> splitTargetList randomNGenes -> randomNTargets <<<<<<<<<<<<<<<<<<<<<<<<>>>>>>>>>>>>>>>>>>>>>>>>> USAGE: enhgrid -mat -xgi -ygi -gtf -out -tag \ [-mat2 -xgi2 -ygi2 ] \ # IF PASSING A GENE MATRIX FILE [-target ] \ # IF FOCUSING ON ONE TARGET [-targets ] \ # IF FOCUSING ON A LIST OF TARGETS [-isExpr] # IF THE GENE MATRIX IS A GENE EXPRESSION MATRIX [-covariates -xgi_subset -ygi_subset -clusters ] \ # OPTIONAL [-downsample -threads -n_boot -depth -max_features -threshold min_matsize -min_leafsize -merging_cutoff -neighborhood -format {coo, mtx} -keep_sparse -maxFeatType -rmPeaksInTargets -linkType {"all", "positive", "negative"} -secondOrder -ignoreEnhancerWeight -uniformSampling] \ # OPTIONAL [-randomGenes -repetition -processes --splitTargetList] # OPTIONAL and specific to enhgrid -clusters value Clusters file -covariates value optional covariate dense TSV matrix (row:cell x col:covariates). a column header with the name of the covariates must be the first line and the cell IDs must be the first column. Only categorical covariates can be used for now. -depth string Max tree depth. If negative, no max tree depth will be used (default: -1) (default "2") -downsample string Downsample the number of samples used for each bootstrap if > 0 and if > len(cluster) (default "0") -format string Input matrix format {coo, mtx, cellRanger}. "coo" is a numeric 3-columns (cell-Index feat-index value) file with indices starting at 0. "mtx" is similar to "coo" but the first two lines indicates the number of rows and columns (see documentation). "cellRanger" is similar to "mtx" but indices start at 1 and rows and columns are switched (default "coo") -gtf value BED File containing the genomic coordinates (three first columns) of each target and the target ID (e.g. gene symbol) -ignoreEnhancerWeight Ignore Enhancers weight (the ratio of accessibility) in the computation of the modified Information Gain. Useful to not overweight widely accessible enhancers when performing distal inference with neighborhood very large (or == 0) -isExpr Use an expression float matrix for -mat2 -keep_sparse Keep the main ColMat matrix sparse in the memory. Usefull for memory reason if neighboorhood is very wide (or ==0), but slow down computation (default: false) -lambda1 string Lambda parameter of a poisson distribution, that controls the amount of dropouts of the simulated variables (default "1.8") -lambda2 string Lambda parameter of a poisson distribution, that controls the amount of false positives of the simulated variables (default "0.05") -mat value Input matrix file -mat2 value Input target matrix file, defining the value of each target for each cell -maxFeatType string Maximum of features to be considered for a given tree. accepted value: {"all", "sqrt", "log"}. can be an Int (number of features) or a float between 0 and 1 (fraction of features) (default "all") -max_features string Maximum number of explanatory features per bootstrap model. If set only K features are kept for each model based on the coefficient, even if the number of non null coefs is larger (default "4") -merging_cutoff int Cutoff (bp) for merging close promoter peak (default: 1500) (default 1500) -min_leafsize string Min size of a tree leaf (default: 10) (default "10") -min_matsize string Min matrix size to use for computation. If too small, create random features from the surrounding enhancers (default: 0) (default "100") -n_boot string Number of boostrap regressions to perform for each target (default "100") -nb_sim_features string if set to K and K > 0, Perform a simulation on K features, using the other variables as backgrou nd and a poisson distribution to control the amount of noise in the simulated variables. The accuracy me trics (TPR, TNR, FPR, FNR), are reported for each promoter in a separated file (default: 0) (default "0" ) -neighborhood string Neighborhood regions in (+/-) BP to explore (default 2.5e5) (default "250000") -linkType string Which links to keep {"all", "positive", "negative"}. Positive and negative links refers to positively or negatively correlated linkages (default "all") -onlySim Only perform simulation. -out string Output directory -processes int Number of Enhlink processes to be launched in parallel. The total number of cpus used would be t hreads x processes (default 1) -randomNTargets int Number of random targets to draft and process for each iteration instead of analyzing the full s et of targets -repetition int Number of repetition to be performed for each iteration (default 1) -rmPeaksInTargets For each target model, Remove peaks which are within the target boundaries -secondOrder Identify the covariates associated with each inferred enhancer-target links -secondOrderMaxFeat string Maximum number of explanatory features per bootstrap model. If set, only K features are kept for each model based on the coefficient, even if the number of non null coefs is larger (default "2") -splitTargetList Split the list of targets through the n processes -tag string Output files tag -target string Perform computation for one single target -targets value File containing a subset of targets to focus on for computation -threads string Number of internal threads (default: 2) (default "2") -threshold string P-value threshold (default: 0.05) (default "0.05") -uniformSampling Randomly sample the cells to have an uniform covariate distribution for each bootstrap. Needs a covariate matrix -version Show current version -xgi value row index for input mat -xgi2 value row index for input target mat -xgi_subset value Subset of xgi (cells) to use for computation -ygi value column index for input mat -ygi2 value column index for input target mat -ygi_subset value Subset of ygi (peaks) to use for computation ``` ### enhtools _enhtools_ can compute intersections and differences between bedpe files, and can filter or intersect a bedpe file according to genomic regions defined in a bed file: ```bash # -intersect is based on perfect matches between files with identical file name from two folders enthools -intersect -in -in2 -out # -intersect2 is based on perfect matches (the matched regions are identical) between file1 and file2 enthools -intersect2 -in file1.bedpe -in2 file2.bedpe -out file3.bedpe # -intersect3 is based on unperfect matches between file1 and file2 enthools -intersect3 -in file1.bedpe -in2 file2.bedpe -out file3.bedpe # -diff output the set of non overlapping links from file1. # file4.bedpe will contain links from file1 that don't intersect links from file2 enthools -diff -in file1.bedpe -in2 file2.bedpe -out file4.bedpe # file1.filtered.bedpe will contains links from file1 that are also within one of the region of TAD.bed enhtools -filter -in file1.bedpe -bed TAD.bed -out file1.filtered.bedpe # -intersectBed intersects links with at least a matching region with a bed file enhtools -intersectBed -in file1.bedpe -bed region.bed -out file1.intersect.bedpe ``` Here are _enhtools_'s arguments. ```bash #################### Post processing of enhlink results ######################## enhtools interescts bedpe files and compute accuracy metrics (TPR, FPR, F-score...) USAGE: * with -intersect (Intersection of output directory obtained from enhlink) enhtools -intersect -in -in2 -out (optional: -tag/tag2/outtag -scorePos/pvalPos/geneIDPos -mergeScore {left, right, mean} -prec ) * with -intersect2 (Intersection of two bedpe files) enhtools -intersect2 -in -in2 -out (optional: -tag/tag2/outtag -scorePos/pvalPos/geneIDPos -mergeScore {left, right, mean} -prec -stdout) * with -intersect3 (Intersect input bedpe files, based on unperfect matches. A match occurs if both regions of the 2 bedpe files intersect respectively) enhtools -intersect3 -in -in2 -out (optional: -stdout) * with -intersect4 (Intersect input bedpe files, based on unperfect matches. A match occurs if at least one of two regions of the first bedpe file intersects with one of the two regions of the second bedpe file) The metadata columns to annotate each match are defined with the scorePos, pvalPos, and geneIDPos options. enhtools -intersect4 -in -in2 -out (optional: -stdout -scorePos -pvalPos -geneIDPos) * with -intersectBed (Intersect bedpe with bed, at least one bedpe region of intersect one of the BED region) Matching type defined by the -maching option ("either", "left", "right", "both") controls which bedpe region should have a match. enhtools -intersectBed -in -bed -out (optional: -matching -stdout -scorePos -pvalPos -geneIDPos) * with -diff (Difference between (file1 - file2) input bedpe files, based on unperfect matches. A match occurs if both regions of the 2 bedpe files intersect respectively) enhtools -diff -in -in2 -out (optional: -stdout) * with -filter (filter links from bedpe file (-in) which are not within at least one of the region defined in (-bed). If -diff is added, only the links not within at least one region of -bed will be outputed. -filter is well adapted for filtering links not in TAD regions, defined in a BED file enhtools -filter -in -bed (optional: -stdout -diff) TIPS: scorePos/pvalPos/geneIDPos can be used to se different column IDs for in and in2! Use the ":" to delimitate the seprators for the files from in and in2 -bed string BED file used to filter bedpe links -diff Output the difference between the bedpes given by -in and -in2 with -intersect3 -filter filter links from bedpe file (-in) which are not within at least one of the region defined in (-bed). If -diff is added, only the links not within at least one region of -bed will be outputed. -filter is well adapted for filtering links not in TAD regions, defined in a BED file -geneIDPos string column ID(s) from bedpe file1 (in1), starting from 0, that define the gene ID.if genePos different between -in and -in2, use the following format ":", example: "8:9" -in string Input results directory (-intersect) or input bedpe file (-intersect2) -in2 string 2nd Input results directory (-intersect) or input bedpe file (-intersect2) -intersect intersect the outputs of enhlink by retaining the intersection of the results -intersect2 intersect two bedpe files -intersect3 Intersect input bedpe files, based on unperfect matches. A match occurs if both regions of the 2 bedpe files intersect respectively -intersect4 Intersect input bedpe files, based on unperfect matches. A match occurs if at least one of two regions of the first bedpe file intersects with one of the two regions of the second bedpe file -intersectBed Intersect bedpe with bed (at least one bedpe region of intersect one of the BED region) -matching string matching type for -intersectBed {"either" (default) "left" (left region of the bed), "right" (right region), "both" (both regions)} (default "either") -mergeScore string Strategy to merge scores: {left, right, mean} (default "left") -out string output (directory / file) -outtag string Output files tag -prec int Float precision (default 4) -pvalPos string column ID(s) from bedpe file1 (in1), starting from 0, indicating the pvalues. if pvalPos different between -in and -in2, use the following format ":", example: "7:6" -scorePos string column ID(s) from bedpe file1 (in1), starting from 0, indicating the scores. if scorePos different between -in and -in2, use the following format ":", example: "8:9" -stdout write to stdout -tag string Input files tag -tag2 string 2nd input files tag -version Show current version ```