Skip Navigation
NCI banner National Cancer Institute U.S. National Institutes of Health National Cancer Institute
SAGE Genie
  • Human SAGE Genie Tools
  • Downloads
  • Digital Karyotyping
Cancer Genome Characterization Initiative

Visit the database of genomic characterization data for multiple tumor types.

Extract SAGE Tags From Sequence Files

What the SAGE Tag Extraction Tool Can Do

The tag extraction tool allows you to extract 10-bp or 17-bp SAGE tags from sequence files that you upload on this page. You may request that linker-similar tags be removed from the results; for this option you may use your own list of linker-similar tags or use default lists. The tag extraction tool will return to you the list of extracted tags as well as a report on the process. The extraction tool also allows you to extract 10-bp tags from a list of 17-bp tags by taking the first 10 base pairs of each 17-bp tag and then collating results.

1. Extract Tags From Sequence Files

1. Prepare a compressed file containing your all sequences in fasta format; each sequence file must have the extension '.seq' before compression. The only compression formats that are accepted are (1) Winzip zip file produced on Windows, and (2) .zip, .gz files produced on Unix/Linux systems. Note that it is not necessary to have a separate file for each fasta sequence; it is possible to have a single '.seq' file containing multiple fasta sequences (or multiple '.seq' files each containing multiple fasta sequences). If you are submitting multiple '.seq' files from a Unix/Linux machine, first use tar to create a single file, which can then be compressed ( or xxx.tar.gz). We only process Window's Winzip's zip file and UNIX, tar.gz.
Enter the name of the compressed file containing your sequence file(s) or use the "Browse" button to locate the file in a local directory.

3. Chose following one options: Specify your own linker-similar sequences, or specify the default linker-similar sequences, or don't exclude any linker-similar sequences. The default linker-similar lists contain every tag that is a one-bp substition, insertion, or deletion variant of TCCCTATTAA and TCCCCGTACA (short SAGE), or TCGGACGTACATCGTTA and TCGGATATTAAGCCTAG (long SAGE).

Use default:         default short linker-similar list




8. Click "Extract Tags" button:

2. Extract Short Tags From Long Tags

1. Prepare a file containing long tags with their frequencies. Each line in the file must have one tag and its numeric frequency, seaprated by a TAB. Don't compress the file.

3. Click "Extract Tags" button: