2 Dec 2014: Domain architecture images (PNG and SVG) are now drawn by using pfam27 hits.
10 Oct 2014: SeqDepot is updated with non-redundant database
SeqDepot massively simplifies the retrieval of primary sequences and associated secondary data by:
- Consolidating protein sequences and precomputed data,
- providing access via a well-documented REST API, and
- supplying tools to rapidly integrate knowledge of interest into your research.
Despite many excellent existing resources and databases, gathering biological information remains a tedious and error prone task. SeqDepot streamlines this process by minimizing the data retrieval and consumption burden, so that you can focus on what's important. More
Virtually every bioinformatics project involves the analysis of numerous sequences in conjunction with diverse functional attributes. For example, knowledge of a protein's domain architecture - the set of domains, coiled-coil regions, conserved repeats, etc. - significantly improves understanding its function.
Merely organizing the base data set likely requires downloading data from multiple servers, installing and configuring tools, adhoc programming, and various computational runs. Moreover, this approach is vastly inefficient. Numerous analytical results produced by popular tools such as HMMER strictly depend upon the raw sequence and output the same result for identical sequences. You can imagine the computational waste going on!
SeqDepot can benefit anyone working with protein sequence data; however, because all access must be done via the REST API, basic programming skills are helpful. Thus, bioinformaticians and computational biologists will likely benefit the most.
I want to do X, but do not know how to program!
We have also developed a handy Perl program, sdQuery, that performs many useful tasks (e.g. retrieving precomputed data for FASTA sequences, downloading images, etc.). Give it a try and read the documentation to learn its capabilities.
To demonstrate SeqDepot's capabilities and how to use it effectively, we developed a query interface that communicates with the SeqDepot database using the public API.
We recommend you start by experimenting with different parameters and different sample inputs to develop a feel for how it works.
Resources such as InterPro and SIMAP provide comprehensive coverage and complex interfaces for searching this space. Though powerful and highly useful, working with these databases is not completely streamlined. SeqDepot makes obtaining and utilizing predicted protein features (e.g. transmembrane regions, domains, etc.) incredibly easy. Interested? Find out more and how SeqDepot can accelerate your research.
- SOURCE AGNOSTIC
- Intrinsic identifiers derived solely from the raw sequence enable rapid, fool-proof querying independent of external cross-references (although SeqDepot handles that too).
- Duplicate sequences are merged into a single record (along with any relevant metadata). This improves database performance and facilitates data retrieval.
- REST API
- Using the REST API interface, easily fetch entire or partial record data in a single request to a semantically meaningful URI. Batch retrieval for up to 1,000 queries per request is also supported.
- PRECOMPUTED DATA
- Each sequence contains predictions derived from 19 distinct analytical tools - Pfam / SMART domains, transmembrane regions, signal peptides, SuperFamily, and several more.
- RAPIDLY CONSUMABLE
- Results are encoded in JSON and may readily be converted into native data structures for downstream manipulation.
- Query using a wide variety of external database identifiers - GenBank (GI), UniProt, PDB, or MD5 hexadecimal digests.
- LOCAL DATABASE
- Want to run SeqDepot locally? Download the database and crank up your own local copy powered by MongoDB.
- Easily generate PNG or SVG images visualizing a sequence's domain architecture.
- Download sdQuery to perform many common tasks (e.g. retrieving precomputed data for FASTA sequences, download images, etc). Perl / Python modules simplify server interaction with several easy to use methods.