Ulrich, L.E. and Zhulin, I.B. Bioinformatics (2014)

SeqDepot

SeqDepot Overview

News:

2 Dec 2014: Domain architecture images (PNG and SVG) are now drawn by using pfam27 hits.

10 Oct 2014: SeqDepot is updated with non-redundant database


What

SeqDepot massively simplifies the retrieval of primary sequences and associated secondary data by:

  1. Consolidating protein sequences and precomputed data,
  2. providing access via a well-documented REST API, and
  3. supplying tools to rapidly integrate knowledge of interest into your research.
Why

Despite many excellent existing resources and databases, gathering biological information remains a tedious and error prone task. SeqDepot streamlines this process by minimizing the data retrieval and consumption burden, so that you can focus on what's important. More

Who

SeqDepot can benefit anyone working with protein sequence data; however, because all access must be done via the REST API, basic programming skills are helpful. Thus, bioinformaticians and computational biologists will likely benefit the most.

I want to do X, but do not know how to program!

We have also developed a handy Perl program, sdQuery, that performs many useful tasks (e.g. retrieving precomputed data for FASTA sequences, downloading images, etc.). Give it a try and read the documentation to learn its capabilities.

Get Started

To demonstrate SeqDepot's capabilities and how to use it effectively, we developed a query interface that communicates with the SeqDepot database using the public API.

We recommend you start by experimenting with different parameters and different sample inputs to develop a feel for how it works.

Awesome features!

SOURCE AGNOSTIC
Intrinsic identifiers derived solely from the raw sequence enable rapid, fool-proof querying independent of external cross-references (although SeqDepot handles that too).
NON-REDUNDANT
Duplicate sequences are merged into a single record (along with any relevant metadata). This improves database performance and facilitates data retrieval.
REST API
Using the REST API interface, easily fetch entire or partial record data in a single request to a semantically meaningful URI. Batch retrieval for up to 1,000 queries per request is also supported.
PRECOMPUTED DATA
Each sequence contains predictions derived from 19 distinct analytical tools - Pfam / SMART domains, transmembrane regions, signal peptides, SuperFamily, and several more.
RAPIDLY CONSUMABLE
Results are encoded in JSON and may readily be converted into native data structures for downstream manipulation.
CROSS-REFERENCES
Query using a wide variety of external database identifiers - GenBank (GI), UniProt, PDB, or MD5 hexadecimal digests.
LOCAL DATABASE
Want to run SeqDepot locally? Download the database and crank up your own local copy powered by MongoDB.
VISUALIZATION
Easily generate PNG or SVG images visualizing a sequence's domain architecture.
TOOLS
Download sdQuery to perform many common tasks (e.g. retrieving precomputed data for FASTA sequences, download images, etc). Perl / Python modules simplify server interaction with several easy to use methods.