Aldo-Keto Reductase (AKR) Superfamily Database

This site contains existing and potential protein sequences of the AKR protein superfamily, as well as tools to visualize aligned sequences and their conservation across species. In addition, scientists are encouraged to submit newly identified AKRs.

AKR

AKRs share similar three-dimensional structures involving a parallel β-8/α-8-barrel fold, and they function as enzymes that catalyze the reduced nicotinamide adenine dinucleotide (phosphate) (NAD(P)H)-dependent oxido-reduction of carbonyl groups. Over 190 members have been identified in species ranging from prokaryotes to plants, fungi, and animals. These proteins, which are grouped into 16 families named AKR1-AKR16, have unique structural features that influence their substrates and kinetics.

Contributors

Coding and design by Jaehyun Joo, Blanca Himes, Anisha Tehim, and Trevor Penning. Full code for this Shiny app is available in GitHub.

References

Mindnich RD, Penning TM. Aldo-keto reductase (AKR) superfamily: genomics and annotation. Hum Genomics. 2009 Jul;3(4):362-70. doi: 10.1186/1479-7364-3-4-362. PMID: 19706366; PMCID: PMC3206293.

Penning TM. The aldo-keto reductases (AKRs): Overview. Chem Biol Interact. 2015 Jun 5;234:236-46. doi: 10.1016/j.cbi.2014.09.024. Epub 2014 Oct 7. PMID: 25304492; PMCID: PMC4388799.

Funding

This work was supported by the University of Pennsylvania Center of Excellence in Environmental Toxicology (P30 ES013508).

Nomenclature

The general format for AKR names is as follows: the root symbol 'AKR' for Aldo-Keto Reductase; an Arabic number designating the family; a letter indicating the subfamily when multiple subfamilies exist; and an Arabic numeral representing the unique protein sequence. Under the system, the protein AKR1A1 would be the first AKR in family 1, subfamily A, and in this instance corresponds to human aldehyde reductase.

Definition of Families. Delineation of families occurs at the 40% amino acid identity level. Members of an AKR family should have < 40% amino acid identity with any other family. At present, the sixteen families defined by our cluster analysis satisfy this criterion.

Definition of Subfamilies. Within a given family, subfamilies may be defined by a > 60% identity in amino acid sequence among subfamily members. By this definition, nine of sixteen AKR families include multiple subfamilies. For example, family AKR1 includes the following subfamilies: AKR1C1-AKR1C4 and AKR1D1, which play critical roles in the metabolism of all steroid hormones, conjugated steroids, synthetic therapeutic steroids, and the synthesis of neurosteroids and bile acids. Numbering of the known members of each subfamily was assigned in an arbitrary fashion. For example, AKR1A1, AKR1A2, and AKR1A3 are the aldehyde reductases from human, pig, and rat, respectively. Any new additions to a subfamily are numbered chronologically.

Allelic Variants versus Isoforms. Allelic variation may occur between superfamily members. We propose that proteins with > 97% amino acid sequence identity are alleles of the same gene unless: they have different enzyme activities; they are encoded by different cDNA's, usually evident by a distinct 3'-untranslated region (UTR); and they are derived from genes of different structure. While AKR1C1 [human dihydrodiol dehydrogenases 1 (DD1)], and AKR1C2 [human dihydrodiol dehydrogenases 2 (DD2)] are 98% identical in amino acid sequence and have 3'-UTRs which are 97% identical, the substrate specificity and function of these proteins are quite different. AKR1C1 is predominantly a 20α-HSD while AKR1C2 is the major bile acid binding protein in human liver. Based on these functional differences, we have assigned AKR1C1 and AKR1C2 as unique members of the AKR superfamily.

Dimeric Proteins. Multimers are proteins which consist of multiple monomers. Although majority of all AKR proteins are monomeric proteins, approximately 320 amino acids in length, the AKR2 (which includes the xylose reductases), AKR6 (which includes the b-subunits of the voltage-gated potassium channel), and AKR7 (which contains the aflatoxin dialdehyde reductases) families have been shown to form multimers. To expand the nomenclature to accommodate multimers we recommend that the composition and stoichiometry be listed. For example, AKR7A1:AKR7A4 (1:3) would designate a tetramer of the composition indicated.

AKR Genes. The designation for an AKR superfamily gene should be noted in italics to distinguish between the gene and the protein. For example, the gene AKR1A1 encodes the protein AKR1A1.

The above nomenclature system was adopted at the 8th International Workshop on the Enzymology and Molecular Biology of Carbonyl Metabolism. It is similar to that for the cytochrome P450 superfamily, but, unlike that system, amino acid sequences are used for comparisons. For historical reasons, the AKR1A subfamily represents the aldehyde reductases and the AKR1B subfamily represents the aldose reductase. We recommend that authors referencing members of the AKR superfamily use any previous names along with the new designation in parenthesis - for example, human aldehyde reductase (AKR1A1).

Protein Structures

AKRs are characterized by an (αβ)8-barrel structure:

The (αβ)8-barrel motif of AKRs


Loop Structure. Using the structure of AKR1C9W, CHO Reductase with NADP+, three large loops can be assigned.

Loop Structure


Cofactor Binding Site. Cofactor binding site for 3α-HSD (AKR1C9); taken from PDB 1LWI. Distances are in angstroms.

Loop structures of AKRs


Typical Catalytic Tetrad. Blue sphere indicates the position of a water molecule and the probable position of the substrate carbonyl. Taken from 3α-HSD (AKR1c9). See PDB 1LW1.

Loop structures of AKRs

AKR Family Descriptions

AKR Family 1. AKR1 enzymes control the concentrations of active ligands for nuclear receptors and control their ligand occupancy and transactivation. Furthermore, AKR1 enzymes regulate the amount of neurosteroids that can regulate the activity of GABAa and NMDA receptors. Therefore, AKR1 enzymes are typically involved in the pre-receptor regulation of nuclear as well as membrane bound receptors. In addition, altered expression of individual AKR1C genes is related to the development of prostate, breast, and endometrial cancer. Mutations in AKR1C1 and AKR1C4 are responsible for sexual development dysgenesis, and mutation in AKR1D1 are causative in blie acid deficiency.

AKR Family 2. Members of the AKR2 family are categorized as microbial enzymes consisting of xylose reductases. AKR2B1 - AKR2B6 are classified as yeasts, while AKR2C and AKR2D are classified as fungi. The function of xylose reductases is to catalyze the first step of xylose metabolism. During this process xylose reductase, which is an NADPH- or NADH- dependent enzyme, will oxidize xylose to xylitol. While this occurs, the NAD+ dependent enzyme xylitol dehydrogenase (XDH) reduces xylitol to xylulose. Under microaerophilic conditions, yeasts like Kluyveromyces, Pichia, and Pachysolen are capable of fermenting xylose to ethanol.

AKR Family 3. Similar to the AKR2 family, members of the AKR3 family are categorized as microbial enzymes and are yeasts. For example, AKR3A1, known as Gcylp, catalyzes the reduction of several aldehyde substrates including D,L-glyceraldehyde. In addition, YPRI (AKR3A2) has been shown to contribute 50% of in vivo 2-methylbutyraldehyde reductase activity. Furthermore, deletion of the GRE3 gene encoding S. cerevisiae aldose reductase (AKR2B6) leads to a decrease in xylitol formation from xylose by 50%.

AKR Family 4. Members of AKR Family 4 are classified as plant-based AKR’s. These AKRs have many functions including biotic and abiotic stress defense, production of commercially important secondary metabolites, iron acquisition from soil, and plant-microbe interactions. Plant AKR’s have not been well studied, however a few of AKR4’s functionality are known. For instance, the AKR4C family is known to be involved in aldehyde detoxification and stress defense, osmolyte production, secondary metabolism, and membrane transport. For example, AKR4C8 and AKR4C9 from Arabidopsis thaliana (Arabidopsis) can reduce a range of toxic compounds containing reactive aldehyde groups. In contrast, AKR4C7 from maize (Zea mays) catalyzes the oxidation of sorbitol to Glc.

AKR Family 5. Similar to Family 2 and Family 3, AKR Family 5 is categorized as microbial AKR’s. AKR5C, AKR5D, and AKR5B, are classified as bacteria; AKR5E and AKR5F are classified as yeasts; finally, AKR5E is classified as a fungi. While majority of the functionality of the AKR5 family is unknown, AKR5D is responsible for the NADPH-dependent stereospecific reduction of 2,5-diketo-D-gluconate to 2-keto-L-gulonate, a precursor in the industrial production of vitamin C.

AKR Family 6. The AKR6 family is found in humans, and contains the beta-subunits of the voltage-gated potassium channel. These AKR6 family members are structurally characterized by having an extra helix attached to a long loop between β9 and α7, and the proteins forms tetramers. Furthermore, In the AKR6 family, the N-terminal β1-β2 hairpin (Y39-G46) forms a part of the tetramer intersubunit interface: together with the closely located R109-S111 segment (Kvβ2-AKR6A5 numbering) from the α2-β5 loop at the bottom of the barrel it interacts with the loop β5-α3 (consisting of amino acid residues K124-R129) at the top (C-terminal end) of another barrel. Interestingly the regions involved in the intersubunit interaction are >92% conserved within the AKR6 family, but do not share homology with AKRs from other families.

AKR Family 7. The AKR7 family contains the aflatoxin dialdehyde reductases. Members of the AKR7 family can be found in several species. For example AKR7A2 and AKR7A3 are found in humans, AKR7A1 and AKR7A4 are found in rats, and AKR7A5 is found in mice. The purpose of the AKR7 family is typically to reduce aldehyde to alcohol, however some do have more specific functions. For example, AKR7A2 is involved in the metabolism of daunorubicin to the cardiotoxic daunorubicinol.

AKR Family 8. AKR Family 8 is also classified as a microbial AKR family. Both AKR8A1 and AKR8A2 are known as pyridoxal reductase and can be found in yeast. The function of pyridoxal reductase is to catalyze the NADPH-mediated reduction of pyridoxal to pyridoxine.

AKR Family 9. AKR Family 9 is classified as a microbial AKR family. For example, AKR9C falls under the archaebacteria division; AKR9B1-B4 falls under the yeasts division, and AKR9A1-A3 falls under the Fungi category. These enzymes vary immensely and have differing functions. For example, AKR9A1, known as a sterigmatocystin dehydrogenase is involved in the biosynthesis of a fungal secondary metabolite. Another example is AKR9A2, which is involved in aflatoxin biosynthesis. On the contrary, AKR9C, known as oxi reductase catalyzes the transfer of electrons from one molecule (the oxidant, the hydrogen or the electron donor) to another molecule (the reductant, the hydrogen or electron acceptor).

AKR Family 11. Members of the AKR11 family, which are AKR11A and AKR11B are both classified as bacterial AKR’s. Their respective enzyme names are IolS and GSP69, and they both have the same function of reducing substrates DL-glyceraldehyde, D-erythrose and methylglyoxal in the presence of NADPH.

AKR Family 12. Members of the AKR12 family are known as Streptomyces sugar aldehyde reductases. AKR12A from Streptomyces fradiae, andAKR12C from Streptomyces avermitilis, are also thought to play roles as reductases in L-mycarose and L-oleandrose biosynthesis respectively. However, a role for these latter two enzymes through the use of gene mutation/deletion has not been proven.

AKR Family 13. Members of AKR13 are classified as hyperthermophilic bacteria reductases. These enzymes are involved in protein thermostabilization, including ion pairs, hydrogen bonds, hydrophobic interactions, disulfide bridges, packing, decrease of the entropy of unfolding, and intersubunit interactions.

This table contains all known members of the AKR Superfamily that are contained in our database.


1 Where no reference is given please refer to the accession number in the appropriate database
2 Trichosporonoides megachilieni, as known as Moniliella megachiliensis

This table lists potential AKR superfamily members. These are currently excluded from the nomenclature because either no functional data exists for the protein or the sequence is a partial cDNA or derived from a genomics project.


This table contains AKR sequences grouped according to Protein Data Bank (PDB) structures. Note that more than one AKR can map to the same PDB entry.


This dendrogram replaces the older version constructed in the GCG program and was constructed using the multialign program which enables any user to conduct their own pair-wise comparison. As a result of this enhancement, some families have shifted. However, the nomenclature of the individual AKR families, subfamilies and their members are essentially unchanged.

For the AKR tree diagrams organized by family, note that while there are 16 AKR families, lack of availability of sequences in Family’s 14-16 (as well as AKR8 & AKR10) make it unfeasible to construct a phylogenetic tree for these families. The Method Used to Create the Phylogenetic Trees, organized by AKR Family can be found here.

This page enables visualization of aligned protein sequences for various groups of AKRs that are stored in our database. To use it, select a group of interest from the interactive menu to adjust the set of sequences output. The alignment is created with MSAViewer, which provides an interactive JavaScript-based visualization of multiple sequence alignment. Options for the aligner can be set with the blue buttons, including the color scheme. Further details can be found in the MSAViewer user manual and this GitHub repository on available color schemes.


Select a set of AKR proteins to visualize:


  1. Since the proposed nomenclature system is protein-based, the newly identified AKR will require that the amino acid sequence has been obtained by either cDNA cloning or by direct methods. The protein encoded by a cDNA should have been either overexpressed or purified from its natural source. Investigators should provide GenBank, Swiss-Prot or PIR accession numbers.

  2. Upon submission of a complete protein sequence, it will be matched against the AKRs in the database and placed within the cluster analysis. When submitting sequences investigator should provide the following information:

    • Trivial name if one has been assigned
    • Species of origin
    • Expression system used
    • Substrate used to assign enzyme activity
    • Accession number
    • Status of publication
    • Citation if exists
    • Complete contact information for the submitter
    •  
  3. The location of the sequence within the superfamily cluster analysis will determine its assigned designation. As needed, new families and subfamilies will be added to the existing system.

  4. The sequence, the assigned designation, and position within the cluster analysis will be returned to the submitter, but the database will not be updated until the submission has been published. We encourage the submitter to use the new assignment in their publication. It is an investigator's responsibility to notify the web-site that the information submitted has been published and provide the appropriate citation.


Thanks, a new AKR sequence was submitted successfully!

Submit another AKR sequence
Submitting...

Error: