Experimental procedures

Specific objective #1.  A comprehensive set of tools required for functional analysis of the gene set will be created.  These include (a) mutants defective for each gene in the network, (b) E. coli strains overexpressing each protein in the study set, and (c) isoform-specific antibodies for each protein.

Loss-of-function mutations of each gene in the study set:  The enabling aspect of the functional genomics approach on which this proposal is based is a collection of mutant Arabidopsis stocks each lacking the function of one gene in the study set listed in Appendix A-2.  To date two mutations in the gene set have been described in the literature, specifically in DBE1 (AtISO-1) (36), coding for an isoamylase, and DPE1 (AtDE-1) (41), coding for a D enzyme.  To expand this collection and obtain a complete set of mutants, we will make use simultaneously of two NSF funded projects that provide resources for identification of stocks lacking a particular gene function.  The Salk Institute Genome Analysis Laboratory (SIGnAL) (http://signal.salk.edu/tabout.html) has recently begun publication of the insertion sites of T-DNA elements within the Arabidopsis genome.  BLAST searches with all 28 members of the gene set has so far identified only one gene that contains a T-DNA insertion in the SIGnAL population, specifically At-BAM-1, coding for a b-amylase.  Mutant seeds have been ordered, and propagation of the mutant plants in our laboratory will begin shortly.  Genotyping by PCR will identify homozygous mutant- and heterozygous seed, and standard outcrossing procedures will be used to isolate the mutation of interest from any deleterious alleles that might exist in the genetic background.  Based on results from other species, and the fact that multiple isozymes exist for each enzyme in the network, we expect lethal mutations in this study set to be rare.  Frequent searches of the SIGnAL data set as it is updated will be carried out and each mutant that appears in that population will be collected.

We will also utilize the services of the Arabidopsis Knockout Facility at the University of Wisconsin-Madison (77) (http://www.biotech.wisc.edu/Arabidopsis/).  In accordance with procedures described in detail by the facility, gene-specific forward and reverse PCR primers will be synthesized and verified to amplify selected gene sequences.  These primers will be used by the facility to amplify pools of genomic DNA from a large collection of T-DNA-transformed Arabidopsis lines, using primer pairs containing one gene-specific primer and one T-DNA-specific primer.  Each PCR product pool will be screened in our laboratory by DNA gel blot hybridization using the full-length cDNA as a probe.  PCR amplified fragments that hybridize to the cDNA will be sequenced, to directly verify that a T-DNA insertion is located within the target genetic element.  DNA subpools from the primary pools that contain such a verified fragment will then be amplified at the Wisconsin facility in a second round screen, and the verification procedure will be repeated in our laboratory.  Seeds from the smallest pool will then be obtained from the facility, and continued PCR screening eventually will lead to identification of a single plant that carries a T-DNA insertion in the gene of interest.  Co-PI Wurtele has successfully used this strategy to obtain insertion mutations in three genes, as specified in a previous section of this proposal.

The screening required to work through this set of genes will be an arduous task, although we expect it should be possible to present a nearly complete mutant collection by the end of the proposed four-year project period.  Randomness in the T-DNA insertion sites, and detection of insertions, prevents accurate advance determination of how quickly this work will proceed.  The time frame of the project allows 20 genes to be screened at the Wisconsin facility, within the limit of five per year.  We are also aware of a European project addressing the same aim using a different population of T-DNA insertional mutants.  Working collaboratively together with the international community, using several different populations, and taking full advantage of the NSF funded resources that have already been established, we are confident that the complete mutant collection is a practical goal within the proposed project period.

Protein expression and antibody production:  Each enzymes coded for by the selected set of genes will be expressed in E. coli to produce purified recombinant protein for the purposes of generating antibodies and establishing affinity chromatography matrices.  These studies will utilize the glutathione S-transferase (GST) gene fusion system (Pharmacia) or the pET fusion vectors (Novagen) that introduce a 15 amino acid ÒS-tagÓ sequence at the amino terminus of the protein.  We recognize that insolubility and other expression problems will arise in some instances, however, both the GST fusion and S-protein approaches have been successfully applied in our laboratory to generate polyclonal antibodies to the maize DU1, SU1, ZPU1, and BEIIa polypeptides.  Three of those recombinant proteins have been purified in active form to near homogeneity (29, 31).  This previous experience indicates that successful production of 6-10 proteins per year is a feasible proposition. 

Specific antibodies are required for a comprehensive functional genomic analysis, as monitors of gene expression at the protein level, and as probes of the biochemical activities responsible for starch granule assembly and disassembly.  We propose to raise a set of monoclonal antibodies that are specific for the polypeptide products of the 28 identified genes.  Monoclonal antibody production will be contracted to the ISU Cell and Hybridoma facility, which will work in collaboration with the project personnel for screening the antibody produced by individual hybridomas and selection of specific cell lines for bulk preparation.  The antigens will be recombinant fusion proteins expressed in E. coli from pGEX or pET fusion vectors, purified in sufficient quantity for immunization of mice.  An effective means of reducing the costs of this resource is to immunize each mouse with a mixture of three antigens, and subsequently screen the hybridoma clones for production of individual monoclonal antibodies.  Each monoclonal antibody will be tested for reaction with the original recombinant antigen, and also with soluble extracts from various Arabidopsis tissues.  The mutant plants will be a useful resource, such that the absence of an immunoblot signal would indicate specificity of any protein recognized by one of the antibodies in wild type plants.  Standard polyclonal antibodies will also be raised in rabbits in those instances when large amounts of antigen are available.  Like the other resources generated in this project, the antibodies would be publicized on the web site as soon as their efficacy and specificity were definitively verified, and made available upon request to any publicly funded research project.

A note on the comprehensive nature of this specific objective:  We are well aware of the amount of work proposed in this specific objective, having completed analogous or identical studies on four genes involved in maize endosperm starch production over the past five years.  From that experience, we know that meaningful investigation of the starch biosynthesis process requires the ability to examine the pathway as a comprehensive unit.  The Arabidopsis 2010 projects offers the opportunity to approach this broad problem with all the tools that will be required, as opposed to the piecemeal approach that has been the only choice prior to the availability of a complete plant genome sequence.  Thus, in this proposal we suggest going all the way, so that specific genetic and biochemical probes will be available for nearly all of the factors involved in starch granule assembly and disassembly.  With steady, programmatic work over the four-year project period, this specific objective is feasible.

Specific objective #2.  Starch synthesis and degradation in each mutant will be characterized regarding levels and rates of accumulation, and the molecular architecture of amylopectin

Total glucans isolated from immature and mature leaves, roots, and siliques of the 28 knockout mutants identified in objective #1 will be subjected to detailed analysis. The relative proportions of sugars, water-soluble polysaccharides (WSP), and granular polysaccharides will be determined by methods used routinely in our laboratory for analysis of plant glucans (63).  Briefly, WSP is defined as the polysaccharide present in the 12,000g supernatant from an aqueous extract, and the granular starch fraction is defined as the polysaccharides present in the pellet collected by centrifugation at 600g.  Samples of each fraction will be analyzed for total glucose equivalents using a standard commercial assay kit (Boehringer-Mannheim) that employs amyloglucosidase to completely hydrolyze polysaccharide to glucose, followed by hexokinase and glucose-6-phosphate dehydrogenase reactions to quantitatively determine the amount of glucose present.  Control reactions in which amyloglucosidase is omitted will reveal the free glucose content in each fraction.  The amount of glucose equivalents from polysaccharides present in each fraction will be calculated relative to the total glucan in the aqueous extract.  Additionally, specific assays will be performed on extracts from each tissue to determine the concentrations of free sucrose, fructose, and glucose (all assays available in kit form, Boehringer Mannheim).

The composition of granular starch will be determined according to standard procedures (78), in which the granular starch fraction is disrupted by boiling in 90% DMSO, and the glucans present collected by precipitation with alcohol.  This material will be fractionated by gel permeation chromatography (GPC) on Sepharose CL-2B and complexed with iodine (I2/KI).  Absorbance spectra from 400 to 700 nm will assess the maximal absorbance wavelength (lmax).  From these data, the amylopectin and amylose peaks will be ascertained, as well as peaks representing undefined intermediate material.  The peak fractions will be pooled, and glucose assays will be used to determine the amylose/amylopectin ratio.

Amylopectin structure will be determined as follows.  Amylopectin in pooled GPC fractions will be dialyzed extensively in water and lyophilized, then digested to completion with Pseudomonas isoamylase, a widely applied enzyme that specifically hydrolyzes a-(1¨6) branch linkages.  Reducing end values will be determined and quantified by comparisons to maltose standards, thus revealing the concentration of branch linkages in the sample (79).  The distributions of chain lengths in the amylopectin will be determined by fluorescence-assisted capillary electrophoresis (FACE) of linear chains that have been labeled at the reducing end, as described previously (80) .  This instrumentation is available through the ISU Metabolomics Research Facility, and has been in routine use in our laboratory for several months.  Figure 4 presents the results we have obtained using this analysis to characterize amylopectin from wild type Arabidopsis leaves.

Specific objective #3.  The effects of each mutation on the complement of specific isoforms of each enzyme in the network will be determined using high-resolution, two-dimensional zymograms.

In maize, pleiotropic changes in starch-metabolizing enzymes occur in response to a mutation in a single starch biosynthetic gene (63) .  For example, specific alleles altering two different DBEs each cause the loss of activity of one BE isoform but do not alter expression of the BE protein.  These responses suggest that functional interactions occur between specific starch metabolic enzymes, possibly resulting from physical associations.  We propose to further investigate these interactions by employing a two-dimensional activity gel method to separate starch-metabolizing enzymes in total protein extracts from specific tissues.  Using the mutant collection, this technique will be applied to determine whether loss of one protein in the gene set affects the enzymatic activity of others in the set.  For the first separation, total proteins will be extracted from approximately 10-20 g of tissue (e.g., light-harvested mature leaves) by grinding the tissue to a fine powder in liquid nitrogen, and suspension in extraction buffer.  Proteins in a high-speed supernatant will be separated by anion exchange chromatography on a MonoQ column using AKTA FPLC instrumentation.  This method is routinely applied in our laboratory for similar separations of proteins from maize kernel extracts.  The second-dimension separation of proteins in each MonoQ fraction is achieved by non-denaturing SDS-PAGE.  Following electrophoresis, proteins are transferred by electroblotting to a polyacrylamide gel of the same size containing 0.3% (w/v) starch, Ap, or other glucan substrate.  Starch metabolic activities are observed after staining the gel with I2/KI solution.  Against a purple background, regions of the gel in which the substrate has become more highly branched are seen as red-staining bands, regions in which the starch is less branched are seen as blue-staining bands, and regions in which starch has been hydrolyzed are seen as white-staining bands.  This zymogram method resolves the activities of approximately 30 different enzymes that alter the structure of starch.  Figure 5 shows an optimized, high-resolution result from maize endosperm tissue, as well as an initial analysis of Arabidopsis leaves.  Proteins on duplicate native PAGE gels will be transferred by electroblotting to nitrocellulose, for hybridization with isoform-specific antibodies generated in objective #1.  Thus, alterations in the activities of starch metabolic enzymes on activity gels will be correlated with specific polypeptides.  From these analyses we expect to comprehensively describe the pleiotropic effects on other enzymes caused by eliminating each protein in the gene network.

Specific objective #4.  Selected starch assembly- or disassembly factors will be tested for physical interaction with other components of the network.  Additional proteins that interact with components of the network will be identified.

Owing to the indications of functional interactions between starch metabolizing-enzymes, we will seek to identify multisubunit complexes involving members of the gene set using an affinity chromatography approach.  The proposed methods will follow those that have been successful in other laboratories in identifying, for example, interactions between components of the cyclin-dependent kinase complex (81), cytoskeletal elements (82), or various transcription factor complexes (83).  The advantage of this approach is in maintaining a high local concentration of one of the potential binding partners, so that even weakly interacting proteins can be retained on the column.

Expressed fusion proteins from objective #1 will be purified and bound to the Affigel 10 matrix (BioRad), which will be packed into a small affinity column in a 1 ml syringe barrel.  Protein extracts from various Arabidopsis tissues will be prefiltered to remove proteins that bind non-specifically to the column matrix by passage through an Affigel 10 matrix bound with BSA.  The lysates will then be passed through the affinity column in a low-salt buffer and eluted in steps with increasing concentrations of KCl.  Proteins that elute at higher salt concentrations will be separated by SDS-PAGE and subjected to immunoblot analysis using isoform-specific antibodies from objective #1.  To investigate unknown factors that bind to the columns, proteins will be eluted from the polyacrylamide gel, digested with proteases, and masses of the proteolytic fragments determined by mass spectrometry (84-87).  Computational algorithms are available to compare these data to the masses predicted from the nucleotide sequences of known proteins.  In addition to crude extracts, purified recombinant proteins will be passed over the columns and tested for the ability to bind to the affinity matrix in moderate salt concentrations.

In a variation of this strategy, the affinity matrix will bear the monoclonal antibody specific for one component of the pathway.  The antibodies will be bound to covalently activated Sepharose beads.  Endosperm extracts will then be applied to that affinity matrix and eluted as described above.  This strategy is feasible using monoclonal antibodies, because only a single epitope on the target protein will be bound by one antibody molecule, and the relative weakness of this interaction allows elution from the column in mild conditions.  In a slightly different approach the antibody beads will be incubated with extracts in batch solution and the entire bound complex will be denatured in SDS and applied to PAGE gels for analysis.  In either instance, the immunoaffinity-purified proteins will be analyzed with the battery of monoclonal antibodies against the known components of the system to reveal any direct interactions.

This is a labor-intensive approach that most likely will not be feasible for all members of the study set.  The approach will be focused primarily by the results of objective #3, in which proteins that have pleiotropic effects on other enzyme activities will be identified.  Such proteins are likely candidates to interact with other starch assembly- or disassembly factors.  Another factor that will focus this approach, which is both trivial and practical, is the ability of any protein to be expressed and purified from E. coli in active form.  In other words, with a such a large target set, our initial choices for examination of protein-protein interactions will be those members that are easy to work with.  We anticipate affinity columns containing about one-half of the proteins in the target set as the binding ligand can be constructed and exploited over the course of the project period.

Specific objective #5.  The temporal, tissue, and cellular-specific patterns of gene expression will be determined for each member of the network.

Immunoblot, RNA gel blot, and RT-PCR methods will be used characterize the expression specificities of each member of the gene set.  For transcript analysis, total RNA will be isolated from wild type and mutant tissues (immature seeds, siliques, leaves, and roots) by means of a Tri-Reagent protocol (Sigma), and analyzed on gel blots using each cDNA as a hybridization probe.  This analysis will indicate tissue specificity, and also discriminate between null mutations and potentially less severe mutations, which may contain residual or altered transcripts.  Also, forward and reverse primers from the cDNA of each gene in the set will be used for RT-PCR amplification of the RNA from each tissue from both mutant and wild type plants.  Quantitative RT-PCR equipment is available to the project, so that PCR cycle data can be used to accurately determine the relative levels of mRNA in various tissues.  These analyses also will be applied to RNAs from immature vs. mature leaves, and leaves harvested at various times in a diurnal cycle.  Additionally, crude extracts of proteins isolated from each of the tissues examined for mRNA expression will be subjected to immunoblot analysis, using appropriate isoform-specific antibodies.  We recognize that this is a relatively general description of gene expression and that more detailed analysis involving cell-specific expression studies will be required in the long run.  Studies of this type are not included in the current proposal to keep the project feasible within the requested time frame and resources.

Specific objective #6.  The effects of eliminating each member of the network on the expression of all other Arabidopsis genes will be examined by global mRNA profiling

Genomics technology provides the ability to simultaneously monitor the expression levels of all the genes in an organism.  As such, microarrays provide a unique approach to assessing the changes in genome-wide expression at the level of RNA accumulation in response to the environment, development, or genotype.  Such assays will provide a novel, comprehensive means of identifying factors involved in starch assembly or degradation that would not be revealed by other strategies.  Genes whose transcription is altered in each of the knockout mutants will be identified in an effort to determine transcriptional networks, and also as a means of finding other candidate genes potentially involved in starch metabolism. Furthermore, microarray experiments may lead to the identification of novel complements of genes that were not hitherto suspected to be intimately involved with the starch granule metabolic network.

The mRNA profiling approach is now becoming widely applied (88-91).  For the proposed experiments, commercially manufactured oligonucleotide gene chips from Affymetrix will be utilized.  These chips comprise non-redundant sequences covering approximately one-third of the Arabidopsis genome.  We anticipate whole-genome chips will be available commercially to the academic community within the next two years.  The microarray chips will be incubated with biotinylated RNA sets (ÒtargetsÓ) generated from Arabidopsis lines that differ with respect to a mutation in a starch granule metabolism gene.  After hybridization and washing, the chips will be reacted with a streptravidin-phycoerythrin derivative, which binds to biotin on the target molecules.  Preparation of RNA targets, labeling, and hybridization will be conducted according to Affymetrix protocols (www.affymetrix.com/).  Currently, we are conducting such experiments on a fee basis at the University of Iowa microarray analysis facility, and have processed over 50 chips.  We propose to continue use of this service for the current project. 

The Affymetrix Microarray Suite software package will be used to measure differences in gene expression from the fluorescence values at individual registers on the chip, as follows.  Each gene is represented on the chip by numerous oligonucleotides, and for each of those there is also on the chip, at a different register, a corresponding ÒmismatchÓ oligonucleotide that differs at a single base.  The difference between the fluorescence values for the pair of perfect match- and mismatch oligonucleotides indicates in general the amount of hybridization above background to that oligonucleotide sequence.  The average difference between the perfect- and mismatch oligonucleotides for all test sequences from a given gene is calculated.  To allow comparison between different chips, these average difference values are normalized so that the median value for all genes is defined as 1.0.  Values for any gene below the background, i.e., a negative average difference value, are considered as 0 values in calculation of the median.  Comparison of these normalized average difference values reveals changes in steady state mRNA levels between conditions, e.g., in a given tissue from mutant or congenic wild type lines.

As a preliminary study, the microarray approach was used to identify changes in expression patterns of some genes in the proposed study set as the result of a particular mutation.  Differences in mRNA levels were compared between an ATP citrate lyase (ACL) antisense mutant and the congenic wild type line.  ACL functions to generate cytosolic acetyl-CoA (92), and thus is predicted to function in lipid biosynthesis.  Expression of several starch metabolism genes was affected in response to the ACL reduction (Fig. 6).  Notable differences included an 8-fold increase in expression of AtBE-2, but only a 2-fold increase in expression of the closely related sequence AtBE-3.  Other genes in the set apparently are not significantly affected by the ACL deficiency (Fig. 6).  These preliminary results indicate that ACL enzyme activity somehow influences expression of genes coding for specific starch assembly enzymes.  This experiment illustrates the power of microarray profiling to uncover previously unknown relationships between seemingly divergent pathways.

Leaves 5-8 will be harvested from 25-day-old seedlings grown in a 8 h dark/16 h light cycle.  These leaves are mature and actively conducting photosynthesis.  All material will be harvested at precisely the same time of the diurnal cycle, 4 h into the light phase.  Harvested plant material will be used for RNA extraction.  In addition, aliquots of the plant material will be set aside for subsequent metabolite determinations, protein analyses, enzyme activity assays, and PCR analyses.  These samples can later be used to determine the levels of starch, starch-related metabolites, enzyme activities, and proteins (see objective #5).  Proteins will be detected using antibodies from objective #1.

Leaf mRNA from three individual plants collected at the same experimental condition will be pooled, and that combined sample will be subjected to microarray analysis.  Two independent sample set will be analyzed for each condition (e.g., harvest time in the diurnal cycle).  Analysis of both pooled sample sets by microarray achieves a true biological replication.  Expression levels of individual genes, as determined by microarray analysis, will subsequently be confirmed by quantitative RT-PCR.

The proposed budget includes resources for analysis of 15-20 mRNA populations.  Profiling of specific knockout mutants will be prioritized based on the results from objectives #1-5, in accordance with phenotypic severity, expression data, effects on starch synthesis or degradation, and indications of protein-protein interactions.  Depending on results from the initial microarray experiments, we will focus subsequent microarray studies on the genetic networks associated with particular knockouts, and in response to particular stimuli.  For example, we will monitor selected mutants at selected time points over a 24-hour time course, during starch assembly and disassembly in the mesophyll plastids.  This approach is expected to identify novel and unpredicted genes that are differentially expressed in the mutants compared to the wild type, and to delineate pathways that respond differently under differing mutant backgrounds.  We also anticipate discovering previously unknown genetic interconnections that enable a plant to maintain homeostasis in the event of disruptions in specific steps in the starch metabolic network.

We are currently using GeneSpring (Silicon Genetics) to normalize the data, generate probe lists, and display gene expression profiles.  In addition to the standard analysis packages, we are shifting to software with multidimensional view capabilities for data analyses; these analyses are being conducted using the visual and statistical package that we are currently developing, GeneExpressionToolkit (http://www.public.iastate.edu/~zcox/fcmodeler/) (93).  As this software moves towards a beta version, we anticipate using it as our predominant tool. The visualization tools in GeneExpressionToolkit are based on the statistical data visualization software GGobi written by Dr. Dianne Cook, Dept. of Statistics, ISU, and colleagues (http://www.ggobi.org/).