BDGP Resources
Transmembrane Proteins
A High Throughput Screen to Identify Novel
Secreted and Transmembrane Proteins Involved in Drosophila
Embryogenesis
Casey C. Kopczynski¹, Jasprina N. Noordermeer¹, Thomas L.
Serano, Wei-Yu Chen, John D. Pendleton, Suzanna Lewis, Corey S.
Goodman and Gerald M. Rubin¹.
¹ These authors contributed equally to this work.
Howard Hughes Medical Institute, Department of Molecular and Cell
Biology
University of California
Berkeley, CA 94720-3200
USA
ABSTRACT
Secreted and transmembrane proteins play an essential role in
intercellular communication during the development of multicellular
organisms. As only a small number of these genes have been
characterized, we developed a screen for genes encoding
extracellular proteins that are differentially expressed during
Drosophila embryogenesis. Our approach utilizes a new method for
screening large numbers of cDNAs by whole embryo
in situ
hybridization. The cDNA library for the screen was prepared from
rough endoplasmic reticulum-bound mRNA, and is therefore enriched
in clones encoding membrane and secreted proteins. To increase the
prevalence of rare cDNAs in the library, the library was normalized
using a novel method based on cDNA hybridization to genomic
DNA-coated beads. In total, 2518 individual cDNAs from the
normalized library were screened by
in situ hybridization,
and 917 of these cDNAs represent genes differentially expressed
during embryonic development. Sequence analysis of 1001 cDNAs
indicated that 811 represent genes not previously described in
Drosophila. Expression pattern photographs and partial DNA
sequences have been assembled in a database publicly available at
the Berkeley Drosophila Genome Project website
(http://www.fruitfly.org). The identification of a large number of
genes encoding proteins involved in cell-cell contact and signaling
will advance our knowledge of the mechanisms by which multicellular
organisms and their specialized organs develop.
INTRODUCTION
A major goal of developmental biology is to elucidate the molecular
mechanisms that govern cell-cell interactions in higher eukaryotes.
Genetic analysis of development in Drosophila has proven to be a
powerful approach for studying these mechanisms. For example, most
of the genes known to be involved in the hedgehog (1, 2), dpp/BMP
(3), and Wnt (4) signaling pathways were identified through
classical genetic screens in Drosophila. The characterization of
these genes and their vertebrate homologs has greatly advanced our
understanding of the cell signaling pathways that regulate
development.
Genetic screens, however, have significant limitations. Genes with
subtle loss-of-function phenotypes or genes whose function can be
compensated for by other genes or pathways are unlikely to be
found. These two classes of genes may represent the majority of
genes in Drosophila, since it is estimated that two-thirds of
Drosophila genes are not required for viability (5). In addition,
screens designed to identify specific phenotypic defects often do
not recover genes with pleiotropic roles during development, since
the requirement for gene function in one developmental process can
mask its requirement in another.
To identify all classes of developmentally important genes,
expression-based and other molecular screens are needed to
supplement classical genetic screens. In Drosophila, the most
productive such screens to date have utilized P element-based
enhancer traps (6-9), but P element insertion is not random and
enhancer trap screens are biased towards identifying genes that are
favored for insertion by P elements (10). Other expression-based
screens to specifically identify extracellular proteins have
involved generating monoclonal antibodies against crude membrane
preparations and screening by immunostaining of embryos (11, 12).
Unfortunately, antibody screens are biased towards identifying the
most abundant or highly immunogenic proteins and thus typically
identify only a small subset of proteins.
We present a novel, large scale screen for genes encoding secreted
and transmembrane proteins that are expressed in specific tissue or
cell types during embryonic development in Drosophila. The approach
combines a cDNA library enriched for genes encoding extracellular
proteins with a high throughput whole embryo
in situ
hybridization procedure and subsequent sequence analysis. The
results have been compiled in a publicly available database.
MATERIALS AND METHODS
All protocols used in this study are available in a more detailed
form at http://www.fruitfly.org.
RNA isolation from rough endoplasmic
reticulum
Rough endoplasmic reticulum membranes or rough microsomes (RMs)
were isolated from 10g of 8 - 16 hr (25°C) embryos using a
sucrose gradient sedimentation procedure (13, 14) with some
modifications. PolyA
+ RNA was purified from the RM RNA
preparation using the PolyA Select kit (Promega).
cDNA library construction
A directionally-cloned RM cDNA library was prepared from RM
polyA
+ RNA using standard techniques (15), except that
the RNA was annealed with a Pst-T15 primer/adaptor
(5'-CACCTTGTCTCACTGCAGT15) and the first strand cDNA synthesized in
the presence of 5-methyl dCTP (Pharmacia) to protect internal Pst I
sites from subsequent digestion. Double-stranded cDNA was then
repaired with T4 DNA polymerase, ligated with Hind III/Xmn I
adaptors (New England Biolabs), digested with Pst I, size-selected
to remove cDNAs smaller than 500 bp (15), and cloned into Hind
III/Pst I-digested pBluescript SK(+) (Stratagene). The ligated
plasmid was transformed into XL-1 Blue MRF' (Stratagene) to obtain
a library of 5 X 10
5 independent cDNA clones.
The normalized RM cDNA library was prepared from single-stranded RM
cDNA eluted from genomic DNA beads (see below). Single-stranded
cDNA was converted to double-stranded cDNA using the Bluescript KS
primer, cloned into pBluescript SK(+) and transformed into XL-1
Blue MRF' as described above. A normalized library of 4.4 X
10
4 independent cDNA clones was obtained.
Preparation of genomic DNA-coated magnetic beads and
normalization of the RM cDNA library
Genomic Drosophila DNA was partially digested with Sau3A and Mae
III, size fractionated and a Klenow "fill in" reaction (15) was
used to incorporate biotin-dUTP (ENZO Biochem) into the ends of the
Sau 3A and Mae III fragments. The biotin-labeled genomic DNA was
immobilized on streptavidin-coated magnetic beads (Dynal) using a
modification of the manufacturer's instructions. The beads were
collected, washed and used immediately for cDNA
hybridization.
To prepare single-stranded cDNA "driver" for hybridization to the
genomic DNA "target", the RM library was transcribed
in
vitro and the product RNA subsequently converted into
single-stranded cDNA. The genomic DNA beads were resuspended in
hybridization mix containing single-stranded RM cDNA as driver and
free polysome polyA
+ RNA as competitor to block the
hybridization of free polysome cDNA to the beads. The beads were
hybridized at 65°C for 16 hrs with rocking. After
hybridization the beads were washed extensively and subsequently
the hybridized cDNA was eluted and recovered by ethanol
precipitation. The protocol used to construct the library is shown
schematically in Figure 1.
Figure 1
Figure 1
Legend
Whole-mount RNA in situ hybridization of
Drosophila embryos in 96 well plates
The non-radioactive whole embryo
in situ hybridization
method described by Tautz and Pfeifle (16) was adapted to the use
of RNA probes to achieve maximum sensitivity. To allow expedient
screening with large numbers of probes, the protocol was further
modified for hybridization in 96 well plates. Staging of embryos
and description of expression domains was performed as described
(17) using a standardized vocabulary
(http://flybase.bio.indiana.edu/docs/flydocs/flybase/controlled-vocabularies.txt).
Photography and digital imaging
Between 10 and 15 individually staged embryos were selected for
photography for each RM cDNA clone. Expression domains were
examined using Nomarski optics on an Axiophot microscope (Zeiss)
and photographed using standard 35mm film. Digital images were
generated and written onto compact discs (Eastman Kodak
Company).
DNA Sequencing and Analysis
The cDNAs were sequenced using either the ABI Prism Dye Terminator
Cycle Sequencing Ready Reaction Kit or the Pharmacia Autoread
Sequencing Kit and the products run on an ABI Prism 373 DNA
Sequencer or a Pharmacia ALF Express DNA Sequencer, respectively.
The resulting DNA sequences were trimmed and edited using
Sequencher 3.1 software. Edited sequences average about 350-400
nucleotides in length and contain 3% or less ambiguity. In cases
where sequences from the 5' and 3' ends of the insert overlapped,
contigs were constructed. Database searches were carried out using
the BLASTN and TBLASTX programs (18).
Database and Software
We implemented the cDNA database in Illustra version 3.2, an
object-oriented relational database. The network browser interface
was supported by the Apache v1.2.5 HTTP server. Common Gateway
Interface (CGI) scripts were written in Perl v1.0.5. Assemblies of
the cDNA sequences are publicly viewable using a Java applet. The
applet was compiled with Java 1.0.3 and utilized the
BDGP/Neomorphic Software Inc. widget set. The cDNA sequences were
analyzed using gapped WU-BLAST v2.0 (Warren Gish). Consensus
sequences from multiple cDNAs (tentatively the same gene) were
assembled using PHRAP (P. Green, in preparation).
RESULTS
Isolation of mRNA from rough microsomes
Most mRNAs that encode membrane and secreted proteins are bound to
the rough endoplasmic reticulum through ribosomes engaged in
cotranslational secretion of their nascent polypeptides. We
isolated rough endoplasmic reticulum membranes, or rough microsomes
(RMs), from embryos as a source of mRNAs encoding membrane and
secreted proteins. We found that only a small fraction of polysomal
mRNA (<10%) is present in the RM preparation; the vast majority
of embryonic mRNA appears to be translated on "free" polysomes
encoding cytosolic proteins. This result is consistent with
sequencing data obtained from an embryo cDNA library prepared from
unfractionated mRNA, which revealed that 94% of clones with matches
to known proteins encoded intracellularly-localized proteins (see
below).
Northern blot analysis was used to determine the extent to which
mRNAs encoding membrane and secreted proteins are enriched in the
RM RNA preparation (Figure 2A and B). The results show that the
mRNA encoding the membrane protein Fasciclin II (Fas II) is
approximately 10-fold enriched in the RM RNA preparation relative
to the mRNA encoding the cytosolic protein rp 49. Similar results
were obtained using probes representing other membrane and
cytosolic proteins (data not shown). Although these results confirm
that the RM RNA preparation is enriched for mRNAs encoding membrane
and secreted proteins, they also reveal that the RM preparation was
contaminated with significant amounts of free polysomes. The low
yield of RMs obtained from embryos and the RNA degradation suffered
on sucrose gradients precluded further purification of the RM
preparation.
Figure 2
Figure 2
Legend
Preparation of a normalized cDNA library
Poly A
+ RNA was prepared from RM RNA and used to
generate a directionally cloned RM cDNA library (Materials and
Methods). To increase the chances of identifying genes that encode
low abundance mRNAs, it was important to normalize the
representation of cDNAs in this library. A method of normalization
was needed that would increase the prevalence of rare cDNAs
encoding membrane and secreted proteins without increasing the
prevalence of cDNAs encoding cytosolic proteins. The normalization
procedure we developed is based upon hybridizing a large excess of
single stranded cDNA to a limiting amount of genomic DNA that is
attached to magnetic beads (Figure 1). To prevent cDNAs encoding
cytosolic proteins from hybridizing to the genomic DNA-coated
beads, free polysome polyA
+ RNA was added as a
competitor. Once the hybridization was complete, the unbound cDNA
was discarded and the normalized library was prepared from the cDNA
that hybridized to the genomic DNA. Thus the representation of
cDNAs in the normalized library should reflect gene copy number,
rather than mRNA abundance.
The effectiveness of this method was determined by colony blot
hybridization using probes to a moderately abundant RM-bound mRNA
(Fas II), a low abundance RM-bound mRNA (connectin) and a cytosolic
mRNA (Ras 1). As expected, normalization had the greatest effect on
the frequency of clones representing the low abundance connectin
mRNA, which showed a 13-fold increase from an initial frequency of
1 in 90,000 clones to 1 in 6900. By comparison, the frequency of
Fas II clones in the normalized library increased only 2-fold from
an initial frequency of 1 in 10,000 clones to 1 in 4300.
Unexpectedly, the frequency of Ras 1 clones in the library also
increased substantially (6-fold from an initial frequency of 1 in
130,000 clones to 1 in 21,000). This suggests that the addition of
free polysome RNA as a competitor in the hybridization mix was only
partially effective at preventing normalization of cDNAs encoding
cytosolic proteins. Given that typical embryo cDNA libraries
contain similar numbers of Fas II and Ras 1 clones (data not
shown), the results suggest that the normalized RM cDNA library is
approximately 5-fold enriched for clones encoding membrane and
secreted proteins.
Since normalization of the RM library resulted in an increase in
the representation of cDNAs encoding cytosolic proteins, we devised
a rapid Northern blot assay to determine if a cDNA of interest is
likely to encode a membrane or secreted protein or a cytosolic
protein (Figure 2C and D). Specifically, the cDNA is hybridized to
a blot containing one lane of unfractionated mRNA and one lane of
free polysome mRNA: if the hybridization signal is decreased in the
free polysome lane, this suggests that the mRNA was bound to rough
microsomes and thus encodes a membrane or secreted protein. To
date, this assay has produced accurate predictions for 11/12 cDNAs
tested (data not shown).
RNA in situ hybridization of cDNA clones to
Drosophila embryos.
Spatial and temporal embryonic expression profiles of the genes
represented by RM cDNAs were determined by RNA
in situ
hybridization to whole mount Drosophila embryos. To evaluate large
numbers of cDNA probes, we developed an RNA
in situ
hybridization protocol that allows the simultaneous screening of 96
different RNA probes in a single multi-well plate.
A total of 2518 RNA probes prepared from individual, randomly
picked cDNA clones were screened on 0 to 24 hours old, whole mount
embryos. Of these clones, 917 (36%) were expressed in specific
patterns during embryogenesis, while 1206 (48%) of the cDNAs showed
apparent uniform expression throughout the embryo. The remaining
395 clones (16%) did not produce detectable levels of staining in
the embryo. For every cDNA clone with specific expression patterns,
10 to 15 embryos covering a range of different embryological stages
(starting at the fertilized egg to stage 16) were evaluated and
photographed. As expected, a wide variety of temporal and spatial
expression patterns was observed (examples in Figure 3).
Figure 3
Figure 3 Legend
The frequency with which cDNAs were found to be expressed in
various embryonic organs is summarized in Table I (ubiquitously
expressed cDNAs are not included). The numbers shown in Table I are
adjusted for multiple occurrences of cDNAs representing a single
gene. A disproportionately large number of cDNAs are expressed in
the embryonic gut, the CNS and the muscle, while only a small
percentage of cDNAs are found in tissues such as the amnioserosa,
glands, trachea, imaginal discs and gonads. A possible explanation
for this observation is that expression in a tissue such as the gut
is more easily scored than, for example, that in the embryonic
imaginal discs; these consist of only 10-25 cells and are
considerably more difficult to identify.
Only a small percentage of the clones were found to be expressed
during early zygotic stages of development (blastoderm, gastrula
and segmented germband stages). The vast majority are expressed
during stages when the internal organs, like the gut, the central
nervous system and the muscles are formed. As the embryos that were
used to make the cDNA library were taken from an 8 to 16 hours
collection, the period when these tissues are developing, the bias
towards cDNAs expressed in the internal organs is not unexpected.
In addition, a large number of cDNAs show hybridization to early
stage embryos prior to the onset of zygotic gene expression. This
hybridization presumably represents maternal contribution of the
cognate mRNAs.
Table I
Expression domains of RM clones during
embryogenesis
| Table I: Expression
domains of RM clones during embryogenesis |
|
|
|
Spatial Expression Domain
|
Number of RM clones*
|
%¹
|
|
|
|
| fertilized egg |
167 (282)
|
7
|
| blastoderm |
13 (18)
|
<1
|
| gastrula |
9 (9)
|
<1
|
| segmented germ band |
4 (5)
|
<1
|
|
|
|
| epidermis |
86 (134)
|
4
|
| mesoderm |
379 (638)
|
16
|
|
|
87 (160)
|
4
|
|
|
228 (329)
|
9
|
|
|
28 (84)
|
1
|
|
|
36 (65)
|
2
|
| nervous system |
210 (317)
|
9
|
- stomatogastric nervous
system
|
6 (8)
|
<1
|
- peripheral nervous
system
|
13 (27)
|
<1
|
|
|
191 (282)
|
8
|
| embryonic gut |
418 (642)
|
17
|
|
|
99 (129)
|
4
|
|
|
169 (284)
|
7
|
|
|
94 (136)
|
4
|
|
|
38 (72)
|
2
|
|
|
18 (21)
|
<1
|
|
|
|
| amnioserosa |
28 (41)
|
1
|
|
|
|
| embryonic glands |
69 (95)
|
3
|
|
|
|
| embryonic tracheal
system |
25 (32)
|
1
|
|
|
|
| reproductive system |
24 (43)
|
1
|
|
|
|
| imaginal disc |
3 (6)
|
<1
|
* The first number given is the number of cDNAs that represent
unique sequences, while the number in parentheses is the total
number of clones. Individual clones are usually expressed in more
than one tissue. Uniformly expressed cDNAs are not included.
¹ The percentage of unique clones in the database expressed in
a particular tissue.
Sequence Analysis
We next set out to sequence the 5' and 3' ends of the 917 cDNAs
that represent genes with tissue- and stage-specific expression
patterns, as such genes are good candidates to play important roles
in development. In addition, we sequenced a subset (381) of the
cDNAs that represent uniformly expressed genes. Based upon sequence
analysis, we were able to identify 297 recurring cDNAs. The largest
class of repetitive cDNAs corresponded to mitochondrial genes,
which we found to be strongly expressed in the visceral mesoderm.
The relatively high prevalence of mitochondrial cDNAs is likely due
to the fact that mitochondria are a significant contaminant of
rough microsome preparations and mitochodrial DNA is present at a
very high copy number in embryos. After taking redundancies into
account, the 1298 sequenced cDNAs represent 1001 unique sequences.
This is likely to be a slight overestimate of the number of
different genes represented, however, since a single gene can
produce transcripts with different 3' ends and "false" 3' ends can
be generated by internal priming during cDNA synthesis. Thus, we
expect the number of different genes examined to be between 800 and
900.
This sequence data provided us with another opportunity to assess
the enrichment of the library for cDNAs encoding membrane-targeted
proteins. Of the 1001 different sequences, 124 correspond to known
Drosophila genes for which we could predict a subcellular
localization based on protein similarity or published protein
localization data; 47 of these genes encode membrane proteins and
77 encode either nuclear or cytoplasmic proteins. Thus,
approximately 38% of the cDNAs that correspond to known genes
encode for membrane proteins. For comparison, we carried out a
similar analysis on sequences from an unfractionated embryonic cDNA
library, the LD library (sequence data made available by the
Berkeley Drosophila Genome Project; http://www.fruitfly.org). We
analyzed 326 LD cDNAs that correspond to known Drosophila genes.
These cDNAs represent 147 different genes, of which 16 (11%) encode
membrane proteins and 131 (89%) encode nuclear or cytoplasmic
proteins. These results suggest that the RM library is
approximately 3.5-fold enriched for cDNAs encoding
membrane-targeted proteins, similar to the 5-fold enrichment
suggested by our colony blot hybridization results (discussed
above). It should be noted that sequence analysis may underestimate
the overall representation of clones encoding membrane-targeted
proteins in the RM library due to a bias for cytosolic and nuclear
proteins in the Drosophila sequence database. To date, 6/8 RM cDNAs
characterized solely on the basis of expression pattern have been
found to encode membrane or secreted proteins (data not
shown).
The 811 sequences that did not correspond to previously described
Drosophila genes were analyzed for homology to translated
nucleotide databases using the TBLASTN program (18). We found that
267 of these sequences show significant similarity to characterized
genes in other species (i.e., homologies that have a probability of
10-5 or less and that are not the result of simple repetitive
sequences). As expected, many of these cDNAs encode for homologs of
mammalian membrane and secreted proteins, including growth factors,
transmembrane receptors, ion transporters and proteins that
function in the endoplasmic reticulum (Table II). Another 125
sequences show significant homology to identified but
uncharacterized sequences in other organisms, typically to human
and mouse ESTs and to C. elegans genomic DNA. The remaining 419
sequences have no significant homology to any sequence in the
databases. Since the majority of the cDNAs are relatively small
(approximately 1kb in length), it is likely that many of the
sequences consist mainly of 3' untranslated region and therefore
would not be useful for searching databases for protein homologies.
Therefore, the percentage of Drosophila genes that have homologs in
other species is likely to be significantly higher than these
results suggest.
Table II. Selected RM cDNAs with Homologies to Known
Mammalian Genes
| CK no. |
Highly similar mammalian gene |
| 02126 |
Human epidermal surface antigen (M60922) |
| 02288 |
Human plasma membrane calcium ATPase isoform 3x/b (U60414) |
| 01423 |
Human stomatin (X60067) |
| 01140 |
Human adenosine triphosphatase (M95541) |
| 00230 |
Human KDEL receptor (X55885) |
| 00459 |
Rat purine specific Na+ nucleoside cotransporter (U25055) |
| 01227 |
Human multidrug resistance-associated protein (L05628) |
| 02656 |
Mouse ABC8 (Z48745) |
| 00309 |
Canine docking protein (SRP receptor) (X06272) |
| 01110 |
Human testican (X73608) |
| 00043 |
Human SEC13R membrane protein (L09260) |
| 00325 |
Human sulfonylurea receptor (L40625) |
| 01510 |
Human K-Cl cotransporter, hKCC1 (U55054) |
| 01296 |
Rat TRAP complex gamma subunit (Z14030) |
| 02248 |
Rat Dri42 (Y07783) |
| 01027 |
Human bumetanide-sensitive Na-K-Cl cotransporter (U30246) |
| 02682 |
Mouse reticulocalbin (D13003) |
| 00198 |
Mouse macrophage scavenger receptor (M59445) |
| 01823 |
Human E16 (M80244) |
| 00539 |
Human LDL-receptor related protein (X13916) |
| 01577 |
Mouse scavenger receptor class B type I (mSR-BI) (U37799) |
| 02137 |
Rat zinc transporter, ZnT-2 (U50927) |
| 02567 |
Mouse thrombospondin, THBS2 (M64866) |
These clone-gene combinations show TBLASTX values between
e
-18 and e
-59.
For each mammalian gene, the GenBank accession number is shown in
parentheses.
Data Availability over the Internet
A database describing the expression patterns and DNA sequences of
the cDNAs compiled in this study that were expressed in specific
tissues is accessible at http://www.fruitfly.org. The web page
describing each EST shows the sequence, accession numbers, and a
summary of gene expression data, together with a low resolution
expression image and a summary of similarity to other sequences. A
high resolution digital image is available for downloading. Several
types of searches are available to query this information: 1)
Expression Domain Keyword Search: Every expression image has been
annotated using the standardized set of terms developed by Flybase
for the description of
Drosophila anatomy
(http://flybase.bio.indiana.edu). Therefore, keyword searches for
cDNAs that are expressed in a particular embryonic organ, or
combination of organs, may be performed; 2) Sequence Keyword
Search: A BLAST similarity search was performed on each EST and the
results stored in the database, including the accession number of
the GenBank entries of similar sequences. cDNAs that show
similarity to a particular class of gene may be found by searching
for words or phrases that are likely to be found in the gene's
GenBank description; 3) Clone Identifier Search: unique
identifiers, such as the clone name (CK number) or accession
number, can be used to retrieve an individual cDNA record; 4)
Sequence Similarity Search: Using a public BLAST server available
at the same site as the database, searches for ESTs similar to any
query sequence can be performed.
DISCUSSION
We have used high-throughput whole embryo
in situ
hybridization and a normalized cDNA library prepared from RM-bound
mRNA to identify membrane and secreted proteins whose expression is
associated with specific developmental processes during
embryogenesis. The expression patterns of 1003 individual cDNAs and
sequence information for 1298 cDNAs is available on a public
database (http://www.fruitfly.org). This database makes it possible
to rapidly identify new developmentally regulated genes and, based
on the sequence and expression pattern, formulate testable
hypotheses for the function of the genes. For example, based on a
motoneuron-specific expression pattern in the developing nerve
cord, we identified the first Drosophila member of the tetraspanin
family of transmembrane proteins,
late bloomer (19).
Through subsequent genetic analysis, we determined that
late
bloomer function facilitates neuromuscular synapse formation
in the embryo (19). Similarly, characterization of a cDNA expressed
specifically in muscle led to the identification of a new
Drosophila glutamate receptor (20).
Although the RM cDNA library is 4 - 5 fold enriched for membrane
and secreted proteins, this library also contains a large fraction
of cDNAs encoding cytosolic and nuclear proteins. This is due in
part to the fact that embryonic mRNAs encoding membrane and
secreted proteins appear to be much less abundant than mRNAs
encoding cytosolic and nuclear proteins. In addition, normalization
of the RM library decreased the enrichment for membrane and
secreted proteins by partially restoring the prevalence of clones
encoding cytosolic and nuclear proteins. In spite of this drawback
to normalization, we chose to screen the normalized RM cDNA library
to reduce the number of recurrent cDNAs and thereby increase the
chances of identifying less abundant mRNAs whose expression is
limited to a small number of cells in the embryo.
The normalization method we describe has both advantages and
disadvantages relative to the more standard methods of normalizing
by limited cDNA self-hybridization (21). The main advantage of
normalizing by hybridization to genomic DNA is that the method
requires no optimization of hybridization times or titration of
hydroxyapatite elution conditions. However, genomic DNA
hybridization normalizes on the basis of gene copy number, which
means that high copy number genes are overrepresented in the cDNA
library. We found mitochondrial genes were particularly
problematic; approximately 15% of the clones in the library
represent mitochondrial genes. This could be resolved by further
purification of the genomic DNA to ensure that mitochodrial DNA is
not present on the magnetic beads. Another limitation of the
technique is the need for relatively large amounts of genomic DNA
target in the hybridization to capture enough cDNA to prepare a
library. The amount of DNA needed for genomes of higher complexity
than Drosophila would necessitate a much larger amount of genomic
DNA-coated beads, which would increase the amount of contamination
in the library due to nonspecific hybridization. Also, the larger
amount of interspersed repetitive DNA in vertebrate genomes would
cause rapid annealing of the genomic DNA and could cause vast
overrepresentation of mRNAs containing repetitive elements in their
untranslated regions. For these reasons, this normalization
technique may not be appropriate for vertebrate genomes.
Subcellular fractionation of RM-bound mRNA is a convenient way to
prepare mRNA enriched for membrane and secreted proteins. However,
it requires a relatively large amount of tissue in order to isolate
enough mRNA to generate a library that does not require
amplification by PCR. It is also difficult to normalize a RM
library without increasing the prevalence of mRNAs encoding
cytosolic and nuclear proteins. In the course of this work, two
alternative methods for identifying cDNAs encoding membrane and
secreted proteins were described that have some advantages over
subcellular fractionation (22, 23). These methods are based on
transforming tissue culture cells (22) or yeast (23) with a vector
that will express an assayable reporter protein only when a cDNA
encoding a signal sequence is cloned into the vector. This approach
allows cDNA libraries to be prepared from small amounts of
unfractionated mRNA, and the library of positive cDNAs that is
generated is highly specific for membrane and secreted
proteins.
The Drosophila genome is estimated to contain approximately 12,000
genes (5). The fact that we were able to carry out
in situ
hybridization to embryos for over 2,500 different cDNA clones in
this study argues that the methodology we describe could be used to
collect similar data for all Drosophila genes. Suitable probes
could be derived by using PCR to amplify segments of sequenced
genomic DNA or cDNA clones as templates. The highly sensitive and
rapid
in situ hybridization method employed here allows
the detailed visualization of gene expression and provides a level
of spatial and temporal resolution that is not currently obtainable
by methods that require RNA isolation and hybridization to clone
(24) or oligonucleotide (25) arrays. Such expression data, along
with the more quantitative data provided by hybridization to
arrays, will be essential for deciphering gene regulatory
networks.
ACKNOWLEDGMENTS
We thank Fred Wolf for his help with the initial RNA
in
situ screens, Rick Fetter and Lee Fradkin for helping prepare
the figures and Lee Fradkin and the members of the Rubin and
Goodman laboratories for critical review of the manuscript. C. C.
K. was supported as a Jane Coffin Childs postdoctoral fellow and a
Howard Hughes Medical Institute (HHMI) postdoctoral associate. T.
L. S. is a Jane Coffin Childs postdoctoral fellow. J. N. N. is a
postdoctoral associate and C. S. G. and G. M. R. are investigators
with the HHMI. This work was supported in part by NIH grant
HG00750.
REFERENCES
1. Burke, R., & Basler, K. (1997)
Curr. Opin.
Neurobiol. 7, 55-61.
2. Perrimon, N. (1996)
Cell 86,
513-516
3. Derynck, R., & Zhang, Y. (1996)
Curr. Biol.
6, 1226-1229
4. Cavallo, R., Rubenstein, D., & Peifer, M. (1997)
Curr.
Opin. Genet. Dev. 7, 459-466
5. Miklos, G. L., & Rubin, G. M. (1996)
Cell
86, 521-529
6. Wilson, C., Pearson, R.K., Bellen, H.J., O'Kane, C.J.,
Grossniklaus, U. & Gehring, W.J. (1989)
Genes Dev.
3, 1301-1313.
7. Bier, E., Vaessin, H., Shepherd, S., Lee, K., McCall, K.,
Barbel, S., Ackerman, L., Carretto, R., Uemura, T., Grell, E., Jan,
L.Y. & Jan, Y.N. (1989)
Genes Dev. 3, 1273-1287.
8. Torok, T., Tick, G., Alvarado, M. & Kiss, I. (1993) Genetics
135, 71-80
9. Spradling, A. C., Stern, D. M., Kiss, I., Roote, J., Laverty,
T., & Rubin, G. M. (1995)
Proc. Natl. Acad. Sci. USA
92, 10824-10830
10. Kidwell, M.G. (1986) in
Drosophila: A Practical
Approach, ed. Roberts, E.D. (I.R.L. Press, Washington, D.C.),
pp. 59-83.
11. Bastiani, M.J., Harrelson, A.L., Snow, P.M., & Goodman,
C.S. (1987)
Cell 48, 745-755
12. Zipursky, S.L., Venkatesh, T.R., Teplow, D.B., & Benzer, S.
(1984)
Cell 36, 15-26
13. Gaetani, S. Smith, J., A., Feldman, R.A., & Morimoto, T.
(1983)
Methods Enzymol. 96, 3-24
14. Natzle, J.E., Hammonds, A.S., & Fristrom, J.W. (1986)
J. Biol. Chem. 261, 5575-5583
15. Sambrook, Fritsch, E.F., & Maniatis, T. (1989)
Molecular Cloning: A Laboratory Manual, Second Edition
(Cold Spring Harbor, New York).
16. Tautz, D., & C. Pfeifle (1989)
Chromosoma
98, 81-85
17. Hartenstein, V. (1993)
Atlas of Drosophila Development
(Cold Spring Harbor, New York).
18. Altschul, S.F., W., Gish, W., Miller, E.W. Myers, & D.J.
Lipman (1990)
J. Mol. Biol. 215,
403-410
19. Kopczynski, C. C., Davis, G.W., & Goodman, C.S. (1996)
Science 271, 1867-1870
20. Petersen, S.A., Fetter, R.D., Noordermeer, J.N., Goodman, C.S.,
DiAntonio, A. (1997)
Neuron 19,
1237-1248.
21. de Fatima Bonaldo, M., Lennon, G. & Soares, M.B. (1996)
Genome Res. 6, 791-806
22. Tashiro, K., Tada, H., Heilker, R., Shirozu, M., Nakano, T.
& Honjo, T. (1993)
Science 261,
600-603
23. Klein, R.D., Gu, Q., Goddard, A. & Rosenthal, A. (1996)
Proc. Natl. Acad. Sci. USA 93,
7108-7113
24. Schena, M., Shalon, D., Davis, R.W., & Brown, P.O. (1995)
Science 270, 467-470.
25. Lockhart, D.J., Dong, H., Byrne, M.C., Follettie, M.T., Gallo,
M.V., Chee, M.S.,
et al. (1996)
Nature
Biotechnology 14, 1675-1680.
Figure Legends
Figure 1
Schematic representation of the cDNA normalization
procedure.
The normalization method is described in detail in the text.
Figure 2
mRNAs encoding transmembrane proteins are selectively
enriched in the rough microsome RNA fraction and decreased in the
free polysome fraction
(A, B) Northern blots containing 20
mg RNA
from the total (T) or rough microsome (M) fractions were hybridized
with the genes encoding the transmembrane protein Fas II (A) (4500
nucleotide transcript) or the rp 49 ribosomal protein (B) (600
nucleotide transcript). (C, D) Northern blots containing 10
mg polyA
+ RNA from the total (T)
or free polysome (F) fractions were hybridized with genes encoding
the transmembrane protein latebloomer Lbm (C) (1300 nucleotide
transcript) or the cytosolic protein actin 57B (D) (2000 nucleotide
transcript).
Figure 3
Expression domains of a subset of RM clones.
The RNA expression patterns of selected RM clones in distinct parts
of the Drosophila embryo are shown. A typical image assigned to
each RM clone in the database is shown in A, while panels B through
L show a detail of these images. In panels B through L, anterior is
to the left.
(A) Expression of CK02213 in the
anterior and posterior midgut primordium (arrows), the midgut
(arrowhead) and the visceral mesoderm. This clone shows homology to
the human NMDA receptor glutamate-binding subunit.
(B) Expression of CK02262 in the ventral nerve
cord and brain. This clone shows homology to the B. taurus gene for
Na/Ca,K-exchanger protein.
(C) Expression of
CK02467 in the proventriculus, a part of the stomodeum. This clone
does not show homology to any genes in the existing gene databases.
(D) Expression of CK01670 in the developing
tracheal system. This clone does not show homology to any genes in
the existing gene databases.
(E) Expression of
CK01209 in the brain. This clone shows homology to human
serine/threonine kinase.
(F) Expression of CK02623
in the salivary glands and proventriculus. This clone shows
homology to the rat Na
++-dependent inorganic phosphate
cotransporter.
(G) Expression of CK00246 in the
central nervous system, ventral nerve cord and brain. This clone
shows homology to mouse and human ESTs.
(H)
Expression of CK01174 in the reproductive system (gonads). This
clone does not show homology to any genes in the existing gene
databases.
(I) Expression of CK00490 in the
anterior and posterior midgut primordium. This clone shows homology
to several human ESTs.
(J) Expression of CK01593
in the dorsal vessel and lymph gland. This clone does not show
homology to any genes in the existing gene databases.
(K) Expression of CK02229 in the epidermis, the
visceral mesoderm, the tracheal system and the fore and hindgut.
This clone shows homology to human laminin.
(L)
Uniform expression of CK02318 throughout the epidermis. This clone
shows homology to a C. elegans EST