A detailed look at the T4 bacteriophage
The T4 bacteriophage is a complex and highly evolved virus which infects the bacterium Escherichia coli.
The T4 phage is quite large for a virus: the head (capsid) is 119.5 nm tall and 86 nm in diameter, whilst
the tail is 100 nm long and 21 nm in diameter. The virion is composed of over 2000 protein subunits,
which are the products of over 50 different genes.

The T4 Genome

The T4 genome is dsDNA which is wound tightly within the head (capsid). The genome consists of 168
903 bp and is thought to code for 289 proteins, though not all have yet been characterised, in addition to
coding for 8 tRNAs and at least 2 small RNA molecules of unknown function. The DNA is specially
modified HMC-DNA, meaning that (16%) of the cysteine bases are chemically modified into glucosylated
hydroxymethyl cytosine (HMC). This makes the DNA resistant  to endonucleases (nucleic acid degrading
enzymes), such as host endonucleases which digest foreign DNA or the T4 endonucleases which digest
host DNA.

The glucose molecules added to the cysteine residues also increases the stability of the DNA since the -
OH and -H groups of the glucose can hydrogen-bond to neighbouring bases. This may be especially
important as the genome is low in G+C base pairs at 34.5% G+C (which stabilise DNA since they are
joined by triple hydrogen-bonds whereas T-A bases are joined only by double hydrogen-bonds). In some
regions both strands are transcribed and both may be translated into proteins. The T4 dsDNA
approximates D-form DNA (poly(dA-dT)) which is overwound with only 8 bp per turn and a wider and
shallower major groove and a deeper and narrower minor groove. This form of DNA is possibly
transcribed and replicated faster as it may unzip more easily.

Viruses economise on genetic material. In order to find a new host cell, viruses have to produce many
progeny which means minimising the quantity of protein used in capsid construction which reduces the
space available for packaging the genome. T4 has several mechanisms to make the most of its DNA.
These are explained below.

1. Nested Genes

Some genes may actually encode several proteins (e.g. genes 16, 17 and 49) by having multiple START
codons. At least five T4 genes have multiple STARTs. Such nested genes may have diverse functions.
For example, in phage lambda one open-reading frame (ORF) has two nested genes whose proteins
differ by only two amino acids. One of these proteins forms pores in the host cell membrane to induce
cell lysis when the progeny are ready to escape. The other delays formation of the pore and so has an
opposite or antagonistic function. together these two proteins help regulate the time of lysis.

2. Framshifting

Nested genes may have different reading frames, meaning that a
frame-shift can produce an entirely
new coded protein. Sometimes the different frames may be read in the same direction (e.g. 30.3') or in
opposite directions (e.g. repEA, repEB involved in initiation of replication at OriE). Some viruses also
utilise programmed frameshifts in which the nucleotide seqience codes for a frameshift of 1 bp forwards
or backwards, creating a new code from the same base sequence (recoding). This does not occur in T4
however.

3. Introns

At least three T4 genes have introns: gene td (codes for thymidylate synthase), nrdB which codes for a
subunit of the aerobic ribonucleotide reductase and
nrdD which codes for anaerobic ribonucleotide
reductase. These introns are
group 1 self-splicing introns and are designated I-TevI (td), I-TevIII
(
nrdB) and I-TevIII (nrdD). The first two introns themselves code for protein, both code for an
endonuclease. These endonucleases recognise a 'homing site' and cleave it to form a DSB (double-
strand break) which can recombine with the intron allowing it to 'infect' any intron-free strains that may be
present in the same host cell. Thus these endonucleases are called
homing endonucleases.

These introns may also have a function in regulating deoxyribonucleotide synthesis. In order to splice
themselves efficiently they require efficiently translating ribosomes in the upstream exon. If the host cell
enters stationary phase (meaning that nutrients are limiting and the cells not dividing - unfavourable
conditions for phage synthesis and for the progeny to infect new cells) then T4 replication pauses until
favourable conditions return. The lack of nutrients may slow translation as the ribosomes have to wait for
amino acids, which prevents intron splicing such that the host gene mRNA is not translated into
functional proteins. Since these proteins are used in DNA synthesis this halts DNA replication.

3. Translational Bypassing

This occurs, for example, in T4 gene 60. A 50 bp sequence in the coding region of the mRNA may be
skipped, resulting in an alternative protein product. This is a high-efficiency bypass site, meaning that it
is bypassed often. An additional low-efficiency bypass occurs at the junction of gene 56 with gene 69.

4. High Gene Density

The T4 genome has a high gene density, twice as high as that of Escherichia coli, which also has a
compact genome. Only 9kb (5.3% of the genome) are non-coding. In contrast about 2% of bacterial
genomes are non-coding and 98% of the human genome.

5. Proteins with Multiple functions

Additionally a single gene may code for a protein with multiple roles, e.g. the T4 RNA ligaseA (coded for
by gene 63,
rnlA) also catalyses tail fiber attachment to the baseplate during assembly.

Of all these genes only 62 (occupying about half of the genome) are essential for successful replication
under controlled laboratory conditions, however, the others probably serve to increase efficiency of
replication in variable and unpredictable natural conditions. These essential genes code for the
replisome, nucleotide-precursor complex, several transcriptional regulatory factors and most structural
and assembly proteins. Amber mutations in these genes (in which a coding codon is mutated into a
STOP codon causing premature termination of translation) prevents successful replication.

Non-essential genes are involved in nucleotide biosynthesis, recombination, DNA repair, nucleases
which degrade host DNA, proteins which prevent superinfection, proteins which inhibit lysis and progeny
escape when the phage/host ratio is high and proteins which inhibit host replication and transcription. In
some cases the missing functions may be provided by host enzymes or these proteins may increase the
efficiency of phage replication (increase the burst size) and are clearly advantageous if not critical.
Mutations in the primase (gene 61) and topisomerase (genes 39, 52 and 60) may be compensated for
by other mechanisms which prime DNA synthesis.

Finally a few genes appear to be identical copies of one-another: {58, 61 }, { 2, 64} and {4, 50, 65}.

Structural Proteins and Virion Assembly

More than 40% of the genome encodes 53 of the 54 proteins required for phage particle synthesis: 24
for head morphogenesis, 22 for tail morphogenesis and 7 for tail fibre synthesis (including one for tail
fibre attachment). Of these 54, 5 are catalysts and not actual structural components. The head, tail, tail
fibres and whiskers are synthesised by separate pathways, then the head and tail join before 6 tail fibers
are attached to form a complete virion.

Of the 24 proteins required for head synthesis, 16 are needed for prohead formation and further
maturation to form a mature head, of which 10 are absolutely essential and only one host-encoded
(GroEL); 5 are for DNA-packaging and 3 complete and stabilise the head.

Head Assembly

The T even phage head is an icosahedron with triangulation number (T) = 13, but which is elongated
along the fivefold axis of symmetry into a prolate icosahedron by the insertion of a near-equatorial band
of 20 capsomeres in the T2 phage. Prolate icosahedrons are defined by both T and Q. There are 3T
protein subunits (protein molecules or protomers) per face in an isometric icoshedron (i.e. an
icosahedron whose height = its width) and 60T in total. For T = 13 this would give 780 protomers. At
each vertex 5 protomers assemble into a pentameric capsomere, whilst the faces consist of structural
units of 6 protomers each, forming hexagonal capsomeres. This would give a total of (12 x 5) 60
protomers at the vertices (12 pentameric capsomeres) and 120 capsomeres of 6 protomers each. This
would give 132 capsomeres in total. However, in prolate icosahedra additional hexameric capsomeres
are inserted. For such geometries the number of capsomeres is given by 5(T + Q) + 2, where for T4 Q =
20, giving 5(13 + 20) + 2 =  167 capsomeres. However, one vertex is the portal vertex which attaches to
the tail so T4 has 166 capsomeres in its capsid. Eleven of the 12 vertices are formed of pentamers of
gp24 (5 x 11 = 55 pentamers of gp24). This leaves 155 hexameric capsomeres, made up of (155 x 6)
930 copies of gp23.

During head assembly, an
initiator complex forms first, which is a dodecamer (12-mer) of gp20 (gp =
g
ene product, i.e. a protein) - that is a ring of 12 copies of gp20 which attaches to the inside of the inner
membrane of the host. A
scaffold of 576 copies of gp22 and 72 copies of gp21 in a complex attaches to
the initiator complex. The initiator complex forms one vertex of the head to be, called the portal vertex (as
DNA will enter the capsid through it). Pentamers of gp24 form the other 11 vertices of the prolate
icosahedron. The capsid structural proteins, gp23 and gp24 assemble around the scaffold. The scaffold
is then removed by the gp21 T4 prohead protease which also cleaves gp23 and gp24 to increase the
space inside the
prohead to accommodate more DNA. The prohead then detaches from the host cell
membrane to become an 'empty small particle' (ESP) which becomes an ISP (initiated small particle)
when DNA begins packing inside it and the prohead expands by 15% in linear dimension, resulting in a
doubling of internal volume to form an ILP (initiated large particle) packed with DNA and which later
develops into a mature head.

T4 proteins gp13, gp14 and 6 trimers of gp wac (whisker or fibritin protein which forms the whiskers) bind
the portal vertex to complete the head which then binds an assembled tail. The phage proteins soc and
hoc are non-essential but are added to the head after expansion and may serve to stabilise it, indeed its
is established that soc acts as a protein clamp to stabilise the head. It has been shown that soc stabilises
the capsid against highly alkaline pH, extreme temperatures and osmostic shock. The function of hoc is
less clear.

DNA Packaging

The replicated phage DNA is formed as a double-stranded concatemer (several copies of the genomes
joined along the same DNA molecule). A terminase complex ((gp16, gp17, gp17' and gp17'') binds to the
DNA and then binds to the portal vertex to form the
packasome. ATP hydrolysis provides the energy
needed by the packasome to package the DNA into the head and then the terminase cuts the DNA when
the head is full. It is thought that the dodecameric gp20 neck rotates as the DNA is packaged. The head
is pacakaged with one complete copy of the DNA plus about 3%. The genome is circularly permuted so
the precise location of the cut does not matter as long as a complete genome is packaged. The
terminase complex apparently inserts the DNA into the phage head, translocates the DNA and then cuts
the DNA when packaging is complete.

Sometimes errors in assembly occur. Defective virions may be produced with unusually long heads or
unusually short heads. Those with short heads will lack the entire genome and so cannot infect and
complete replication alone, but may do so in a host which is superinfected (infected by more than one
phage) with one of the phages having the missing genes. Those with isometric capsids can only hold
about 70% of the genome, whilst some giant mutants can hold up to 12 copies of the genome!

Tail Assembly

The tail consists of two concentric tubes - the contractile outer tail-sheath tube and the inner tube.
The outer tail sheath consists of 144 copies of gp18 arranged into 24 hexameric rings with each ring
rotated 17 degrees to the one beneath it (in a right-handed helix). This sheath is 98.4 nm long when non-
contracted but contracts to 36 nm (with the twist or angle between adjacent rings increasing to 32
degrees) when the needle is deployed.

The
baseplate consists of a central hub (possibly utilising genes 5, 27, 29, 26, 28 and 51) to which 6
pre-assembled
wedges bind. Hub assembly is initiated by gp29. Each wedge is formed when gp10 and
gp11 bind one-another, followed by the sequential binding of: gp7, gp8, gp6, gp53 and gp25. once the
wedges bind the hub, gp9, gp12, gp48 and gp54 also bind (6 trimers of gp9, gp10 and gp11, 2 trimers of
gp3 and single trimers of gp5 and gp27; 12 non-trimeric copies of gp8 bind). The baseplate is formed in
association with the inner surface of the host cell inner membrane with a 30 nm fibre (apparently part of
gp7) attaching each of the 6 corners to the membrane. When host cell lysis occurs then the baseplates
detach from the host membrane.

Six
short tail fibres, trimers of gp12 with gp11 at the tip, are incorporated into the baseplate, these are
responsible for secondary and irreversible binding to the target cell surface when initiating infection.
A
zinc iron at the centre helps to hold the gp12 trimer together.
A lysozyme, consisting of a gp5-gp27
complex is apparently inserted in the baseplate as a (hetero)hexameric structure attached to the needle
end of the inner tube. Protein gp5 is cleaved during assembly with the new C-terminus covering the
active site until penetration when the C-terminus moves out of the way to expose the active site.


The six
long tail fibres, or 'legs', consist of the upper or proximal fibre, knee joint/ articulator and distal
fibre ending in a needle-like foot. The proximal fibre or proximal rod (proximal = closer to main body,
distal further from main body) consists of three copies of the gp34 protein (a homotrimer or group of 3
identical subunits bonded together) and is about 70 nm long. Each gp34 subunit is quite a large protein,
consisting of 1289 residues (i.e. amino acids incorporated into a polypeptide). The articulator at the
'knee' consists of a single copy of the protein gp35 (consisting of 372 residues). The distal fibre / rod is
also about 70 nm long and divided into an upper or more proximal section is a trimer of gp36 (each gp36
consists of 221 residues) whilst the more distal part of the shaft is a trimer of gp37 (gp37: 1026
residues). The distal end of the gp37 trimer (from residues 785 to 1026) forms the 'foot'.

Each foot consists of a proximal globular
collar domain (a domain is a region of a protein or polypeptide
with a specific function) where the three monomers are not connected, a common
needle domain 17.5
nm long and a terminal
head domain which contains the actual adhesin region which adheres to the
outer membrane of
Escherichia coli. (See: Bartual et al. 2010. Structure of the bacteriophage T4 long tail
fiber receptor-binding tip. PNAS 107: 20287–20292, for more information on the structure of the long tail
fibres). The adhesin can bind to LPS (lipopolysaccharide or endotoxin, a glycolipid unique to Gram
negative bacterial envelopes) or to the outer membrane porin (protein channel spanning the membrane)
OmpC of
Escherichia coli. OmpC is a trimer of three membrane-spanning subunits forming a membrane
channel in between them (porins allow water-soluble materials to cross the fatty membrane). Computer
molecular docking experiments suggest that the tip of the foot (head domain adhesin) can dock (at an
angle as it would in life) to the depression formed by the outer pore of the channel in the OmpC trimer.
The foot can also bind LPS, possibly through positively-charged residues.

The needle domain of the foot consists of 6 highly interwoven polypeptide beta-strands arranged in a
barrel structure )an anti=parallel beta-barrel) forming a 'hollow' tube with a row of seven spaced iron ions
in the centre (the iron atoms are octahedrally coordinated by 6 histidine residues, 2 from each chain)
which hold the strands together to form the structure of the foot. In some phages, such as
lambda, have
8 such iron atoms to bind them, which it has been suggested indicates that these structures evolved from
multiplication of an original iron-binding domain. Phylogeny shows us that proteins often evolve by
piecing together domains from other proteins to make new proteins, in a kind of flexible kit building
manner.
Above: when at least three of the long tail fibres bind to sugars on the LPS (lipoplysaccharide)
of the target cell outer membrane, the baseplate undergoes a conformational change -
opening out into a star configuration which deploys the six short tail fibres (otherwise hidden
in the baseplate) which form a secondary and more stable adhesion with the target cell
surface. Simultaneously, the tail sheath contracts (below) as a wave of contraction travels
along its length (presumably from bottom to top) and the inner tail-tube with its 'needle-like' tip
penetrates the target cell envelope, with the help of lysozyme in its tip, injecting the DNA into
the host. This mechanism presumably uses stored elastic or strain energy in the various
proteins. DNA entry requires a membrane potential (possibly uses the proton gradient?).
Evidence also suggests that the whiskers bend back out of the way of the contracting tail
sheath (not shown here).
Chaperone-like proteins (gp51, gp38 and gp57A) are needed for complete tail assembly. Chaperones are
proteins which help other proteins to fold correctly. Hub assembly is assisted by gp51, short and long tail-
fiber assembly by gp57A.

Long Tail-Fiber assembly

The socket of each long tail-fiber consists of gp9 which provides a flexible socket when the tail fibers are
down in the expanded position ready for adhesion to a suitable host. The proximal segment of each long
tail-fiber consists of gp34, the distal segment of gp37 whilst gp35 and gp36 attach the distal fiber to the
proximal fiber. The gp37 tip has a
hypervariable region which differs between different t even phages
and confers host specificity (recognising specific sugar residues in the LPS of a specific host bacteriual
species or strain).

Whiskers

The 6 whiskers are made of the protein wac (fibritin) and are thought to bind to the knee of the tail fiber
during assembly to facilitate attachment of the long tail-fiber to the baseplate. The whiskers also appear
to bind the knee when the phage is in the
retracted configuration. When environmental conditions are
not suitable the long tail-fibers retract, being drawn up towards the tail and head. The whiskers act as an
environmental sensor and allow the long tail-fibers to drop down into the
expanded configuration when
conditions are suitable. The flexible socket protein gp9 is presumably also involved in these changes of
position.

T4 bacteriophage, Pov-Ray model
T4 bacteriophage bound, Pov-Ray modle
T4 bacteriophage contracted, Pov-Ray model
T4 bacteriophage retracted, Pov-Ray model
Above: T4 in its retracted configuration. When environmental
conditions are unfavourable, the adhesive tail fibers are
retracted - the 'knee' pulls up to the whisker. In this 'dormant'
phase T4 can neither bind to a target cell nor initiate infection.
t4 bacteriophage more detailed model
Above: a more detailed model of T4, showing the structure of the tail fibers.
Article updated: 25/3/2018
Suggested Reading

Bartual, S.G.; J.M. Otero; C. Garcia-Doval; A.L. Llamas-Saiz; R. Kahn, G.C. Fox
and M.J. van Raaij, 2010. Structure of the bacteriophage T4 long tail fiber receptor-binding tip. PNAS
107: 20287–20292.


Labrie, S.J.; J.E. Samson and S. Moineau, 2010. Bacteriophage Resistance Mechanisms. Nature
Reviews: Microbiology 8: 317-327.


Miller, E.S., Kutter, E., Mosig, G., Arisaka, F., Kunisawa, T., and W. Ruger, 2003. Bacteriophag T4
Genome.
Microbiology and Molecular Biology Reviews 67: 86–156.

Mesyanzhinov, V.V., Leiman, P.G., Kostyuchenko, V.A., Kurochkina, L.P., Miroshnikov, K.A., Sykilinda, N.
N., and M.M. Shneider, 2004. Molecular Architecture of Bacteriophage T4.
Biochemistry (Moscow) 69:
1190-1202.

Baumann, R.G. and L.W. Black, 2003. Isolation and Characterization of T4 Bacteriophage gp17
Terminase, a Large Subunit Multimer with Enhanced ATPase Activity.
J. Biol. Chem. 278: 4618-4627.

Fokine, A., Zhang, Z., Kanamaru, S., Bowman, V.D., Aksyuk, A.A., Arisaka, F., Rao, V.B. and M.G.
Rossmann, 2013. The Molecular Architecture of the Bacteriophage T4 Neck.
J. Mol. Biol. 425: 1731-
1744.

Iwasaki, K., Trus, B.L., Wingfield, P.T., Cheng, N., Campusano, G., Rao, V.B.,and A.C. Steven, 2000.
Molecular Architecture of Bacteriophage T4 Capsid: Vertex Structure and Bimodal Binding of the
Stabilizing Accessory Protein, Soc.
Virology 271: 321-333.
Evoiding Superinfection

Superinfection is the infection of a host by more than one parasite of a given species. This reduces the
resources available to each parasite and although superinfection does sometimes occur parasites often
try to avoid it. The T4
imm (immunity) gene is expressed about 4 minutes post-infection and if DNA from
another T4 or a related phage attempts to gain entry to the same host then the product of
imm causes
the newly injected DNA to remain in the periplasm where it is degraded by nucleases.
The Imm protein
has been modeled below (using The Phyre2 web portal for protein modeling, prediction and analysis
(Kelley LA et al. 2015.
Nature Protocols 10: 845-858):
T4 Phage DNA injection
Above: the DNA injection process of T4. The long tail fibres bind to LPS or OmpC in the outer membrane
of
Escherichia coli, signalling to the short tail fibres to change conformation and bind LPS irreversibly to
stabilise the attachment. This signals the tail outer sheath to contract, driving the hollow needle-like core
through the outer membrane. With the help of viral lysozyme which digests the tough peptidoglycan layer
in the bacterial cell wall (the middle layer between the outer and inner membranes) this punctures the
peptidoglycan. The DNA is ejected under considerable pressure when the tail sheath contracts. A host
cell inner membrane transporter is probably largely responsible for transporting the viral DNA across the
inner membrane and into the host cell.
Viral DNA Replication

The T4 encodes all the components of its own replisome (unusual amongst viruses). It encodes its own
DNA polymerase (gene 43), sliding clamp loader (gene 44 and 62), sliding clamp (gene 45), helicase  
loading protein (gene 59 at least
in vitro), helicase (gene 41), primase for lagging strand synthesis (gene
61) and single-stranded DNA binding protein (gene 32). T4 proteins RNase H (gene
rnh) and DNA ligase
(gene 30) seal Okazaki fragments on the lagging strand (though host enzymes can substitute for these).
Otherwise replication is very similar to that of DNA in bacteria such as
Escherichia coli. The number of
replisomes is limited and only one of several origins of replication (Oris, sing. Ori) is used. Although the
onset of DNA replication depends on Oris, most T4 replication forks are initiated at more-or-less random
positions along the genome using intermediates of  
recombination as DNA primers.

Viral Gene Transcription

transcription of viral genes occurs in three main phases: early, middle and late. Early transcription
occurs almost immediately after infection and involves at least 39 promoters (Pe, early promoters).
These promoters are stronger than host promoters with which they compete for the host's sigma-70
transcription factor (a sigma factor is a detachable component of DNA-dependent RNA polymerase, RNA-
P, which detaches soon after RNA synthesis begins and which determines the specificity of the
polymerase for different promoters). There are about 650 of the sigma-70 and 2000 host core RNA-P
which the virus utilises as well as the host. The T4 protein gpAlt enters the host along with the T4 DNA
and this is a mono-ATP-ribosyltransferase which ADP-ribosylates one alpha subunit of RNA-P to make
transcription of early T4 genes more favourable. Early gene products include the proteins ModA and
ModB which are also ADP-ribosyltransferases which also ADP-ribosylate both alpha subunits of host
RNA-P to reduce host gene transcription (they replace a positive charge with a negative one). Thus
early transcription focuses on taking control of the host cell to divert host resources to T4 synthesis.
Many early proteins are lethal to the host.

Middle transcription occurs a few minutes later into infection and involves 30 T4 promoters (Pm,
middle promoters) and these depend on T4 MotA (a transcriptional activator protein, not to be confused
with the
E. coli protein of that name) and the protein AsiA which modifies RNA-P for middle gene
transcription, reducing transcription of both host and early T4 genes. Thus, a switch occurs to the
transcription of the middle set of genes.

Late transcription focuses on virion synthesis: head, tail and fiber synthesis and the production of
structural proteins, assembly proteins and recombination genes (see below). It involves about 50 late T4
promoters (Pl, late promoters) which are activated by gp33 and sigma-55, a t4 sigma-factor.

Host Cell Lysis

When about 100-150 progeny T4 phages are assembled, host cell lysis is triggered - the host cell is
ruptured to release the new T4 particles so that they may be carried by advection and diffusion to new
target cells. (We say that the
burst size is 100-150 for T4). Lysis involves a T4 lysozyme (coded for by
the
gpe gene) and a T4 holin (coded for by the gpt gene). The T4 holin forms a pore  in the inner
membrane to allow the lysozyme to reach the peptidoglycan cell wall and degrade it. If more pahges
attack the cell more than five minutes after infection then lysis is delayed, since this indicates a lack of
new hosts for the progeny. This involves the rI protein which regulates T4 holin assembly and
T4 gprIII
extends lysis inhibition further.
The Imm protein is thought to insert itself into the host inner cell membrane. It consists of two
membrane-spanning (lipophilic alpha) helices (shown at the bottom) joined by a hydrophilic linker and
both the N-terminus (in green) and the C-terminus are in the periplasmic space between the inner and
outer membranes. On the left is a ribbon model and on the right is a space-filling model of the
polypeptide chain. This is a small protein of 65 residues. Similar proteins are produced by other
bacteriophages of Gram negative bacteria.

The Imm protein blocks entry of T4 DNA across the inner membrane, and so is thought to inhibit the
inner membrane DNA transporter of
E. coli, as shown in fig. B below:
Anti-phage Biological Weapons
Imm accounts for about 80% of superinfection inhibition. Another key player is the T4 protein Sp which
inhibits the viral lysozyme to prevent the virus from breaching the tough peptidoglycan layer, or at least
making it very difficult to do so (since a combination of mechanical force and enzyme activity is most
likely employed) as shown in A above.

Bacterial Resistance

It is not just bacteriophage that can prevent further infection of their host cell by other phages, but the
bacterial hosts also develop defence mechanisms and engage the phages in a
molecular arms race!
Some bacteria develop various slime layers (consisting of carbohydrate chains which hydrate in water to
form slime) to physically block the phage from reaching the cell envelope (as shown in C above). Some
viruses overcome this by utilising an enzyme incorporated into their tail to digest the carbohydrate
chains of the slime and so dissolve their way through, leading to some bacteria modifying the chemical
make-up of the carbohydrates to render the enzyme ineffective. Other phages may actually use the
carbohydrate chains for initial capture from flow and adhesion to the bacterial cell surface. Bacteria can
also modify the targets to which the long tail fibres bind, such as modifying the LPS or membrane porins
or producing proteins which mask the binding target site. Of course, some viruses adapt by changing the
nature of their adhesin.

Even if the phage does successfully inject its DNA, the bacterial host has further lines of defence. Many,
if not all, bacteria produce
restriction enzymes to digest foreign DNA and RNA. These enzymes
recognise specific short codes in the viral DNA (restriction sites) and cut the DNA at these points. The
host's own DNA is protected by being methylated. Some phages have mutated their DNA over time to rid
it of restriction sites, though bacteria have evolved quite a wide range of restriction enzymes. Others
have their own methylase to methylate their own DNA to disguise it as host DNA. Some phages have
unusual bases in their genetic code to prevent recognition by restriction enzymes. The DNA of T4
contains an unusual base: hydroxymethylcytosine (HMC) instead of the usual cytosine. Any restriction
enzyme whose recognition site contains a cytosine will not recognise or cut T4 DNA. Of course, some
bacteria have evolved restriction enzymes that can recognise HMC! T4 further resists these new
restriction enzymes by adding glucose molecules (glucosylating) its HMC bases to physically block
binding of the restriction enzymes (the enzyme can not get its mouth around the target, so to speak).

The phage-bacteria arms race is complex and the review by Labrie et al. (2010) contains more examples
and more details.