Molecular determinants of surface colonisation in diarrhoeagenic Escherichia coli (DEC): from bacterial adhesion to biofilm formation

Escherichia coli is primarily known as a commensal colonising the gastrointestinal tract of infants very early in life but some strains being responsible for diarrhoea, which can be especially severe in young children. Intestinal pathogenic E. coli include six pathotypes of diarrhoeagenic E. coli (DEC), namely, the (i) enterotoxigenic E. coli, (ii) enteroaggregative E. coli, (iii) enteropathogenic E. coli, (iv) enterohemorragic E. coli, (v) enteroinvasive E. coli and (vi) diffusely adherent E. coli. Prior to human infection, DEC can be found in natural environments, animal reservoirs, food processing environments and contaminated food matrices. From an ecophysiological point of view, DEC thus deal with very different biotopes and biocoenoses all along the food chain. In this context, this review focuses on the wide range of surface molecular determinants acting as surface colonisation factors (SCFs) in DEC. In the first instance, SCFs can be broadly discriminated into (i) extracellular polysaccharides, (ii) extracellular DNA and (iii) surface proteins. Surface proteins constitute the most diverse group of SCFs broadly discriminated into (i) monomeric SCFs, such as autotransporter (AT) adhesins, inverted ATs, heat-resistant agglutinins or some moonlighting proteins, (ii) oligomeric SCFs, namely, the trimeric ATs and (iii) supramolecular SCFs, including flagella and numerous pili, e.g. the injectisome, type 4 pili, curli chaperone-usher pili or conjugative pili. This review also details the gene regulatory network of these numerous SCFs at the various stages as it occurs from pre-transcriptional to post-translocational levels, which remains to be fully elucidated in many cases.


INTRODUCTION
Most recent phylogenetic analyses have revealed that the Escherichia genus is subdivided into eight groups containing three species, namely, Escherichia coli, Escherichia fergusonii and Escherichia albertii, as well as five clades numbered from I to V (Lawrence and Hartl 1991;Walk et al. 2009). Escherichia coli is undoubtedly the most investigated bacterial species and is used as a model organism in microbiology. This lipopolysaccharidic (LPS) diderm bacterium (archetypical Gram-negative bacterium) is primarily known as a harmless commensal of the gastrointestinal tract (GIT) (Mason and Richardson 1981;Chagnot et al. 2013). While E. coli is prevalently an inhabitant of the gut of warm-blooded animals, especially mammals but also birds, it is worth mentioning this bacterial species can also be isolated from fish, frogs or reptiles, such as crocodiles, turtles or snakes, but also insects, such as flies (Janisiewicz et al. 1999;Souza et al. 1999;Gordon and Cowling 2003;Escobar-Paramo et al. 2006;Blazar, Allard and Lienau 2011); E. coli generally appears more prevalent in herbivores and omnivores than carnivores. In humans, E. coli colonises the GIT of young children early in life and usually represents less than 1% of the human intestinal microbiota in adults (Eckburg et al. 2005).
Nevertheless, some E. coli species possess some virulence factors that enable them to cause a broad range of human extraintestinal and intestinal infections. On one side extraintestinal pathogenic E. coli (ExPEC) mainly comprises the uropathogenic E. coli (UPEC), neonatal meningitis E. coli (NMEC), necrotoxic E. coli (NTEC) and sepsis-associated E. coli (SEPEC). On the other side, and in addition to the adherent invasive E. coli (AIEC) associated with Crohn's disease (Mann and Saeed 2012), the intestinal pathogenic E. coli (InPEC) essentially encompasses six pathotypes of diarrhoeagenic E. coli (DEC), namely, the (i) enterotoxigenic E. coli (ETEC), (ii) enteroaggregative E. coli (EAEC), (iii) enteropathogenic E. coli (EPEC), (iv) enterohemorragic E. coli (EHEC), (v) enteroinvasive E. coli (EIEC) and (vi) diffusely adherent E. coli (DAEC) (Kaper et al. 2004;Croxen and Finlay 2010); of note, EHEC belong to the larger group of shigatoxin-encoding E. coli (STEC) or shigatoxin-producing E. coli, which are not all considered as pathogenic as they can exhibit very various virulence levels ranging from avirulence to hyper-virulence (Karmali et al. 2003;Laing et al. 2009;Monteiro et al. 2016). The pathogenicity of DEC strains is well documented and their main virulence factors are also well defined (Croxen and Finlay 2010). Some of these pathotypes are not restricted to human infections, but can be responsible for diarrhoea in animals, for instance (i) ETEC in porcines (piglets), bovines (calves) or ovines (lambs), (ii) EPEC in rabbits, dogs, cats, pigs, calves, lambs and goats and (iii) STEC in calves and piglets (Beutin 1999;DebRoy and Maddox 2001); to date, EAEC, EIEC and DAEC have not been reported as etiological agents of diarrhoea in animals. Despite the high genome plasticity demonstrating intensive gene flow, the population structure of E. coli remains mostly clonal (Touchon et al. 2009), with a clear delineation into seven principal phylogenetic groups (A, B1, B2, C, D, E and F) (Jaureguy et al. 2008;Walk et al. 2009;Tenaillon et al. 2010;Clermont et al. 2013;Beghain et al. 2018). Commensal E. coli strains generally belong to phylogroup A, whereas DEC usually belong to phylogroups A, B1, C, D and E (Jaureguy et al. 2008;Okeke et al. 2010;Croxen et al. 2013;Hazen et al. 2016;Rossi et al. 2018): (i) ETEC can be found in phylogroups A and B1 and to lesser extent in D, (ii) EAEC are found within phylogroup A but also B1, D and to a smaller extend in B2, (iii) EPEC can belong to phylogroups E and B2, (iv) EHEC strains are mostly found in phylogroups B1 and D but also in E (with the with serotype O157:H7 or O104:H4), (v) EIEC are mainly present in phylogroups A, B1 and E, together with Shigella, which are essentially E. coli species from phylogenetic and taxonomic perspectives (Brenner et al. 1972;Lan and Reeves 2002;Chaudhuri and Henderson 2012;Pettengill, Pettengill and Binet 2015) and (vi) DAEC which mostly belong to phylogroups B2 and D (Servin 2014;Mosquito et al. 2015;Walczuk et al. 2019). This distinct grouping suggests a parallel evolution of the different pathotypes on multiple occasions, possibly with the intervention of mobile elements enabling the acquisition of specific combinations of virulence factors (Chaudhuri and Henderson 2012;Croxen et al. 2013).
DEC can be found all along the food chain (Giaouris et al. 2014;Kim, Cho and Rhee 2017). They can have various environmental reservoirs, such as ruminants for EHEC, and are mainly transmitted to humans by the faecal-oral route through the consumption of contaminated food, including water or contact with contaminated surfaces (Croxen et al. 2013). Besides anthropozoonosis, transmission can also occur from host to host between humans. In any case, the colonisation of the food chain by DEC is a major issue for the agri-food and public health sectors alike. The surface colonisation process can occur via bacterial adhesion and/or biofilm formation to various biotic or abiotic surfaces. When the reversible adhesion to the surface by low energy linkages (e.g. electrostatics and Van der Waals interactions) is overcome, some bacteria can grow at the surface. As such, biofilm formation can be broadly defined as the sessile development of microorganisms at a surface or interface (Azeredo et al. 2017). Biofilm can be monospecies but are more generally multispecies in the natural environment, forming a complex multicellular community, which is often embedded in an exopolymeric matrix (EPM) (Costerton 1995;Costerton, Stewart and Greenberg 1999). It confers to bacterial cells an increased resistance against environmental stress, antibiotics and/or immunological defences of the host. Once the reversible adhesion is overcome, the bacterial biofilm formation is per se divided in several steps: (i) initial and irreversible adhesion of bacterial cells to the surface, (ii) bacterial division at the site of adhesion resulting in the formation of microcolonies, (iii) maturation of the biofilm architecture into a three-dimensional structure and (iv) bacterial dispersion enabling the colonisation of other sites (O'Toole, Kaplan and Kolter 2000;Hall-Stoodley and Stoodley 2002). Biofilm formation can thus plays a key role in DEC ecophysiology by enabling colonisation of various environmental niches (soil, water, vegetables, agri-food surfaces, etc. . . ), the asymptomatic and direct colonisation of some hosts, as well as contributing to transmission through the food chain and ultimately human infection (Ahmed et al. 2013).
Most information about the colonisation process in E. coli is focused on the domesticated laboratory strain K12, commonly considered as representative of the E. coli species . However, this notion is biased due to the numerous and very significant genotypic and/or phenotypic differences with commensal and pathogenic E. coli isolates (Hobman, Penn and Pallen 2007). Indeed, E. coli K12 has one of the smallest genomes compared to other genome-sequenced strains of E. coli due to the loss of a large variety of genes during its domestication (Lenski 2017). With regards to the selective pressures that shapes the genome evolution, E. coli K12 have been replicated and studied for a long time under laboratory conditions, far from those encountered in natural environments (Hobman, Penn and Pallen 2007); some molecular determinants, including some surface colonisation factors (SCFs), could thus be lacking or misregulated in domesticated laboratory strains of E. coli compared to commensal and pathogenic E. coli isolates. As the interface Figure 1. Schematic representation of the exopolymeric matrix (EPM) in E. coli biofilm. By analogy with the extracellular matrix (ECM) in mammalian tissue, the EPM in bacterial biofilm can be further discriminated between (i) the EPM closely associated with the bacterial cells, i.e. the cell-associated EPM (caEPM) (purple shade background), and (ii) the interstitial EPM (iEPM) (white background). Molecular determinants of the caEPM are attached, anchored or linked to the bacterial cell surface. Besides cell-surface proteinaceous determinants including monomeric proteins (not depicted in the picture) and supramolecular protein structures, such the flagella and pili, molecular components of caEPM further comprise extracellular polysaccharides (EPS), namely, some lipopolysaccharides (LPS) as well as poly-β-1,6-N-acetyl-D-glucosamine (PNAG) and colanic acid, which both form a capsule. Together with colanic acid that can be released from the bacterial cell surface, cellulose can compose the EPS part of the iEPM. Besides extracellular DNA (eDNA), some exoproteins (not depicted in the picture) and outer membrane vesicles (OMV) may also constitute the iEPM in E. coli biofilm. between the bacterial cell and its surroundings, the molecular surface determinants are key players in the initial adhesion and sessile development processes and this review aims at summarising exhaustively the SCFs present in DEC. The complexity of the regulation network occurring at various stages, from pre-transcriptional to post-translocational levels, is also highlighted. A greater understanding of the parameters that influence adhesion and biofilm formation may inform the development of interventions to minimise DEC dissemination in the food chain, from the environment, animal, food, to human.

MOLECULAR DETERMINANTS INVOLVED IN SURFACE COLONISATION BY DEC
The colonisation processes along the food chain, from natural environments, such as soil, plants and animals, to food environments, including the industrial processing food chain and food matrices, and ultimately infection or asymptomatic carriage in human, are very complex and involves many molecular determinants. Sessile development at a surface or interface is generally accompanied by the formation of an EPM embedding the bacterial cells in biofilms (Fig. 1). These exopolymers can act as glue for adherence of the bacterial cell to the support and shape the architecture of the biofilm (Hobley et al. 2015). Furthermore, the EPM provides protection by shielding the bacteria from desiccation and antimicrobial compounds but also participates in the channelling of nutrients and signalling molecules (Sutherland 2001;Starkey et al. 2004;. As such, the EPM contribute to the survival strategy and persistence of bacteria in various environmental conditions (Branda et al. 2005). Molecular determinants participating in the surface colonisation by DEC can either be closely associated with the bacterial cell surface and form the cell-associated EPM (caEPM) or present in the extracellular milieu, namely the interstitial EPM (iEPM) ( At a biochemical level, EPM components can be broadly discriminated between (i) extracellular polysaccharides (EPS), (ii) extracellular DNA (eDNA) and (iii) surface proteins. Depending on the different DEC pathotypes, these various determinants can be either present or absent (Table 1). Outer membrane vesicles (OMVs) have been reported to be components of the EPM in E. coli K12 (Schooling and Beveridge 2006) and their presence in biofilm from DEC is likely, although it remains to be demonstrated. To date, there is no report of their contribution to biofilm formation in DEC, as observed in Pseudomonas aeruginosa or Helicobacter pylori (Yonezawa et al. 2009;Wang, Chanda and Zhong 2015), but it is an aspect that would deserve further investigation in DEC. Of note, poly-γ -glutamate (PGA) can be found as a component of the EPM of numerous bacteria, especially parietal monoderm bacteria (archetypical Gram-positive bacteria) and only a few LPS-diderm bacteria, where it can either be released or cell-surface attached to form a capsule (Candela and Fouet 2006;Ogunleye et al. 2015;Radchenkova et al. 2018) but, to date, this has never been reported in any E. coli strain.

Exopolysaccharides (EPS)
EPS are one of the main components of the EPM in E. coli biofilms (Beloin, Roux and Ghigo 2008). DEC can biosynthesise a variety of EPS, namely, (i) lipopolysaccharide (LPS), (ii) poly-β-1,6-Nacetyl-D-glucosamine (PNAG), (iii) colanic acid and (iv) cellulose. Because of their intimate association with the bacterial cell surface, several of these EPS can contribute to the caEPM and the formation of a so-called capsule. Actually, E. coli harbours some serotype-specific polysaccharides, namely lipopolysaccharides (LPS) (O antigen) and capsular polysaccharides (K antigen). E. coli capsules are composed of high-molecular weight polysaccharides embedding the bacterial cells and linked to the cell-surface via covalent attachments (Whitfield 2006). More than 80 capsular antigens have been reported in E. coli, which are divided into four groups, from G1 to G4 (Whitfield 2006;Yaron and Romling 2014). DEC (including EPEC, ETEC and EHEC) produce G1 and G4 capsules that share a common assembly system and can be associated with the lipid A of LPS (K LPS ) or be structurally similar to the O-polysaccharides of the LPS (O-antigen capsules). During an infection, these capsules allow bacteria to be protected from opsonophagocytosis and complement-mediated killing (Whitfield 2006). In EHEC O104:H4, the capsule has been shown to play a role in bacterial survival in the environment and in direct bacterial interaction with plants (Jang and Matthews 2018).

Lipopolysaccharide (LPS)
LPS is located at the outer leaflet of the outer membrane (OM) and part of the caEPM (Raetz and Whitfield 2002). This glycolipidic polymer is formed around a toxic component, lipid A, and for this reason is also considered an endotoxin; the LPS is further composed of the core region linked to the lipid A (divided into an inner and outer part) and the O-antigen that is linked to the outer part of the core region (Raetz and Whitfield 2002). Biosynthesis and assembly pathways of LPS have been fully described and involve more than 50 genes encoded in operons or monocistrons scattered on the bacterial chromosome (Sandkvist 2001;Szalo, Taminiau and Mainil 2006). The structures of lipid A and its core region are highly conserved in E. coli but the core region has five basic structures, called R1, R2, R3, R4 and K12. Among these, R1 is the most prevalent in non-STEC clinical isolates of E. coli and R3 is more associated with STEC strains (Gibb et al. 1992;Appelmelk et al. 1994;Currie and Poxton 1999;Amor et al. 2000). In E. coli clinical isolates, R1 is most prevalent, whilst the K12 core is not detected (Gibb et al. 1992;Appelmelk et al. 1994). More than 170 O-antigens have been identified and consist of 10-25 repeating units containing one to eight sugar residues (Stenutz, Weintraub and Widmalm 2006). The O-antigen can be present (smooth LPS, also called S-LPS or LPS I, resulting in colonies with a smooth phenotype) or absent (rough LPS, also called R-LPS or LPS II, resulting in colonies with a rough phenotype) depending on the E. coli strain; if the core region is also absent, it is called deep-rough LPS (Hitchcock et al. 1986). Smooth strains are the most commonly found in nature, including in DEC, whereas the rough phenotype is more commonly found in laboratory strains (Whitfield and Keenleyside 1995;Nataro and Kaper 1998). For smooth strains, the LPS length is positively correlated with the force of adhesion (Strauss, Burnham and Camesano 2009). The O-antigen assists adhesion through hydrogen binding (Tomme et al. 1996). For example, it has been demonstrated that the O-antigen enables EHEC O157:H7 strains to colonise animal hosts (Sheng et al. 2008). Mutations in LPS biosynthesis genes have been shown to affect the adhesion of E. coli to abiotic surfaces and its biofilm formation ability (Bilge et al. 1996;Genevaux et al. 1999;Landini and Zehnder 2002;Beloin et al. 2006). Additionally, LPS can promote or inhibit biofilm formation by two distinct mechanisms, mainly by interacting with cell-surface-exposed adhesion factors. It has been shown that alteration of LPS synthesis can impair type 1 pili and colanic acid expression as well as bacterial motility, whereas the reduction in LPS expression may unmask E. coli adhesins and thus promote adhesion or biofilm formation as observed for EHEC O157:H7 strain (Bilge et al. 1996;Beloin et al. 2006;Beloin, Roux and Ghigo 2008).

Poly-N-acetyl glucosamine (PNAG)
PNAG is an EPS attached to the bacterial surface and is involved in biofilm formation on abiotic surfaces (Wang, Preston and Romeo 2004). The biosynthetic pathway for PNAG is encoded by the pgaABCD locus (formerly ycdSRQP). Initiation of PNAG production occurs with the PgaDC, a glycosyl transferase localised on the cytoplasmic side of the inner membrane that uses the UDP-N-acetyl-D-glucosamine as substrate (Wang et al. 2004;Itoh et al. 2005Itoh et al. , 2008. The PNAG polymer is exported and anchored to the bacterial surface through the β-barrel formed by two outer membrane proteins (OMPs), namely PgaB and PgaA. Although PNAG forms a surface capsule and is one of the main components of the caEPM in diverse bacterial biofilm, the pga locus is not present in all E. coli strains (Cerca et al. 2007;Cimdins et al. 2017). In DEC, PNAG plays a role in the stabilisation of biofilm architecture (Wang et al. 2004;Al Safadi et al. 2012). It has been demonstrated to be important for biofilm formation of EHEC on sprouts and tomato roots (Matthysse et al. 2008). In vivo expression of pgaA during infection by EHEC O104:H4 suggests that biofilm formation is a key step in pathogenesis (Al Safadi et al. 2012). PNAG is also expressed by some ETEC strains and often induced by conditions found in the environment (Gonzales-Siles and Sjoling 2016).

Colanic acid
Colanic acid is a negatively charged polymer of glucose, galactose, fucose, and glucuronic acid produced by most E. coli strains, including DEC (Obadia et al. 2007). The wca operon (or cps) encodes 19 proteins including polymerases involved in colanic acid synthesis from sugar residues (Stevenson et al. 1996). Colanic acid actually forms the G1 capsule but a significant portion of the colanic acid produced can also be released into the extracellular milieu to contribute to the iEPM (Whitfield and Downloaded from https://academic.oup.com/femsre/advance-article-abstract/doi/10.1093/femsre/fuaa008/5815079 by Teagasc user on 28 May 2020Roberts 1999Beloin, Roux and Ghigo 2008). The exact contribution of colanic acid to biofilm formation is still unclear (Matthysse et al. 2008;May and Okabe 2008). Nonetheless, it forms a physical barrier that helps bacteria to survive outside the host with the formation of a protective capsule around the bacterial cell. This capsule allows E. coli biofilms to resist osmotic and oxidative stresses as well as to temperature variations (Whitfield and Roberts 1999;Chen, Lee and Mao 2004). In EHEC O157:H7, it has been shown to play a role in the bacterial survival in simulated GIT fluids (Mao, Doyle and Chen 2006). In EAEC, the presence of colanic acid has been linked with the formation of large biofilm structures on the surface of sprouts (Borgersen et al. 2018). In contrast, the production of colanic acid could also mask some cell-surface adhesins and consequently impair initial adhesion to some supports (Hanna et al. 2003;Schembri et al. 2004;.

Cellulose
Cellulose is a linear homopolysaccharide composed of Dglucopyranose units linked by β-1→4 glycosidic bonds. While this widespread biopolymer is generally related to plant biology, it is also present in the iEPM in some bacterial species where it plays a role in protection, maturation and structure of the biofilm (Solano et al. 2002;Ude et al. 2006). In E. coli, cellulose biosynthesis genes are located in two operons, namely, bcsQABZC and bcsEFG (Zogaj et al. 2001;Solano et al. 2002;Le Quere and Ghigo 2009). The cellulose synthase is formed by BcsAB, which catalyses cellulose biosynthesis from UDP-glucose subunits and forms a transmembrane pore across the inner membrane for cellulose export prior to secretion across the OM via a β-barrel pore formed by BcsC (Keiski et al. 2010;Omadjela et al. 2013). The role of the bcsEFG operon is still unclear but its presence is necessary for cellulose production (Solano et al. 2002). These genes are found in both commensal and pathogenic E. coli strains . Although cellulose production is essential for biofilm maturation, over-production negatively impacts biofilm formation and bacterial aggregation, possibly by coating and thus masking the adhesive properties of surface proteins such as curli (Gualdi et al. 2008). In EHEC O157:H7 and EPEC O127:H6 cellulose production has been shown to contribute to biofilm formation, and consequently, host colonisation and survival in different environments ). The involvement of cellulose in E. coli colonisation of plant materials has also been demonstrated but it depends on the vegetable, as its presence seems dispensable for biofilm formation by E. coli O157:H7 to spinach leaves, but it is required for bacterial adhesion to alfalfa sprouts (Matthysse et al. 2008;Macarisin et al. 2012). Expression of these genes in some ETEC strains is often induced at ambient temperatures, low ionic strength and nutrient limitation (Bokranz et al. 2005;Szabo et al. 2005).

Extracellular DNA (eDNA)
The importance of eDNA in biofilm maturation has been demonstrated in numerous bacterial species (Muto and Goto 1986;Kadurugamuwa and Beveridge 1995;Steinberger et al. 2002), including E. coli (Xi and Wu 2010;Nakao et al. 2012). As a component of the iEPM, eDNA serves as structural component of the biofilm but can also contribute to a cation gradient, as a nutrient source, induce antibiotic resistance and aid horizontal gene transfer (Bockelmann et al. 2006;Palchevskiy and Finkel 2006;Sanchez-Torres, Hu and Wood 2011). However, the role of eDNA in DEC strains remains to be elucidated. The molecular mechanism explaining the presence of eDNA has been a subject of investigation for some time but essentially results from the release of genomic DNA upon cell lysis, following the bacteriophage lytic cycle or bacterial cell apoptosis (Palmen and Hellingwerf 1995;Steinmoen, Knutsen and Havarstein 2002;Qin et al. 2007). Nonetheless, the lysis of outer membrane vesicles (OMVs) containing DNA (Kadurugamuwa and Beveridge 1996;Whitchurch et al. 2002), as well as DNA secretion through the conjugative Type IV, subtype b, secretion system (T4bSS) (Hamilton et al. 2005;Chagnot et al. 2013) could also contribute to the presence of eDNA. The extent and respective contribution of these different mechanisms to the presence of eDNA would undoubtedly require further investigations, especially in DEC, also considering the impact of the apparent presence of pancreatic nuclease in the intestine (Maturin and Curtiss 1977).

Cell-surface proteins
The cell surface of LPS-diderm bacteria can display a number of proteins associated with the OM. Proteinaceous determinants found at the bacterial cell surface and acting as SCFs can be broadly discriminated into (i) monomeric proteins, (ii) multimeric proteins (Fig. 2).
In the scientific literature, E. coli adhesins have generally been discriminated between fimbrial and afimbrial (or non-fimbrial). However, according to animal classification, a group is much better defined by features that are present rather than by the absence of some features. As such, the term afimbrial adhesins does not tell anything about the nature of these adhesins. In addition, some afimbrial adhesins later appeared to be atypical fimbriae secreted by the same family of protein secretion system, e.g. the CS31A (coli surface associated 31a antigen) pili (Adams et al. 1997). For these reasons, we here propose to regroup those cell-surface proteins under the term of monomeric proteinaceous adhesins, or monomeric proteinaceous colonisation factors. Besides, the term fimbriae is not very well defined across the Bacteria kingdom when considering different bacterial species. On the contrary, the term pili can be used as a generic term encompassing the various type of pili and fimbriae, including curli or injectisome. In addition, some cellsurface appendages contributing to surface colonisation in bacteria cannot be categorised as fimbrial adhesins per se, e.g. the flagella and the trimeric autotransporters. To avoid any ambiguity, these different cell-surface appendages are proposed to be regrouped under the term of multimeric proteinaceous colonisation factors.

Monomeric proteinaceous surface colonisation factors
In E. coli, monomeric protein acting as SCFs include some autotransporters (ATs), inverted autotransporters (IATs), and some OMPs, but also the surface-exposed lipoprotein SslE, Efa-1 (E. coli factor adherence 1), dispersin, as well as some moonlighting proteins. Of note, the ATs (also sometimes called classical ATs) only belong to the Type V, subtype a, secretion system (T5aSS) and correspond to monomeric polypeptides with modular organisation into at least three main regions, i.e. (i) a Nterminal signal peptide, (ii) a central passenger and (iii) a translocator at the C-terminus (Desvaux, Parham and Henderson 2003;Leo et al. 2012). ATs (T5aSS) should not be mistaken with the trimeric ATs, hybrid ATs and inverted ATs, which belong the T5sSS, T5dSS and T5eSS, respectively.

Autotransporter adhesins (ATAs)
ATAs enable direct adhesion to abiotic supports, e.g. glass, stainless steel or plastic ware and/or biotic surface, e.g. mammalian cells or extracellular matrix (ECM) components such as collagens (Vo et al. 2017). As such, they can also belong to MSCRAMM (microbial surface components recognizing adhesive matrix molecules) proteins (Chagnot et al. 2012).
In EHEC, several enterohaemorrhagic E. coli autotransporters (Eha) have been identified (Wells et al. 2008). Among them, EhaB has been shown to promote bacterial cells binding to laminin and collagen I (Wells et al. 2008;Wells et al. 2009), whereas EhaJ causes strong adherence to fibronectin, fibrinogen, collagens II, III and V, and laminin (Easton et al. 2011). EhaB has also been identified in EPEC and ETEC (Zude, Leimbach and Dobrindt 2014). Immediately adjacent to the eha gene, egtA encodes a glycosyltransferase. EhaJ requires glycosylation to mediate strong biofilm formation but not for adhesion to ECM components (Easton et al. 2011). Following genomic analysis, ehaJ appears to be also present in EAEC, EIEC and ETEC where its function is still unknown. In EPEC, its exact function in the colonisation process remains unclear, as it does not seem to be required for bacterial adhesion and biofilm formation (Easton et al. 2011). While EhaD has been shown to mediate biofilm formation, its role in bacterial adhesion has not been determined yet and its contribution to sessile development in DEC would require more in-depth investigation (Wells et al. 2008). In the laboratory strain E. coli K12, the EhaD homologue YpjA has been shown to promote adhesion to glass and polyvinyl chloride (PVC), as well as biofilm formation together with the EhaC homologue YfaL and YcgV (Roux, Beloin and Ghigo 2005). In EHEC, however, EhaC was not shown to promote biofilm formation (Wells et al. 2008). A homologue of ycgV has been genetically identified in several DEC, namely, EPEC, ETEC, EAEC and EIEC (Wells, Totsika and Schembri 2010;Zude, Leimbach and Dobrindt 2014). Altogether, this information emphasises the need for further experimental characterisation of the adhesive functions of Eha, particularly considering the diversity of DEC.
Some ATs originally identified in UPEC and acting as adhesins have been identified in DEC, namely, UpaB (uropathogenic E. coli autotransporter B) and UpaI (Zude, Leimbach and Dobrindt 2014). From UPEC investigations, these proteins appeared to promote adhesion to a wide range of ECM components (Allsopp et al. 2012;Zude, Leimbach and Dobrindt 2014), whilst UpaI was further demonstrated to mediate biofilm formation (Zude, Leimbach and Dobrindt 2014). Although the genes are found in EPEC and STEC, none of them have been functionally characterised in any DEC to date (Zude, Leimbach and Dobrindt 2014).
Following genomic analysis, AatA (avian pathogenic E. coli autotransporter A) appears to be also present in some DEC strains (Zude, Leimbach and Dobrindt 2014). In APEC (avian pathogenic E. coli), AatA is important for pathogenesis as it enhanced adhesion to chicken fibroblast cells (Dai et al. 2010;Li et al. 2010;. However, its role and contribution in DEC is still unknown.

Self-associating autotransporters (SAATs)
SAATs are primarily enable to associate to one another resulting in bacterial cell autoaggregation (Klemm, Vejborg and Sherlock 2006). In E. coli, the SAATs regroup ATs from the Ag43 (antigen 43), AIDA-I (adhesin involved in diffuse adherence phenotype) and TibA (toxigenic invasion locus b) families (Trunk, Khalil and Leo 2018). Of note, SAATs differentiate from ATAs as they do not necessarily play a role in direct adhesion to biotic or abiotic surfaces but can nonetheless contribute directly or indirectly to surface colonisation.
Ag43 is probably the SAAT which has triggered the most research to date, with most of the information resulting from investigations in the E. coli K12 laboratory strain (van der Woude and Henderson 2008). Besides autoaggregation, Ag43 has been demonstrated to increase biofilm formation on abiotic surfaces (Kjaergaard et al. 2000) and adhesion to epithelial cells de Luna et al. 2008) but to decrease bacterial motility (Ulett et al. 2007a,b). The gene encoding Ag43 has been shown to be highly expressed during the early stage of biofilm formation (Schembri, Kjaergaard and Klemm 2003) but not in mature biofilms (Beloin et al. 2004). While biofilm formation is favoured by the autoaggregation phenomenon (van der Woude and Henderson 2008), Ag43 is not involved in gut colonisation (de Luna et al. 2008). It is also known that the expression of pili would shield the interaction between Ag43 and thus prevent the autoaggregation (Korea et al. 2010). Phylogenetic analysis revealed the agn43 gene is distributed into two subfamilies, namely, subfamily I (SF-I) and SF-II, and is only found among, but not all, E. coli (including some Shigella spp.) (van der Woude and Henderson 2008). It has been suggested that agn43 is more prevalent in pathogenic E. coli strains than in commensal E. coli strains (van der Woude and Henderson 2008). It can be detected as a single gene copy, like in E. coli K12, or in multiple alleles, like in EHEC O157:H7 EDL933 where two identical copies are found in two different pathogenicity islands, namely the O-island 43 (OI-43) and OI-48 . In UPEC CFT073, Ag43 is encoded by two different alleles, namely agn43a and agn43b (Ulett et al. 2007b). Compared to the Ag43 encoded by the first allele, Ag43 from allele b had a slower autoaggregation kinetics and lower propension for biofilm formation.
Autoaggregation results from the L-shape structure of Ag43 passenger region, which drives molecular interaction via salt bridges and hydrogen bonds along the β-helix structure in a molecular Velcro-like handshake mechanism (Heras et al. 2014). In E. coli O157:H7 EDL933, Ag43 was shown to promote autoaggregation, calcium binding and biofilm formation but was unable to mediate adhesion to epithelial cells . While present in other DEC, such as EPEC, ETEC and EAEC (Zude, Leimbach and Dobrindt 2014;Vo et al. 2017), functional characterisation of Ag43 in these different pathotypes has not be examined in details to date. Most recently, phylogenetic network analysis revealed the Ag43 passengers were distributed into four distinct classes, namely, C1, C2, C3 and C4 (Ageorges et al. 2019). Structural alignment and modelling analyses indicated the N-terminal and C-terminal regions of the passengers belonged to two different subtypes which gave rise to these four distinct Ag43 classes upon domain shuffling. Functional analyses demonstrated that expression of Ag43 C3 (which both agn43a and agn43b from UPEC CFT073 belong to) induced a slower sedimentation kinetics of bacterial cells and smaller aggregates compared to the three other Ag43 classes (Ageorges et al. 2019). Using prototypical Ag43 C1 from E. coli K12 MG1655, Ag43 C2 from EHEC EDL933, Ag43 C3 from UPEC CFT073 (allele agn43b) and Ag43 C4 from ETEC H10407, it appeared that heterotypic interactions occurred in a very limited number of cases compared to homotypic interactions. This ability of Ag43 variants to specifically identify genetic copies of themselves in other bacterial cells through Ag43-Ag43 interactions further suggests a greenbeard effect (Gardner and West 2010;Wall 2016), the ecophysiological relevance of which undoubtedly require further investigation (Ageorges et al. 2019).
AIDA-I is involved in the diffuse adherence of DEC strains (Benz and Schmidt 1989;Benz and Schmidt 1992) and also in bacterial autoaggregation, biofilm formation and adherence to a wide range of human and non-human cells (Benz and Schmidt 1989;Sherlock et al. 2006). While the function of AIDA-I is quite similar to Ag43, they clearly belong to different protein families (Vo et al. 2017). The gene encoding AIDA-I is especially prevalent in ETEC and STEC strains from porcine origin, which suggests pork as a main animal reservoir for this gene (Niewerth et al. 2001;Ha et al. 2003). In EPEC, the AIDA-I gene (aidA) is associated with aah which encodes a 45-KDa heptosyltransferase (Benz and Schmidt 2001). These genes are plasmid located and transcribed as bicistronic mRNA, but their expression seems to be restricted to a small number of DEC strains (Owen et al. 1996;Sherlock et al. 2004). Aah (adhesin associated heptosyltransferase) modifies the AIDA-I by addition of 19 heptose residues on average, which enables EPEC to adhere to human cells (Benz and Schmidt 1992;Benz and Schmidt 2001;Laarmann and Schmidt 2003;Schembri, Dalsgaard and Klemm 2004). In EHEC O157:H7, though, AIDA-I does not play a role in adherence to cultured cells or to pig intestinal epithelial cells (Yin et al. 2009). This suggests different subfamilies or classes of AIDA-I could exist as observed for Ag43, which would require further in-depth investigation. TibA has been found to self-aggregate, promote biofilm formation and facilitate colonisation of the intestinal epithelia (Sherlock, Vejborg and Klemm 2005; Cote and Mourez 2011). In ETEC, TibA is encoded by the tib operon, which also encodes the glycosyltransferase TibC (Lindenthal and Elsinghorst 1999). Glycosylation of TibA is important for its function since its unglycosylated form is less stable and cannot oligomerise properly and in turn cannot promote bacterial adhesion to epithelial cells (Cote, Charbonneau and Mourez 2013); nonetheless, it can autoaggregate, promote biofilm formation and cell invasion. Interestingly, TibA, AIDA-I and Ag43 have been reported to interact with one another resulting in the formation of mixed bacterial aggregates (Klemm, Vejborg and Sherlock 2006). These interesting findings deserve further in-depth characterisation, especially with regards to recent findings where the interactions between Ag43 variants appears quite specific (Ageorges et al. 2019).
In E. coli O157:H7, EhaA has been shown to mediate autoaggregation and adhesion to primary epithelial cells derived from the bovine terminal rectum, as well as biofilm formation (Wells et al. 2008). As such, EhaA can be considered as an additional member of SAAT also found in EAEC, EPEC and ETEC (Vo et al. 2017). Similarly, UpaC was reported to promote autoaggregation, as well as biofilm formation (Zude, Leimbach and Dobrindt 2014). UpaC is found in a wide range of InPEC (Zude, Leimbach and Dobrindt 2014). Of note, some ATAs such as UpaI can further promote autoaggregation to some extent (Zude, Leimbach and Dobrindt 2014).
Serine protease autotransporters from enterobacteriaceae (SPATEs) SPATEs correspond to a subfamily of protease autotransporters that specifically exhibit a serine protease domain (IPR034061) in the passenger region (Rojas-Lopez et al. 2017). While their primary function is associated with the degradation of various proteins, such as mucin or haemoglobin, they can contribute to bacterial virulence via their cytotoxic effect, and some can even be involved in bacterial colonisation (Dautin 2010).
In EHEC, EspP (extracellular serine protease plasmidencoded), also known as PssA (protein secreted by Stx-producing E. coli), contributes to biofilm formation, bacterial adherence to intestinal epithelial cells, including bovine primary rectal cells and colonisation of the bovine intestine (Dziva et al. 2007;Puttamreddy, Cornick and Minion 2010;Farfan and Torres 2012). EspP is encoded on the pO157 plasmid and can be found in diverse STEC isolates (van Diemen et al. 2005;Dziva et al. 2007;Ruiz-Perez and Nataro 2014). At the bacterial cell surface, EspP passenger domains self-assemble to form supramolecular structures, called ropes (Xicohtencatl-Cortes et al. 2010). Besides cytopathic activities, the EspP ropes have strong adhesive properties to host epithelial cells and can further serve as a substratum for bacterial adherence and biofilm formation. Similar observations have also been made for EspC from EPEC (Xicohtencatl-Cortes et al. 2010).
In EAEC, Pic (protein involved in colonisation) is involved in mucin degradation but also directly in mucin binding (Gutierrez-Jimenez, Arciniega and Navarro-Garcia 2008; Andrade et al. 2017). It thus participates in intestinal colonisation and may also be involved in bacterium-mucus biofilm (Navarro-Garcia and Elias 2011). Pic is also expressed by the hybrid EHEC/EAEC E. coli O104:H4 but its exact contribution to the colonisation process in this genetic background remains to be ascertained Harrington et al. 2009;Abreu et al. 2015;Abreu et al. 2016). Of note, Shmu is a mucinase identical to Pic found in Shigella (Rajakumar, Sasakawa and Adler 1997).
Inverted autotransporters (IATs) In IATs, which correspond to the Type V, subtype e, secretion system (T5eSS), the translocator is located in the N-terminal region and the passenger at the C-terminal, which is the opposite of the modular organisation found in ATs (Tsai et al. 2010;Oberhettinger et al. 2012). In DEC, there are several IATs acting as SCFs, namely, intimin, FdeC (Factor adherence of E. coli) and YeeJ. More recently, additional IATs have been identified in E. coli, where iatA appeared quite prevalent but the functional characterisation of the gene product is still awaited (Goh et al. 2019). IatB, IatC and IatD from an environmental E. coli strain were further shown to be involved in strong biofilm formation when overexpressed in a recombinant E. coli K12 background, but not in autoaggregation nor adhesion to ECM proteins (Goh et al. 2019). While identified in several DEC, their role and contribution in their native genetic background is still unknown.
Intimin Intimin is the prototypical member of IATs (Leo, Grin and Linke 2015). In EPEC and EHEC, the intimin is encoded by the eae (for E. coli attachment effacement) gene in the locus of enterocyte effacement (LEE) (Nataro and Kaper 1998). This protein interacts specifically with its receptor Tir (translocated intimin receptor) allowing the establishment of the intimate attachment of the bacteria with the host cell, pedestal formation and attaching/effacing lesions (A/E) (Schmidt 2010). In addition, intimin contributes to intestinal colonisation in a Tir-independent manner (Mallick et al. 2012). Intimin may also bind to alternative receptors such as β 1 integrins or nucleolin but this remains to be clarified (Liu, Magoun and Leong 1999;Leo et al. 2015).

Factor adherence of E. coli (FdeC)
FdeC is a widespread IAT in E. coli and present in all DEC pathotypes (Nesta et al. 2012;Easton et al. 2014). In EHEC O26:H11, FdeC was shown to contribute to biofilm formation and potentially in colonisation of the terminal rectum of cattle (Easton et al. 2014).
YeeJ More recently, the gene encoding YeeJ has been reported to be present in some DEC, namely, EHEC, EPEC, ETEC and EIEC (Martinez-Gil et al. 2017). In E. coli K12, this IAT has been shown to participate in biofilm formation. While YeeJ exists into two distinct variants of different lengths, no functional difference could be detected between them. However, the contribution of YeeJ to biofilm formation in DEC remains to be established.
Outer membrane protein A (OmpA) While originally considered as a pore forming protein (Sugawara and Nikaido 1992), whether the OmpA β-barrel offers a channel for the continuous passage of water or solutes remains controversial . Nowadays, OmpA is rather viewed as a multifaceted protein with functions of an adhesin as well as an invasin. In EHEC O157:H7, OmpA is involved in adhesion to intestinal epithelial cells (Torres and Kaper 2003;Kudva et al. 2015). OmpA further appears to be the key molecular determinant for bacterial adhesion to plant surfaces, such as alfalfa sprouts . The role of OmpA as an invasin was demonstrated in NMEC (Prasadarao et al. 1996) but remains to be established in DEC.
Interestingly, OmpA can be encoded by at least two different alleles, namely ompA1 and ompA2 (Power et al. 2006). Many of the interaction properties of OmpA emanate from protein loops external to the OM, which are displayed on the bacterial cell surface ; in the two alleles, differences in these regions could influence the adhesin and/or invasin properties of the protein. Of note, OmpA further serves as a receptor for bacteriophages and bacteriocins (Smajs, Pilsl and Braun 1997;Power et al. 2006). Regarding biofilm formation, the direct contribution of OmpA remains controversial; while OmpA from E. coli K12 has been shown to bind to abiotic surfaces and to significantly influence biofilm formation (Lower et al. 2005;Barrios et al. 2006), the role of OmpA in EHEC O157:H7 biofilm formation appears to be minor and it acts rather as a modulator than a contributor to sessile development Kudva et al. 2015). Keeping in mind that OmpA is an important contributor to the structural integrity of the bacterial cell envelope by bridging the OM and cell wall, along with lipoproteins (Wang 2002), the interpretations of phenotypes from OmpA mutants must be considered with caution due to possible pleiotropic effects that can be confounding. Further investigations on these various aspects are clearly needed, and in particular the allelic variation of OmpA should also be more carefully considered to decipher their exact role.

Heat-resistant agglutinin (Hra)
The Hra family of OMPs were first described with Hek (haemagglutinin from E. coli K1) in NMEC, where it was reported to promote autoaggregation, interactions with human erythrocytes and epithelial cells, as well as adhesion to, and invasion of epithelial cells (Fagan and Smith 2007). Hek was originally identified because of its homology with Tia (toxigenic invasion protein A) (Bhargava et al. 2009). In ETEC, Tia mediates attachment to intestinal epithelial cells as well as their invasion (Fleckenstein et al. 1996;Sjoling, von Mentzer and Svennerholm 2015). It also appears to bind several mammalian heparan sulphate binding proteins suggesting, that ETEC use these ubiquitous cell surface heparan sulphate proteoglycans as receptors to adhere and invade host epithelial cells (Fleckenstein, Holland and Hasty 2002).
In EAEC O42, Hra1 (heat-resistant agglutinin 1) was demonstrated to be responsible for autoaggregation and aggregative adherence, as well as biofilm formation (Bhargava et al., 2009). While these observations were made upon protein expression in nonadherent and nonpathogenic laboratory E. coli strains, an EAEC 042 hra1 deletion mutant was not deficient in these phenotypes, indicating that Hra1 is an accessory colonisation factor in this genetic background. While hra1/hek was originally considered absent from DEC but restricted to UPEC, NMEC and sepsis E. coli (Dobrindt et al. 2002;Cooke et al. 2010), it later became clear that hra1 and tia are common among DEC, especially EAEC but also EPEC (Fleckenstein et al. 1996;Mancini et al. 2011). In the EAEC strain 60A, Hra2 it is not involved in autoaggregation or invasion, but only in adherence to epithelial cells (Mancini et al. 2011); its involvement in bacterial adhesion to abiotic supports and biofilm formation remains to be elucidated. The prevalence of hra2, however, seems to be very low among DEC.
More recently, a novel member of the Hra family has been identified in STEC, namely, Hes (Hemagglutinin from shigatoxinencoding E. coli) (Montero et al. 2017). Hes was shown to promote autoaggregation and biofilm formation as well as erythrocyte agglutination and adherence to epithelial cells, but not invasion. The gene was observed to be present in LEE-negative STEC but not LEE-positive STEC (Montero et al. 2017).

Iron-regulated protein A homologue adhesin (Iha)
Iha is an adherent-conferring protein homologous to IrgA (iron-regulated protein A) found in Vibrio cholerae (Tarr et al. 2000). As well as a β-barrel structure enabling membrane anchoring as in any OMP, Iha has externally exposed domains. Rather than localised adherence, Iha confers a diffuse adherence pattern in E. coli O157:H7. Besides STEC, iha has been identified in EPEC and UPEC (Szalo et al. 2002;Kanamaru et al. 2003;Gomes et al. 2011). In UPEC, Iha was shown to further act as a catecholate siderophore receptor ) and a virulence factor (Johnson et al. 2005) but these roles in DEC remain to be established. In EHEC, Iha has been clearly demonstrated to be involved in intestinal colonisation and contribute to pathogenesis by promoting adherence to the intestinal epithelium (Yin et al. 2009).
Secreted and surface-associated lipoprotein of E. coli (SslE) SslE, formerly known as YghJ Iguchi et al. 2009), was recently described as a novel E. coli mucinase thanks to its zinc metallopeptidase motif (Luo et al. 2014;Nesta et al. 2014). This protein is secreted by a Type II, subtype a, secretion system (T2aSS) but the molecular mechanisms of its maturation as a surface lipoprotein remains unclear. The gene encoding SslE is present in different DEC pathotypes such as EPEC, ETEC and EHEC (Decanio, Landick and Haft 2013). In EPEC, SslE was shown to mediate biofilm formation and intestinal colonisation (Baldi et al. 2012;Vermassen et al. 2019). This protein can be divided into two main variants and antibodies raised against variant I (from ExPEC strain IHE3034) are able to inhibit translocation of E. coli strains through a mucin-based matrix. In addition, immunisation of animals with SslE I significantly reduces gut colonisation by strains of different pathotypes expressing SslE II (Nesta et al. 2014). These observations make SslE a key factor in E. coli colonisation of the mucosal surface in humans and could serve as a component for a protective vaccine against DEC (Naili et al. 2016;Naili et al. 2017;Rojas-Lopez et al. 2018;Rojas-Lopez et al. 2019).
E. coli factor adherence 1 (Efa-1) Efa-1, also known as LifA (lymphostatin A), present in EPEC and some non-O157 EHEC strains, is known to inhibit the proliferation of mitogen-activated lymphocytes and the synthesis of proinflammatory cytokines and gamma interferon (Klapproth et al. 2000;Abu-Median et al. 2006). Efa-1 has been shown to mediate colonisation of the calf intestine independently of glycotransferase and cysteine protease motifs (Deacon et al. 2010). In EHEC O157 strains, ToxB is homologous to Efa-1 and appears to contribute to adherence to cultured epithelial intestinal cells (Tatsuno et al. 2001). However, no lymphostatin-like activity has been associated with this protein and it is not involved in intestinal colonisation in animal models Abu-Median et al. 2006). While Efa-1 has an extracytoplasmic domain and is presumably cell-surface exposed (Nicholls, Grant and Robins-Browne 2002), the molecular mechanisms at play for its secretion and cell-surface display remain unknown.
Dispersin Dispersin is an anti-aggregation protein (Aap) involved in the spreading of bacterial cells along the host intestinal mucosa (Sheikh et al. 2002). This protein contributes to adherence and colonisation of EAEC by preventing hyper-aggregation and collapse of AAF (aggregative adherence fimbriae). Dispersin is present at the bacterial cell-surface via binding to LPS in a non-covalent manner after secretion through a Type I secretion system (T1SS) (Velarde et al. 2007). This secretion system and cognate-secreted protein are encoded in the aat (aggregative ABC transporter) locus located in the pAA plasmid of some EAEC (Nishi et al. 2003). Dispersin is also present in some STEC strains Muniesa et al. 2012).
Moonlighting proteins At the bacterial cell surface of E. coli, some unexpected proteins primarily known to be localised in the cytoplasm have been reported. Among these unexpected cell surface proteins, glycolytic enzymes are frequently uncovered (Henderson and Martin 2011). These so-called moonlighting proteins have been demonstrated to exhibit a secondary function at the bacterial cell-surface, completely unrelated to their primary function in the cytoplasm (Khan et al. 2014). As a common glycolytic enzyme frequently found at the bacterial cell surface, GAPDH (glyceraldehyde 3-phosphate dehydrogenase) has been demonstrated to bind plasminogen and fibrinogen in EHEC and EPEC (Egea et al. 2007); although there is no evidence of GAPDH acting directly as a plasminogen activator (Coleman and Benach 1999;Seidler 2013). In addition, GAPDH is clearly involved in adhesion to intestinal epithelial cells upon infection. A common theme for moonlighting proteins present at the bacterial cell surface is that these proteins lack a N-terminal signal peptide for translocation across the CM and the protein secretion systems enabling their translocation across the OM are often unknown, which is covered by the generic term of non-classical protein secretion (Bendtsen and Wooldridge 2009;Desvaux et al. 2009b). For GAPDH, though, it has been strongly suggested to occur via piggybacking through the Type III, subtype a, secretion system (T3aSS) (Aguilera et al. 2012). While it is also known that enolase can also be extracellularly located in E. coli (Boel et al. 2004), its contribution to bacterial adhesion remains to be determined. The elongation factor Tu (EF-Tu) is also found at the bacterial cell surface and has been reported to be involved in bacterial aggregation (Amimanan et al. 2017). In DEC, the contribution of putative moonlighting glycolytic enzymes and other moonlighting proteins to the colonisation process deserves more thorough investigation.

Multimeric proteinaceous surface colonisation factors
Multimeric protein complexes acting as SCFs can be classified as (i) homooligomeric proteins, namely, the trimeric autotransporter adhesins (TAAs) and (ii) cell-surface supramolecular structures, including flagella and numerous pili.

UPEC autotransporter G (UpaG)
While UpaG was originally identified in UPEC, it was also found in the EAEC 042 strain (Zude, Leimbach and Dobrindt 2014). UpaG is involved in autoaggregation, biofilm formation, adhesion to fibronectin and laminin, as well as human epithelial cells (Valle et al. 2008). In EHEC, EhaG (EHEC autotransporter G) is a positional orthologue of UpaG, which is also involved in autoaggregation, biofilm formation, adhesion to laminin, fibronectin and collagens I, II, II and IV as well as some epithelial cells (Valle et al. 2008;Totsika et al. 2012;Zude, Leimbach and Dobrindt 2014). The gene encoding EhaG has been also identified in a wide range of DEC including EPEC, EIEC, ETEC and EAEC (Zude, Leimbach and Dobrindt 2014).

E. coli immunoglobulin-binding protein (Eib)
Eibs were originally characterised for their ability to bind immunoglobulin fractions, especially to the Fc (fragment crystallisable) region of IgA and IgG (Sandt and Hill 2000;Sandt and Hill 2001;Leo and Goldman 2009); up to seven different Eibs have been identified to date, namely, EibA, B, C, D, E, F and G. In LEE-negative STEC O91, it further appeared that EibG is involved in adherence to epithelial cells in a chain-like adhesion (CLA) pattern (Lu et al. 2006). CLA corresponds to the formation of a long chain cell aggregate, which EibG induces on both human and bovine intestinal epithelial cells. The gene encoding EibG is distributed into 21 different alleles clustered into three eibG subtypes, namely, eibG-α, -β and -γ (Merkel et al. 2010). While EibG-α and EibG-β are responsible for the typical CLA phenotype, EibG-γ induces adherence in much shorter cell chains and smaller cell aggregates, corresponding to an atypical CLA. EibD has been further shown to promote autoaggregation and biofilm formation (Leo et al. 2011). Considering their structural similarity, other Eibs have been suggested to have similar biological functions but experimental confirmation is still required to ascertain this. Eib genes are found in some STEC strains, as well as some E. coli commensal strains (Lu et al. 2006).

STEC-autotransporter mediating biofilm formation (Sab)
Sab contributes to the diffusive adherence of STEC to human epithelial cells and biofilm formation to abiotic surfaces Farfan and Torres 2012). Genes encoding Sab are especially present in LEE-negative STEC.
STEC autoagglutining adhesin (Saa) Saa promotes adhesion to HEp-2 cells in a semilocalised adherence pattern (Paton et al. 2001). So far, the saa gene has only been reported in some STEC, including some LEE-negative EHEC strains (Paton and Paton 2002;Jenkins et al. 2003;Monaghan et al. 2011).
Cell-surface supramolecular structures Flagella and pili are organelles resulting from the supramolecular assembly of different protein subunits to form heteromultimeric protein complexes on the bacterial cell-surface.
Flagella Flagellar components are secreted and assembled via the Type III, subtype b, secretion system (T3bSS) and more than 50 genes divided in three hierarchical classes are involved in the flagellar apparatus formation (Young, Schmiel and Miller 1999;Chilcott and Hughes 2000). The main component of the flagellum filament is the flagellin, which has considerable diversity in ultrastructure and is responsible for the H-antigen variability (H1 to H56) (Zhou et al. 2015). In E. coli, the flagellation is peritrichous but the sites of cell surface localisation and the number of flagella (typically around 6-10) are considered random (Macnab 1987a(Macnab , 1987b. Nonetheless, it must be stressed that when swimming, the flagella in motion coalesce into an undulating bundle, forming one rigid helical ponytail about 14 nm in diameter and 10 μm long that appears as polarly localised in E. coli (Bray 2001). A swimming bacterial cell has a run-andtumble behaviour, where it progresses linearly (run) and then changing abruptly in direction (tumble), but also slow-randomwalk behaviour, where it moves at a relatively low speed (Qu et al. 2018). Upon chemotaxis, the rotational direction of the flagella motor can be switched to control motility, a factor that might help approaching the intestinal mucosa in a more coordinated movement (Kitao and Hata 2018;Rossi et al. 2018). The approach to the surface is an important step towards initial bacterial adhesion and subsequent sessile development. Active motility involving the flagella allows the bacterial cells to overcome repulsive electrostatic and hydrodynamic forces at the adhesion site (Donlan 2002).
Besides swimming, flagella can participate in an alternative type of motility called swarming where bacterial cells move and spread on a surface (Kaiser 2007). Swarming directly contributes to the surface colonisation process and is associated with the expression of an alternative system, the lateral flagella (Merino, Shaw and Tomas 2006). In EAEC O42, the Flag-2 locus encodes such a system (Ren et al. 2005), although, a mutation frameshift has likely inactivated this system in this strain. Nonetheless, the Flag-2 cluster appeared to be present in about 20% of E. coli strains from the ECOR collection. In the environmental strain E. coli SMS-3-5, although the Flag-2 gene cluster is complete and intact, swarming motility could not be observed (Fricke et al. 2008); to date, the functionality of this system in E. coli remains to be elucidated. In the absence of polar flagella, E. coli is not as efficient at surface colonisation but is still considered a temperate swarmer, enabling it to swarm over surfaces with rheology corresponding to 0.5%-0.8% agar (in comparison to ≥1.5% agar for robust swarmers) (Partridge and Harshey 2013).
Besides motility, flagella can directly act as adhesins, as shown in EPEC, where they are involved in adhesion to epithelial cells (Giron et al. 2002;Cleary et al. 2004). In EAEC, flagella contribute to adhesion to plant leaves (Berger et al. 2009). In EHEC, the flagellin FliC favours initial attachment, adhesion to epithelial cells and biofilm formation on abiotic surfaces as well as spinach leaves (McNeilly et al. 2008;Mahajan et al. 2009;Vikram et al. 2013;Nagy et al. 2015). In ETEC, flagella contribute to bacterial adhesion to salad leaves and intestinal epithelial cells, as well as biofilm formation (Shaw et al. 2011;Duan et al. 2012;Zhou et al. 2013;Zhou et al. 2014). Interestingly, in this pathotype, flagella can also mediate indirect adhesion through EtpA (ETEC two-partner secretion protein A), a protein secreted by a T5bSS (two-partner secretion system), which bridges the flagella with host cell receptors, thus allowing bacterial cell attachment to some epithelial cells and mucin-expressing regions in mouse small intestinne (Fleckenstein et al. 2006;Roy et al. 2009). In EHEC and EPEC, the adhesion of H6 and H7 flagella to the intestinal epithelium and epithelial cells has been suggested to occur though mucins (Giron et al. 2002;Mahajan et al. 2009) as reported for H1 flagella from the probiotic E. coli Nissle 1917 (Troge et al. 2012). In some EHEC/STEC strains, namely LEE-negative EHEC O113:H21 and STEC O139:H1:F18ab strains, flagella can also contribute to bacterial invasion of intestinal epithelial cells but the molecular mechanisms at work remains to be clarified (Luck et al. 2006;Rogers et al. 2012;. These latter aspects would undoubtedly deserve further in-depth investigation. While different flagellin variants have been shown to be involved in direct binding to host cells, such as H1 and H19 flagella in ETEC (Duan et al. 2012;, systematic analysis of the colonisation properties of all of the different Hantigens in E. coli has not been investigated as yet. Except for EIEC which are generally considered as nonmotile (Nataro and Kaper 1998), the contribution of flagella as a motility factor over an adhesion factor in the colonisation processes has not been clearly resolved as of yet in DEC, particularly regarding bacterial adhesion and biofilm formation to biotic and abiotic surfaces Servin 2014).
Pili Pili, also referred to in the E. coli literature as fimbriae, are key actors during the initial attachment of bacteria to surfaces, which is characterised by a stronger and longer interaction coupled with a decrease of bacterial motility (Pruss et al. 2006). While binding can be considered reversible as evidenced for the chaperon-usher fimbriae to lectin (Hultgren et al. 1989;Lin et al. 2002), bacterial binding can also be very strong due to the numerous pili expressed simultaneously by a single cell creating an avidity effect, as well as the flexibility of the stalk itself (Andersson et al. 2006). These pili can be secreted and assembled by different protein secretion systems, namely, the Type II, subtype c (T2cSS), Type III, subtype a (T3aSS), Type IV, subtype b (T4bSS), Type VII (T7SS) or Type VIII (T8SS) secretion systems (Figure 2). It should be stressed that this numerical protein secretion nomenclature was intended and restricted to the LPS-diderm bacteria in the first place (Desvaux et al. 2009a). In mycolate diderm bacteria (archetypical acid-fast bacteria, namely, mycobacteria) and some parietal monoderm bacteria, the ESX (ESAT-6) system involved in protein export across the IM (or cytoplasmic membrane) was also termed T7SS, which is (i) misleading when considering that no ESX component enabling protein translocation across the mycolic outer membrane has yet been identified (Converse and Cox 2005;Bitter et al. 2009;Groschel et al. 2016;Bosserman and Champion 2017;Unnikrishnan et al. 2017;Vaziri and Brosch 2019) and (ii) a misnomer with respect to both the bacterial export systems (and especially parietal monoderm bacteria), which do not follow the numerical nomenclature (e.g. Sec or Tat), and the numerical nomenclature for protein secretion systems in LPS-diderm, which is primarily based on the presence of a translocon at the OM Desvaux et al. 2009a,b Sutcliffe 2011. In diderm bacteria, the ESX is truly an export system in the same line than the Sec or Tat systems (van der Woude, Luirink and Bitter 2013) but not a secretion system per se. In the present review, the T7SS refers exclusively to the chaperone-usher pathway in LPS-diderm bacteria (Desvaux et al. 2009a,b;Chagnot et al. 2013;Abby et al. 2016;Gagic et al. 2016;Monteiro et al. 2016), which is the main pathway responsible for the secretion of a wealth of pili in E. coli (Wurpel et al. 2013). Of note, P pili have been well investigated in UPEC infection (Kuehn et al. 1992; Lillington, Geibel and Waksman 2014; Behzadi 2020) but their prevalence in DEC and potential contribution (or not) in diarrhoeic infection is much less documented although they contribute to intestinal colonisation of commensal E. coli (Nowrouzian, Wold and Adlerberth 2001) and have been detected in some strains causing bovine diarrhoea (Dozois et al. 1997).

The injectisome
The injectisome is a bacterial molecular syringe assembled and secreted by the T3aSS Galan and Waksman 2018). The injectisome forms a needle, which is functionally closer to the Hrp (hypersensitive response and pathogenicity) pilus in Pseudomonas syringae than to a flagellum (He and Jin 2003;Tampakaki et al. 2004;Cornelis 2006). This cell-surface appendage can vary in size depending on the bacterial species and even bacterial strains (Cornelis 2006); in a controlled process, the pilus length can further adapt for cell surface contact. In DEC, this peculiar pilus is encoded by genes located in the LEE pathogenicity island (McDaniel, Donnenberg and Kaper 1995), a landmark for all EPEC but is also present in some EHEC strains (namely, the LEE-positive strains), such as E. coli O157:H7, and EIEC (including Shigella spp.) (Hueck 1998;Galan and Wolf-Watz 2006;Coburn, Sekirov and Finlay 2007). Tir (translocated intimin receptor) is encoded by the tir gene located in the LEE and is injected in the host cell by the injectisome (Hueck 1998). This protein is then exposed at the host cell surface and serves as the receptor for the intimin, enabling intimate bacterial interaction with the intestinal epithelia (Donnenberg et al. 1993;. In EPEC, the injectisome is involved in cell adhesion and pedestal formation that occurs during the formation of attaching and effacing lesions upon actin rearrangement in the infected eukaryotic cell (A/E) (Wong et al. 2011). Of note, while A/E lesions are observed in vitro from infected epithelial cell cultures or colonic epithelium with LEE-positive EHEC (Lewis et al. 2015), these kinds of lesions are never observed from clinical samples of EHEC infections (Nataro and Kaper 1998); a clear explanation of why this is the case is unclear but would undoubtedly deserve further investigation to match up lab experiments with clinical observations (Lewis et al. 2015). In addition to the infection of mammalian cells, the injectisome is involved in adhesion to plants with a marked tropism for the stomata (Schroeder and Hilbi 2008;Shaw et al. 2008;Berger et al. 2010;Croxen et al. 2013). EspA, the main component of the filament in the injectisome is directly involved in adhesion, as well as in biofilm formation, in EPEC (Knutton et al. 1998;Moreira et al. 2006). In EIEC, the injectisome contributes to the invasion capabilities (Hueck 1998).
Type 4 pili (T4P) T4P are assembled and secreted by the T2cSS (Ramer et al. 2002;Chagnot et al. 2013). T4P have been demonstrated to play a role in several E. coli pathotypes, including host cell adherence and bacterial aggregation (Craig, Pique and Tainer 2004). Some of these pili can exhibit a unique feature in their ability to extend and retract, which results in twitching motility further contributing to biofilm formation (Mattick 2002;Craig, Forest and Maier 2019). In EPEC, T4P are also known as BFP (bundle-forming pili) and their subunits assemble in a helical manner to form polymeric fibres and can further interact to create higher-order bundles or tangled aggregates (Giltner, Nguyen and Burrows 2012; Melville and Craig 2013). These T4P are involved in the colonisation of the GIT and contribute to bacterial virulence (Bieber et al. 1998;Tacket et al. 1998). BFP are encoded by the bfp operon comprising of 14 genes, including bfpA, which encodes the major repeating subunit of the pilus fibre (Ramer, Bieber and Schoolnik 1996;Sohel et al. 1996). In EHEC strains, the T4P are called HCP (haemorrhagic E. coli pili) ). Inactivation of the hcpA gene in EHEC O157:H7 reduces adherence to human and bovine epithelial cells. HCP is also able to bind to fibronectin and laminin, to agglutinate rabbit red blood cells, to mediate biofilm formation and to promote twitching motility ). HCP are also encoded in some STEC strains (Farfan and Torres 2012). Because of their size, peculiar T4P called longus pili have been reported in ETEC (Giron, Levine and Kaper 1994). The N-terminal part of the major subunit LngA is homologous with Bfp of EPEC, CofA subunit of CFA/III (colonisation factor antigen) of ETEC and TCP (the toxin-coregulated pilin) of V. cholerae (Giron et al. 1995;Gomez-Duarte and Kaper 1995). Longus pili are involved in colonisation of the human gut (Clavijo, Bai and Gomez-Duarte 2010; Mazariego-Espinosa et al. 2010), in bacterium-bacterium interaction and resistance to antimicrobial agents as a result of biofilm formation (Clavijo, Bai and Gomez-Duarte 2010).
Conjugative pili (CP) CP are assembled and secreted through T4bSS (Lawley et al. 2003). Classically, the genes encoding for F-plasmid transfer are encoded on the tra operon located in the conjugative F plasmid (Manwaring, Skurray and Firth 1999). CP are responsible for nucleoprotein transfer between a donor bacterial cell (harbouring the F plasmid) and a recipient bacterial cell via the T4bSS (Lawley et al. 2003). Bacterial conjugation is a well-known process enabling horizontal transfer of genes including virulence or colonisation factors (Manwaring, Skurray and Firth 1999;Mazel and Davies 1999;Llosa et al. 2002;Sorensen and Mortensen 2005). Gene transfer is especially promoted in biofilm where physical contact between sessile donor and recipient cells is favoured (Lebaron et al. 1997;Hausner and Wuertz 1999;Dionisio et al. 2002;Molin and Tolker-Nielsen 2003;Maeda et al. 2006). Besides the transfer of genetic material, CP can be directly involved in bacterial adhesion May and Okabe 2008;May, Tsuruta and Okabe 2011). In biofilm, this can be further amplified as cells carrying a conjugative F plasmid promote the establishment of F pili mating pairs and consequently induce adhesion and biofilm formation between abiotic surfaces and poor biofilm former cells. EAEC strains expressing F pili have been demonstrated to improve mixed biofilm formation (Pereira et al. 2010). In EAEC C1096, pili encoded on the conjugative plasmid Incl1 further contributed to adherence to abiotic surfaces and epithelial cells (Dudley et al. 2006b). In EHEC O157:H7 Xuzhou, a novel conjugative plasmid called pO157-Sal encoding a complete set of genes for the T4bSS was identified, but its involvement in the colonisation process has not been investigated as yet Zhao et al. 2013).
Type 1 pili (T1P) T1P (also called Type 1 fimbriae) are the most investigated pili secreted and assembled via a T7SS (Capitani et al. 2006). The expression of T1P is induced during the initial bacterial adhesion step (Harris et al. 1990;Pratt and Kolter 1998;Cookson, Cooley and Woodward 2002;Orndorff et al. 2004;Reisner et al. 2014) and they are involved in the early and late stages of biofilm formation (Schembri, Kjaergaard and Klemm 2003;Beloin et al. 2004;Reisner et al. 2014). T1P also have a role in the formation of SIgA (secretory IgA) mediated biofilm of the normal flora within the gut (Bollinger et al. 2003;Orndorff et al. 2004;Bollinger et al. 2006). T1P are composed of FimA (fimbrillin A), which constitutes the pilus rod, and FimH at the apex of the pilus tip. FimH is the key adhesin component in T1P as it can link to mannose residues of some receptors on eukaryotic cells (Kaper, Nataro and Mobley 2004;Duncan et al. 2005) but also has nonspecific binding activity to abiotic surfaces Kolter 1998, Beloin et al. 2008). The absence of the FimH adhesin has been shown to hinder biofilm formation by preventing cell-to-surface and cell-to-cell contacts (Danese et al. 2000). In E. coli, different fimH alleles have been reported as conferring distinct colonisation abilities and thus playing different roles in biofilm formation (Martinez et al. 2000;Weissman et al. 2006). It was shown that contact between T1P and abiotic surfaces alters the composition of the OM and changes some physicochemical properties of the bacterial surface, which in turn influences adhesion (Otto et al. 2001;Orndorff et al. 2004). While the laboratory E. coli K12 strain and UPEC NU14 strain are the focus of the majority of the investigations about T1P, their involvement in bacterial adhesion and/or biofilm formation has been further demonstrated in EPEC, EAEC, ETEC and STEC strains (Elliott and Kaper 1997;Cookson, Cooley and Woodward 2002;Moreira et al. 2003;Sheikh et al. 2017). T1P are encoded in the fimBEAICDGHF gene cluster, which is quite widespread in E. coli in both commensal and pathogenic isolates (Sauer et al. 2000;Kaper, Nataro and Mobley 2004;Wurpel et al. 2013). While present in EHEC O157:H7 (Abraham et al. 1988;Li et al. 1997;Roe et al. 2001;McWilliams and Torres 2014), their contribution to the colonisation process has yet to be demonstrated.
Genes encoding the F1C pili are present in approximately 7% of E. coli faecal isolates (Werneburg and Thanassi 2018). F1C pili have been characterised in UPEC strains where they are encoded in the foc (fimbriae of serotype 1C) operon homologous to the fim locus (Klemm et al. 1994). In UPEC, F1C pili are involved in adherence to the bladder and kidney cells, as well as in biofilm formation (Werneburg and Thanassi 2018). Their prevalence and contribution to the colonisation process in DEC remains to be investigated.

CS31A pili
The CS31A (coli surface associated 31a antigen) plays a key role in the virulence of septicemic E. coli and ETEC, as well as some EPEC and DAEC (Girardeau et al. 1988;Contrepois et al. 1989;Jallat et al. 1994;Adams et al. 1997). Because of their thin structure, as well as their close and packed association to the bacterial cell surface, CS31A was initially described as capsule-like or even nonfimbrial antigens (Bertin et al. 1993;Mechin, Rousset and Girardeau 1996) before being clearly identified as thin capsular pili secreted and assembled by a chaperoneusher pathway (T7SS) (Thanassi, Saulino and Hultgren 1998). These pili are synthesised from the clp operon located on a high-molecular-weight self-transmissible R plasmid, called p31A (Martin, Boeuf and Bousquet 1991;Jallat et al. 1994;Martin 1996). CS31A are considered homologous to the K88/F4 (fae operon) and F41 pili but with some functional dissimilarities, such as that CS31A does not exhibit haemagglutinin activity (Girardeau et al. 1991). In ETEC, F4 pili allow bacterial adherence to F4-specific receptors present on the brush borders of villous enterocytes thus promoting the colonisation of the small intestine (Snoeck et al. 2008). The locus for diffuse adherence (ldaCDEFGHI) (Scaletsky et al. 2005) from EPEC is homologous to the K88 fae and ETEC CS31A clp operons. LdaH mediates diffuse adherence to Hep-2 cells. The LdaH encoding gene has also been found in STEC strains but no functional characterisation has been reported as yet (Scaletsky et al. 2005).
Aggregative adherence fimbriae (AAF) AAF belongs to the Afa/Dr (afimbrial adhesin/decay-accelerating factor receptor) haemagglutinin family together with F1845 pili (Nowicki et al. 1990;Le Bouguenec and Servin 2006). In DAEC and EIEC, Afa and Dr hemagglutinins recognise the Dr blood group antigen (Nowicki et al. 1990). Among the five genes encoded in the afa cluster, afaB, afaC and afaE are required for mannose-resistant hemagglutination (MRHA) (Servin 2005). The Dr hemagglutinin is encoded by the draABCDE operon, where draA, draB, draC, and draD encode accessory proteins and draE encodes the adhesin part (Nowicki et al. 1987;Servin 2005). In addition, it specifically binds collagen IV (Nowicki et al. 1988). Afa and Dr haemagglutinins can link to decay-accelerating factor (DAF) and to carcinoembryonic antigen-related cellular adhesion molecules (CEA-CAMs) (Nowicki et al. 1988;Westerlund et al. 1989;Berger et al. 2004). While some members of the Afa/Dr family were believed not to form pili as they could not be observed by electron microscopy examination, it is now clear they are secreted as AAF and F1845 by T7SS, to form pili of various architecture depending on the pilin subunits Pettigrew et al. 2004).
In EAEC, the colonisation of the gut occurs through aggregative adherence (AA) due to AAF, which binds to ECM proteins such as fibronectin, laminin and collagen IV (Farfan, Inman and Nataro 2008;Berry et al. 2014) and then promotes biofilm formation (Hicks, Candy and Phillips 1996;Wakimoto et al. 2004). To date, five AAFs (AAF/I to AAF/V) have been identified, all encoded by virulence plasmids of EAEC (pAA) and the main subunits of which are AggA, AafA, Agg3A, Agg4a and Agg5a respectively (Nataro et al. 1992;Czeczulin et al. 1997;Boisen et al. 2008;Jonsson et al. 2015). Another hypothetical Dr-related pilin called HdaA (HUS-associated diffuse adherence) also appears to confer the capacity to cause the AA phenotype in EAEC (Boisen et al. 2008). In DAEC and EIEC, F1845 pili are involved in gut colonisation (Servin 2005). F1845 pili are responsible for diffuse adherence to epithelial cells of the gut and are encoded by the daaABCDE operon (Bilge et al. 1989;Bilge et al. 1993).
Colonisation factor antigens (CFA) In ETEC, colonisation factor antigens (CFA), also called coli surface antigens (CS), form pili that take part in adhesion to the small intestine and are critical for virulence (Gaastra and Svennerholm 1996). CFA/I, CFA/II (CS1, 2 and 3) and CFA/IV (CS4, 5 and 6) are the most virulent (Sjoberg et al. 1988;Knutton et al. 1989;Taniguchi et al. 1995;Gaastra and Svennerholm 1996;Svennerholm and Lundgren 2012) but CS12, 14, 17, 18, 19, 20 and 31 can also adhere to intestinal cells (Werneburg and Thanassi 2018). CFA/CS are encoded in operons; taking CFA/I as an example, it is encoded by the cfaABCE operon, where cfaB encodes the main subunit, cfaE the distal subunit, cfaA a chaperone and cfaC the usher involved in pilin transport across the OM (Jordi et al. 1992). Cell adhesion is enabled by CfaB through its ability to bind glycosphingolipid (Jansson et al. 2006).

F9 pili
In EHEC O157:H7, F9 pili are involved in the colonisation of epithelial bovine cells, bovine gastrointestinal tissue explants and can also bind to fibronectin (Low et al. 2006). Mutants of the main subunit of F9 pili are still able to colonise the terminal rectum, indicating that the adhesin is not solely responsible for the rectal tropism observed but may contribute to colonisation at other sites, especially in young animals (Low et al. 2006). These pili are short but are able to form longer bundles (Low et al. 2006). They are encoded in the F9 gene cluster, a six genes operon located on the pathogenicity island O161 (Low et al. 2006;Wurpel et al. 2013). This operon has also been identified in EPEC, as well as EAEC (Wurpel et al. 2013). F9 pili are secreted and assembled by a T7SS (Wurpel et al. 2013).

E. coli YcbQ laminin-binding fimbriae (ELF)
In EHEC O157:H7, it has been shown that E. coli YcbQ laminin-binding fimbriae (ELF) bind laminin and are involved in adherence to epithelial cells in humans, cows and pigs (Samadder et al. 2009). ELF form peritrichous flexible fine fibres and are encoded by the elfADCG operon, originally called the ycbQRST operon, which was previously identified in UPEC and some commensal E. coli strains (Spurbeck et al. 2011). This operon is homologous to the F17 pili biogenesis genes found in ETEC, which are assembled and secreted by a T7SS (Lintermans et al. 1988;Lintermans et al. 1991;Bertin et al. 1996;Bertin et al. 2000). More generally, ELF are also homologous to 20 K, K99 and G pili found in various pathogenic E. coli (Guinee, Jansen and Agterberg 1976;Contrepois et al. 1983). These pili have been shown to mediate binding to intestinal mucosal cells, especially to N-acetyl-D-glucosamine-containing receptors (Bertin et al. 1996). The composition of the pili and the sequence of the tip-adhesin differ between the strains and could explain the phenotypic divergence associated with the expression of this family of pili in different E. coli strains (Korea et al. 2010).
Long polar fimbriae (LPF) LPF are encoded by two operons lpf1 and lpf2 located on the pathogenicity islands O141 and O154 in EHEC O157:H7, respectively (Perna et al. 2001). LPF are also present in other DEC, e.g. LEE-negative EHEC, EPEC, rabbit-specific EPEC, EAEC and ETEC, as well as in several commensal strains (Doughty et al. 2002;Wurpel et al. 2013). They share homology with the LPF of Salmonella enterica serovar Typhimurium which are involved in adherence to Peyer's patches and M cells in the human gut (Baumler and Heffron 1995;Baumler Tsolis and Heffron 1996). The lpf1 operon is composed of five genes, with lpfA encoding the main pilus subunit, lpfD and lpfE encoding minor subunits, and lpfB and lpfC encoding the chaperone and usher, respectively (Doughty et al. 2002;Torres et al. 2004). The lpf2 operon also contains five genes with a duplication of lpfD called lpfD' but with no lpfE paralogue (Torres et al. 2004). In E. coli O157:H7, it has been proposed that LPF2 is expressed in early stages whereas LPF1 is expressed in late stages of growth (Torres et al. 2004). LPF are secreted and assembled by a T7SS and can bind fibronectin, laminin and collagen IV, as well as the follicule-associated epithelium (FAE) of Peyer's patches in humans (Fitzhenry et al. 2006;Farfan and Torres 2012;McWilliams and Torres 2014). Expression of lpf2 is increased under conditions similar to those for biofilm formation (Torres et al. 2007). Recently, it has been demonstrated that STEC isolates positive for lpf2 formed significantly more biofilm than lpf2negatives isolates (Vogeleer et al. 2015). In EPEC, LPF have been shown to contribute to the early stages of colonisation of rabbits and the severity of diarrhoea (Newton et al. 2004).

E. coli common pilus (ECP)
In EHEC, ECP (previously called Mat for meningitis-associated temperature dependent pilus) provides adherence to HEp-2, HeLa and HT-29 cells and allows interaction between bacterial cells (Rendon et al. 2007). Secreted and assembled by a T7SS, ECP expression is increased under environmental conditions that are experienced in the GIT, e.g. low oxygen and high CO 2 concentrations (Rendon et al. 2007). However, its role seems to be secondary in the colonisation of the human or bovine gut (Tatsuno et al. 2000;Dziva et al. 2004). The ecp operon has been identified in numerous commensal and pathogenic E. coli, including DEC (Rendon et al. 2007).

Sorbitol-fermenting frimbriae protein (SFP)
In EHEC, the expression of sorbitol-fermenting frimbriae protein (SFP) pili is induced in anaerobic conditions and leads to an increased adherence to Caco-2 and HCT-8 cells, with a mannose-resistance hemagglutination phenotype (Brunder et al. 2001;Musken et al. 2008;Bielaszewska et al. 2009). These pili are encoded on the sfpABDCDJG operon harboured in the virulence plasmid pSFO157 (Brunder, Karch and Schmidt 2006). SFP pili are secreted and assembled by a T7SS (Brunder et al. 2001). Besides E. coli O157, sfp has been identified in other EHEC serotypes, such as O165 (Bielaszewska et al. 2009), but its prevalence among STEC in general is thought to be quite low (Toma et al. 2004). Distribution of the sfp operon in other DEC has not been investigated in detail as of yet.
Curli Curli are thin aggregative pili generally considered as one of the major proteinaceous components of the E. coli biofilm matrix (Smyth et al. 1996;Stathopoulos et al. 2000;Kostakioti et al. 2005;Evans and Chapman 2014). These peculiar pili are secreted and assembled by the T8SS through the extracellularnucleation-pathway (ENP). Curli are helical filamentous amyloid fibres that facilitate cell-surface and cell-cell interactions and promote biofilm formation (Olsen et al. 1993;Cookson, Cooley and Woodward 2002;Szabo et al. 2005;McCrate et al. 2013). In EHEC O157:H7, curli are associated with cellulose production, adherence to spinach leaves and Hep-2 cells as well as abiotic surfaces (Kim and Kim 2004;Pawar, Rossman and Chen 2005;Macarisin et al. 2012). In ETEC, curli facilitate adherence to plastic surfaces (Szabo et al. 2005). Although curli were originally thought not be expressed by EPEC (Ben Nasr et al. 1996), some strains were later reported to synthetise curli, playing a role in bacterial adhesion and biofilm formation in condition mimicking human or bovine hosts ). However, curli do not seem to be required for biofilm formation and/or adhesion of EAEC strains (Sheikh et al. 2001;Berger et al. 2009;Pereira et al. 2010). In Shigella spp. and EIEC, CsgD and curli expression is often inactivated (Sakellaris et al. 2000). Two operons are involved in curli production, (i) the csgBAC operon, encoding the structural components of curli (CsgA and CsgB) and an accessory protein (CsgC) and (ii) the csgDEFG operon, encoding a transcriptional regulator (CsgD) and the secretion machinery for transport across the OM (CsgE-G) (Arnqvist, Olsen and Normark 1994;Hammar et al. 1995;). In the current model, CsgB is proposed as embedded in the OM where it acts as a nucleator for the polymerisation of the major CsgA curlin (Van Gerven et al. 2015;Jain and Chapman 2019). While the exact structure of curli fibres has not yet been elucidated with molecular resolution (Van Gerven et al. 2015;Jain and Chapman 2019), the fibres have been reported to display irregular thin branches, which would result from minor incorporation of CsgB along the curli and promoting the formation of branched fibres (Bian and Normark 1997;Soto and Hultgren 1999;Shu et al. 2012;DeBenedictis, Ma and Keten 2017). Recently, CsgC and CsgE were demonstrated to highly inhibit CsgA aggregation and CsgE was shown to prevent pellicle biofilm formation when added exogenously (Andersson et al. 2013;Evans et al. 2015).

Haemolysin-coregulated protein (Hcp)
In EAEC, the haemolysin-coregulated protein (Hcp) tube formed by the Type VI secretion system (T6SS) was suggested to be of importance for biofilm formation (Aschtgen et al. 2008). More than ten orthologues of the T6SS components have been identified in EHEC and EPEC strains. This system can also contribute to bacterial aggregation at the host cell surface (Dudley et al. 2006a;Shrivastava and Mande 2008;Lloyd et al. 2009;Aschtgen et al. 2010;Moriel et al. 2010). Further investigations are required in DEC to determine the exact role and molecular mechanisms involved in the colonisation processes by the Hcp and T6SS.

THE DIFFERENT REGULATION LEVELS INVOLVED IN THE EXPRESSION OF COLONISATION FACTORS
In general, the expression of genes encoded on genomes into proteins can be regulated at pre-transcriptional, transcriptional, post-transcriptional, translational and/or post-translational levels, as well as at translocational and post-translocational levels, the latter of which are especially relevant and important for molecular determinants expressed at the bacterial cell surface (Fig. 3). With the rise of omic approaches, however, some basic bacterial physiology concepts may sometimes be overlooked and gene/protein expression is very often considered as being limited to regulatory networks involving transcriptional repressors or activators. However, when it comes to functions and activities, it is primarily proteins that can help to comprehend bacterial physiology. It must also be kept in mind that the relationship between mRNA and protein abundances only very partially correlates; mRNA levels are just a proxy for the presence of a protein but is not directly proportionate with the increase or decrease folds of protein expression and even less with its Respective to biochemical process, the sequential steps and events for gene/protein expression flow from pre-transcriptional, transcriptional, post-transcriptional, translational to post-translational regulation levels (as depicted by blue arrows). Thus, at least five regulation levels can be considered in bacteria and at each level, different control mechanisms can be at play. Besides, for a same protein encoded gene different regulation levels and regulatory mechanisms can intervene, e.g. the expression of Ag43 is regulated at pre-transcriptional level by DNA methylation, at transcriptional level by OxyR, at post-transcriptional level by antitermination of transcription and translation initiation in the leader mRNA, and also at post-translational levels with its autoaggregative activity modulated by pH, its native folding requiring chaperones and final subcellular localisation by translocation across the OM. Besides rRNA, tRNA and sRNA, biological functions and activities are essentially represented by proteins and the hierarchy of regulations levels and control mechanisms (as depicted by shades of red) is opposite to the gene/protein expression flow; e.g. whatever the pre-transcriptional (with DNA replication), transcriptional (with mRNA synthesis), post-transcriptional (with the modulation of transcripts) or translational (with the protein synthesis) levels, they are all strictly depend on enzyme activites which can be regulated at post-translational levels in the first place with direct and immediate effect due to modulation of their catalytic activity by temperature or pH for instance. activity when we consider an enzyme for instance (Vogel and Marcotte 2012). Here, the different regulatory levels involved in bacterial adhesion and biofilm formation are highlighted using key examples of different SCFs.

Regulation at the pre-transcriptional level: phase variation
Prior to transcription, some regulatory mechanisms can already be at work at the DNA level, through phase variation. There are four main mechanisms of phase variation (i) DNA inversion, (ii) slipped-strand mispairing, (iii) DNA methylation and (iv) DNA deletion . As a commonality, all these regulatory mechanisms primarily occur at the stage of DNA replication and a large majority of genes regulated by phase variation are bacterial cell surface molecular determinants (Owen et al. 1996;Holden and Gally 2004).
In E. coli K12, T1P are well-known to be subjected to phase variation following DNA inversion (Blomfield 2001). The expression of the fim operon is under the control of the fim promoter, which is located within the fimS-invertible element (Abraham et al. 1985;Wright, Seed and Hultgren 2007). The orientation of the promoter determines the ON or OFF phase and then induces the expression of upstream genes or not. Two tyrosine recombinases, FimB and FimE, are known to control the orientation of the fimS-invertible region. FimB predominantly switches the fim operon transcription from OFF to ON, while FimE mediates ON to OFF phase switching (Klemm 1986;Gally, Leathart and Blomfield 1996;Hannan et al. 2008). Of note, two DNA topological effectors participate in this regulation, namely H-NS (histone-like nucleoid-structuring protein) and IHF (integration host factor); these histones play complementary role, as the DNA inversion is absolutely dependent upon IHF, whereas the inversion rate is slowed down with high levels of H-NS and vice versa (Dorman and Ni Bhriain 1993). The existence of this regulation in DEC has not been examined as of yet.
Slipped-strand mispairing occurs in the course of DNA replication in repetitive DNA regions, which can be positioned either upstream of a coding DNA sequence (CDS) and then influences the transcription, such as the promoter efficiency, or within a CDS and can affect the translational reading frame resulting in a mutation frameshift . In E. coli, phase variation resulting from strand-lippage has not been reported as yet, nonetheless, there is no molecular mechanistic constraint for it not to occur (Torres-Cruz and van der Woude 2003).
Phase variation resulting from DNA methylation corresponds to a bacterial epigenetic mechanism . Ag43 is probably one of most investigated surface proteins subjected to such a regulatory mechanisms (van der Woude and Henderson 2008). This epigenetic regulation involves two proteins, the DNA adenine methylase (Dam) and the OxyR transcriptional regulator (van der Woude and Henderson 2008). When Dam has methylated the GATC sites present in the operator region in the course of DNA replication, the repressor OxyR cannot bind and transcription by the RNA polymerase occurs and Ag43 is expressed (ON phase); however, if OxyR binds the GATC sites before they are methylated by Dam, there is no transcription and no Ag43 expression (OFF phase). Besides Ag43, several pili secreted and assembled by the T7SS have been reported to be subjected to such an epigenetic regulation in E. coli Blomfield 2001). The pap (pyelonephritisassociated pilus) operon in UPEC is considered as a paradigm where the Dam methylation of a GATC-II site in the operator region prevents binding of the repressor Lrp (leucine-responsive regulatory protein), and consequently the papBA operon is transcribed and the pili are expressed (ON phase). In the absence of methylation at GATC-II, Lrp can bind to the operator, repress the transcription and ultimately prevent pili formation (OFF phase). Additionally, this repression can be lifted when Lrp binds to another site called GATC-I. Among DEC, CS31A pili are subjected to this same regulatory mechanism (Crost et al. 2003;Graveline et al. 2014).
As a general trend, phase variation due to DNA deletion is irreversible due to the loss of the genetic element bearing the gene of interest. In E. coli, DNA deletion is responsible for unilateral flagellar phase variation as reported in the H3, H47 and H17 strains (Zhou et al. 2015). While most flagellins are encoded by fliC in E. coli, H3 and H47 are encoded by flkA and H17 is encoded by flnA. For H3 and H47, their production results from the expression of flkAB operon, where the transcriptional regulator FlkB represses fliC (Feng et al. 2008). Upon excision of the flk region from the chromosome, flkAB is irreversibly deleted, the repression of fliC is released and the FliC flagellin is produced. Similarly, the H17 strain can irreversibly switch flagellar antigens to H4 (Ratiner 1967). It appears this flagellar phase variation can be caused by excision of flnA (Liu et al. 2012). When flnA is present in the chromosome, the translation of FliC H4 is inhibited and only FlnA H17 is produced; once flnA is excised, the repression of the fliC is released and only the FliC H4 is produced. The ∼35 kb DNA deletion region containing the flnA gene is excised as a covalently closed extrachromosomal circular form. While some DNA deletion can occur through homologous recombination , flagellar phase variation is mediated by non-homologous recombination via an integrase of the tyrosine recombinase family (Feng et al. 2008). The flagellar phase variation mechanisms in some other E. coli H variants and especially in DEC remain to be defined.

Regulation at the transcriptional level: regulators and effectors
Regulation at the transcriptional level is the most well-known level of gene regulation and quite often the only one really considered as a proxy for protein expression levels. Transcriptional regulators can either be repressors or activators but it is wrong to assume a repressor will systematically repress transcription or an activator will activate transcription. A second crucial partner to the process must also be considered, that is the effector, which can be of two types, either an inducer or a corepressor. Four possibilities for regulation at the transcriptional level can be discriminated: (i) positive control of an inducible gene, where an activator is activated by an inducer, (ii) positive control of a repressible gene, where an activator is inactivated by an inhibitor, (iii) negative control of an inducible gene, where a repressor is inactivated by an inducer or (iv) negative control of a repressible gene, where a repressor is activated by a co-repressor. Additionally, a so-called repressor can act as an activator for some genes and vice versa. In other words, the upexpression or down-expression of a regulator is not sufficient to know what kind of transcriptional regulation is taking place without knowing the nature and level of the inducer.
Bacteria can sense and respond to environmental cues thanks to a large range of two-component signal transduction systems where a sensor activates a transcriptional regulator, which further represses or activates gene expression (Hoch 2000;Zschiedrich, Keidel and Szurmant 2016). Some of these systems participate in cell-to-cell communication (CTCC) via a signal molecule called auto-inducer (AI) (Bassler 2002). Quorum sensing (QS) is only one of the different functions of CTCC, which specifically refers to the sensing of the cell density (quorum); QS should not be considered synonymous with CTCC because some sensing can be unrelated to QS sensu stricto but to diffusion sensing, confinement or efficiency sensing for instance (Redfield 2002;Platt and Fuqua 2010;West et al. 2012). This semantic issue is of particular importance in biofilm formation, since by definition, bacteria cells are at a high density following sessile development and therefore the notion of QS makes little sense. Transcriptional regulators of virulence and SCFs have been the subject of intense and extensive research and scientific literature in DEC Tobe 2008;Pruss 2017;Rossi et al. 2018). For these reasons only some key examples will be provided to illustrate the relevance of differentiating the regulation at different levels.
At the transcriptional level, PNAG production is regulated by NhaR, a transcriptional regulator of the LysR family, which activates the transcription of the pgaABCD operon by binding to two sites near the -35 region of the promoter (Goller et al. 2006). In EPS, the production of colanic acid is consistently upregulated within biofilms by the RcsA transcriptional activator (Matthysse et al. 2008;May and Okabe 2008). The transcription of the wca operon is regulated by the rcsABCF locus that encodes a twocomponent system (Gervais and Drapeau 1992;Ebel and Trempy 1999;. However, the signal sensed by the RcsC sensor kinase remains unknown (Whitfield and Roberts 1999;Oropeza, Salgado-Bravo and Calva 2015). H-NS is known to act as a transcriptional repressor in bacteria, a so-called bacterial transcriptional silencing, analogous to eukaryotic silencing by histones (Landick, Wade and Grainger 2015;Grainger 2016). While RcsA is present at a low amount in the cell, this was found to be partially due to transcriptional silencing by H-NS (Sledjeski and Gottesman 1995). Cellulose synthesis is under the control of the CsgD transcriptional regulator (Romling et al. 2000;Zorraquino et al. 2013). Interestingly in EIEC, csgD expression is often inactivated (Sakellaris et al. 2000), suggesting that biofilm formation can interfere with pathogenesis, making these strains poor biofilm formers.
While no specific transcriptional regulator has been identified for the expression of AIDA-I, it was shown that transcription was enhanced in the absence of H-NS and RfaH transcriptional regulators (Benz et al. 2010). Similarly, the transcription of ehaG and fdeC is regulated by H-NS (Totsika et al. 2012;Easton et al. 2014).
CS31A synthesis is dramatically reduced in media containing alanine or leucine, suggesting that these amino acids can play a role as effectors (Crost et al. 2003). The ON/OFF switch is locked in the OFF phase by alanine, whilst leucine repressed transcription but without affecting the switch frequency. Analysis of clp expression indicated that alanine and leucine could repress clp transcription by a methylation-independent mechanism but also by either promoting methylation or methylation protection of GATC-II and GATC-I respectively, which increased the methylation pattern characteristic of repressed cells. Furthermore, alanine prevented the AfaF-dependent methylation protection and thus the appearance of cells in the ON phase. Additional regulatory proteins, including ClpB, cAMP, receptor protein (CRP) and H-NS, also play important roles in the transcriptional expression of the operons of the pap family combined with regulation at a pre-transcriptional level by phase variation (Blomfield and van der Woude 2007).
For the T4P in EPEC, the expression of the bfp operon is controlled by the BfpT (also called PerA) transcriptional regulator, a member of the AraC family, encoded on the enteroadherence factor plasmid (Tobe et al. 1992;Gomez-Duarte and Kaper 1995). The expression of CFA/I is positively regulated by CfaR, whereas for the expression of CFA/II, CS1 and CS2 is positively regulated by the rns gene product (a homologue to cfaR with 96% identity) (Caron and Meyer 1989;Caron and Scott 1990;Savelkoul et al. 1990). The expression of AAF is induced by the transcriptional activator AggR (an homologue of AraC) also located on pAA (Nataro et al. 1994); YafK and Fis (factor for inversion stimulation) have also been reported to regulate AAF/II transcription (Sheikh et al. 2001). From a transcriptional regulation point of view, lpf1 is repressed by H-NS and activated by Ler in response to different environmental conditions (Torres et al. 2007;Rojas-Lopez et al. 2011), whereas lpf2 transcription appears to be activated by Fur (Torres et al. 2007). Regulation of curli biogenesis is complex and involves several two-component systems, such as EnvZ/OmpR, CpxA/CpxR or CpxR/H-NS/RstA/IHF/OmpR (Vidal et al. 1998;Prigent-Combaret et al. 2000;Prigent-Combaret et al. 2001;Ogasawara et al. 2010;Laverty, Gorman and Gilmore 2014). In EPEC, Fis has been identified as a negative transcriptional regulator of csgA expression ). Curli expression can be triggered by a large range of environmental signals such as the temperature, osmolarity or redox potential (Olsen et al. 1993;Prigent-Combaret et al. 1999;Gerstel and Romling 2001;Evans and Chapman 2014).
The transcriptional regulatory control of the locus of enterocyte effacement (LEE) encoding the injectisome is undoubtedly one of the most extensively investigated in DEC, and in particular in EPEC and EHEC (Schmidt 2010;Stevens and Frankel 2014;Franzin and Sircili 2015). For additional information about the complex regulation networks of specific, global and phage encoded regulators, as well as environmental signals such as nutrient sources or metabolic products from the host or microbiota that can affect the transcription of the LEE-encoded genes, readers are referred to recent, specific reviews on the topic (Connolly, Finlay and Roe 2015;Furniss and Clements 2018;Platenkamp and Mellies 2018;Turner, Connolly and Roe 2018).

Regulation at a post-transcriptional level
At least three main regulation mechanisms can occur posttranscriptionally, (i) the stability of mRNA, which can be quantified by determining its half-life, (ii) a riboswitch, where a molecule such as a metabolite can change the folding of an mRNA with the formation of a termination hairpin that stops the on-going transcription by the RNA polymerase or (iii) attenuation based on the formation of terminator/anti-termnator loops, which couple or uncouple the transcription by the RNA polymerase with the translation of the mRNA. Such posttranscriptional regulations are important regulatory mechanisms that are generally overlooked and underestimated, most likely because they cannot be easily investigated and estimated by transcriptomic analysis on its own (Vogel and Marcotte 2012).
Recently, it was shown that the expression level of agn43 can be controlled by antitermination of transcription and translation initiation in the leader mRNA (Wallecha et al. 2014). Among EPS determinants, PNAG production is regulated by the RNAbinding protein CsrA (carbon storage regulatory protein A) posttranscriptionally (Boles and Horswill 2011;Wang, Yang and Yang 2017), where CsrA binds cooperatively to the pgaA mRNA and competes for recognition with the 30S ribosomal subunit. By binding to sites located in the mRNA leader, CsrA can further destabilise the pgaA transcript. The transcription of yeeJ is increased in absence of the mRNA regulator PNPase, an exoribonuclease polynucleotide phosphorylase component of the degradosome (Martinez-Gil et al. 2017).
Pili produced by the pap operon appears to be regulated posttranscriptionally as a result of differential mRNA stability (Baga et al. 1988). The study demonstrated that the papBA transcript is processed and the resulting mRNA encoding the major pilin subunit accumulated. The difference in abundance of the two mRNA species could be readily explained by differences in their half-life. In E. coli, RNA degradation occurs via the degradosome thanks to the combination of endoribonuclease and exoribonuclease activities (Burger, Whiteley and Boshoff 2011;Bandyra et al. 2013).

Regulation at the translational level
While attenuation collaterally affects the translation, three main mechanisms are directly involved in the regulation of translation, (i) anti-sense RNAs (including the small RNAs), which hybridise with mRNA and thus block the binding of the ribosome, (ii) riboregulation, where a ligand changes the mRNA folding, which consequently prevents the binding of the ribosome and (iii) translational efficiency depending on the codon usage.
In addition to CsrA, PNAG synthesis is regulated by two small RNAs, CsrB and CsrC, which actually sequester CsrA and thus activate the translation of the pgaABCD transcript (Liu et al. 1997;Weilbacher et al. 2003). For colanic acid production, the low level of expression from the rcsA promoter by H-NS transcriptional silencing is alleviated by the DsrA small RNA (Sledjeski and Gottesman 1995).
In E. coli, the OmpA protein is expressed to very high levels, is growth rate dependent and is a paradigm for riboregulation (Lugtenberg et al. 1976;Koebnik, Locher and Van Gelder 2000). Actually, the ompA mRNA half-life increases proportionally with the bacterial growth rate (Nilsson et al. 1984). While a specific region of the transcript is targeted by the RNaseE (endoribonuclease E), binding of the ribosome induces conformational changes that mitigate the mRNA degradation (Emory and Belasco 1990;Emory, Bouvet and Belasco 1992;Hansen et al. 1994). As an antagonist, Hfq can bind the transcript to decrease its stability, thus inducing RNA decay (Nilsson et al. 1984;Vytvytska et al. 2000). Hfq facilitates the binding of a small RNA called MicA in the vicinity of the ribosome-binding site, thus preventing ribosomal recruitment (Udekwu et al. 2005).

Regulation at the post-translational level
Regulations at the post-translational level comprises the most diverse range of molecular mechanisms and is hierarchically the most important (Fig. 3). In metabolic pathways, regulation at the post-translational levels is a key mechanism, particularly in relation to the modulation of the enzymatic activity, which can be influenced by physical parameters (pH, temperature, ionic force, redox, etc. . . ), inducers and inhibitors (irreversible or reversible: competitive, non-competitive, uncompetitive or mixed inhibition) (Guedon et al. 2000;Desvaux and Petitdemange 2002;Desvaux 2004); retro-inhibition and pro-activation can also occur and may also involve allosteric enzymes. Protein activity can be further altered by numerous post-translational modifications, namely, (i) proteolytic cleavage and (ii) chemical modifications such as disulphide bonds, phosphorylation, acetylation, methylation, adenylation or uridylation. Posttranslational regulation also includes the protein folding, association/dissociation of homo-and heteromers, the degradation of proteins following the N-terminal rule by the ClpAP proteolytic complex, which can all influence the protein half-life, as well as the protein translocation to a final subcellular location. Indeed, the maturation of a protein can also occur at translocational and post-translocational levels.
As an example of post-translational regulation, the decreased production of colanic acid at 37 • C results from the degradation of the RcsA transcriptional activator by the Lon protease (Ebel and Trempy 1999). This post-translational regulation alleviates the wca transcription and explain the low amount of RcsA in cell (Sailer, Meberg and Young 2003). As a two-component system, the RcsA regulator is activated by the transfer of a phosphate group from the RcsC sensor, which is per se another post-translational regulation level (Desai and Kenney 2017). For cellulose biosynthesis, the catalytic activity of the BcsA-B complex using UDP-glucose as a substrate is allosterically controlled by cyclic-di-GMP (c-di-GMP) on the PilZ domain of the cellulose synthetase BcsA (Omadjela et al. 2013). Actually, the PilZ domain was the first effector identified that is activated upon binding of c-di-GMP (Ryan, Tolker-Nielsen and Dow 2012). Furthermore, the diguanylate cyclase AdrA exhibiting a GGDEF domain regulates c-di-GMP production (Romling et al. 2000;Zorraquino et al. 2013). C-di-GMP is a ubiquitous second messenger produced by the diguanylate cyclase exhibiting GGDEF domain, which is antagonistically degraded by the phosphodiesterases exhibiting EAL domain (Romling and Amikam 2006). This molecule controls the motility and virulence of planktonic cells, as well as cell adhesion and persistence of multicellular communities (Jenal and Malone 2006;Romling and Amikam 2006;. As an autotransporter, Ag43 exhibits a signal peptide, which drives the preprotein to the Sec export system for translocation across the CM before being cleaved off after translocation into the periplasm. In the periplasm, several chaperones participate in the folding prior to the translocation across the OM through a cooperative mechanism involving the translocation assembly (TAM) and β-barrel assemby (BAM) machineries (Selkrig et al. 2014). Additionally, the passenger of Ag43 is glycosylated, which stabilises its conformation . These different post-translational, translocational and post-translocational levels all contribute to the regulation of the expression of this surface protein. While glycosylation is not that important for the functions of Ag43 (Reidl et al. 2009), in TibA it is necessary for autoaggregation, adhesion to epithelial cells and biofilm formation (Cote, Charbonneau and Mourez 2013).

CONCLUSION AND PERSPECTIVES
Reviewing the different cell-surface molecular determinants that can participate in the surface colonisation process in DEC, from bacterial adhesion to biofilm formation, the wealth of SCFs at play is clearly highlighted. While some of these molecular determinants still remain to be fully characterised, their interplay in surface colonisation must also be carefully considered and kept in mind. The flagella, as force-generating cell-surface organelles, have been demonstrated to be important for biofilm formation (Hobley et al. 2015), but expression of strong adherence factors could replace motility in the early stages of biofilm formation (Pratt and Kolter 1998;Donlan 2002). Although flagella expression is repressed during the switch from the planktonic to sessile lifestyle to reduce the motility capacity of the bacteria, these surface organelles have a structural and architectural role in the EPM (Hung et al. 2013;Serra, Richter and Hengge 2013). While the expression of flagellar genes are repressed, genes involved in the biosynthesis of the EPM components are generally activated during the biofilm maturation step (Guttenplan and Kearns 2013). In E. coli K12, capsule polysaccharide and T1P appear to block the autoaggregation mediated by Ag43 by physically shielding intercellular Ag43-Ag43 interaction (Hasman, Chakraborty and Klemm 1999;Schembri, Dalsgaard and Klemm 2004), whilst, in turn, the autoaggregation overrides bacterial motility (Ulett, Webb and Schembri 2006). In some ExPEC, T1P expression appears to be further modulated and influenced by OmpA or OmpX, together with an increase of exopolysaccharide production, as well as a decrease in bacterial motility (Otto and Hermansson 2004;Teng et al. 2006). In NMEC, OmpA would act together with Hek in the invasion of epithelial cells Fagan, Lambert and Smith 2008). All-in-all, this suggests the OMPs' composition of the OM may act as a signal in physiological adaptation of bacteria for surface adhesion and colonisation; this research direction is one of the next frontiers to be explored in DEC.
As a general trend, the average number of pili types appears lower in commensal compared to pathogenic E. coli (Spurbeck et al. 2011). For instance, curli or conjugative pili can compensate for motility during initial adhesion and biofilm development (Prigent-Combaret et al. 2000;Ghigo 2001;Reisner et al. 2003;. Plasmids in general can encode numerous SCFs as shown in ETEC and EAEC (Amabile-Cuevas and Chicurel 1996; Mainil et al. 1998;Ghigo 2001;Molin and Tolker-Nielsen 2003;Kaper, Nataro and Mobley 2004;Wuertz, Okabe and Hausner 2004;Ong et al. 2009). While conjugative plasmids can confer initial adhesion capacity and modulate the biofilm architecture (Ghigo 2001;Wuertz, Okabe and Hausner 2004), the genetic mobility of this extrachromosomal gene pool and its contribution to biofilm formation remain poorly investigated in DEC (Dudley et al. 2006b). In Pseudomonas aeruginosa, T4P have been primarily regarded as involved in the attachment of epithelial cells in the course of an infection but later were demonstrated to also bind to abiotic surfaces such as polyvinyl chloride, polystyrene and stainless steel (Giltner et al. 2006) and it even appeared to exhibit a much higher affinity towards steel than the mucosal epithelial surface, which emphasises the relevance of examining T4P in both environmental and clinical conditions (Yu et al. 2007;Burgess, Desvaux and Olmez 2014). In the human and animal cutaneous pathogens Erysipelothrix rhusiopathiae, the RspA (rhusiopathiae surface protein A) and RspB surface proteins have been shown to specifically bind several ECM components, namely, fibronectin, collagens I and IV, but also polystyrene shedding light on the ecophysiology of this microorganism through its binding ability to adhere to both biotic and abiotic surfaces (Shimoji et al. 2003). These aspects have not been reported or examined as yet in DEC but are particularly relevant considering the presence of T4P and ECM-binding proteins, especially some ATs, in the various E. coli enteropathotypes.
The regulatory network for the production of colonisation factors is often depicted as being restricted to the transcriptional level. However, this review clearly demonstrates that the range of regulation levels is much broader and even more complex (Fig. 3). As a general trend, it is important to stress and keep in mind that the primary functional and regulation level is posttranslational and not transcriptional, as is sometimes assumed. Whenever DNA replication, RNA polymerisation or protein synthesis occur, enzymes are essential and required for these physiological processes at pre-transcriptional, transcriptional and translational regulation levels, respectively; any abrupt changes in the environmental conditions, such as some physicochemical parameters (e.g. pH, temperature, redox potential), will have a first and direct effect on the enzyme activity before the cell can even change its transcription profile. For the SCFs, the interplay taking place at the other regulation levels is extremely complex and their hierarchy is extremely difficult to establish at a global scale. As well as this, some regulatory mechanisms in the expression of SCFs in DEC have not been fully investigated, such as attenuation, riboswitches or translational efficiency, but their involvement cannot be excluded. As molecular cellsurface determinants, the SCFs in DEC need to be translocated across a LPS-diderm bacterial cell envelope to be functional and active, which involves further translocational and posttranslocational regulation levels that should not be overlooked in a regulatory network. To this end, our view of the regulatory network for the production of SCFs in E. coli remains incomplete and there is far from an integrated view of all regulation mechanisms. In addition, findings from investigations using domesticated laboratory strains of E. coli must be interpreted with caution and reinvestigation in DEC genetic backgrounds would be wise (Hobman, Penn and Pallen 2007). This will undoubtedly lead to new discoveries in the field in the years to come and contribute to our understanding of DEC colonisation mechanisms.
In DEC, SCFs have often been examined for their contribution to bacterial virulence and thus investigated in conditions related to human infection (Nataro and Kaper 1998;Kaper et al. 2004;Rossi et al. 2018). In addition to humans, the GIT of a wide range of animals also harbours E. coli strains, both commensal and pathogenic (Escobar-Paramo et al. 2006;Croxen et al. 2013;Smati et al. 2015;Torres 2017). Following shedding from these animal reservoirs, E. coli is also found in the environment. Out-side the host, the range of extraintestinal environmental conditions that can be encountered by this species is wide, ranging from soil, water to plants, as well as food matrices and food processing facilities (van Elsas et al. 2011;Giaouris et al. 2014;Jang et al. 2017). As foodborne zoonotic pathogens, understanding the ecophysiology of DEC necessitates considering its lifestyle outside the human host. In fact, the role of SCFs should be placed in a context much broader than the colonisation of the GIT, as they can also play an important role in the colonisation of other environmental niches. A focus solely on the physiopathology and GIT environment may bias and limit a full understanding of the wide diversity of SCFs in E. coli. While the notion of virulence factors is a major contribution to the field of microbial pathogenesis (Falkow 1988;Finlay and Falkow 1989), a change of paradigm with the concept of coincidental by-products of commensalism (Le Gall et al. 2007;Diard et al. 2010;Leimbach, Hacker and Dobrindt 2013) or niche factors (Hill 2012) is necessary to more accurately apprehend and understand the ecophysiology of pathogenic species in the food chain and in one-health approach.
Taking a one-health approach considering the whole food chain, the physiology of DEC should not only be considered with respect to human infection only, but also in conditions representative of upstream, i.e. from the natural environments, animal/human reservoirs, agri-food environments and foodstuffs (Burgess, Desvaux and Olmez 2014). Investigating the ecophysiology of the DEC with respect to the various biotopes and biocoenoses encountered in different ecosystems from natural environments, animal reservoirs, food matrices, foodprocessing environments, to human ingestion should shed new light on the relevance and contribution of the SCFs for this species and inform the design of strategic, targeted interventions to improve public health. Training Network) EID (European Industrial Doctorate) DISCo (A multidisciplinary doctoral industrial school on novel preventive strategies against E. coli infections) project (n • FP7-PEOPLE-607611), by the RA-FEDER-Bpifrance ("Région Auvergne-Fonds Européen de Développement Régional-Banque Publique d'Investissement des entrepreneurs franç ais") FRI ("Fonds Régional Innovation") Cluster IRP ("Institut de Recherche Pharmabiotique") CoMBa ("Colonisation de la matrice par les bactéries") project (n • AV0003483 and n • DOS0019690/00), by ANR (Agence National de la Recherche Agronomique) PathoFood (Deciphering the cell heterogeneity and the spatial distribution of foodborne bacterial pathogens in food matrices in interaction with the microbial communities) project (n • ANR-17-CE21-0002) and by France-Ireland PHC ("Programme Hubert Curien") ULYSSES 2016 Campus France-Irish Research Council ARBiC project (n • 36044YD). RM was a Marie Curie PhD research fellow granted by the ITN EID DISCo. VA was a PhD research fellow granted by the RA-FEDER-Bpifrance FRI IRP CoMBa. All authors declare that the work was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

AUTHORS CONTRIBUTION STATEMENT
VA and MD wrote the first overall draft of the manuscript and draw the original pictures; RM, SL, MP, CMB and FCD wrote sections of the manuscript. MD contributed to conceptualise the overarching aims and had management as well as coordination responsibility for the execution of the work. MD, MP, CMB and FCD contributed to the acquisition of the financial supports and resources leading to this publication. All authors contributed to the critical revision of the manuscript, read and approved the submitted version.