Paraphelidium transcriptome: some code

This is a quick post to highlight the publication of Global transcriptome analysis of the aphelid Paraphelidium tribonemae supports the phagotrophic origin of fungi by Guifré Torruella, Purificación López-García (Université Paris-Sud) et al., in Communications Biology.

In parallel, we released the R code we used to analyse the functional profile of its gene content, which you can find it in this Github repository. We used the absence/presence profile of a set of genes linked to primary metabolism in the genomes (or transcriptomes) of unicellular eukaryotes to highlight similarities between the gene content of fungi and Paraphelidium.

Thus, it can assist in the production of plots such as the following (Figure 3 in the paper):

a PCoA (Principal Coordinate Analysis) of gene presence for orthologs related to primary metabolism, across 41 eukaryotes. b same data in a binary presence/absence heatmap, with species clustering

These results highlight the similarities between the rich primary metabolism of Paraphelidium and ‘canonical’ fungi. Thus, unlike other early-branching fungal allies such as Rozella, Paraphelidium does not have a simplified metabolism. This is consistent with what is actually the most interesting result of the paper (in my opinion): the phylogenomic analysis of Paraphelidium consolidates its position as sister-group to fungi and breaks up its association with Rozella and microsporidians. This has important implications regarding the lifestyle of the ancestral fungi, which was likely phagotrophic (instead of osmotrophic).

But you should read the paper to get the whole picture.


Ich fürchte mich so vor der Menschen Wort

Ich fürchte mich so vor der Menschen Wort.
Sie sprechen alles so deutlich aus:
Und dieses heißt Hund und jenes heißt Haus,
und hier ist Beginn, und das Ende ist dort.

Mich bangt auch ihr Sinn, ihr Spiel mit dem Spott,
sie wissen alles, was wird und war;
kein Berg ist ihnen mehr wunderbar;
ihr Garten und Gut grenzt grade an Gott.

Ich will immer warnen und wehren: Bleibt fern.
Die Dinge singen hör ich so gern.
Ihr rührt sie an: sie sind starr und stumm.
Ihr bringt mir alle die Dinge um.


Les paraules dels homes m’omplen de temor.
Parlen de totes les coses amb mots clars:
així que això és una casa, i això és un gos,
i aquí hi ha el principi, i aquí hi ha el final.

M’espanta com pensen, el seu joc sardònic;
saben què passarà, i saben què passà;
i a cap muntanya hi queda res sagrat:
béns i jardins fan frontera amb els divins.

Sempre he d’avisar-los: ni us apropeu!
És la música de les coses el que pretenc sentir.
Les toqueu: però elles romanen quietes i mudes.
Tots vosaltres em mateu les coses.

Rainer Maria Rilke

«L’església catòlica espanyola»

Puta paparra, carronya on fermenta
La claveguera de la llum del dia,
Apunta el seu coet lluna opulenta
I implora no fallar la punteria.

Teixeix sotanes una aranya lenta.
Com ballen amb les vides per la via
Que va del militar a la serventa!
Despullen amb les ungles pedreria.

Ens fa de mare i de pare, i s’engreixa
De tèrbola tenebra, i no desdenya
De beneir la reixa de la queixa.

Be mossegaire, mal de tots nosaltres,
Aquesta activitat d’ensenyar els altres
Aplica-te-la, porca, a tu mateixa.

L’esglèsia catòlica espanyola (Joan Brossa, Poesia rasa I, 1950-1955)


Havies d’haver fet una altra fi;
et mereixies, hipòcrita, un mur a
un altre clos. La teva dictadura,
la teva puta vida d’assassí,

quin incendi de sang! Podrit botxí,
prou t’havia d’haver estovat la dura
fosca dels pobles, donat a tortura,
penjat d’un arbre al fons d’algun camí.

Rata de la més mala delinqüència,
t’esqueia una altra mort amb violència,
la fi de tants des d’aquell juliol.

Però l’has feta de tirà espanyol,
sol i hivernat, gargall de la ciència
i amb tuf de sang i merda. Sa Excremència!—

Glòria del bunyol,
ha mort el dictador més vell d’Europa.
Una abraçada, amor, i alcem la copa!

Final! (Joan Brossa, 20 de novembre de 1975)

The evolution of alternative splicing in eukaryotes (and the animal ‘revolution’)

After a few years of work with Iñaki Ruiz-Trillo (IBE and ICREA) and Manuel Irimia (CRG), the last paper from my PhD thesis has just been published in Genome Biology — you can read it here (open access, of course):

Origin of exon skipping-rich transcriptomes in animals driven by evolution of gene architecture (Grau-Bové et al., Genome Biology 2018)

In this article, we survey the role of alternative splicing in the transcriptomes of 65 different eukaryotes. Alternative splicing is a mechanism of transcriptomic regulation by which multiple transcripts (isoforms) can be produced from a single gene. Some of these ‘additional’ possible transcripts can end up becoming new evolutionarily significant protein variants, but AS also affects gene expression levels, and has been linked to a fairly substantial amount of biological noise.

In any case, AS bears the potential to increase the biological complexity of eukaryotes by tuning the usage and function of the genes encoded in their genomes (provided they have introns, as many eukaryotic genes do).

We took a great effort in covering as much eukaryotic diversity as possible, in order to make evolutionary inferences all the more robust. To the best of our knowledge, this is the first comparative of alternative splicing that includes at least one genome from each of the main eukaryotic groups, and covers all available early-branching animals.

Our taxon sampling

For the most part, we focus on a particular form of AS known as exon skipping (ES): the alternative inclusion of a given exon in the final transcript. Exon skipping has long been considered a fixture of animal transcriptomes (and maybe even one of the reasons behind animal multicellular complexity). Of course, we wanted to dig deeper into this idea.

First things first: all eukaryotic groups show at least some level of exon skipping, which means that it appeared in the last eukaryotic common ancestor (at the same time introns did).

Second, we indeed found that animals had generally higher levels of exon skipping than other eukaryotes (in the figure below, we compare frequencies in animal groups, unicellular holozoans, and other opisthokonts like fungi). But this was not the case for all animals: early-branching non-bilaterians, like sponges or the placozoan Trichoplax adhaerens had low, ‘protist-like’ ES levels.

ES frequency in animals and some unicellular eukaryotic groups. Non-bilaterians include sponges, cnidarians, Mnemiopsis (ctenophore) and Trichoplax (placozoan).

Does that mean that the great increase in exon skipping usage seen in animals was actually something specific to bilaterians alone? Quantitatively, it may seem so. Qualitatively, not quite so.

Indeed, even if they had fairly low exon skipping, early-branching sponges and Trichoplax shared a unique feature with most other animals: skipped exons tended to have 3-divisible lengths, which means that their occasional exclusion has a lesser effect on the final transcript and protein (it does not break the ORF):

In most bilaterians (red), higher ES implies an overabundance of ORF-preserving alternative splicing events.

This means that exon skipping in early animals became enriched for frame-preserving events (with 3-divisible exon lengths) before the increase in exon skipping in bilaterian animals. We see this as a two-step evolutionary process.

So far, this is the first part of the paper. But this analysis came with a surprise: when comparing exon skipping in different eukaryotes, we saw that some protists, like Sphaeroforma arctica, had small increases in frequency compared to their sister species. Interestingly, this coincided with protists that had some ‘animal-like’ features in their genome architectures, like enlarged genomes with abundant and long introns.

The relationship between gene architecture and alternative splicing frequency is well-known and proved in a number of species. But Sphaeroforma‘s oddity raised a question: are gene architectural effects on AS ‘universal’ to all eukaryotes? And, if so, can we use them to understand the evolution of alternative splicing?

Well, as it turns out, yes.

In this figure, we show that whenever a given trait of gene architecture is associated with exon skipping, this correlation tends to be conserved across multiple eukaryotes. For example, having longer introns facilitates skipping of the middle exon, and this can be seen in most species (blue boxes all along the horizontal axis). Same thing for negative relationships (in red): e.g. short exons are more frequently involved in skipping.

Positive (blue) and negative (red) associations of various gene architectural features with higher ES levels, for some holozoans. All 65 species are shown in the paper.

Thus, Sphaeroforma‘s increase in exon skipping is mirrored by changes in its genome organization that facilitate this form alternative splicing. These same changes occurred independently (and more deeply) in animals and plants.

In essence, we find that gene architecture is a pan-eukaryotic ‘soft code’ of alternative splicing determination. Thus, we can approximate the evolution of this part of the transcriptome by studying the evolutionary history of genomes themselves. And genome evolution is way easier to study.

And this is our final throw of perfume to the violet: we use data from gene architecture and AS in living eukaryotes to approximate the levels of AS in ancestral eukaryotes, for which transcriptomic data is obviously non-existing. We use this model to pinpoint shifts in AS usage in the animal ancestry.

Climbing mount AS using genome data as a GPS. A terrible metaphor, but I like the plot.

This ancestral reconstruction signals that animal evolutionary innovations involving AS mostly appeared at the same time as multicellular animals themselves (quite unlike what happens for other genome innovations, like many quintessential ‘animal’ gene families that are actually older than one might expect — see here, and here, and here, and here…).

And that is it.

In the paper we cover other topics as well, such as intron retention, the effect of nucleotide composition in AS, a more detailed analysis of intron length evolution in Volvox and Sphaeroforma, and more. To find out about that, I encourage you to read it 😉

And if you have any input or doubts, let me know!

Sphaeroforma arctica, by Arnau Sebé-Pedrós.

PS: It’s also been argued that AS is not a central contributor to ‘regulated proteome diversity’ after all. Instead, it’d be a consequence of a substantial amount of ‘noisy splicing’, and most genes would actually be producing just one main isoform. I personally think this idea has merit at the micro-evolutionary level, although our macro-evolutionary analysis is forcefully recovering more signal from long-standing adaptive effects. This is a very interesting read in that respect:

Alternative Splicing May Not Be the Key to Proteome Complexity (Tress et al., Trends in Biochemical Sciences 2016).

Sageta de foc / «Junteu-vos, junteu-vos!»

LLUITA X BELLES GESTES I ACCIONS: Eterna espiral vers l’Infinit.


VOLUNTAT X UN DESIG BOIG DE CÓRRER; i córrer sempre als cims, així com fuig la cérvola.


Un jaç arran de la carretera
per als vells i els que cauen.
si acàs, que ells mateixos s’aixequin.

Hi ha un HOME a la presó
dels que avancen.
traieu-li l’embaràs que li oprimeix les mans.

Al desencorajat no l’atieu.
Ni al fanàtic absurd.
Deixeu-los barallar,
que es destorbin a ells sols.

sistemes de govern,
sistemes filosòfics,

Sofismes els sofismes per als qui només veuen amb els ulls del cervell.

Mes… si cal governar i dirigir,
agafeu una tralla.
Us estimaran més, i àdhuc obeiran.

Amunt! Amunt! Encara més…
A on anem? No és bo preocupar-se’n.

Suara ha sortit del niu un oronell.



Poemes en ondes hertzianes (Papasseit, 1919).