LINE-1 retrotransposons drive human neuronal transcriptome complexity and functional diversification

The genetic mechanisms underlying the expansion in size and complexity of the human brain remain poorly understood. Long interspersed nuclear element–1 (L1) retrotransposons are a source of divergent genetic information in hominoid genomes, but their importance in physiological functions and their contribution to human brain evolution are largely unknown. Using multiomics profiling, we here demonstrate that L1 promoters are dynamically active in the developing and the adult human brain. L1s generate hundreds of developmentally regulated and cell type–specific transcripts, many that are co-opted as chimeric transcripts or regulatory RNAs. One L1-derived long noncoding RNA, LINC01876, is a human-specific transcript expressed exclusively during brain development. CRISPR interference silencing of LINC01876 results in reduced size of cerebral organoids and premature differentiation of neural progenitors, implicating L1s in human-specific developmental processes. In summary, our results demonstrate that L1-derived transcripts provide a previously undescribed layer of primate- and human-specific transcriptome complexity that contributes to the functional diversification of the human brain.


Figure S2 L1 expression in neurons in the adult human brain.
A) Cell type composition in the snRNA-seq of adult samples.B) Expression (RPKM) over full length (>6kbp) L1HS, L1PA2, L1PA3 and L1PA4, plus 6kbp flanking regions in each cluster for one of the adult samples.Blue heatmaps showing the signal per cluster in sense of the annotated element.Red heatmaps showing signal in antisense.Top annotation indicates the cell type of the cluster in question.D) Single-read mappability score for full-length (>6kbp) young L1 subfamilies (read length of 100) as reported for hg38 by Karimzadeh, et al. 2018 (86) subfamilies on UMAP. ? ?

Figure S1
Figure S1 Quality control for the validation of L1 expression in the adult human brain.A) Number of reads quantified as genes or TEs per sample, as quantified by TEcounts.B) Expression (RPKM) over full length (>6kbp) L1HS, L1PA2, L1PA3 and L1PA4, plus 6kbp flanking regions.Blue heatmaps showing the signal per sample in sense of the annotated element.Red heatmaps showing signal in antisense.C) Genome browser tracks showing an adult-specific expression of a >6kbp L1PA4 with antisense transcription initiated in its promoter.Transcription is split by strand (blue = forward; red = reverse).

Figure S3 Figure
Figure S3 Quality control for the validation of L1 expression in bulk and different cell types of the fetal forebrain.A) Number of reads quantified as genes or TEs per sample, as quantified by TEcounts.B) Expression (RPKM) over full length (>6kbp) L1HS, L1PA2, L1PA3 and L1PA4, plus 6kbp flanking regions.Blue heatmaps showing the signal per sample in sense of the annotated element.C) Red heatmaps showing signal in antisense D) Genome browser tracks showing fetal-specific expression of a >6kbp L1PA4 with antisense transcription initiated in its promoter.Transcription is split by strand (blue = forward; red = reverse).E) Comparison of the pseudo-bulk cluster expression of young L1 subfamilies among the different cell types (AP = apical progenitors; BP = basal progenitors; CR = Cajal Retzius; EBN = early-born neurons; IN = interneurons; M = microglia).F) Cluster expression of young L1 subfamilies (quantified per sample), grouped per cell type.G) L1 expression of cycling vs non-cycling cells from each cluster, grouped per cell type (p-value as per paired Wilcoxon test).

Table 1 Demographics of adult and fetal samples including the brain region of sample collection and sequencing approach
A) adult cortex samples.PMI = Post-mortem interval.B) fetal forebrain samplesSupplemental Table 2 Statistical analysis of organoid growth.The size of 10 organoids was measured at each time point in three independent replicates of the experiment, for a total of 30 organoids per time point, per condition.A) Per CRISPRi guide RNA: Statistical analysis was performed using Two-way ANOVA and a Dunnett correction for multiple comparisons.B)Pooled CRISPRi guide RNAs: Statistical analysis was performed using Mixed-effects analysis and a Sidak correction for multiple comparisons.

Table 3
L1-chimera transcripts.Transcripts ids and gene names as annotated in gencode v38 (or de novo), transcript expression level, and coordinates of the L1 residing in the transcript's promoter site.A) Transcripts expressed in adult samples B) Transcripts expressed in fetal samples.