1 Introduction

Targeting T lymphocytes was among the first monoclonal antibodies (mAb) associated immunotherapies. Anti-CD3 therapy was used to control T cell activity, suppress the immune response, and substitute the polyclonal anti-thymocyte antibody preparation, previously used for graft rejection [17]. Muronomab, a monoclonal antibody specific for the human CD3 antigen, was the first mAb used in clinical studies, but its use was abolished due to its overall toxicity [21]. Nowadays, novel and less toxic CD3 specific antibodies have reemerged as promising therapeutics for controlling autoimmune diseases.

To better understand the human T cell reprogramming after anti-CD3 treatment, we previously investigated the protein-coding (PTC) genes [25] regulated ex vivo in a PBMC milieu. Based on a new transcript prediction algorithm, we reannotated the non-coding transcriptome of human T cells treated with anti-CD3 antibodies to unveil differentially expressed lncRNA (DEL) that may be involved in CD3 targeted antibody therapy. We observed several novel non-coding transcripts along with previously annotated ones, and we discuss their possible participation in T cell fate and the suppressive phenotype.

2 Materials and Methods

2.1 Sample Preparation and Sequencing

We extend here analysis of the RNA-seq data GEO database (GSE112899) originally described in [25]. In brief, Ficoll-Paque purified volunteer human PBMC were cultured with or without anti-CD3 antibodies. We used three antibody preparations: an anti-CD3\(\epsilon \) Monoclonal Antibody - OKT3, and two Recombinant Antibodies derived from OKT3, and produced in a human IgG1 scFvFc format: a humanized (FvFcR) and a chimeric antibody (FvFcM). After 72 h in culture, CD3\(^+\) cells were enriched by negative selection. The RNA-seq data were produced with an Illumina HiSeq in 2 \(\times \) 150 bp paired-end mode [25]. All human blood experiments were performed in accordance with the Ethics Committee of the University of Brasilia guidelines, which approved the study protocol (CAAE: 32874614.4.0000.0030).

2.2 Genome Mapping and Transcript Prediction

All sequenced reads produced by Illumina were analyzed for quality control using FASTQC [1]. The reads were filtered using BBDuk [3] at \(k=31\) to a reference of ribosomal kmers provided by the developers. Adapters were trimmed by cutadapt [15], and reads were then aligned to the HG19/GRCh37 Human Genome using HISAT2 [13] at standard settings. Ryūtō [7] was run on the alignments to predict individual transcripts for each set. Transcript predictions were joined using the TACO meta assembler [16]. Results were compared to the Gencode V19 annotation to identify novel transcripts. Salmon [18] was used to realign the filtered reads for each sample to the full set of GENCODE. A count matrix was created from transcripts together with the additional predicted novel transcripts. DESeq2 [14] was used to identify differentially expressed genes (DEGs) on all replicates, applying a significance threshold for the adjusted p-values of 0.05.

2.3 Data Mining and Ontology Classification

The annotated transcriptome and DEGs were explored using in-house awk scripts and BEDTools [19] commands and visualized with Integrative Genomics Viewer (IGV) [22] and Ensembl browser (ENSEMBL) [12]. Venn diagrams were designed using UGENT toolsFootnote 1. The protein-coding (PTC) transcripts (Ensembl transcript type) were excluded from analysis, and only non-coding transcripts from non-protein-coding and protein-coding genes were considered. Only transcripts with base mean above zero were considered. The overlaps between different transcriptomes were analyzed using bedtools intersect [19]. Non-coding potentials were calculated using Portrait [2] using a cutoff of 60%.

2.4 cDNA Synthesis Transcript Testing

The cDNA was synthesized with the SuperScript IV Reverse Transcriptase kit (Invitrogen, Carlsbad, CA, USA) from the total RNA extracted using miRNeasy® Mini Kit (Qiagen, Valencia, CA, USA) [25] from T lymphocytes from three donors, treated or not with anti-CD3 antibodies, following the manufacturer’s guidelines. Primers were designed for the predicted transcripts lnc-DC and GAPLINC, that were amplified by PCR with Taq Platinum DNA Polymerase (Invitrogen, Carlsbad, CA, USA) under the following conditions: 35 cycles where the DNA was denatured at 94 \(^\circ \)C for 30 s, annealed at 52 \(^\circ \)C for 30 s and elongated at 72 \(^\circ \)C for 1 min. The amplification cycles were performed in the SimpliAmp thermocycler (Applied Biosystems). The PCR products were ligated into the pGEM-T Easy Vector (Promega, Madison, Wisconsin) and transformed into XL1Blue competent cells. The cloning was confirmed by plasmid enzymatic digestion with Nco I and Not I. Plasmids were sequenced by the Sanger method.

2.5 Gene Expression Analysis by qPCR Assays

The quantitative PCR assays were performed in triplicate with total RNA isolated from T cells utilized for cDNA synthesis using RT2 First Strand Kit (Qiagen, Valencia, CA, USA). As previously described [25], expression genes were quantified using RT2 qPCR SYBR Green/ROX Master-Mix (Qiagen, Valencia, CA, USA) following the manufacturer’s instructions. The reference gene Beta-2 microglobulin (B2M) was used as the endogenous control. The assays were performed using the ABI Step One Plus Real-Time PCR System (Applied Biosystems). The 2-\(\varDelta \varDelta \)Ct method was used to calculate the levels of the lncRNA transcripts (fold change) and for analysis of the obtained data, the RT2 Profiler PCR Array Data Analysis (SABiosciences Frederick, MD, USA) software was used. Real-Time qPCR p-values were calculated based on Student’s t-test using RT2 Profiler PCR Array Data Analysis software.

3 Results

The transcriptome reconstructed from the union of all mapped reads comprises 174,649 transcripts of which about a third are protein-coding, nearly half (44.1%) are correspond to known non-coding Ensembl/VEGA transcripts, and 20% previously unannotated transcripts (Fig. 1). The distribution of Ensembl/VEGA transcript types among the known ncRNA is summarized in Fig. 1-A. The most abundant types are transcripts with retained introns (IR) and Processed Transcripts (PT) that are not classified as belonging to one of the lncRNAs classes.

Fig. 1.
figure 1

Several non-protein coding transcripts are differentially expressed in anti-CD3 stimulated cells. (A) Gene type distribution of the observed ncRNA in all experimental samples. The distribution of non-protein-coding transcripts is detailed in the right; Volcano plot of DEL transcripts regulated by (B) OKT3; (C) FvFcR; and (D) FvFcM. Each group is compared to the non-treated sample; (E) Venn diagram comparing statistically significative DEL in each experimental group and their intersections.

Long intergenic ncRNA (LincRNA), Antisense RNA (AS), and pseudogenes were also significantly observed. The three antibody treatments resulted in a total of 726 differentially expressed lncRNA (DELs). OKT3 induced a larger number of ncRNA compared to FvFcR and FvFcM, as observed in the volcano plots (Fig. 1-D). The Venn diagram (Fig. 1-E) summarizes the DEL set seen in each treatment and their intersections. The number of statistically significant, differentially expressed transcripts after each antibody treatment category is summarized in Table 1 for the five major categories of lncRNA. Anti-CD3 antibodies regulate only a small fraction of total observed lncRNA, ranging from 0,9% of retained intron transcripts (IR) after OKT3 stimulation to 0,02% of pseudogenes after FvFcM stimulation.

Table 1. Differentially expressed lncRNA after anti-CD3 treatment

3.1 LincRNAs

LincRNAs are known to be involved in critical processes in the differentiation of T cells. The whole T cell transcriptome reveals 6,552 LincRNA, and a small proportion of them are regulated by anti-CD3 stimulation. Some of them are known to be expressed T cells, such as NEAT1, a nuclear long ncRNA associated with Th2 [10] and Th17 differentiation [23]. NEAT1 is down-regulated after OKT3 and FvFcR treatment, but only OKT3 induces a statistically significant three-fold reduction in transcript level. Linc00861 was found to be expressed in several CD4 and CD8 cells. OKT3 stimulation also diminished linc00861 with high confidence (\(p<10^{-13}\)), while repression by FvFcR shows a barely significant adjusted p-value (<4\(\,\times \,10^{-3}\)).

Three lincRNA were observed to be activated in all antibodies treatment, AC017002.1, LINC00339, and LINC01132. Only the first two were previously found in T cells [5]. AC017002.1 was mostly but not exclusively detected in memory Treg. This lincRNA is in close genomic proximity to BCL2L11, a proapoptotic gene involved in the T cell negative selection in thymus associated with T cell activation by high-affinity antigens [8].

3.2 Novel Predicted Transcripts

Among the reconstructed transcripts, about a fifth (35,149) remained unannotated transcripts (TU) (Fig. 1-A). Most of these TU overlapped with protein-coding and non-protein coding genes (Fig. 2-A), and thus most likely constitute previously undescribed isoforms of known genes. However, 575 TUs do not overlap any annotated gene and are novel genes candidate. The TU set was further tested for their coding potential that suggested that at least 413 (72%) of them are novel lncRNA transcripts (Fig. 2-B). These transcripts contained between 1 and 15 exons, with a modal exon number of 2 per gene (Fig. 2-C). Among these putative novel transcripts revealed by RNA-seq, 77 are regulated by anti-CD3 stimulation, 72 by OKT3, 19 by FvFcR, and 13 by FvFcM (Fig. 2-D). All three antibodies regulated eight of them.

Interestingly, some genomic regions accumulate novel TU with CD4 and CD8 specific transcripts suggesting a regulatory role for such loci. An example is the TU35249, which appears close to KLF3-AS1 and KLF3 loci (data not shown). All three transcripts are repressed during anti-CD3 treatment. TU35249 also seems to be coregulated with the antisense RNA and overlaps a CD4 and a CD8 specific transcript previously reported in [11, 20].

3.3 Testing Novel LncRNA Isoforms

Two of the computationally predicted TU were tested experimentally for differential expression after T cell stimulation with anti-CD3. These transcripts were cloned and sequenced, and their expression levels were quantified by qPCR. The data on T cell RNA-seq suggested transcriptional activity in the locus WFDC21P on Chromosome 17. This locus is described as an ancient pseudogene with coding capacity in mammals, including primates, but not in humans [6]. In the genus Homo, this locus was reported to code for a lincRNA, Lnc-DC, found to regulate STAT3 activity in dendritic cells (DC) [27]. We observed several transcripts in this locus, and two novel isoforms TU20859 and TU20860 were found repressed in OKT3 stimulated T cell (\(p_{\mathrm {adj}}<0.05\)). These isoforms differ from the Lnc-DC due to the use of novel exons at the 5’ end (Fig. 3). Transcriptional activity was validated in T cell RNA by qPCR analyses that corroborate the RNA-seq data to demonstrate the presence of WFDC21P transcripts in resting T cell, besides their repression after anti-CD3 treatment. Moreover, sequence analyses of cDNA amplicon are compatible with the TU20859 and TU20861 transcripts, but no cDNA clone was found consistent with the presence of TU20860, the transcript Lnc-DC (Genbank NR_030732.1). Though, the WFDC21P transcript that is repressed after anti-CD3 stimulation may be a novel transcript distinct from the previously described Lnc-DC [27].

Fig. 2.
figure 2

A set of unannotated predicted transcripts may correspond to novel lncRNA. (A) a large fraction of the predicted transcriptome could be machine annotated (blue), from the unannotated transcripts most overlaps known gene loci (salmon), except for a small fraction (red) of them. (B) Non-coding probability of the unannotated nonoverlapping transcripts. (C) Histogram of exon content of the unannotated nonoverlapping transcripts. The number of transcripts is quoted following the number of predicted exons. (D) Venn diagram showing that part of the unannotated nonoverlapping transcripts is regulated by the anti-CD3 treatment. In blue are transcripts regulated by OKT3, red, FvFcR and green, FvFcM. Yellow eclipse reflects the total number of unannotated nonoverlapping transcripts. (Color figure online)

GAPLINC is a lincRNA that has been described as a marker for gastric adenocarcinoma [9] and was not reported to be expressed in lymph nodes [5]. The data on T-cell RNA-seq presented here suggested the presence of GAPLINC transcripts in the untreated cells and significant reduction after anti-CD3 treatment. All four reference transcripts of GAPLINC were observed, but no significant differential expression was observed, except for the GAPLINC-204Footnote 2, barely significantly repressed by OKT3 with a \(p_{\mathrm {adj}}=0.0135\). Along with them, two novel isoforms, TU21901 and TU21904, were detected. TU20901 was the most abundant transcript predicted in this locus and is a DEL, repressed as a result of OKT3, FvFcR, and FvFcM treatment (\(p_{\mathrm {adj}}\) value of 0.0001, 0.0487, and 0.0502, respectively). The qPCR quantification suggested that all antibodies induced certain repression, especially for a particular primer pair, which detects all variants except for GAPLINC-201 (Fig. 4).

The analysis of cDNA clones obtained from PCR for exons 1 to 4 and 3 to 4 yielded sequences that confirmed the presence of predicted transcripts. Three independent clones could unequivocally validate TU21901 with the same exon-exon junction. Two clones showed the unique junction of GAPLINC-204, where exon 2 uses an alternative donor splice site compared to TU21901. Three other predicted clones showed an exon-exon junction that is shared either by TU21904 or by the previously reported transcript GAPLINC-202. Therefore, the data showed that along with GAPLINC-202 and GAPLINC-204, at least the novel regulated transcript TU21901 could be found in non-stimulated T cells.

Fig. 3.
figure 3

The WFDC21P transcript is depicted to reveals the Lnc-DC along with TU20859, TU20860, and TU20861 intron-exon structure. (A) The transcripts in the opposite strand of chromosome 17 were rotated to facilitate visualization. Primers used to check for transcripts are marked in red and green. Quantitative expression by qPCR assay of lncRNAs was performed with total RNA extracted from T cells stimulated with anti-CD3 antibodies. The results are expressed as the fold change relative to unstimulated T cells (n = 5; \(p<0.05\)). (B) Expression of transcripts detected with the primer pair for the first and third exon of Lnc-DC (red). (C) Expression of transcripts detected with the primer pair for the second and third exon of Lnc-DC (green). (Color figure online)

Fig. 4.
figure 4

Transcriptional activity of the GAPLINC locus. (A) Novel transcripts are depicted along with annotated transcripts. In the top, the genomic view of transcripts with exon-intron structured. Primers used to check for transcripts are marked in green and magenta. Quantitative expression by qPCR assay of lncRNAs was performed with total RNA extracted from T cells stimulated with anti-CD3 antibodies. The results are expressed as the fold change relative to unstimulated T cells (n = 5; \(p<0.05\)). (B) qPCR with a primer to the junction of the first and second exon and another for the third exon of GAPLINC-204 (red), detecting all transcripts except GAPLINC 201. (C) Expression of transcripts detected with a primer pair for the third and fourth exon of GAPLINC-202/TU21904 (green). (Color figure online)

4 Discussion

The stimulation of T lymphocytes with anti-CD3 antibodies induces a change in transcriptional landscape. In this new study, we reanalyzed the data on the anti-CD3 induced T cells [25] to unveil the lncRNA transcriptome. We produced a new read mapping of the reads and focused on the reconstruction of splicing isoforms. The number of annotated non-protein-coding transcript observed in the cell transcriptome was close to that in the hg19 human genome assembly [12], suggesting a good coverage of the total universe of lncRNAs. The focus of the research presented here, however, was not the complete lncRNA set, but the differentially expressed transcriptome; trying to figure out the changes in genetic programming that is achieved after antibody stimulation. Due to limitation of our model system, we are not able to pinpoint specific T cell subpopulations. Nevertheless, considering DEL observed in the whole T cell mixture in a PBMC context, we speculate that a particular T cell population is imposed (or polarized), either by expansion or differentiation.

LincRNA expression seems to be more cell type-specific than protein-coding genes [4]. After that, we inspect the ncRNA transcriptome for lincRNA DEL, which could support discrete changes in cell differentiation, explaining the upraise of a suppressive phenotype. Among the annotated lincRNA, only three (AC017002.1, LINC00339, LINC01132) were consistently activated with the three anti-CD3 molecules. AC017002.1 was associated with memory CD4 cells, and its upregulation may reflect the expansion of memory cells, a reported effect of commercial anti-CD3 [26].

The antibody molecule format has a significant impact on the expression profile. OKT3 is far the most effective in regulating lncRNA. It chimeric format FvFcM and the humanized antibody fragment FvFcR neverthess display a similar, although generally less intense response. The former, a mouse mAb, has a much stronger mitogenic response than FvFc format [24, 25]. Despite the differences in the regulated gene set, several of DEL are consistently regulated after all antibody molecules. Yet, other regulated lncRNA is antibody specific, such as FvFcM specific DEL. The chimeric antibody is the only antibody that significantly regulates TU13951. Maybe variation in the antibody’s paratope and the Fc component affect the strength and quality of TCR signaling, and further engineering may improve regulatory bias reducing the inflammatory response.

5 Conclusion

We investigate the lncRNA transcriptome of T lymphocyte cells to uncover the changes incurred by anti-CD3 immunotherapy, which could reflect in a suppressive phenotype. Several lncRNAs and known lincRNAs were observed, and its role in the reversal of inflammation may be associated with the induction of a regulatory phenotype.

The successful release of novel anti-CD3 therapeutics will readdress the investigation of the novel suppressive and tolerogenic effect of these immune pharmaceuticals in humans. The selective immunoregulation of the anti-CD3 treatment observed in clinics may become the basis of novel therapy for autoimmune disease. In this sense, the development of clinical-grade markers could help this development. The data on lncRNA revealed in this work may not only contribute to novel markers to follow immunoregulation on the whole T cell context but also contribute to potential non-coding RNA regulation. Association of anti-CD3 data may yield new disease markers and treatment targets among the differentially expressed lncRNA. Beyond simplifying therapeutics monitoring, biological relevant ncRNA could become a pharmacological target for future therapies. As a result, a new generation of more powerful pharmaceuticals to immunosuppress and control the immune response.