Found 359 repositories(showing 30)
RNA vaccines have become a key tool in moving forward through the challenges raised both in the current pandemic and in numerous other public health and medical challenges. With the rollout of vaccines for COVID-19, these synthetic mRNAs have become broadly distributed RNA species in numerous human populations. Despite their ubiquity, sequences are not always available for such RNAs. Standard methods facilitate such sequencing. In this note, we provide experimental sequence information for the RNA components of the initial Moderna (https://pubmed.ncbi.nlm.nih.gov/32756549/) and Pfizer/BioNTech (https://pubmed.ncbi.nlm.nih.gov/33301246/) COVID-19 vaccines, allowing a working assembly of the former and a confirmation of previously reported sequence information for the latter RNA. Sharing of sequence information for broadly used therapeutics has the benefit of allowing any researchers or clinicians using sequencing approaches to rapidly identify such sequences as therapeutic-derived rather than host or infectious in origin. For this work, RNAs were obtained as discards from the small portions of vaccine doses that remained in vials after immunization; such portions would have been required to be otherwise discarded and were analyzed under FDA authorization for research use. To obtain the small amounts of RNA needed for characterization, vaccine remnants were phenol-chloroform extracted using TRIzol Reagent (Invitrogen), with intactness assessed by Agilent 2100 Bioanalyzer before and after extraction. Although our analysis mainly focused on RNAs obtained as soon as possible following discard, we also analyzed samples which had been refrigerated (~4 ℃) for up to 42 days with and without the addition of EDTA. Interestingly a substantial fraction of the RNA remained intact in these preparations. We note that the formulation of the vaccines includes numerous key chemical components which are quite possibly unstable under these conditions-- so these data certainly do not suggest that the vaccine as a biological agent is stable. But it is of interest that chemical stability of RNA itself is not sufficient to preclude eventual development of vaccines with a much less involved cold-chain storage and transportation. For further analysis, the initial RNAs were fragmented by heating to 94℃, primed with a random hexamer-tailed adaptor, amplified through a template-switch protocol (Takara SMARTerer Stranded RNA-seq kit), and sequenced using a MiSeq instrument (Illumina) with paired end 78-per end sequencing. As a reference material in specific assays, we included RNA of known concentration and sequence (from bacteriophage MS2). From these data, we obtained partial information on strandedness and a set of segments that could be used for assembly. This was particularly useful for the Moderna vaccine, for which the original vaccine RNA sequence was not available at the time our study was carried out. Contigs encoding full-length spikes were assembled from the Moderna and Pfizer datasets. The Pfizer/BioNTech data [Figure 1] verified the reported sequence for that vaccine (https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/), while the Moderna sequence [Figure 2] could not be checked against a published reference. RNA preparations lacking dsRNA are desirable in generating vaccine formulations as these will minimize an otherwise dramatic biological (and nonspecific) response that vertebrates have to double stranded character in RNA (https://www.nature.com/articles/nrd.2017.243). In the sequence data that we analyzed, we found that the vast majority of reads were from the expected sense strand. In addition, the minority of antisense reads appeared different from sense reads in lacking the characteristic extensions expected from the template switching protocol. Examining only the reads with an evident template switch (as an indicator for strand-of-origin), we observed that both vaccines overwhelmingly yielded sense reads (>99.99%). Independent sequencing assays and other experimental measurements are ongoing and will be needed to determine whether this template-switched sense read fraction in the SmarterSeq protocol indeed represents the actual dsRNA content in the original material. This work provides an initial assessment of two RNAs that are now a part of the human ecosystem and that are likely to appear in numerous other high throughput RNA-seq studies in which a fraction of the individuals may have previously been vaccinated. ProtoAcknowledgements: Thanks to our colleagues for help and suggestions (Nimit Jain, Emily Greenwald, Lamia Wahba, William Wang, Amisha Kumar, Sameer Sundrani, David Lipman, Bijoyita Roy). Figure 1: Spike-encoding contig assembled from BioNTech/Pfizer BNT-162b2 vaccine. Although the full coding region is included, the nature of the methodology used for sequencing and assembly is such that the assembled contig could lack some sequence from the ends of the RNA. Within the assembled sequence, this hypothetical sequence shows a perfect match to the corresponding sequence from documents available online derived from manufacturer communications with the World Health Organization [as reported by https://berthub.eu/articles/posts/reverse-engineering-source-code-of-the-biontech-pfizer-vaccine/]. The 5’ end for the assembly matches the start site noted in these documents, while the read-based assembly lacks an interrupted polyA tail (A30(GCATATGACT)A70) that is expected to be present in the mRNA.
moritzkoerber
A Covid-19 data pipeline on AWS featuring PySpark/Glue, Docker, Great Expectations, Airflow, and Redshift, templated in CloudFormation and CDK, deployable via Github Actions.
siddharth271101
The goal of this project is to analyse the impact of Covid-19 on the Aviation industry through data engineering processes using technologies such as Apache Airflow, Apache Spark, Tableau and couple of AWS services
Lucas-Czarnecki
Cleaned daily reports and time series data from the 2019 Novel Coronavirus COVID-19 (2019-nCoV) Data Repository by Johns Hopkins University for Systems Science and Engineering (JHU CSSE).
ChristianConchari
This project was done as a final project of the artificial vision course IMT-344 of the Mechatronics Engineering course of the Universidad Catolica Boliviana San Pablo. Its main objective is the classification of COVID-19 in Chest X-Ray images, additionally also classified images of Viral Pneumonia and Lung Opacity. All data were collected from the COVID-19 Radiography Dataset from Kaggle.
ONSdigital
Data engineering pipeline for the household COVID-19 Infection Survey (CIS)
epi-center
Proposal for open science and data engineering around epidemiology, starting with COVID
datpham0412
Machine learning project aimed at predicting new COVID-19 cases using historical COVID-19 and mobility data. The project involves data fetching, migration, preprocessing, exploratory data analysis (EDA), feature engineering, data splitting, model training, and evaluation.
RecruiterRon
David Aplin Group, one of Canada's Best Managed Companies, has partnered with our client to recruit Junior Software Developers. New graduates or soon-to-graduate students are encouraged to apply! Our client is looking for Junior Software Developers to join their growing team. This position is responsible for the development, evaluation, implementation, and maintenance of new software solutions, including maintenance and development of existing applications. Applications involve data collection, data storage, machine learning, and data visualization. The Role: Designing, coding, and debugging software applications using front-end frameworks and enterprise applications - front-end, back-end, and full-stack development. Performing software analysis, code analysis, requirements analysis, software reviews, identification of code metrics, system risk analysis, software reliability analysis. Providing assistance with installations, system configuration, and third-party system integrations. Providing team members and clients with support and guidance. The Ideal Candidate: A Bachelor's degree or Diploma in Computer Science, Computer Engineering, Information Technology, or a similar field. Experience working with coding languages C#, JavaScript, Angular, React, Python, PHP jQuery, JSON, and Ajax. Solid understanding of web design and development principles. Good planning, analytical, and decision-making skills. A portfolio of web design, applications, and projects you have worked on including projects published on GitHub. Critical-thinking skills. In-depth knowledge of software prototyping and UX design tools. High personal code/development standards (peer testing, unit testing, documentation, etc). Team spirit and a sense of humour are always great. Goal-orientated and deadline-driven. COVID-19 considerations: All employees are currently working from home. Any equipment or materials required for work will be provided by the company via shipment to the employee's home. Company policy will continue to evolve through the COVID-19 pandemic and implement alternative working arrangements to ensure that all our people stay safe. If you are interested in this position and meet the above criteria, please send your resume in confidence directly to Jim Juacalla or Ron Cantiveros at Aplin Information Technology, A Division of David Aplin Group. We thank all applicants; however, only those selected for an interview will be contacted. Apply: https://jobs.aplin.com/job/409253/Junior-Software-Developers-New-Graduates
MahdisSep
A Data Engineering and Analysis project built with KNIME. Focuses on data cleansing, feature generation, aggregation, and visualization of public health datasets (e.g., COVID-19 testing and travel data) to derive key insights.
abdeslam272
No description available
Data Engineering Project on COVID-19 DataLake by AWS
skyprince999
No description available
tangybluff
Scalable data engineering pipeline built for clinical data. Uses Terraform, DLT, GCP (GCS + BigQuery), dbt, and Dagster to ingest, transform, and orchestrate COVID-19 patient data for analytics, visualization, and future ML applications.
ROSHANFAREED
End-to-end Azure Data Engineering pipeline using ADF, Databricks (PySpark), ADLS Gen2, Azure SQL, and Power BI for COVID-19 analytics
SHREYAS-SHETTY-KR
A comprehensive exploration of constructing an Extract, Transform, Load (ETL) pipeline in the AWS Cloud, with a focus on leveraging AWS COVID-19 dataset to showcase best practices in the field of data engineering.
D-Bits
A collection of COVID-19 data pipelines, built with Airflow.
Real-VeerSandhu
Data analysis on Covid-19 datasets (UofT Global Engineering Challenge)
CaRdiffR
R API to COVID-19 data from Johns Hopkins University Center for Systems Science and Engineering
lsbastos
Data analysis of COVID-19 using data provided by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE).
chamikamunithunga
This project analyzes the spread of COVID-19 using data from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. The application displays a time series of confirmed COVID-19 cases globally, visualized through a line plot.
AyanPoetsSinha
A data engineering solution using Azure Databricks and Azure Data Factory (ADF) for a real world problem of reporting Covid-19 trends and prediction of the spread of this virus.
This repository consist of a project to build an ETL pipeline for a data lake hosted on S3 using Spark to assess the impact of covid-19 in the stock market. This project is for Udacity's Data Engineering Nanodegree.
sahsanaly
No description available
suryakesaram
No description available
wavemode
CSC 4110 Software Engineering project. Android app for tracking Covid-19 data by county in the United States.
Elysium33
Covid Data Engineering Project
EngKhallo
No description available
GabrielHenriqueCA
📊 COVID-19 Data Engineering project | AWS Data Lake with Medallion Architecture (Bronze→Silver→Gold) using Glue (PySpark), Athena, and Power BI | 1.1M records processed and analyzed
JinXuan-Wong
End-to-end real-time data engineering pipeline using Kafka, Spark, HDFS, MongoDB, and Neo4j for COVID-19 outbreak monitoring and analytics.