Found 685 repositories(showing 30)
AlexMog
A small mmo in Java, the goal was to provide an efficient scripting-to-entity system. The final goal was to let the players create all the game.
sanusanth
What is C++? C++ is a general-purpose, object-oriented programming language. It was created by Bjarne Stroustrup at Bell Labs circa 1980. C++ is very similar to C (invented by Dennis Ritchie in the early 1970s). C++ is so compatible with C that it will probably compile over 99% of C programs without changing a line of source code. Though C++ is a lot of well-structured and safer language than C as it OOPs based. Some computer languages are written for a specific purpose. Like, Java was initially devised to control toasters and some other electronics. C was developed for programming OS. Pascal was conceptualized to teach proper programming techniques. But C++ is a general-purpose language. It well deserves the widely acknowledged nickname "Swiss Pocket Knife of Languages." C++ is a cross-platform language that can be used to create high-performance applications. C++ was developed by Bjarne Stroustrup, as an extension to the C language. C++ gives programmers a high level of control over system resources and memory. The language was updated 3 major times in 2011, 2014, and 2017 to C++11, C++14, and C++17. About C++ Programming Multi-paradigm Language - C++ supports at least seven different styles of programming. Developers can choose any of the styles. General Purpose Language - You can use C++ to develop games, desktop apps, operating systems, and so on. Speed - Like C programming, the performance of optimized C++ code is exceptional. Object-oriented - C++ allows you to divide complex problems into smaller sets by using objects. Why Learn C++? C++ is used to develop games, desktop apps, operating systems, browsers, and so on because of its performance. After learning C++, it will be much easier to learn other programming languages like Java, Python, etc. C++ helps you to understand the internal architecture of a computer, how computer stores and retrieves information. How to learn C++? C++ tutorial from Programiz - We provide step by step C++ tutorials, examples, and references. Get started with C++. Official C++ documentation - Might be hard to follow and understand for beginners. Visit official C++ documentation. Write a lot of C++ programming code- The only way you can learn programming is by writing a lot of code. Read C++ code- Join Github's open-source projects and read other people's code. C++ best programming language? The answer depends on perspective and requirements. Some tasks can be done in C++, though not very quickly. For example, designing GUI screens for applications. Other languages like Visual Basic, Python have GUI design elements built into them. Therefore, they are better suited for GUI type of task. Some of the scripting languages that provide extra programmability to applications. Such as MS Word and even photoshop tend to be variants of Basic, not C++. C++ is still used widely, and the most famous software have their backbone in C++. This tutorial will help you learn C++ basic and the advanced concepts. Who uses C++? Some of today's most visible used systems have their critical parts written in C++. Examples are Amadeus (airline ticketing) Bloomberg (financial formation), Amazon (Web commerce), Google (Web search) Facebook (social media) Many programming languages depend on C++'s performance and reliability in their implementation. Examples include: Java Virtual Machines JavaScript interpreters (e.g., Google's V8) Browsers (e.g., Internet Explorer, Mozilla's Firefox, Apple's Safari, and Google's Chrome) Application and Web frameworks (e.g., Microsoft's .NET Web services framework). Applications that involve local and wide area networks, user interaction, numeric, graphics, and database access highly depend on C++ language. Why Use C++ C++ is one of the world's most popular programming languages. C++ can be found in today's operating systems, Graphical User Interfaces, and embedded systems. C++ is an object-oriented programming language which gives a clear structure to programs and allows code to be reused, lowering development costs. C++ is portable and can be used to develop applications that can be adapted to multiple platforms. C++ is fun and easy to learn! As C++ is close to C# and Java, it makes it easy for programmers to switch to C++ or vice versa Definition - What does C++ Programming Language mean? C++ is an object oriented computer language created by notable computer scientist Bjorne Stroustrop as part of the evolution of the C family of languages. Some call C++ “C with classes” because it introduces object oriented programming principles, including the use of defined classes, to the C programming language framework. C++ is pronounced "see-plus-plus." C++ Variables Variables are the backbone of any programming language. A variable is merely a way to store some information for later use. We can retrieve this value or data by referring to a "word" that will describe this information. Once declared and defined they may be used many times within the scope in which they were declared. C++ Control Structures When a program runs, the code is read by the compiler line by line (from top to bottom, and for the most part left to right). This is known as "code flow." When the code is being read from top to bottom, it may encounter a point where it needs to make a decision. Based on the decision, the program may jump to a different part of the code. It may even make the compiler re-run a specific piece again, or just skip a bunch of code. You could think of this process like if you were to choose from different courses from Guru99. You decide, click a link and skip a few pages. In the same way, a computer program has a set of strict rules to decide the flow of program execution. C++ Syntax The syntax is a layout of words, expression, and symbols. Well, it's because an email address has its well-defined syntax. You need some combination of letters, numbers, potentially with underscores (_) or periods (.) in between, followed by an at the rate (@) symbol, followed by some website domain (company.com). So, syntax in a programming language is much the same. They are some well-defined set of rules that allow you to create some piece of well-functioning software. But, if you don't abide by the rules of a programming language or syntax, you'll get errors. C++ Tools In the real world, a tool is something (usually a physical object) that helps you to get a certain job done promptly. Well, this holds true with the programming world too. A tool in programming is some piece of software which when used with the code allows you to program faster. There are probably tens of thousands, if not millions of different tools across all the programming languages. Most crucial tool, considered by many, is an IDE, an Integrated Development Environment. An IDE is a software which will make your coding life so much easier. IDEs ensure that your files and folders are organized and give you a nice and clean way to view them. Types of C++ Errors Another way to look at C++ in a practical sense is to start enumerating different kinds of errors that occur as the written code makes its way to final execution. First, there are syntax errors where the code is actually written in an illegible way. This can be a misuse of punctuation, or the misspelling of a function command or anything else that compromises the integrity of the syntax as it is written. Another fundamental type of error is a compiler error that simply tells the programmer the compiler was not able to do its work effectively. As a compiler language, C++ relies on the compiler to make the source code into machine readable code and optimize it in various ways. A third type of error happens after the program has been successfully compiled. Runtime errors are not uncommon in C++ executables. What they represent is some lack of designated resource or non-working command in the executable program. In other words, the syntax is right, and the program was compiled successfully, but as the program is doing its work, it encounters a problem, whether that has to do with interdependencies, operating system requirements or anything else in the general environment in which the program is trying to work. Over time, C++ has remained a very useful language not only in computer programming itself, but in teaching new programmers about how object oriented programming works.
TautvydasZilys
This project demonstrates how to use IL2CPP scripting backend when you want final UWP application to use a C# project.
scijava
Mirror of the final state of the java.net scripting project (http://java.net/projects/Scripting)
needle-mirror
[Mirrored from UPM, not affiliated with Unity Technologies.] 📦 Visual scripting is a workflow that uses visual, node-based graphs to design behaviors rather than write lines of C# script.Enabling artists, designers and programmers alike, visual scripting can be used to design final logic, quickly create prototypes, iterate on gameplay and create custom nodes to help streamline collaboration.Visual scripting is compatible with third-party APIs, including most packages, assets and custom libraries.
What is C++? C++ is a general-purpose, object-oriented programming language. It was created by Bjarne Stroustrup at Bell Labs circa 1980. C++ is very similar to C (invented by Dennis Ritchie in the early 1970s). C++ is so compatible with C that it will probably compile over 99% of C programs without changing a line of source code. Though C++ is a lot of well-structured and safer language than C as it OOPs based. Some computer languages are written for a specific purpose. Like, Java was initially devised to control toasters and some other electronics. C was developed for programming OS. Pascal was conceptualized to teach proper programming techniques. But C++ is a general-purpose language. It well deserves the widely acknowledged nickname "Swiss Pocket Knife of Languages." C++ is a cross-platform language that can be used to create high-performance applications. C++ was developed by Bjarne Stroustrup, as an extension to the C language. C++ gives programmers a high level of control over system resources and memory. The language was updated 3 major times in 2011, 2014, and 2017 to C++11, C++14, and C++17. About C++ Programming Multi-paradigm Language - C++ supports at least seven different styles of programming. Developers can choose any of the styles. General Purpose Language - You can use C++ to develop games, desktop apps, operating systems, and so on. Speed - Like C programming, the performance of optimized C++ code is exceptional. Object-oriented - C++ allows you to divide complex problems into smaller sets by using objects. Why Learn C++? C++ is used to develop games, desktop apps, operating systems, browsers, and so on because of its performance. After learning C++, it will be much easier to learn other programming languages like Java, Python, etc. C++ helps you to understand the internal architecture of a computer, how computer stores and retrieves information. How to learn C++? C++ tutorial from Programiz - We provide step by step C++ tutorials, examples, and references. Get started with C++. Official C++ documentation - Might be hard to follow and understand for beginners. Visit official C++ documentation. Write a lot of C++ programming code- The only way you can learn programming is by writing a lot of code. Read C++ code- Join Github's open-source projects and read other people's code. C++ best programming language? T he answer depends on perspective and requirements. Some tasks can be done in C++, though not very quickly. For example, designing GUI screens for applications. Other languages like Visual Basic, Python have GUI design elements built into them. Therefore, they are better suited for GUI type of task. Some of the scripting languages that provide extra programmability to applications. Such as MS Word and even photoshop tend to be variants of Basic, not C++. C++ is still used widely, and the most famous software have their backbone in C++. This tutorial will help you learn C++ basic and the advanced concepts. Who uses C++? Some of today's most visible used systems have their critical parts written in C++. Examples are Amadeus (airline ticketing) Bloomberg (financial formation), Amazon (Web commerce), Google (Web search) Facebook (social media) Many programming languages depend on C++'s performance and reliability in their implementation. Examples include: Java Virtual Machines JavaScript interpreters (e.g., Google's V8) Browsers (e.g., Internet Explorer, Mozilla's Firefox, Apple's Safari, and Google's Chrome) Application and Web frameworks (e.g., Microsoft's .NET Web services framework). Applications that involve local and wide area networks, user interaction, numeric, graphics, and database access highly depend on C++ language. Why Use C++ C++ is one of the world's most popular programming languages. C++ can be found in today's operating systems, Graphical User Interfaces, and embedded systems. C++ is an object-oriented programming language which gives a clear structure to programs and allows code to be reused, lowering development costs. C++ is portable and can be used to develop applications that can be adapted to multiple platforms. C++ is fun and easy to learn! As C++ is close to C# and Java, it makes it easy for programmers to switch to C++ or vice versa Definition - What does C++ Programming Language mean? C++ is an object oriented computer language created by notable computer scientist Bjorne Stroustrop as part of the evolution of the C family of languages. Some call C++ “C with classes” because it introduces object oriented programming principles, including the use of defined classes, to the C programming language framework. C++ is pronounced "see-plus-plus." C++ Variables Variables are the backbone of any programming language. A variable is merely a way to store some information for later use. We can retrieve this value or data by referring to a "word" that will describe this information. Once declared and defined they may be used many times within the scope in which they were declared. C++ Control Structures When a program runs, the code is read by the compiler line by line (from top to bottom, and for the most part left to right). This is known as "code flow." When the code is being read from top to bottom, it may encounter a point where it needs to make a decision. Based on the decision, the program may jump to a different part of the code. It may even make the compiler re-run a specific piece again, or just skip a bunch of code. You could think of this process like if you were to choose from different courses from Guru99. You decide, click a link and skip a few pages. In the same way, a computer program has a set of strict rules to decide the flow of program execution. C++ Syntax The syntax is a layout of words, expression, and symbols. Well, it's because an email address has its well-defined syntax. You need some combination of letters, numbers, potentially with underscores (_) or periods (.) in between, followed by an at the rate (@) symbol, followed by some website domain (company.com). So, syntax in a programming language is much the same. They are some well-defined set of rules that allow you to create some piece of well-functioning software. But, if you don't abide by the rules of a programming language or syntax, you'll get errors. C++ Tools In the real world, a tool is something (usually a physical object) that helps you to get a certain job done promptly. Well, this holds true with the programming world too. A tool in programming is some piece of software which when used with the code allows you to program faster. There are probably tens of thousands, if not millions of different tools across all the programming languages. Most crucial tool, considered by many, is an IDE, an Integrated Development Environment. An IDE is a software which will make your coding life so much easier. IDEs ensure that your files and folders are organized and give you a nice and clean way to view them. Types of C++ Errors Another way to look at C++ in a practical sense is to start enumerating different kinds of errors that occur as the written code makes its way to final execution. First, there are syntax errors where the code is actually written in an illegible way. This can be a misuse of punctuation, or the misspelling of a function command or anything else that compromises the integrity of the syntax as it is written. Another fundamental type of error is a compiler error that simply tells the programmer the compiler was not able to do its work effectively. As a compiler language, C++ relies on the compiler to make the source code into machine readable code and optimize it in various ways. A third type of error happens after the program has been successfully compiled. Runtime errors are not uncommon in C++ executables. What they represent is some lack of designated resource or non-working command in the executable program. In other words, the syntax is right, and the program was compiled successfully, but as the program is doing its work, it encounters a problem, whether that has to do with interdependencies, operating system requirements or anything else in the general environment in which the program is trying to work. Over time, C++ has remained a very useful language not only in computer programming itself, but in teaching new programmers about how object oriented programming works.
HasanMothaffar
Final project for the operating systems (1) course at Damascus University, which covers bash scripting.
ShaikhInaam
The Web Application Security Scanner is a web based application developed in Java-EE which checks vulnerabilities of other websites or web applications by getting website link in order to apply security injections on them like SQL Injection, Cross Side Scripting (XSS), Buffer Overflow and some other attacks as well. In this app, if a user can make his/her account, app will give him/her an API key, with this, a user can use our web services and app will give him/her a jar file of java classes which we have made along with research documents so user can continue further more research on this security project if he/she wants.This project is my final year project (FYP) of B.S (Software Engineering)
Besika40k
final project for the subject Scripting languages
allen813
This repository contains coursework and experiments from the Programming Fundamentals Course (CH0–CH7), including the Final Project: Snake Game (C++). Topics covered include C programming, Linux scripting, and cross-platform development using Visual Studio and Shell/Makefile tools.
Electronick79
Syllabus – COMP 596 Web Application Development Class Meeting Time & Location: Wednesday 4-6:40 Website: https://github.com/ http://blackboard.bridgew.edu/ Instructor: Dr. Jung (sjung@bridgew.edu) DMF339 Office Hours: 10-11 Tuesdays and Thursdays, 2-3:30 Wednesdays, or by appointment ___________________________________________________________________________________________ BSU Catalog Course Description: (http://catalog.bridgew.edu ) COMP 596 - Topics in Computer Science Prerequisite: Admission to the MS program in Computer Science or consent of instructor 3 credits, this course is elective in the category 3 Software Development https://www.bridgew.edu/graduate/computer-science-ms In this course, topics are chosen from program verification, formal semantics, formal language theory, concurrent programming, complexity or algorithms, programming language theory, graphics and other computer science topics. Repeatable for different topics. Offered as topics arise. This course covers web technologies for modern web application development; designing, developing, publishing websites on the World Wide Web; databases, client and server-side scripting, security and privacy issues. Course Outcomes (Learning objectives) At the end of the course, students should be able to • Describe the fundamentals of web development • Design and implement a web application with a variety of programming techniques • Demonstrate the ability to run and mange a web application Major Topics Covered in the Course • Client-side development • Server-side development • Working with Databases • Web Server Administration Instructional Methodologies This course will combine traditional lecturing with hands-on assignments that reinforce the lecture material. In particular, lectures will focus on concepts and ideas while the assignments will provide concrete experience and skills. Students will also have a term project, which allows them to apply what they learned from the lectures to interesting questions. Learning requires work. Research indicates that we learn through practice, teaching others, and discussion that corrects misconceptions. Assignments: • Further instructions will be given in each assignment. • Must be submitted on the due date. Assignments submitted late will be given a grade of 0. • Extensions will be handled on an individual basis, but typically reserved for special cases. The earlier you talk to me, the better your chances of receiving one. Do not expect to get an extension by requesting on the assignment due day. No extensions will be allowed after due. • All of the assignment scores will be entered into the grading formula. • No make-up assignments, no extra assignments for this class Exams: Make-ups for exams will be given ONLY to those students who (1) contact me BEFORE the exam with a reasonable excuse (documented by a note from a physician, court summons, etc.). All of these exam scores will be entered into the grading formula. You will fail the course (Letter grade F) if you miss any exam. Syllabus – COMP 596 Web Application Development Attendance Policy: Show up on time. Participate regularly. Attendance is not part of your grading. However, if you miss a lecture, it is your responsibility to learn what you missed. Excessive unexcused absences (more than 5 times) will result in failure for the course (letter grade F). If you arrive late to class, it is your responsibility to correct me on your attendance at the end of the class. Grade Components and Weights (TENTATIVE): Assignments 25% Exams 40% Group project & presentation 35% (Prototyping, Progress presentation, Final presentation & code) Accommodations https://studentbridgew.sharepoint.com/sites/SAS Based on legal requirements outlined in the Americans with Disabilities Act (ADA) and Section 504 of the Rehabilitation Act of 1973, all student-requests for reasonable accommodation must be processed through Student Accessibility Services (SAS). Instructor should NOT accommodate a student without official SAS paperwork. Mask up policy: Effective August 23, BSU mandates that all members of our campus community (including vaccinated students, employees, and visitors) wear a mask in indoor public places, including faculty offices. If a student fails to mask up even after instructor has requested, the faculty member may dismiss the student from class, and upon refusal of the student to leave, the faculty may dismiss the class and report the student to the student conduct officer. Faculty will also receive prior notice of students in their classes who may have a mask exemption which will be issued through the disability resource office. Academic Dishonesty: http://catalog.bridgew.edu/content.php?catoid=7&navoid=486 (BSU academic policy) All assignments in this course are considered take-home programming tests. You must do your own work, entirely. • You are permitted to discuss the meaning of assignments, general approaches and strategies. • You may not share code, pseudo-code, or documentation of any kind. • You may not use code from any other source, including the Internet. • Penalty (generally) o 1st strike: zero score (including the student who helped the other students) o 2nd strike: F in the class In-Class Conduct: Any sort of disruptive behavior in class will not be tolerated and dealt with according to the university policy. https://catalog.bridgew.edu/content.php?catoid=11&navoid=996#Academic_Integrity_and_Classroom_Conduct If disruptive behavior occurs, whether in the classroom or another academic environment, a faculty member has the right to remove the student from the classroom setting. Privacy Policy BSU privacy policy includes that you MUST use BSU account. It is students’ responsibilities to check BSU account. I will not respond any emails from outside BSU account
What is C++? C++ is a general-purpose, object-oriented programming language. It was created by Bjarne Stroustrup at Bell Labs circa 1980. C++ is very similar to C (invented by Dennis Ritchie in the early 1970s). C++ is so compatible with C that it will probably compile over 99% of C programs without changing a line of source code. Though C++ is a lot of well-structured and safer language than C as it OOPs based. Some computer languages are written for a specific purpose. Like, Java was initially devised to control toasters and some other electronics. C was developed for programming OS. Pascal was conceptualized to teach proper programming techniques. But C++ is a general-purpose language. It well deserves the widely acknowledged nickname "Swiss Pocket Knife of Languages." C++ is a cross-platform language that can be used to create high-performance applications. C++ was developed by Bjarne Stroustrup, as an extension to the C language. C++ gives programmers a high level of control over system resources and memory. The language was updated 3 major times in 2011, 2014, and 2017 to C++11, C++14, and C++17. About C++ Programming Multi-paradigm Language - C++ supports at least seven different styles of programming. Developers can choose any of the styles. General Purpose Language - You can use C++ to develop games, desktop apps, operating systems, and so on. Speed - Like C programming, the performance of optimized C++ code is exceptional. Object-oriented - C++ allows you to divide complex problems into smaller sets by using objects. Why Learn C++? C++ is used to develop games, desktop apps, operating systems, browsers, and so on because of its performance. After learning C++, it will be much easier to learn other programming languages like Java, Python, etc. C++ helps you to understand the internal architecture of a computer, how computer stores and retrieves information. How to learn C++? C++ tutorial from Programiz - We provide step by step C++ tutorials, examples, and references. Get started with C++. Official C++ documentation - Might be hard to follow and understand for beginners. Visit official C++ documentation. Write a lot of C++ programming code- The only way you can learn programming is by writing a lot of code. Read C++ code- Join Github's open-source projects and read other people's code. C++ best programming language? The answer depends on perspective and requirements. Some tasks can be done in C++, though not very quickly. For example, designing GUI screens for applications. Other languages like Visual Basic, Python have GUI design elements built into them. Therefore, they are better suited for GUI type of task. Some of the scripting languages that provide extra programmability to applications. Such as MS Word and even photoshop tend to be variants of Basic, not C++. C++ is still used widely, and the most famous software have their backbone in C++. This tutorial will help you learn C++ basic and the advanced concepts. Who uses C++? Some of today's most visible used systems have their critical parts written in C++. Examples are Amadeus (airline ticketing) Bloomberg (financial formation), Amazon (Web commerce), Google (Web search) Facebook (social media) Many programming languages depend on C++'s performance and reliability in their implementation. Examples include: Java Virtual Machines JavaScript interpreters (e.g., Google's V8) Browsers (e.g., Internet Explorer, Mozilla's Firefox, Apple's Safari, and Google's Chrome) Application and Web frameworks (e.g., Microsoft's .NET Web services framework). Applications that involve local and wide area networks, user interaction, numeric, graphics, and database access highly depend on C++ language. Why Use C++ C++ is one of the world's most popular programming languages. C++ can be found in today's operating systems, Graphical User Interfaces, and embedded systems. C++ is an object-oriented programming language which gives a clear structure to programs and allows code to be reused, lowering development costs. C++ is portable and can be used to develop applications that can be adapted to multiple platforms. C++ is fun and easy to learn! As C++ is close to C# and Java, it makes it easy for programmers to switch to C++ or vice versa Definition - What does C++ Programming Language mean? C++ is an object oriented computer language created by notable computer scientist Bjorne Stroustrop as part of the evolution of the C family of languages. Some call C++ “C with classes” because it introduces object oriented programming principles, including the use of defined classes, to the C programming language framework. C++ is pronounced "see-plus-plus." C++ Variables Variables are the backbone of any programming language. A variable is merely a way to store some information for later use. We can retrieve this value or data by referring to a "word" that will describe this information. Once declared and defined they may be used many times within the scope in which they were declared. C++ Control Structures When a program runs, the code is read by the compiler line by line (from top to bottom, and for the most part left to right). This is known as "code flow." When the code is being read from top to bottom, it may encounter a point where it needs to make a decision. Based on the decision, the program may jump to a different part of the code. It may even make the compiler re-run a specific piece again, or just skip a bunch of code. You could think of this process like if you were to choose from different courses from Guru99. You decide, click a link and skip a few pages. In the same way, a computer program has a set of strict rules to decide the flow of program execution. C++ Syntax The syntax is a layout of words, expression, and symbols. Well, it's because an email address has its well-defined syntax. You need some combination of letters, numbers, potentially with underscores (_) or periods (.) in between, followed by an at the rate (@) symbol, followed by some website domain (company.com). So, syntax in a programming language is much the same. They are some well-defined set of rules that allow you to create some piece of well-functioning software. But, if you don't abide by the rules of a programming language or syntax, you'll get errors. C++ Tools In the real world, a tool is something (usually a physical object) that helps you to get a certain job done promptly. Well, this holds true with the programming world too. A tool in programming is some piece of software which when used with the code allows you to program faster. There are probably tens of thousands, if not millions of different tools across all the programming languages. Most crucial tool, considered by many, is an IDE, an Integrated Development Environment. An IDE is a software which will make your coding life so much easier. IDEs ensure that your files and folders are organized and give you a nice and clean way to view them. Types of C++ Errors Another way to look at C++ in a practical sense is to start enumerating different kinds of errors that occur as the written code makes its way to final execution. First, there are syntax errors where the code is actually written in an illegible way. This can be a misuse of punctuation, or the misspelling of a function command or anything else that compromises the integrity of the syntax as it is written. Another fundamental type of error is a compiler error that simply tells the programmer the compiler was not able to do its work effectively. As a compiler language, C++ relies on the compiler to make the source code into machine readable code and optimize it in various ways. A third type of error happens after the program has been successfully compiled. Runtime errors are not uncommon in C++ executables. What they represent is some lack of designated resource or non-working command in the executable program. In other words, the syntax is right, and the program was compiled successfully, but as the program is doing its work, it encounters a problem, whether that has to do with interdependencies, operating system requirements or anything else in the general environment in which the program is trying to work. Over time, C++ has remained a very useful language not only in computer programming itself, but in teaching new programmers about how object oriented programming works.
AdekoyaOlatolokikiAyomide
How to share data with a statistician This is a guide for anyone who needs to share data with a statistician or data scientist. The target audiences I have in mind are: Collaborators who need statisticians or data scientists to analyze data for them Students or postdocs in various disciplines looking for consulting advice Junior statistics students whose job it is to collate/clean/wrangle data sets The goals of this guide are to provide some instruction on the best way to share data to avoid the most common pitfalls and sources of delay in the transition from data collection to data analysis. The Leek group works with a large number of collaborators and the number one source of variation in the speed to results is the status of the data when they arrive at the Leek group. Based on my conversations with other statisticians this is true nearly universally. My strong feeling is that statisticians should be able to handle the data in whatever state they arrive. It is important to see the raw data, understand the steps in the processing pipeline, and be able to incorporate hidden sources of variability in one's data analysis. On the other hand, for many data types, the processing steps are well documented and standardized. So the work of converting the data from raw form to directly analyzable form can be performed before calling on a statistician. This can dramatically speed the turnaround time, since the statistician doesn't have to work through all the pre-processing steps first. What you should deliver to the statistician To facilitate the most efficient and timely analysis this is the information you should pass to a statistician: The raw data. A tidy data set A code book describing each variable and its values in the tidy data set. An explicit and exact recipe you used to go from 1 -> 2,3 Let's look at each part of the data package you will transfer. The raw data It is critical that you include the rawest form of the data that you have access to. This ensures that data provenance can be maintained throughout the workflow. Here are some examples of the raw form of data: The strange binary file your measurement machine spits out The unformatted Excel file with 10 worksheets the company you contracted with sent you The complicated JSON data you got from scraping the Twitter API The hand-entered numbers you collected looking through a microscope You know the raw data are in the right format if you: Ran no software on the data Did not modify any of the data values You did not remove any data from the data set You did not summarize the data in any way If you made any modifications of the raw data it is not the raw form of the data. Reporting modified data as raw data is a very common way to slow down the analysis process, since the analyst will often have to do a forensic study of your data to figure out why the raw data looks weird. (Also imagine what would happen if new data arrived?) The tidy data set The general principles of tidy data are laid out by Hadley Wickham in this paper and this video. While both the paper and the video describe tidy data using R, the principles are more generally applicable: Each variable you measure should be in one column Each different observation of that variable should be in a different row There should be one table for each "kind" of variable If you have multiple tables, they should include a column in the table that allows them to be joined or merged While these are the hard and fast rules, there are a number of other things that will make your data set much easier to handle. First is to include a row at the top of each data table/spreadsheet that contains full row names. So if you measured age at diagnosis for patients, you would head that column with the name AgeAtDiagnosis instead of something like ADx or another abbreviation that may be hard for another person to understand. Here is an example of how this would work from genomics. Suppose that for 20 people you have collected gene expression measurements with RNA-sequencing. You have also collected demographic and clinical information about the patients including their age, treatment, and diagnosis. You would have one table/spreadsheet that contains the clinical/demographic information. It would have four columns (patient id, age, treatment, diagnosis) and 21 rows (a row with variable names, then one row for every patient). You would also have one spreadsheet for the summarized genomic data. Usually this type of data is summarized at the level of the number of counts per exon. Suppose you have 100,000 exons, then you would have a table/spreadsheet that had 21 rows (a row for gene names, and one row for each patient) and 100,001 columns (one row for patient ids and one row for each data type). If you are sharing your data with the collaborator in Excel, the tidy data should be in one Excel file per table. They should not have multiple worksheets, no macros should be applied to the data, and no columns/cells should be highlighted. Alternatively share the data in a CSV or TAB-delimited text file. (Beware however that reading CSV files into Excel can sometimes lead to non-reproducible handling of date and time variables.) The code book For almost any data set, the measurements you calculate will need to be described in more detail than you can or should sneak into the spreadsheet. The code book contains this information. At minimum it should contain: Information about the variables (including units!) in the data set not contained in the tidy data Information about the summary choices you made Information about the experimental study design you used In our genomics example, the analyst would want to know what the unit of measurement for each clinical/demographic variable is (age in years, treatment by name/dose, level of diagnosis and how heterogeneous). They would also want to know how you picked the exons you used for summarizing the genomic data (UCSC/Ensembl, etc.). They would also want to know any other information about how you did the data collection/study design. For example, are these the first 20 patients that walked into the clinic? Are they 20 highly selected patients by some characteristic like age? Are they randomized to treatments? A common format for this document is a Word file. There should be a section called "Study design" that has a thorough description of how you collected the data. There is a section called "Code book" that describes each variable and its units. How to code variables When you put variables into a spreadsheet there are several main categories you will run into depending on their data type: Continuous Ordinal Categorical Missing Censored Continuous variables are anything measured on a quantitative scale that could be any fractional number. An example would be something like weight measured in kg. Ordinal data are data that have a fixed, small (< 100) number of levels but are ordered. This could be for example survey responses where the choices are: poor, fair, good. Categorical data are data where there are multiple categories, but they aren't ordered. One example would be sex: male or female. This coding is attractive because it is self-documenting. Missing data are data that are unobserved and you don't know the mechanism. You should code missing values as NA. Censored data are data where you know the missingness mechanism on some level. Common examples are a measurement being below a detection limit or a patient being lost to follow-up. They should also be coded as NA when you don't have the data. But you should also add a new column to your tidy data called, "VariableNameCensored" which should have values of TRUE if censored and FALSE if not. In the code book you should explain why those values are missing. It is absolutely critical to report to the analyst if there is a reason you know about that some of the data are missing. You should also not impute/make up/ throw away missing observations. In general, try to avoid coding categorical or ordinal variables as numbers. When you enter the value for sex in the tidy data, it should be "male" or "female". The ordinal values in the data set should be "poor", "fair", and "good" not 1, 2 ,3. This will avoid potential mixups about which direction effects go and will help identify coding errors. Always encode every piece of information about your observations using text. For example, if you are storing data in Excel and use a form of colored text or cell background formatting to indicate information about an observation ("red variable entries were observed in experiment 1.") then this information will not be exported (and will be lost!) when the data is exported as raw text. Every piece of data should be encoded as actual text that can be exported. The instruction list/script You may have heard this before, but reproducibility is a big deal in computational science. That means, when you submit your paper, the reviewers and the rest of the world should be able to exactly replicate the analyses from raw data all the way to final results. If you are trying to be efficient, you will likely perform some summarization/data analysis steps before the data can be considered tidy. The ideal thing for you to do when performing summarization is to create a computer script (in R, Python, or something else) that takes the raw data as input and produces the tidy data you are sharing as output. You can try running your script a couple of times and see if the code produces the same output. In many cases, the person who collected the data has incentive to make it tidy for a statistician to speed the process of collaboration. They may not know how to code in a scripting language. In that case, what you should provide the statistician is something called pseudocode. It should look something like: Step 1 - take the raw file, run version 3.1.2 of summarize software with parameters a=1, b=2, c=3 Step 2 - run the software separately for each sample Step 3 - take column three of outputfile.out for each sample and that is the corresponding row in the output data set You should also include information about which system (Mac/Windows/Linux) you used the software on and whether you tried it more than once to confirm it gave the same results. Ideally, you will run this by a fellow student/labmate to confirm that they can obtain the same output file you did. What you should expect from the analyst When you turn over a properly tidied data set it dramatically decreases the workload on the statistician. So hopefully they will get back to you much sooner. But most careful statisticians will check your recipe, ask questions about steps you performed, and try to confirm that they can obtain the same tidy data that you did with, at minimum, spot checks. You should then expect from the statistician: An analysis script that performs each of the analyses (not just instructions) The exact computer code they used to run the analysis All output files/figures they generated. This is the information you will use in the supplement to establish reproducibility and precision of your results. Each of the steps in the analysis should be clearly explained and you should ask questions when you don't understand what the analyst did. It is the responsibility of both the statistician and the scientist to understand the statistical analysis. You may not be able to perform the exact analyses without the statistician's code, but you should be able to explain why the statistician performed each step to a labmate/your principal investigator. Contributors Jeff Leek - Wrote the initial version. L. Collado-Torres - Fixed typos, added links. Nick Reich - Added tips on storing data as text. Nick Horton - Minor wording suggestions.
AlejandraIbarboA
Final scripting
D13Karo
We are a group of 4 who is creating a site markup which is similar to shopping websites
GriffinGaKill
Griffin, Lee, Joey, Maddie
otooto101
No description available
des1-gner
No description available
noah-tah
No description available
Greg-RASTKLAN
No description available
Svanidze1
No description available
Octovalve
Final del curso de Scripting
narso0
Final project from my frontend web dev class
tmmorgan
Art & Scripting IV Final
John-Ortiz
Final Project Game Scripting
jeffreyally
No description available
korshunad
Three pages: Javascript concepts, jQuery concepts, jQuery drag'n'drop and fancybox plugins
Lmaslanka75
No description available
TelcoSYS
Lenguages de Scripting TP Final
frankonyemaorji
No description available