PyDS : Python for Data Science
Welcome to the First Edition of PyDS . The objective of this module is to provide fundamental understanding of the python programming language needed to follow an introductory course in Data Science.
You will start with the basics of python programming, including python data structures, functions and classes. We follow this up by an introduction to Numerical Python (NumPy) and finally, the course will provide a basic introduction to linear regression from scratch.
Along the way, we will introduce foundational ideas of statistics, linear algebra and calculus.
At the end of this module, you will have the tools and the concepts needed to successfully undertake a rigorous course in machine learning.
This page introduces you to the team, the basic instructions, the schedule and various elements of our class.
Interested in joining?
[LINKS DON’T WORK AT THE MOMENT]
If you would like to apply to this course, please go here.
We also provide this course as part of our Masters and Accelerated program, check this link out to get more information and apply.
The Team
NOTE This course will be delivered by the Dr. Pavlos Protopapas’ research group, StellarDNN.
To know more about StellarDNN, click here.
The Coursework
We have very carefully designed the coursework to give you, the student, a wholesome learning experience.
We will hold two 90 minute weekend sessions per week for a total of five weeks.
Session - What to expect
Before the session begins, students are expected to complete a pre-class reading assignment and and attempt a quiz based on the same.
A session will have the following pedagogy layout which will be repeated three times:
- Approx. 15 minutes of live online instruction followed by approx 15 minutes of Q/A + Quiz + Exercises.
- Sessions will help students develop the intuition for the core concepts, provide the necessary mathematical background, and provide guidance on technical details.
- Sessions will be accompanied by relevant examples to clarify key concepts and techniques.
After the session, students are expected to complete a short post-class quiz based on the principal concepts covered in class.
The Class
Welcome Session - Preparing for this class
Please check your mail for more information regarding the platform and the course.
High level course schedule
NOTE: Below timings are in IST
Sessions:
-
Saturday Series:
4:30 PM - 6:00 PM
-
Sunday Series:
4:30 PM - 6:00 PM
Please find a more detailed course schedule here.
Course Topics
Basic Python: Data types, data structures, functions
Advanced Python: Python Classes
Probability & Statistics
Linear Algebra & Calculus All exercises in this course will be done in jupyter notebooks.
Note: Prior knowledge of programming is not necessary for this module
Detailed Session-wise topics
Session 1:
- Introduction to Python
- Data Types, iterators, python operations,
- order of operations, logical operators
Session 2:
- Python Data Structures - Lists, Dictionaries, Tuples
- List/dictionary comprehensions
- Enumeration
Session 3:
- Python Functions - Arguments, keyword arguments, etc.
- Anonymous functions (lambda function)
Session 4:
-
Classes: Constructors vs Instantiations
-
Methods vs. Attributes
Session 5:
- Working with strings
- String formatting
- Reading & writing file
Session 6:
- Debugging skills
- Exception handling
- Finding documentation
- Process of elimination
Session 7: Stats & Probability
- Random Variable
- Probability Density Function
- Some ‘standard’ distributions and their mean/stdev (Normal, Binomial). Properties of mean and variance
Session 8: NumPy
Numpy
- Indexing / slicing
- Shape & reshape
- Zeros, ones, arbitrary array declaration
Session 9: Calculus and Linear Algebra:
- Derivatives (including partial)
- Matrix Operations
- Matrix Multiplication
Diversity & Inclusion
We actively seek and welcome people of diverse identities, from across the spectrum of disciplines and methods since Artificial Intelligence (AI) increasingly mediates our social, cultural, economic, and political interactions [1].
We believe in creating and maintaining an inclusive learning environment where all members feel safe, respected, and capable of producing their best work.
We commit to an experience for all participants that is free from – Harassment, bullying, and discrimination which includes but is not limited to:
- Offensive comments related to age, race, religion, creed, color, gender (including transgender/gender identity/gender expression), sexual orientation, medical condition, physical or intellectual disability, pregnancy, or medical conditions, national origin or ancestry.
- Intimidation, personal attacks, harassment, unnecessary disruption of talks during any of the learning activities.
Reference:
[1] K. Stathoulopoulos and J. C. Mateos-Garcia, “Gender Diversity in AI Research,” SSRN Electronic Journal, 2019 [Online]. Available: http://dx.doi.org/10.2139/ssrn.3428240.
Logistics - What you need to begin?
We assume you have a Univ.AI account, created when you signed up at course.univ.ai. If not, email programs@univ.ai.
Education software we use
- Our lectures and labs are carried out via Zoom (install instructions).
- Quizzes & exercises will be conducted on the digital learning platform Ed.
- Ocassionally, we may conduct in-class contests on kaggle. Please register on kaggle and familarize yourself with it, if you haven’t already done so. This is a short video that will help you learn how to use kaggle.
All exercises and homeworks in this course will be done in jupyter notebooks. This link will help you setup jupyter lab and get you acquianted with jupyter notebooks.
Our module policies around collaboration and grading are listed here. Our expectations of you are also laid out in that document.
Parting Note
As you will learn in the course, programming for data science is not just about writing efficient code.
It requires proficiency in critical thinking, ideation & experimentation.
Keeping that in mind, you are advised to give your full active attention to every session.
We wish you well for the start of your data science journey.