OCS353 Syllabus - Data Science Fundamentals - 2021 Regulation - Open Elective | Anna University
OCS353 Syllabus - Data Science Fundamentals - 2021 Regulation - Open Elective | Anna University
OCS353 |
DATA SCIENCE FUNDAMENTALS |
L T P C |
---|
2023
COURSE OBJECTIVES:
● Familiarize students with the data science process.
● Understand the data manipulation functions in Numpy and Pandas.
● Explore different types of machine learning approaches.
● Understand and practice visualization techniques using tools.
● Learn to handle large volumes of data with case studies.
● Understand the data manipulation functions in Numpy and Pandas.
● Explore different types of machine learning approaches.
● Understand and practice visualization techniques using tools.
● Learn to handle large volumes of data with case studies.
UNIT I |
INTRODUCTION |
6 |
---|
Data Science: Benefits and uses – facets of data - Data Science Process: Overview – Defining research goals – Retrieving data – data preparation - Exploratory Data analysis – build the model – presenting findings and building applications - Data Mining - Data Warehousing – Basic statistical descriptions of Data.
UNIT II |
DATA MANIPULATION |
9 |
---|
Python Shell - Jupyter Notebook - IPython Magic Commands - NumPy Arrays-Universal Functions – Aggregations – Computation on Arrays – Fancy Indexing – Sorting arrays – Structured data – Data manipulation with Pandas – Data Indexing and Selection – Handling missing data – Hierarchical indexing – Combining datasets – Aggregation and Grouping – String operations – Working with time series – High performance
UNIT III |
MACHINE LEARNING |
5 |
---|
The modeling process - Types of machine learning - Supervised learning - Unsupervised learning - Semi-supervised learning- Classification, regression - Clustering – Outliers and Outlier Analysis
UNIT IV |
DATA VISUALIZATION |
5 |
---|
Importing Matplotlib – Simple line plots – Simple scatter plots – visualizing errors – density and contour plots – Histograms – legends – colors – subplots – text and annotation – customization – three dimensional plotting - Geographic Data with Basemap - Visualization with Seaborn
UNIT V |
HANDLING LARGE DATA |
5 |
---|
Problems - techniques for handling large volumes of data - programming tips for dealing with large data sets- Case studies: Predicting malicious URLs, Building a recommender system - Tools and techniques needed - Research question - Data preparation - Model building – Presentation and automation.
TOTAL: 60 PERIODS
COURSE OUTCOMES: At the end of this course, the students will be able to:
CO1: Gain knowledge on data science process.
CO2: Perform data manipulation functions using Numpy and Pandas.
CO3: Understand different types of machine learning approaches.
CO4: Perform data visualization using tools.
CO5: Handle large volumes of data in practical scenarios.
CO2: Perform data manipulation functions using Numpy and Pandas.
CO3: Understand different types of machine learning approaches.
CO4: Perform data visualization using tools.
CO5: Handle large volumes of data in practical scenarios.
TEXT BOOKS:
1. David Cielen, Arno D. B. Meysman, and Mohamed Ali, “Introducing Data Science”, Manning Publications, 2016.
2. Jake VanderPlas, “Python Data Science Handbook”, O’Reilly, 2016.
2. Jake VanderPlas, “Python Data Science Handbook”, O’Reilly, 2016.
REFERENCES:
1. Robert S. Witte and John S. Witte, “Statistics”, Eleventh Edition, Wiley Publications, 2017.
2. Allen B. Downey, “Think Stats: Exploratory Data Analysis in Python”, Green Tea Press,2014.
2. Allen B. Downey, “Think Stats: Exploratory Data Analysis in Python”, Green Tea Press,2014.
Comments
Post a Comment