TIEE3045 Syllabus - Big Data Analytics - 2022 Regulation Anna University
TIEE3045 Syllabus - Big Data Analytics - 2022 Regulation Anna University
TIEE3045 |
BIG DATA ANALYTICS |
L T P C |
---|
2023
COURSE OBJECTIVES:
• To understand big data.
• To learn and use NoSQL big data management.
• To learn mapreduce analytics using Hadoop and related tools.
• To work with map reduce applications
• To understand the usage of Hadoop related tools for Big Data Analytics
• To learn and use NoSQL big data management.
• To learn mapreduce analytics using Hadoop and related tools.
• To work with map reduce applications
• To understand the usage of Hadoop related tools for Big Data Analytics
UNIT I |
UNDERSTANDING BIG DATA |
5 |
---|
Introduction to big data – convergence of key trends – unstructured data – industry examples of big data – web analytics – big data applications– big data technologies – introduction to Hadoop – open source technologies – cloud and big data – mobile business intelligence – Crowd sourcing analytics – inter and trans firewall analytics.
UNIT II |
NOSQL DATA MANAGEMENT |
7 |
---|
Introduction to NoSQL – aggregate data models – key-value and document data models – relationships – graph databases – schemaless databases – materialized views – distribution models – master-slave replication – consistency - Cassandra – Cassandra data model – Cassandra examples – Cassandra clients
UNIT III |
MAP REDUCE APPLICATIONS |
6 |
---|
MapReduce workflows – unit tests with MRUnit – test data and local tests – anatomy of MapReduce job run – classic Map-reduce – YARN – failures in classic Map-reduce and YARN – job scheduling – shuffle and sort – task execution – MapReduce types – input formats – output formats.
UNIT IV |
BASICS OF HADOOP |
6 |
---|
Data format – analyzing data with Hadoop – scaling out – Hadoop streaming – Hadoop pipes – design of Hadoop distributed file system (HDFS) – HDFS concepts – Java interface – data flow – Hadoop I/O – data integrity – compression – serialization – Avro – file-based data structures - Cassandra – Hadoop integration.
UNIT V |
HADOOP RELATED TOOLS |
6 |
---|
Hbase – data model and implementations – Hbase clients – Hbase examples – praxis. Pig – Grunt – pig data model – Pig Latin – developing and testing Pig Latin scripts. Hive – data types and file formats – HiveQL data definition – HiveQL data manipulation – HiveQL queries.
TOTAL: 60 PERIODS
COURSE OUTCOMES: After the completion of this course, students will be able to:
• Describe big data and use cases from selected business domains.
• Explain NoSQL big data management.
• Install, configure, and run Hadoop and HDFS.
• Perform map-reduce analytics using Hadoop.
• Use Hadoop-related tools such as HBase, Cassandra, Pig, and Hive for big data analytics.
• Explain NoSQL big data management.
• Install, configure, and run Hadoop and HDFS.
• Perform map-reduce analytics using Hadoop.
• Use Hadoop-related tools such as HBase, Cassandra, Pig, and Hive for big data analytics.
TEXT BOOKS:
1. Michael Minelli, Michelle Chambers, and AmbigaDhiraj, "Big Data, Big Analytics: Emerging Business Intelligence and Analytic Trends for Today's Businesses", Wiley, 2013.
2. Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
3. Sadalage, Pramod J. “NoSQL distilled”, 2013
2. Eric Sammer, "Hadoop Operations", O'Reilley, 2012.
3. Sadalage, Pramod J. “NoSQL distilled”, 2013
REFERENCES:
1. E. Capriolo, D. Wampler, and J. Rutherglen, "Programming Hive", O'Reilley, 2012.
2. Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
3. Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
4. Alan Gates, "Programming Pig", O'Reilley, 2011.
2. Lars George, "HBase: The Definitive Guide", O'Reilley, 2011.
3. Eben Hewitt, "Cassandra: The Definitive Guide", O'Reilley, 2010.
4. Alan Gates, "Programming Pig", O'Reilley, 2011.
Comments
Post a Comment