Course Outline
Introduction to Data Analysis and Big Data
- What Makes Big Data "Big"?
- Velocity, Volume, Variety, Veracity (VVVV)
- Limits to Traditional Data Processing
- Distributed Processing
- Statistical Analysis
- Types of Machine Learning Analysis
- Data Visualization
Big Data Roles and Responsibilities
- Administrators
- Developers
- Data Analysts
Languages Used for Data Analysis
- R Language
- Why R for Data Analysis?
- Data manipulation, calculation and graphical display
- Python
- Why Python for Data Analysis?
- Manipulating, processing, cleaning, and crunching data
Approaches to Data Analysis
- Statistical Analysis
- Time Series analysis
- Forecasting with Correlation and Regression models
- Inferential Statistics (estimating)
- Descriptive Statistics in Big Data sets (e.g. calculating mean)
- Machine Learning
- Supervised vs unsupervised learning
- Classification and clustering
- Estimating cost of specific methods
- Filtering
- Natural Language Processing
- Processing text
- Understaing meaning of the text
- Automatic text generation
- Sentiment analysis / topic analysis
- Computer Vision
- Acquiring, processing, analyzing, and understanding images
- Reconstructing, interpreting and understanding 3D scenes
- Using image data to make decisions
Big Data Infrastructure
- Data Storage
- Relational databases (SQL)
- MySQL
- Postgres
- Oracle
- Non-relational databases (NoSQL)
- Cassandra
- MongoDB
- Neo4js
- Understanding the nuances
- Hierarchical databases
- Object-oriented databases
- Document-oriented databases
- Graph-oriented databases
- Other
- Relational databases (SQL)
- Distributed Processing
- Hadoop
- HDFS as a distributed filesystem
- MapReduce for distributed processing
- Spark
- All-in-one in-memory cluster computing framework for large-scale data processing
- Structured streaming
- Spark SQL
- Machine Learning libraries: MLlib
- Graph processing with GraphX
- Hadoop
- Scalability
- Public cloud
- AWS, Google, Aliyun, etc.
- Private cloud
- OpenStack, Cloud Foundry, etc.
- Auto-scalability
- Public cloud
Choosing the Right Solution for the Problem
The Future of Big Data
Summary and Next Steps
Requirements
- A general understanding of math
- A general understanding of programming
- A general understanding of databases
Audience
- Developers / programmers
- IT consultants
Custom Corporate Training
Training solutions designed exclusively for businesses.
- Customized Content: We adapt the syllabus and practical exercises to the real goals and needs of your project.
- Flexible Schedule: Dates and times adapted to your team's agenda.
- Format: Online (live), In-company (at your offices), or Hybrid.
Price per private group, online live training, starting from 8000 € + VAT*
Contact us for an exact quote and to hear our latest promotions
Testimonials (7)
How big data work, data programs, greater knowledge of how our current world works using data
Ozayr Hussain - Vodacom
Course - A Practical Introduction to Data Analysis and Big Data
The practical side of the training.
Patrick - Vodacom PTy Ltd
Course - A Practical Introduction to Data Analysis and Big Data
Interactive topics and the style used by the lecture to simplified the topics for the students
Miran Saeed - Sulaymaniyah Asayish Agency
Course - A Practical Introduction to Data Analysis and Big Data
the trainer and his ability to lecture
ibrahim hamakarim - Sulaymaniyah Asayish Agency
Course - A Practical Introduction to Data Analysis and Big Data
Practical exercises
JOEL CHIGADA - University of the Western Cape
Course - A Practical Introduction to Data Analysis and Big Data
R programming
Osden Jokonya - University of the Western Cape
Course - A Practical Introduction to Data Analysis and Big Data
Overall the Content was good.