Big Data Storage Solution - NoSQL Training Course
When traditional storage technologies don't handle the amount of data you need to store there are hundereds of alternatives. This course try to guide the participants what are alternatives for storing and analyzing Big Data and what are theirs pros and cons.
This course is mostly focused on discussion and presentation of solutions, though hands-on exercises are available on demand.
Course Outline
Limits of Traditional Technologies
- SQL databases
- Redundancy: replicas and clusters
- Constraints
- Speed
Overview of database types
- Object Databases
- Document Store
- Cloud Databases
- Wide Column Store
- Multidimensional Databases
- Multivalue Databases
- Streaming and Time Series Databases
- Multimodel Databases
- Graph Databases
- Key Value
- XML Databases
- Distribute file systems
Popular NoSQL Databases
- MongoDB
- Cassandra
- Apache Hadoop
- Apache Spark
- other solutions
NewSQL
- Overview of available solutions
- Performance
- Inconsitencies
Document Storage/Search Optimized
- Solr/Lucene/Elasticsearch
- other solutions
Requirements
Good understanding of traditional technologies for data storage (MySQL, Oracle, SQL Server, etc...)
Open Training Courses require 5+ participants.
Big Data Storage Solution - NoSQL Training Course - Booking
Big Data Storage Solution - NoSQL Training Course - Enquiry
Big Data Storage Solution - NoSQL - Consultancy Enquiry
Consultancy Enquiry
Testimonials (4)
how the trainor shows his knowledge in the subject he's teachign
john ernesto ii fernandez - Philippine AXA Life Insurance Corporation
Course - Data Vault: Building a Scalable Data Warehouse
During the exercises, James explained me every step whereever I was getting stuck in more detail. I was completely new to NIFI. He explained the actual purpose of NIFI, even the basics such as open source. He covered every concept of Nifi starting from Beginner Level to Developer Level.
Firdous Hashim Ali - MOD A BLOCK
Course - Apache NiFi for Administrators
That I had it in the first place.
Peter Scales - CACI Ltd
Course - Apache NiFi for Developers
practice tasks
Pawel Kozikowski - GE Medical Systems Polska Sp. Zoo
Course - Python and Spark for Big Data (PySpark)
Upcoming Courses
Related Courses
Unified Batch and Stream Processing with Apache Beam
14 HoursApache Beam is an open source, unified programming model for defining and executing parallel data processing pipelines. It's power lies in its ability to run both batch and streaming pipelines, with execution being carried out by one of Beam's supported distributed processing back-ends: Apache Apex, Apache Flink, Apache Spark, and Google Cloud Dataflow. Apache Beam is useful for ETL (Extract, Transform, and Load) tasks such as moving data between different storage media and data sources, transforming data into a more desirable format, and loading data onto a new system.
In this instructor-led, live training (onsite or remote), participants will learn how to implement the Apache Beam SDKs in a Java or Python application that defines a data processing pipeline for decomposing a big data set into smaller chunks for independent, parallel processing.
By the end of this training, participants will be able to:
- Install and configure Apache Beam.
- Use a single programming model to carry out both batch and stream processing from withing their Java or Python application.
- Execute pipelines across multiple environments.
Format of the Course
- Part lecture, part discussion, exercises and heavy hands-on practice
Note
- This course will be available Scala in the future. Please contact us to arrange.
NoSQL Database with Microsoft Azure Cosmos DB
14 HoursThis instructor-led, live training in Spain (online or onsite) is aimed at database administrators or developers who wish to use Microsoft Azure Cosmos DB to develop and manage highly responsive and low latency applications.
By the end of this training, participants will be able to:
- Provision the necessary Cosmos DB resources to start building databases and applications.
- Scale application performance and storage by utilizing APIs in Cosmos DB.
- Manage database operations and reduce cost by optimizing Cosmos DB resources.
Data Vault: Building a Scalable Data Warehouse
28 HoursIn this instructor-led, live training in Spain, participants will learn how to build a Data Vault.
By the end of this training, participants will be able to:
- Understand the architecture and design concepts behind Data Vault 2.0, and its interaction with Big Data, NoSQL and AI.
- Use data vaulting techniques to enable auditing, tracing, and inspection of historical data in a data warehouse.
- Develop a consistent and repeatable ETL (Extract, Transform, Load) process.
- Build and deploy highly scalable and repeatable warehouses.
Apache Flink Fundamentals
28 HoursThis instructor-led, live training in Spain (online or onsite) introduces the principles and approaches behind distributed stream and batch data processing, and walks participants through the creation of a real-time, data streaming application in Apache Flink.
By the end of this training, participants will be able to:
- Set up an environment for developing data analysis applications.
- Understand how Apache Flink's graph-processing library (Gelly) works.
- Package, execute, and monitor Flink-based, fault-tolerant, data streaming applications.
- Manage diverse workloads.
- Perform advanced analytics.
- Set up a multi-node Flink cluster.
- Measure and optimize performance.
- Integrate Flink with different Big Data systems.
- Compare Flink capabilities with those of other big data processing frameworks.
Introduction to Graph Computing
28 HoursIn this instructor-led, live training in Spain, participants will learn about the technology offerings and implementation approaches for processing graph data. The aim is to identify real-world objects, their characteristics and relationships, then model these relationships and process them as data using a Graph Computing (also known as Graph Analytics) approach. We start with a broad overview and narrow in on specific tools as we step through a series of case studies, hands-on exercises and live deployments.
By the end of this training, participants will be able to:
- Understand how graph data is persisted and traversed.
- Select the best framework for a given task (from graph databases to batch processing frameworks.)
- Implement Hadoop, Spark, GraphX and Pregel to carry out graph computing across many machines in parallel.
- View real-world big data problems in terms of graphs, processes and traversals.
Confluent KSQL
7 HoursThis instructor-led, live training in Spain (online or onsite) is aimed at developers who wish to implement Apache Kafka stream processing without writing code.
By the end of this training, participants will be able to:
- Install and configure Confluent KSQL.
- Set up a stream processing pipeline using only SQL commands (no Java or Python coding).
- Carry out data filtering, transformations, aggregations, joins, windowing, and sessionization entirely in SQL.
- Design and deploy interactive, continuous queries for streaming ETL and real-time analytics.
Apache NiFi for Administrators
21 HoursIn this instructor-led, live training in Spain (onsite or remote), participants will learn how to deploy and manage Apache NiFi in a live lab environment.
By the end of this training, participants will be able to:
- Install and configure Apachi NiFi.
- Source, transform and manage data from disparate, distributed data sources, including databases and big data lakes.
- Automate dataflows.
- Enable streaming analytics.
- Apply various approaches for data ingestion.
- Transform Big Data and into business insights.
Apache NiFi for Developers
7 HoursIn this instructor-led, live training in Spain, participants will learn the fundamentals of flow-based programming as they develop a number of demo extensions, components and processors using Apache NiFi.
By the end of this training, participants will be able to:
- Understand NiFi's architecture and dataflow concepts.
- Develop extensions using NiFi and third-party APIs.
- Custom develop their own Apache Nifi processor.
- Ingest and process real-time data from disparate and uncommon file formats and data sources.
Python and Spark for Big Data (PySpark)
21 HoursIn this instructor-led, live training in Spain, participants will learn how to use Python and Spark together to analyze big data as they work on hands-on exercises.
By the end of this training, participants will be able to:
- Learn how to use Spark with Python to analyze Big Data.
- Work on exercises that mimic real world cases.
- Use different tools and techniques for big data analysis using PySpark.
Spark Streaming with Python and Kafka
7 HoursThis instructor-led, live training in Spain (online or onsite) is aimed at data engineers, data scientists, and programmers who wish to use Spark Streaming features in processing and analyzing real-time data.
By the end of this training, participants will be able to use Spark Streaming to process live data streams for use in databases, filesystems, and live dashboards.