Overview of the big data universe & cloud computing
Motivation, presentation and applications
The purpose of this course is to introduce the basics of the Big Data ecosystem, Cloud Computing, and their applications in data science. More than a coherent set of tools and frameworks, this ecosystem essentially meets the processing and exploitation needs of voluminous and varied data. In addition, the cloud platform facilitates access to this ecosystem. The purpose of this course is to introduce and demonstrate the basics of the Big Data ecosystem and Cloud Computing and their applications in data science.
The course lasts 3 days and includes 12 introductory hours into the concepts of Big Data, Cloud and Data Science, and 6 hours of practical application with the cloud via tutorials, offering learners a better understand of the essential tools. Time will be given to reflect on learners' data.
Course currently in French, resources in English.
The terms Big Data, Data Science and Cloud are now unavoidable in many organizations. Although initially used by the Web sector, these issues are pervading all sectors, including aeronautics and defense. The purpose of this short introduction is to define, specify, illustrate and apply the concepts of the world of Big Data and the myriad of services associated with Cloud Computing. There will be evaluation of the relevance of these approaches in the specific context of aeronautics and/or defense. Far from forming a unified whole, the tools and methods of this universe are actually based on a set of new computer uses assisting the various digital transitions experienced by companies. What are the situations specific to Big Data? Whats are the tools? What's it for? What is Cloud Computing? This course will try to answer all these questions and to demonstrate some of these tools through machine sessions. The main purpose of the course is to acclimatize learners to the entire Big Data/Cloud/Data Science ecosystem to provide them with the means to make decisions regarding further investment in its technologies.
Course level : Basic/Advanced
Engineering training or equivalent required, basic knowledge of computer systems (databases, software engineering, GNU/Linux system) useful but not necessary.
Dimitri BETTEBGHOR : ONERA
Day 1 BIG DATA AND DATABASES
- Big Data: sources, needs and challenges
- The Age of Big Data
- The emergence of Big Data tools
- Comparison with Traditional Data Organization (Datawarehouse Vs. Datalake)
- The promise of data value
- Structured data vs. Unstructured data
- Big Data in an organization
- Systems Design in the Age of Big Data
- Data as a link between different industries
- Impact on the organization
- Legal Aspects of Data (GDPR)
- Introduction to databases
- Relational bases
- CAP theorem
- NoSQL movement
- Different NoSQL solutions
- Key Warehouses: Value (e.g Redis)
- Column Oriented Bases (eg Cassandra)
- Document Oriented Bases (eg MongoDB)
- Graph Oriented Bases (eg Neo4j)
- Putting Python/Java into practice
Day 2 BIG DATA AND ARCHITECTURE
- Framework Hadoop/HDFS
- Resilience, fault tolerance
- Distribution and monitoring of processes
- MapReduce Pattern
- Hadoop components
- Streaming and hybrid architectures
- Lambda architecture
- Kappa architecture
- Big Data Tools and Machine Learning
- Overview of Apache Spark
- Putting Spark (Python or Java) into practice via Databricks
DAY 3 CLOUD COMPUTING AND DEPLOYMENT
- Fundamentals of the Cloud
- IaaS vs. IaaS vs. PaaS
- Governance and security
- Overview of cloud offers
- Azure, GCP, AWS
- Sovereign clouds?
- Modern deployment
- Containerization vs. Virtualization
- Orchestration with Kubernetes
- Exploration of Docker and a cloud provider
Scheduled in French:
PARIS: 11 to 13 April 2022
For the English realization, please, consult us.
€1 510 excluding tax (20% VAT)