Skip to main content

IPLUSO 23556

Storage for Big Data

Computer Applications for Data Science
  • ApresentaçãoPresentation
    The Curricular unit "Storage for Big Data" is a key part of the professional technical course in Computer Applications for Data Science. This course unit aims to teach students about the principles, technologies, and strategies involved in storing large volumes of data generated in Big Data environments. In the field of action, this UC focuses on advanced storage techniques for Big Data, scalable solutions, and cloud storage. The scope of work also includes data strategies and management, such as compression and partitioning. The intervention domain covers up-to-date tools and frameworks that are pillars in the world of data analysis. Given the increasing relevance of data in strategic decisions in various industries, this UC  is essential in the study cycle, preparing students to become specialists capable of extracting valuable insights from raw data and turning them into impactful solutions.
  • ProgramaProgramme
    Big Data. Concepts and Terminology. Characteristics of Big Data. Different Types of Data. Business Motivations for Big Data Adoption. Big Data Planning. Big Data Analysis Lifecycle. Data Acquisition. Cloud. Privacy. Corporate Technologies. Online Transaction Processing (OLTP). Online Analytical Processing (OLAP). Extract, Transform, Load (ETL). Data Storage. Big Data Storage. Clusters. Hadoop HDFS.  NoSQL. Sharding. Replication. CAP Theorem. ACID. BASE. Big Data Storage Technology. Disk Storage Devices. In-Memory Storage Devices. Integrated Project: Development of a data science project from start to finish, utilizing the acquired concepts and tools.
  • ObjectivosObjectives
    Knowledge: Students will deepen their understanding of storage technologies for big data, as well as scalability and distribution techniques. Data management strategies and data security and privacy. Skills: They will be empowered to manipulate large datasets, apply transformations, feature engineering, and optimize models for superior performance in real-world environments. Competencies: Students will develop the ability to conduct complete data science projects, from data collection and cleaning to data storage and security, using modern tools and frameworks. They will be prepared to tackle complex challenges in the field of data storage, translating insights into strategic recommendations and data-driven solutions for organizations.
  • BibliografiaBibliography
    Santos Maribel & Carlos Costa. BIG DATA concepts, warehousing, and Analytics. River Publishers. 2020. Thomas Erl & Paul Buhler. Big Data Fundamentals, concepts, Drives & Techniques. Service Tech Press. 2016.  
  • MetodologiaMethodology
    Practical Labs in Cloud Environments: Access to cloud platforms for real and scalable experimentation with storage and datasets. Interactive Peer Review: Collaborative analysis and feedback of projects between students themselves, promoting mutual learning. Immersion Journeys: Intensive sessions where real company problems are presented to students for real-time solutions. Project-Based Learning: Development of projects that address the entire data science lifecycle, from acquisition to presentation of insights.
  • LínguaLanguage
    Português
  • TipoType
    Semestral
  • ECTS
    5
  • NaturezaNature
    Mandatory
  • EstágioInternship
    Não