Skip to main content

IPLUSO 23559

Analysis and Treatment of Multivariate Data

Computer Applications for Data Science
  • ApresentaçãoPresentation
    The Curricular Unit "Analysis and Processing of Multivariate Data" is an essential component of the professional technical course in Computer Applications for Data Sciences. Within the field of action, this course focuses on the study and manipulation of data sets with multiple variables, exploring the complexities and interrelationships between them. The area of ¿¿expertise encompasses advanced statistical techniques, machine learning algorithms and visualization methods for multivariate data. As for the intervention domain, it addresses both the underlying theory and practical application, using modern software and tools specific to the processing of multivariate data. The relevance of the UC in the study cycle is unquestionable, as understanding and processing multivariate data is a central pillar in data science, allowing students to extract deeper insights and develop more accurate predictive models from complex data sets.
  • ProgramaProgramme
    Introduction to Multivariate Data: Basic concepts, types and structures of multivariate data. Exploratory Data Analysis (AED): Multivariate data visualization, outlier detection and statistical description. Correlation and Causality: Differences, calculation methods and implications. Dimensionality Reduction: Techniques such as Principal Component Analysis (PCA) and t-SNE. Clustering and Segmentation: Algorithms such as K-means and DBSCAN. Multivariate Classification: Introduction to models such as Multinomial Logistic Regression and Support Vector Machines. Validation and Interpretation of Models: Evaluation methods, metrics and interpretation of results. Practical Applications: Case studies and projects in specific domains, using tools such as R, Python and their specific libraries.
  • ObjectivosObjectives
    Knowledge: Students will acquire a deep understanding of the nature and complexity of multivariate data and the statistical techniques and algorithms used in their analysis.   Skills: They will be able to perform exploratory analyzes of multivariate data, identifying patterns, correlations and anomalies. Additionally, they will develop capabilities to apply dimensionality reduction methods, such as PCA and t-SNE, and clustering and classification techniques.   Skills: Students will be proficient in specific tools and software for processing multivariate data. They will also gain the ability to effectively communicate the results of their analyses, transforming complex data into actionable insights and data-driven solutions to real-world problems. Overall, they will be able to make informed decisions based on analysis of multivariate datasets, making a valuable contribution to any data science team.
  • BibliografiaBibliography
    Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2018). Multivariate Data Analysis (8th ed.). Cengage Learning. James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An Introduction to Statistical Learning: with Applications in R. Springer. McKinney, W. (2017). Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython (2nd ed.). O'Reilly Media.  
  • MetodologiaMethodology
    Project-Based Learning (PBL): Promotes practical application, allowing students to work on real datasets, proposing and implementing solutions to concrete problems. Interactive Coding Platforms: Using tools such as Jupyter Notebooks or RStudio Cloud to facilitate experimentation and real-time data visualization. Peer-to-Peer Discussion Groups: Fostering the exchange of ideas and collaboration, allowing students to learn from each other and share different perspectives.
  • LínguaLanguage
    Português
  • TipoType
    Semestral
  • ECTS
    5
  • NaturezaNature
    Mandatory
  • EstágioInternship
    Não