Skip to main content

IPLUSO 15430

Data Mining

Information Systems Management
  • ApresentaçãoPresentation
    The fundamental goal of the Data Mining discipline is to give the student skills in transforming data into information to support decisions, in the context of large databases. Data Mining tools aim to identify future behaviors and trends, supporting the proactive and knowledge-based decision process. They can also answer business questions whose solution has traditionally been very complex from a computational point of view. Thus, this course deals with the themes and issues normally associated with the designations of Data Mining or Knowledge Discovery. In this course, the main methodological aspects of Data Mining will be presented, as well as the most important tools used. The practical component is one of the fundamental aspects of the discipline, so the ability to translate knowledge into practical actions and analysis decisions is particularly valued.
  • ProgramaProgramme
    1. Introduction to Data Mining (6h) - Fundamental concepts: what is Data Mining - Differences between Data Mining, Big Data, Business Intelligence and Machine Learning - The Knowledge Discovery in Databases (KDD) process - Real-world use cases in various areas (healthcare, retail, finance, etc.)   2. Data Preparation and Exploration (6h) - Data types and data quality - Cleaning, transformation and normalization - Exploratory analysis: basic statistics, histograms, boxplots - Sampling techniques   3. Data Exploration Techniques – Part I (9 pm) 3.1 Classification and Regression (9h) 3.2 Clustering (6h) 3.3 Membership Rules (6h)   4. Model Validation and Evaluation (6h) - Training/test split, cross-validation- Overfitting and underfitting- Confusion Matrix, ROC/AUC curves   5. Tools and Workflows in Data Mining (6h) - Presentation of graphical tools: KNIME, RapidMiner, Orange - Creation of visual Data Mining pipelines- Integration with external data sources
  • ObjectivosObjectives
    This curricular unit aims to address the process of knowledge discovery in databases and the most common methodologies in Data Mining; It is intended that students understand the possible tasks of Data Mining, namely classification, forecasting, trend analysis (time series), grouping, sumarization (and visualization) or association; It is also intended to approach a set of techniques generally used in the implementation of Data Mining, such as decision trees, association rules, linear regression, artificial neuronal networks, genetic algorithms or Bayes networks; Another important goal is the use of an online platform for the application of the theoretical concepts.
  • BibliografiaBibliography
    Han, J., Kamber, M., Pei, J. (2012). Data Mining - Concepts and Techniques, Elsevier Gama, J., at al (2017). Extração de conhecimento de Dados – Data Mining. Edições Sílabo
  • MetodologiaMethodology
    Use of digital analytics apps and plataforms in support to the learn process, such as: - Microsoft Power BI - SAS Viya for Learners - Linguagem Python - Knime, RapidMiner, Orange
  • LínguaLanguage
    Português
  • TipoType
    Semestral
  • ECTS
    5
  • NaturezaNature
    Mandatory
  • EstágioInternship
    Não