The rise of analytics
- Illustrate how the rise of data analytics capabilities is enabled by data availability, advances in computing power, new algorithms, and maturing analytics processes.
- Distinguish between descriptive, predictive, and prescriptive analytics.
- Explore the Python and Jupyter analytics ecosystem (exercise).
Data preparation and exploration
- Describe variables, relationships between variables, and underlying structure in the data (e.g., clusters).
- Prepare and explore data sets using appropriate descriptive and visual techniques.
- Explain the role of exploratory data analysis in business decision-making.
Analytical data architecture
- Outline the architecture of data warehouses.
- Explain dimensional modeling concepts and common schemata.
- Design fact and dimension schemata from business questions and operational data.
Regression I
- Explain the stages of a model-based analytics workflow using linear regression as an example.
- Interpret linear regression models, including coefficients, OLS estimation, and model evaluation.
- Describe how regression models are implemented in Python.
Regression II
- Understand and assess multiple regression models, including refinement strategies and key assumptions.
- Distinguish between inference- and prediction-oriented modeling and their implications for business contexts.
- Explain logistic regression as a probabilistic classification model and its role as a bridge to machine learning.
Machine learning I
- Distinguish between supervised and unsupervised machine learning approaches and explain the generalization problem in supervised machine learning.
- Describe the workflow of supervised machine learning, including feature engineering, train–test splitting, model training, cross-validation, and evaluation.
- Assess the performance of machine learning models, using the confusion matrix and metrics such as precision, recall and F1 score.
- Connect conceptual machine learning procedures to Python implementations, including preprocessing, model training, and evaluation using scikit-learn. (see exercise)
Machine learning II
- Compare selected supervised machine learning algorithms with respect to their assumptions, flexibility, interpretability, and typical application scenarios.
- Explain the mechanics of selected algorithms, including distance-based classification (k-NN), recursive partitioning (decision trees), and layered weighted transformations (neural networks).
- Select and justify an appropriate method for a given predictive task, taking into account data characteristics, performance metrics, and trade-offs between interpretability and predictive power.
Big data I
- Explain the characteristics of big data and their implications for analytics workflows.
- Assess key challenges of big data, including scalability, heterogeneity, and data quality.
- Outline distributed data architectures and their role in large-scale analytics.
Big data II
- Explain the characteristics and challenges of unstructured data sources (e.g., text, social media, spatial data).
- Acquire and preprocess large-scale or unstructured data using APIs and appropriate tools in Python.
- Apply large language models or other scalable methods to extract, classify, or summarize unstructured data.
- Evaluate the reliability, bias, and limitations of API- and LLM-based analytics workflows.
Analytics in organizations
- Explain the strategic role of data analytics in data-driven organizations.
- Discuss the ethical and legal boundaries of data analytics in organizations.
Integration and exam preparation
- Summarize how learnings integrate across the semester.
- Resolve remaining questions for exam preparation.