The rise of analytics
- Illustrate how the rise of data analytics capabilities is enabled by data availability, advances in computing power, new algorithms, and maturing analytics processes.
- Distinguish between descriptive, predictive, and prescriptive analytics.
- Explore the Python and Jupyter analytics ecosystem (exercise).
Data preparation and exploration
- Describe variables, relationships between variables, and underlying structure in the data (e.g., clusters).
- Prepare and explore data sets using appropriate descriptive and visual techniques.
- Explain the role of exploratory data analysis in business decision-making.
Analytical data architecture
- Outline the architecture of data warehouses.
- Explain dimensional modeling concepts and common schemata.
- Design fact and dimension schemata from business questions and operational data.
Regression 1
- Explain the stages of a model-based analytics workflow using linear regression as an example.
- Interpret linear regression models, including coefficients, OLS estimation, and model evaluation.
- Describe how regression models are implemented in Python.
Regression 2
- Explain how business problems can be formulated as binary classification tasks and modeled using logistic regression.
- Understand how logistic regression uses a linear predictor and sigmoid function to produce probabilities, apply thresholds, and interpret log-odds and coefficients.
- Evaluate classification models using confusion matrices and metrics such as accuracy, precision, recall, and F1 score.
- Apply model predictions to decision-making by selecting appropriate thresholds based on business costs and expected value.
Machine learning 1
- Distinguish between supervised and unsupervised machine learning approaches and explain the generalization problem in supervised machine learning.
- Describe the workflow of supervised machine learning, including feature engineering, train–test splitting, model training, cross-validation, and evaluation.
- Connect conceptual machine learning procedures to Python implementations, including preprocessing, model training, and evaluation using scikit-learn. (see exercise)
Machine learning 2
- Explain the core mechanics of selected algorithms, including regularization in linear models, distance-based prediction (k-NN), recursive partitioning (decision trees), ensemble learning (random forests), and margin maximization (SVMs).
- Select and justify an appropriate modeling approach for a given predictive task, considering data characteristics, performance metrics, and the trade-off between interpretability and predictive power.
Big data 1
- Explain the characteristics of big data and their implications for analytics workflows.
- Compare data warehouse, data lake, and logical data warehouse architectures.
- Describe the text analytics pipeline from preprocessing to representation and modeling.
Analytics in organizations
- Explain the strategic role of data analytics in data-driven organizations.
- Discuss the ethical and legal boundaries of data analytics in organizations.