CS 4364 Lab 3
Due Wednesday, 02/04/2026, by midnight (23:59, EST)
Announcements
-
Class participation, Google Folder.
-
For the midterm assignment, work between weeks 3 and 7 to write an 800-word abstract
Goals
The goals for this lab assignment are:
-
Understand Classification
-
Understand Regression
-
Get familiar with the Iris dataset
-
Get familiar with the Midterm and Final Paper format
1. Iris Dataset: Supervised Learning Comparison
-
Read: Sklearn User Guide (Section 1: Supervised Learning). Focus on the Iris dataset examples.
-
Locate and review the Iris examples in the following topics:
Logistic Regression (Regularization path of L1-logistic regression)
Support Vector Machines (SVM-Anova: SVM with univariate feature selection)
Nearest Neighbors Classification
Decision Trees (decision surface + tree structure)
Ensemble Methods (ensemble decision surfaces)
Voting Classifier
-
Compare the algorithms on a common setup: using the same train/test split (or 5-fold cross-validation), compare these methods on Iris.
Accuracy (required)
Macro F1-score (recommended)
Confusion matrix (required for at least 2 models)
Decision boundary visualization (required for at least 2 models)
-
Write a short summary (AAAI format): Submit a 1–2 page report in AAAI two-column format. Your report should include:
A comparison table of results across models
1–2 figures (decision boundaries or confusion matrices)
A short discussion: Which model worked best? Which was easiest to interpret? Why?
2. Midterm Paper Format
-
AAAI is one of the top four AI conferences
-
Read the guide from AAAI 2026
-
Use the LaTeX format
-
Link sharing, each team work on the same overleaf project
3. Find a dataset
-
Review the example dataset (EEGEyeNet, NeurIPS Datasets & Benchmarks 2021): read the paper and skim the GitHub repository. Pay attention to the task definition, dataset structure, labels, and the baseline results table.
-
Choose datasets for your Midterm + Final Project: identify two candidate datasets from the NeurIPS Datasets & Benchmarks track (2021–2024).
Start here: https://neurips.cc/Conferences/2025/CallForDatasetsBenchmarks
-
Present your dataset analysis: for each dataset, briefly summarize (1) what the ML task is, (2) what the inputs/labels are, and (3) what models are most appropriate. Confirm the dataset can support multiple ML methods we will cover in this course (e.g., traditional ML baselines, deep learning, sequence models, transformers).
-
Submit a short written summary (1 page max): explain why you selected these datasets, what makes them suitable for the midterm/final project, and which metrics you plan to use for evaluation.
4. Submission Guide
-
Each studentonly submits one file, lab_3_lastname.zip, including
-
lab_3_lastname.pdf in AAAI format
-
Select your favorite three Supervised Learning code examples using Iris dataset, and explain why in your PDF
-
-
Start a shared Github project for midterm and final project, add the instructor’s account 'xqu'
5. Notes
-
Submit your zip file for the lab to BlackBoard.
-
Lab assignments will typically be released on Thursday and will be due by midnight on the following Wednesday.