Skip to content

Latest commit

 

History

History
60 lines (52 loc) · 3.56 KB

hudm5001_semester_project_guidelines.md

File metadata and controls

60 lines (52 loc) · 3.56 KB

Project Specifications:

  • The instructor will place you into a group of 3-4 students
  • Pick a data set that you and your group find interesting. (Example sources found below. Feel free to select your data from any other source as appropriate.)
  • Form a research question
  • Perform data pre-processing, data cleaning, outlier removal, and so on to sanitize your data as necessary.
  • Save your data in a .csv file (or other format as appropriate for your data set and project scenario).
  • Explore your data to reveal interesting/useful information based on your project scenario.
  • Create at least 2 visualizations that you find interesting/useful.
  • Do at least one of the following, depending in your interests and background:
    • compute meaningful statistical quantities (e.g., means, correlations)
    • perform a statistical test on the data (e.g., t-test)
    • fit a model to the data (e.g., regression)
  • Write at least two Python classes, each of which has at least one method. For example, these classes can be simple as in our lecture notes.

Some Data Source Suggestions:

Deliverables:

1. WRITTEN REPORT (max of 10 pages) containing (due Dec 17):

  • Abstract: Paragraph outline describing your question, what you did, and what you learned
  • Introduction: Describe your project scenario. Starting out, what did you hope to accomplish/learn?
  • Data: Describe your dataset and its significance. Where did you obtain this dataset from?
    Why did you choose the dataset that you did?
    Indicate if you carried out any preprocessing/data cleaning/outlier removal, and so on to sanitize your data.
  • Data Processing Methodology: Describe briefly your process to obtain results/output.
  • Results:
    • Show at least two visualizations
    • Display and discuss the results. Describe what you have learned and mention the relevance/significance of the results you have obtained.
    • Classes: Describe what classes you made. Describe methods in the classes that you wrote. Show a sample run of 1 or 2 of your methods (screen captures or copy-and-paste is fine).
  • Conclusions: Summarize your findings, explain how these results could be used by others (if applicable), and describe ways you could improve your program. You could describe ways you might like to expand the functionality of your program if given more time.

2. CODE (due Dec 17)

  • Clearly document, organize, and name your code file or files
  • The files can be in Jupyter Notebooks or Python scripts

SUBMISSION

  • By the deadline, submit (i) written report and (ii) code files in one Zip file submit through Canvas.

RUBRIC

Total Points = 60

Assignment Description Possible Points
Paper Paper includes abstract, introduction, and conclusions 10
Paper discusses data source, data summary, and data processing methodology 10
Paper includes at least two visualizations 10
Paper includes answers to research questions 10
Paper includes methods in user-defined classes 10
Code Code is clear and well-documented and presents Python classes 10