pandas-challenge: PyCitySchools Analysis
Background
• Objective: Explain the purpose of the project and the role of the Chief Data Scientist in analyzing district-wide standardized test results.
• Goals: Describe what the analysis aims to achieve, such as helping the school board and mayor make informed decisions about school budgets and priorities.
Project Overview
• Project Name: pandas-challenge
• Folder: PyCitySchools
• Primary Script: Jupyter notebook for analysis
Files
• Data Files: • school_complete.csv, student_complete.csv
• Notebook: PyCitySchools_Analysis.ipynb – Main analysis script
Instructions
District Summary
• Calculations:
• Total number of unique schools
• Total students
• Total budget
• Average math score
• Average reading score
• % passing math
• % passing reading
• % overall passing
• DataFrame: district_summary
School Summary
• Calculations:
• School name
• School type
• Total students
• Total school budget
• Per student budget
• Average math score
• Average reading score
• % passing math
• % passing reading
• % overall passing
• DataFrame: per_school_summary
Highest-Performing Schools (by % Overall Passing)
• Sorting and Displaying:
• Sort schools by % Overall Passing in descending order
• Display top 5 rows
• DataFrame: top_schools
Lowest-Performing Schools (by % Overall Passing)
• Sorting and Displaying:
• Sort schools by % Overall Passing in ascending order
• Display top 5 rows
• DataFrame: bottom_schools
Math Scores by Grade
• Calculations:
• Average math scores for each grade (9th, 10th, 11th, 12th) at each school
• DataFrame: math_scores_by_grade
Reading Scores by Grade
• Calculations:
• Average reading scores for each grade (9th, 10th, 11th, 12th) at each school
• DataFrame: reading_scores_by_grade
Scores by School Spending
• Creating Spending Ranges:
• Use provided bins and labels for categorizing spending
• Calculations:
• Average math score
• Average reading score
• % passing math
• % passing reading
• % overall passing
• DataFrame: spending_summary
Scores by School Size
• Creating Size Ranges:
• Use provided bins and labels for categorizing school size
• Calculations:
• Average scores and passing rates by school size
• DataFrame: size_summary
Scores by School Type
• Group and Average:
• Group by “School Type” and average the results
• DataFrame: type_summary
Analysis
• Summary and Conclusions:
Additional Notes
• References: Chat GPT