Skip to content

Commit

Permalink
Quartz sync: Oct 3, 2024, 6:35 PM
Browse files Browse the repository at this point in the history
  • Loading branch information
CarterT27 committed Oct 4, 2024
1 parent cccd9f4 commit df35cee
Show file tree
Hide file tree
Showing 3 changed files with 206 additions and 0 deletions.
126 changes: 126 additions & 0 deletions content/Class Notes/DSC 80 Lecture 3.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
---
tags:
- "Type/Note"
- "Topic/Data_Science"
- "Class/DSC_80"
date:
- 2024-10-03
---

# Aggregating

## Adding and Modifying Columns

Adding a new column to a dataframe using `assign`
```python
dogs = pd.read_csv(Path('data') / 'dogs43.csv', index_col='breed')
dogs.assign(cost_per_year=dogs['lifetime_cost'] / dogs['longevity'])
```

Chain methods together instead of writing long, hard-to-read lines
- Need to wrap expression in parentheses to add newlines before every method call

```python
(dogs
.assign(cost_per_year=dogs['lifetime_cost'] / dogs['longevity'])
.sort_values('cost_per_year')
.iloc[:5]
)
```

Assign with special column names (spaces, special characters)
```python
dogs.assign(**{'cost per year 💵': dogs['lifetime_cost'] / dogs['longevity']})
```

`df.copy()` copies a dataframe in place

`df[] = ` assigns column in place

`df.assign()` assigns a column to a new dataframe

Avoid `inplace=True` - plans to remove in future releases of pandas, not good practice

`df[column].to_numpy()` returns the numpy array of a column

`dogs.max(axis=1)` won't work because you are trying to take the max of a mix of datatypes

## Data Granularity and the `groupby` method

Fine granularity: small details
Coarse: bigger picture

You should opt for **finer granularity** for more detail if you have the resources to do so

How to go from fine to coarse granularity: **Aggregating**

> [!definition] Aggregation
> Aggregating is the act of combining many values into a single value
`penguins.groupby('species')['body_mass_g'].mean()`

"Split-apply-combine" Paradigm

[https://dsc80.com/resources/lectures/lec03/imgs/image_0.png]()

```python
(penguins
.assign(is_dream = penguins['island'] == 'Dream')
.groupby('species')
['is_dream']
.mean()
)
```

```python
%%pt
penguins_small.groupby('species')
```

Allows us to visualize groupby_objects

Aggregation Methods
- `count()`
- `sum()`
- `mean()`
- `max()`
- `last()`
- `first()`

```python
(penguins
.sort_values('body_mass_g')
.groupby('sex')
.last()
)
```

Generally, you should select column(s) directly after groupby

## Beyond default aggregation methods

```python
(penguins
.groupby('species')
['body_mass_g']
.aggregate(['count', 'mean'])
)
```

```python
(penguins
.groupby('species')
.aggregate({'bill_length_mm': 'max', 'island': 'unique'})
)
```

```python
def iqr(s):
return np.percntile(s, 75) - np.percentile(s, 25)

(penguins
.groupby('species')
['body_mass_g']
.agg(iqr) # agg is short for .aggregate
)
```
71 changes: 71 additions & 0 deletions content/Class Notes/MATH 109 Discussion 1.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
---
tags:
- "Type/Note"
- "Topic/Mathematics"
- "Class/MATH_109"
date:
- 2024-10-03
---

# Necessary and Sufficient Conditions, Axiomatic Properties of the Real Numbers

> [!proposition] 2.3.1: + and - for real numbers
> $a,b\in \mathbb{R}$
> 1. Commutativity: $a+b=b+a, a\cdot b = b\cdot a$
> 2. Associativity: $(a+b)+c=a+(b+c), (a\cdot b)\cdot c = a\cdot (b\cdot c)$
> 3. Distributivity: $a\cdot(b+c) = c\cdot b + a\cdot c, (a+b)\cdot c = a\cdot c + b \cdot c$
> 4. Zero Identity: $0\in \mathbb{R}, a+0=a\forall a \in \mathbb{R}$
> 5. Unity: $1\in \mathbb{R}, a\cdot 1 = a \forall a \in \mathbb{R}$
> 6. Subtractivity: $\forall a, \text{the equation } a+x=0\text{ has a unique solution, called }-a$ ($b-a$ is defined as $b+(-a)$)
> 7. Division: $\forall a \neq 0,\text{ the equation }a \cdot x = 1 \text{ has a unique solution }\frac{1}{a}$ ($\forall b$, $\frac{b}{a}$ is defined as $b\cdot\left(\frac{1}{a}\right)$)
>
> In one word, 1-7 collectively say $\mathbb{R}$ is a **field**.
> [!question] Which of the following conditions are necessary for the positive integer $n$ to be divisible by 6 (proofs are not necessary)?
> 1. 3 divides $n$
> 2. 9 divides $n$
> 3. 12 divides $n$
> 4. $n=12$
> 5. 6 divides $n^2$
> 6. 2 divides $n$ and 3 divides $n$
> 7. 2 divides $n$ or 3 divides $n$
> [!answer]-
> Necessary is the same as saying $n$ is divisible by 6 $\implies$ a condition
> 1, 5, 6, 7
> Strong condition (6) implies weak condition (7)
>
> Sufficient is the same as saying condition $\implies$ $n$ is divisible by 6
> 1. 3 divides $n$ - necessary
> 2. 9 divides $n$ - neither
> 3. 12 divides $n$ - sufficient
> 4. $n=12$ - sufficient
> 5. 6 divides $n^2$ - necessary, sufficient
> 6. 2 divides $n$ and 3 divides $n$ - necessary, sufficient
> 7. 2 divides $n$ or 3 divides $n$ - necessary
> [!question] Use the properties of addition and multiplication of real numbers given in Properties 2.3.1 to deduce that, for all real numbers $a$ and $b$,
> 1. $a\times 0 = 0 = 0 \times a$
> 2. $(-a)b=-ab=a(-b)$
> 3. $(-a)(-b)=ab$
> [!answer]
> **Problem 1.** $a\times 0 = 0 = 0\times a$
>
> $a\times (1+0) = a\times 1= a$
> $a\times (1+0) = a\times 1 + a\times 0= a + a\times 0$
> Subtracting $a$ from both sides:
> $a\times 0 = 0$
> Commutative: $0\times a = a \times 0 = 0$
>
> **Problem 2.** $(-a)b=-ab=a(-b)$
>
> $(a+(-a))\times b = 0 \times b = 0$
> $=a\times b + (-a) \times b$
> Subtracting $a\times b$: $(-a)\times b = -ab$
> Commutative: $b\times(-a) = -ba$
>
> **Problem 3.** $(-a)(-b)=ab$
>
> $(-a)(-b) = -(a\cdot(-b)) = -(-(ab))$
> $-(-ab)$ and $ab$ both solve the equation $x+(-ab)=0$ so $-(-ab)=ab$
9 changes: 9 additions & 0 deletions content/Daily Notes/2024-10-03.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
tags:
- "Daily_Note"
---

- [[DSC 80 Lecture 3]]
- [[CSE 158 Lecture 3]]
- [[DSC 40B Lecture 3]]
- [[MATH 109 Discussion 1]]

0 comments on commit df35cee

Please sign in to comment.