From the course: Data Cleaning in Python Essential Training

Unlock the full course today

Join today to access over 23,400 courses taught by industry experts.

Bad values

Bad values

- [Instructor] Your data will have bad values. When I say bad, I mean data that was generated by error. It can be out of scale values such as a thousand degrees for our body temperature, maybe spelling mistakes, and others. Let's have a look at some metrics data. Here we have a CSV with time, name of the metric, and a value. Let's load it up. So we start by importing pandas as pd, and I'm going to hide the bar on the left side. And now, I'm going to read a CSV and parse the time column as date. And I'm going to sample 10 random rows. And we see some memory, some cpus, and some values. Let's use groupby to have a look at statistics per metric. So let's run the cell, groupby name and describe. And we see that the CPU has one count with the mean, and we see that minus 32.14 is probably an error for a metric value, either CPU or memory. The value count method is a great way to find problems in categorical data such as the…

Contents