Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: provide a null_counts keyword to df.info() to force showing of null counts #8701

Merged
merged 1 commit into from
Nov 2, 2014

Conversation

jreback
Copy link
Contributor

@jreback jreback commented Nov 1, 2014

Allows run-time control of showing of null-counts with df.info() (similar to what we allow with for example memory_usage in that this will override the default options for that call)

In [2]: df = DataFrame(1,columns=range(10),index=range(10))

In [3]: df.iloc[1,1] = np.nan

Default for a small frame (currently) is to show the non-null counts

In [5]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 10 columns):
0    10 non-null float64
1    9 non-null float64
2    10 non-null float64
3    10 non-null float64
4    10 non-null float64
5    10 non-null float64
6    10 non-null float64
7    10 non-null float64
8    10 non-null float64
9    10 non-null float64
dtypes: float64(10)
memory usage: 880.0 bytes

# allow it to be turned off
In [6]: df.info(null_counts=False)
<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 10 columns):
0    float64
1    float64
2    float64
3    float64
4    float64
5    float64
6    float64
7    float64
8    float64
9    float64
dtypes: float64(10)
memory usage: 880.0 bytes

When you have a big frame, right now you need to set max_info_rows to control this option

In [7]: pd.set_option('max_info_rows',5)           

In [8]: df.info()
<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 10 columns):
0    float64
1    float64
2    float64
3    float64
4    float64
5    float64
6    float64
7    float64
8    float64
9    float64
dtypes: float64(10)
memory usage: 880.0 bytes

# force it to show
In [9]: df.info(null_counts=True)
<class 'pandas.core.frame.DataFrame'>
Int64Index: 10 entries, 0 to 9
Data columns (total 10 columns):
0    10 non-null float64
1    9 non-null float64
2    10 non-null float64
3    10 non-null float64
4    10 non-null float64
5    10 non-null float64
6    10 non-null float64
7    10 non-null float64
8    10 non-null float64
9    10 non-null float64
dtypes: float64(10)
memory usage: 880.0 bytes

@jreback jreback added API Design Output-Formatting __repr__ of pandas objects, to_string labels Nov 1, 2014
@jreback jreback added this to the 0.15.1 milestone Nov 1, 2014

with option_context('display.max_info_rows',5,'display.max_info_columns',5):
check(None, False)
check(True, False)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jorisvandenbossche I think this is a bug actually, e.g. if you have a really small info display it doesn't display any column counts, but prob a minor point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't follow I think. Shouldn't this new kwarg null_counts not overrule if the frame is larger than max_info_rows/columns ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should and does, but the display is simply not printed out (too many columns), so this is sort of not-applicable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, yes, I see, in such a case, the null_count option has just no effect. I don't really think it is a bug. You could warn that it is ignored in such case if it is specified, but doesn't seem that necesarilly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, ok on the name null_counts? This is really just for an interactive session where you don't want to keep resetting max_info_rows.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, ok on the name!

@jreback jreback force-pushed the show_counts branch 2 times, most recently from 8c79317 to a95549e Compare November 2, 2014 14:56
jreback added a commit that referenced this pull request Nov 2, 2014
ENH: provide a null_counts keyword to df.info() to force showing of null counts
@jreback jreback merged commit 1746df2 into pandas-dev:master Nov 2, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants