Filtering and selecting from pivot tables made with python pandas(从使用 python pandas 制作的数据透视表中过滤和选择)
问题描述
I'm struggling with hierarchical indexes in the Python pandas package. Specifically I don't understand how to filter and compare data in rows after it has been pivoted.
Here is the example table from the documentation:
import pandas as pd
import numpy as np
In [1027]: df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 6,
'B' : ['A', 'B', 'C'] * 8,
'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 4,
'D' : np.random.randn(24),
'E' : np.random.randn(24)})
In [1029]: pd.pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
Out[1029]:
C bar foo
A B
one A -1.154627 -0.243234
B -1.320253 -0.633158
C 1.188862 0.377300
three A -1.327977 NaN
B NaN -0.079051
C -0.832506 NaN
two A NaN -0.128534
B 0.835120 NaN
C NaN 0.838040
I would like to analyze as follows:
1) Filter this table on column attributes, for example selecting rows with negative foo:
C bar foo
A B
one A -1.154627 -0.243234
B -1.320253 -0.633158
three B NaN -0.079051
two A NaN -0.128534
2) Compare the remaining B series values between the distinct A series groups? I am not sure how to access this information: {'one':['A','B'], 'two':['A'], 'three':['B']} and determine which series B values are unique to each key, or seen in multiple key groups, etc
Is there a way to do this directly within the pivot table structure, or do I need to convert this back in to a pandas dataframe?
Update: I think this code is a step in the right direction. It at least lets me access individual values within this table, but I am still hard-coding the series vales:
table = pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
table.ix['one', 'A']
Pivot table returns a DataFrame so you can simply filter by doing:
In [15]: pivoted = pivot_table(df, values='D', rows=['A', 'B'], cols=['C'])
In [16]: pivoted[pivoted.foo < 0]
Out[16]:
C bar foo
A B
one A -0.412628 -1.062175
three B NaN -0.562207
two A NaN -0.007245
You can use something like
pivoted.ix['one']
to select all A series groups
or
pivoted.ix['one', 'A']
to select distinct A and B series groups
这篇关于从使用 python pandas 制作的数据透视表中过滤和选择的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持编程学习网!
本文标题为:从使用 python pandas 制作的数据透视表中过滤和选择
- 我如何卸载 PyTorch? 2022-01-01
- 使用 Cython 将 Python 链接到共享库 2022-01-01
- ";find_element_by_name(';name';)";和&QOOT;FIND_ELEMENT(BY NAME,';NAME';)";之间有什么区别? 2022-01-01
- 我如何透明地重定向一个Python导入? 2022-01-01
- 检查具有纬度和经度的地理点是否在 shapefile 中 2022-01-01
- CTR 中的 AES 如何用于 Python 和 PyCrypto? 2022-01-01
- 如何使用PYSPARK从Spark获得批次行 2022-01-01
- 计算测试数量的Python单元测试 2022-01-01
- 使用公司代理使Python3.x Slack(松弛客户端) 2022-01-01
- YouTube API v3 返回截断的观看记录 2022-01-01
