首先尝试使用pivot:
import pandas as pd import numpy as np df = pd.DataFrame({'Name':['Mary', 'Josh','Jon','Lucy', 'Jane', 'Sue'], 'Age':[34, 37, 29, 40, 29, 31], 'City':['Boston','New York', 'Chicago', 'Los Angeles', 'Chicago', 'Boston'], 'Position':['Manager','Programmer','Manager','Manager','Programmer', 'Programmer']}, columns=['Name','Position','City','Age']) print (df) Name Position City Age 0 Mary Manager Boston 34 1 Josh Programmer New York 37 2 Jon Manager Chicago 29 3 Lucy Manager Los Angeles 40 4 Jane Programmer Chicago 29 5 Sue Programmer Boston 31 print (df.pivot(index='Position', columns='City', values='Age')) City Boston Chicago Los Angeles New York Position Manager 34.0 29.0 40.0 NaN Programmer 31.0 29.0 NaN 37.0
如果需要重置索引,请删除列名称并填写NaN值:
#pivoting by numbers - column Age print (df.pivot(index='Position', columns='City', values='Age') .reset_index() .rename_axis(None, axis=1) .fillna(0)) Position Boston Chicago Los Angeles New York 0 Manager 34.0 29.0 40.0 0.0 1 Programmer 31.0 29.0 0.0 37.0 #pivoting by strings - column Name print (df.pivot(index='Position', columns='City', values='Name')) City Boston Chicago Los Angeles New York Position Manager Mary Jon Lucy None Programmer Sue Jane None Josh