任务4,5参考代码如下,使用自己数据替换示例中数据:
import pandas as pddf=pd.DataFrame({
'Name':['ab','cd','ef','cd','ab','fg'],'Movies':['movie1','movie2','movie3','movie4','movie5','movie6'],'Ratings':[3,4,5,1,4,2]})#print(df)a = df.groupby('Name')['Movies'].apply(lambda x: x.count())df['NumOfMovies']=df['Name'].apply(lambda x:a[x])print(df)b = df.groupby('Name')['Ratings'].apply(lambda x: x.sum()/x.count())df['Score']=df['Name'].apply(lambda x:b[x])df1=df.sort_values(by='Score',ascending=False).reset_index()print(df1)# res=df.drop_duplicates(['Name']).sort_values(by='Score',ascending=False).reset_index()# print(res)
如有帮助,请点击我回答左下角【采纳该回答】按钮给予采纳。