## 商业数据分析从入门到入职（7）Python基础数据结构及其操作

CuterCorley 2021-04-05 22:06:10

# 一、列表

• 列表List
• 元组Tuple
• 字典Dictionary
• 集合Set

## 1.创建列表

``# sequence of characterss="Corley!"s[2:4]``

``'rl'``

``#sequence of anythingp = ['C','o','r','l','e','y','!']p[2:4]``

``['r', 'l']``

``def how_many_days(month): days_in_month=[31,28,31,30,31,30,31,31,30,31,30,31] return days_in_month[month-1]display(how_many_days(2), how_many_days(5), how_many_days(10))``

``283131``

``empty_list = []another_empty_list = list()display(empty_list,another_empty_list)``

``[][]``

``weekday_str = 'Monday,Tuesday,Wednesday,Thursday,Friday'weekdays = weekday_str.split(',')weekdays``

``['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']``

``obj_list = ["string", 1, True, 3.14]list_of_list = [empty_list, weekdays, obj_list]list_of_list``

``[[], ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday'], ['string', 1, True, 3.14]]``

``dal_memeber= [['Corley',18],['Jack',18],['Shirely',48],['Tom',18]]print(dal_memeber)print(dal_memeber[0])print(dal_memeber[0][1])``

``[['Corley', 18], ['Jack', 18], ['Shirely', 48], ['Tom', 18]]['Corley', 18]18``

``display(weekdays[0],weekdays[1:3])``

``'Monday'['Tuesday', 'Wednesday']``

``weekdays[0] = "Sunday"weekdays``

``['Sunday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']``

## 2.删除元素

``del weekdays[0]weekdays``

``['Tuesday', 'Wednesday', 'Thursday', 'Friday']``

``al = ['A', 'B', 'C', 'D', 'E', 'F', 'G']del al[3:]al``

``['A', 'B', 'C']``

``weekdays.remove('Friday')weekdays``

``['Tuesday', 'Wednesday', 'Thursday']``

``weekdays.remove('Friday')weekdays``

``---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-20-b508ecb8e563> in <module>----> 1 weekdays.remove('Friday') 2 weekdaysValueError: list.remove(x): x not in list``

``if 'Friday' in weekdays: weekdays.remove('Friday')else: print('element not exists')``

``element not exists``

in的用法再如：

``weekdays = ['Monday','Tuesday','Wednesday','Thursday','Friday']'Friday' in weekdays``

``True``

``'Fri' not in weekdays``

``45``

``seasons = ['spring', 'summmer', 'autumn', 'winter']last_season = seasons.pop()print("last_season = ", last_season, "\nseasons = ", seasons)``

``last_season = winter seasons = ['spring', 'summmer', 'autumn']``

``first_season = seasons.pop(0)print("first_season = ", first_season, "\nseasons = ", seasons)``

``first_season = spring seasons = ['summmer', 'autumn']``

## 3.添加元素

``weekdays.append('Friday')weekdays``

``['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Friday']``

``weekdays.insert(0, 'Monday')weekdays``

``['Monday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Friday']``

``weekend = ['Saturday', 'Sunday']weekdays = weekdays + weekendweekdays``

``['Monday', 'Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Friday', 'Saturday', 'Sunday']``

``display(weekdays.index('Thursday'),weekdays.index('Friday'))``

``45``

## 4.列表排序

``nums = [1,4,2,5,3]sorted_nums = sorted(nums)print("nums =", nums, "\nsorted_nums =", sorted_nums)``

``nums = [1, 4, 2, 5, 3] sorted_nums = [1, 2, 3, 4, 5]``

``nums.sort()nums``

``[1, 2, 3, 4, 5]``

``sorted_nums = sorted(nums, reverse=True)nums.sort(reverse=True)display(sorted_nums, nums)``

``[5, 4, 3, 2, 1][5, 4, 3, 2, 1]``

``str_ls = ['apple', 'pear', 'lemon', 'peach']str_ls.sort()str_ls``

``['apple', 'lemon', 'peach', 'pear']``

``display(sorted('joshuazhao1234'),''.join(sorted('ACBDZYX')))``

``['1', '2', '3', '4', 'a', 'a', 'h', 'h', 'j', 'o', 'o', 's', 'u', 'z']'ABCDXYZ'``

`join()`方法是以传入的参数为分隔符拼接列表元素。

``obj_list = ["string",1, True, 3.14]obj_list.sort()obj_list``

``---------------------------------------------------------------------------TypeError Traceback (most recent call last)<ipython-input-8-18b4b015f0f5> in <module> 1 obj_list = ["string",1, True, 3.14]----> 2 obj_list.sort() 3 obj_listTypeError: '<' not supported between instances of 'int' and 'str'``

``obj_list = [1, 3.14]obj_list.sort()obj_list``

``['1', '2', '3', '4', 'a', 'a', 'h', 'h', 'j', 'o', 'o', 's', 'u', 'z']'ABCDXYZ'``

## 5.列表赋值

``#assign vs copya = [1,2,3]b = aprint("a = ", a, "\nb = ", b)print()a[0] = 2print("a = ", a, "\nb = ", b)print()b[1] = 3print("a = ", a, "\nb = ", b)``

``a = [1, 2, 3] b = [1, 2, 3]a = [2, 2, 3] b = [2, 2, 3]a = [2, 3, 3] b = [2, 3, 3]``

``a = [1,2,3]b = ac = a.copy()d = a[:]e = list(a)print("a = ", a, "\nb = ", b, "\nc = ", c, "\nd = ", d, "\ne = ", e)print()a[0] = 2print("a = ", a, "\nb = ", b, "\nc = ", c, "\nd = ", d, "\ne = ", e)``

``a = [1, 2, 3] b = [1, 2, 3] c = [1, 2, 3] d = [1, 2, 3] e = [1, 2, 3]a = [2, 2, 3] b = [2, 2, 3] c = [1, 2, 3] d = [1, 2, 3] e = [1, 2, 3]``

## 6.列表推导式

Python支持列表推导式，它可以用一种非常自然、简单的方式来构造列表，就像数学家经常做的那样。其定义格式为`[expression for item in iterable]`

``num_list = []for i in range(0, 10): num_list.append(i)num_list``

``[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]``

``num_list = list(range(0,10))num_list``

``num_list = [i for i in range(0, 10)]num_list``

``num_list = [i**2 for i in range(0, 10)]num_list``

``num_list = []for i in range(0, 10): if i % 2 == 1: num_list.append(i)num_list``

``[1, 3, 5, 7, 9]``

``num_list = list(range(1,10,2))num_list``

``num_list = [i for i in range(0,10) if i % 2 == 1]num_list``

``import mathnum_list = [i * i for i in range(0,int(math.sqrt(500))) if i % 3 == 2]num_list``

``[4, 25, 64, 121, 196, 289, 400]``

``import randomnum_list = random.sample(range(10), 10)target = 5print("list =",num_list)print("target =", target)l1 = [x for x in num_list if x < target]l2 = [x for x in num_list if x >= target]print("l1:", l1)print("l2:", l2)``

``list = [0, 5, 8, 4, 1, 6, 9, 2, 3, 7]target = 5l1: [0, 4, 1, 2, 3]l2: [5, 8, 6, 9, 7]``

``l2 = [x for x in num_list if x not in l1]print("l2:", l2)``

``rows = range(1,4)cols = range(1,3)cells = []for row in rows: for col in cols: cells.append((row, col))cells``

``[(1, 1), (1, 2), (2, 1), (2, 2), (3, 1), (3, 2)]``

``cells = [(r, c) for r in rows for c in cols]cells``

# 二、元组

## 1.元组基本操作

``empty_tuple = ()empty_tuple``

``()``

``week_tuple = ('Monday', 'Tuesday')week_tuple``

``('Monday', 'Tuesday')``

``tpl = 123,456,789tpl``

``(123, 456, 789)``

``display(week_tuple[1], tpl[2])``

``'Tuesday'789``

``single_tuple1 = ('Monday',)single_tuple2 = ('Monday')display(type(single_tuple1), type(single_tuple2))``

``tuplestr``

``tom_list = ['Tom', 'Male', 20]name, gender, age = tom_listprint("Name = ", name, "\nGender = ", gender, "\nAge = ", age)tom_tuple = ('Tom', 'Male', 20)name, gender, age = tom_tupleprint("Name = ", name, "\nGender = ", gender, "\nAge = ", age)``

``Name = Tom Gender = Male Age = 20Name = Tom Gender = Male Age = 20``

``a = 1b = 2print("a = ", a, ", b = ", b)a, b = b, aprint("a = ", a, ", b = ", b)``

``a = 1 , b = 2a = 2 , b = 1``

## 2.元组和列表的对比

``week_tuple[1] = 'Thursday'week_tuple``

``TypeError Traceback (most recent call last)<ipython-input-24-4df244291e3f> in <module>----> 1 week_tuple[1] = 'Thursday' 2 week_tupleTypeError: 'tuple' object does not support item assignment``

``tpl2 = week_tuple[0], 2tpl2``

``('Monday', 2)``

# 三、字典

## 1.创建字典

``empty_dict = {}empty_dict``

``{}``

``pizza = { "size":"medium", "type":"pepperoni", "crust":"Thick", "qty": 1, "deliver":True,}pizza``

``{'size': 'medium', 'type': 'pepperoni', 'crust': 'Thick', 'qty': 1, 'deliver': True}``

``pizza = { "size":"small", "size":"medium", "type":"pepperoni", "crust":"Thick", "qty": 1, "deliver":True,}pizza``

``{'size': 'medium', 'type': 'pepperoni', 'crust': 'Thick', 'qty': 1, 'deliver': True}``

## 2.访问字典

``print(pizza['type'])``

``pepperoni``

``print(pizza['topping'])``

``---------------------------------------------------------------------------KeyError Traceback (most recent call last)<ipython-input-31-ca25624dca51> in <module>----> 1 print(pizza['topping'])KeyError: 'topping'``

``display(pizza.get('type'),pizza.get('topping'))``

``'pepperoni'None``

``display(pizza.keys(),pizza.values(),pizza.items())``

``dict_keys(['size', 'type', 'crust', 'qty', 'deliver'])dict_values(['medium', 'pepperoni', 'Thick', 1, True])dict_items([('size', 'medium'), ('type', 'pepperoni'), ('crust', 'Thick'), ('qty', 1), ('deliver', True)])``

## 3.更新字典

``pizza['topping'] = ['cheese','mushroom']pizza``

``{'size': 'medium', 'type': 'pepperoni', 'crust': 'Thick', 'qty': 1, 'deliver': True, 'topping': ['cheese', 'mushroom']}``

``pizza['qty'] = 10pizza``

``{'size': 'medium', 'type': 'pepperoni', 'crust': 'Thick', 'qty': 10, 'deliver': True, 'topping': ['cheese', 'mushroom']}``

## 4.删除字典

``del pizza['topping']pizza``

``{'size': 'medium', 'type': 'pepperoni', 'crust': 'Thick', 'qty': 10, 'deliver': True}``

``pizza.clear()pizza``

``{}``

# 四、集合

## 1.创建集合

``empty_set = set()empty_set``

``set()``

``even_set = {2,4,6,6,8,10}even_set``

``{2, 4, 6, 8, 10}``

## 2.集合运算

``num_set = {3,6,9,12,15,18}display(num_set & even_set, num_set | even_set, num_set - even_set, even_set - num_set, even_set ^ num_set,(even_set | num_set) - (even_set & num_set))``

``{6}{2, 3, 4, 6, 8, 9, 10, 12, 15, 18}{3, 9, 12, 15, 18}{2, 4, 8, 10}{2, 3, 4, 8, 9, 10, 12, 15, 18}{2, 3, 4, 8, 9, 10, 12, 15, 18}``

# 五、数据结构操作

## 1.转化为列表

``display(list('ababc'),list((1,2,3,4)),list({'name': 'Ed', 'employer': 'Oracle'}),list({'name': 'Ed', 'employer': 'Oracle'}.values()),sorted(list({5,6,7,8})))``

``['a', 'b', 'a', 'b', 'c'][1, 2, 3, 4]['name', 'employer']['Ed', 'Oracle'][5, 6, 7, 8]``

## 2.转化为字典

``dict('ababc')``

``---------------------------------------------------------------------------ValueError Traceback (most recent call last)<ipython-input-47-f82ea82e76c5> in <module>----> 1 dict('ababc')ValueError: dictionary update sequence element #0 has length 1; 2 is required``

``dict([1,2,3])``

``---------------------------------------------------------------------------TypeError Traceback (most recent call last)<ipython-input-48-e308b22febdd> in <module>----> 1 dict([1,2,3])TypeError: cannot convert dictionary update sequence element #0 to a sequence``

``dict(['ab', 'cd', 'ef'])``

``{'a': 'b', 'c': 'd', 'e': 'f'}``

``dict([['a', 'b'], ('c', 'd'), ('e', 'f')])``

``{'a': 'b', 'c': 'd', 'e': 'f'}``

## 3.zip

`zip()`函数可以将两个或多个序列生成对应位置元素生成元组作为元素的新序列。

``s1 = 'abcdefg's2 = 'hijklmn'list(zip(s1, s2))``

``[('a', 'h'), ('b', 'i'), ('c', 'j'), ('d', 'k'), ('e', 'l'), ('f', 'm'), ('g', 'n')]``

``s3, s4 = zip(*d)print(list(s3))print(list(s4))``

``['a', 'b', 'c', 'd', 'e', 'f', 'g']['h', 'i', 'j', 'k', 'l', 'm', 'n']``

## 4.Mutable和Immutable

Python中数据类型的可变与不可变是通过Mutable和Immutable来控制的：Mutable即可变类型，包括列表、字典和集合等，还包括自定义的类型，如果需要将其定义为不可变类型，需要重写object的`__setattr__``__delattr__`方法；Immutable即不可变类型，包括整型、浮点型、布尔型、字符串和元组等。

``s='hello'print("{} id is {}".format(s,id(s)))s='Yello'print("{} id is {}".format(s,id(s)))s= s+'w'print("{} id is {}".format(s,id(s)))``

``hello id is 3116696357488Yello id is 3116695670320Yellow id is 3116696394800``

``i=1print("{} id is {}".format(i,id(i)))i=10print("{} id is {}".format(i,id(i)))i=10+1print("{} id is {}".format(i,id(i)))``

``1 id is 14070966073526410 id is 14070966073555211 id is 140709660735584``

``s = 'hello's[0] = 'b'``

``---------------------------------------------------------------------------TypeError Traceback (most recent call last)<ipython-input-57-ee36c369d295> in <module> 1 s = 'hello'----> 2 s[0] = 'b'TypeError: 'str' object does not support item assignment``

``try_result=[] #cold refreshdef try_func(arg,result): print("inside before result is labeled {}, value {}".format(id(result),result)) result.append(arg) print("inside after result is labeled {}, value {}".format(id(result),result)) print("inside try_result is labeled {}, value is {}".format(id(try_result),try_result))print("before try_result is labeled {}, value is {}".format(id(try_result),try_result))try_func('a',try_result)print("after try_result is labeled {}, value is {}".format(id(try_result),try_result))try_result.append('b')print(try_result)try_func('a',try_result)``

``before try_result is labeled 3116696770752, value is []inside before result is labeled 3116696770752, value []inside after result is labeled 3116696770752, value ['a']inside try_result is labeled 3116696770752, value is ['a']after try_result is labeled 3116696770752, value is ['a']['a', 'b']inside before result is labeled 3116696770752, value ['a', 'b']inside after result is labeled 3116696770752, value ['a', 'b', 'a']inside try_result is labeled 3116696770752, value is ['a', 'b', 'a']``

``imu_number = 10def updateNumber(number): print("inside number is labeled {}, value {}".format(id(number),number)) print("inside imu_number is labeled {}, value {}".format(id(imu_number),imu_number)) number = 20 print("inside number is {}, outside imu_number is {}".format(number,imu_number))print("before imu_number is labeled {}, value {}".format(id(imu_number),imu_number))updateNumber(imu_number)print("after imu_number is labeled {}, value {}".format(id(imu_number),imu_number))``

``before imu_number is labeled 140709660735552, value 10inside number is labeled 140709660735552, value 10inside imu_number is labeled 140709660735552, value 10inside number is 20, outside imu_number is 10after imu_number is labeled 140709660735552, value 10``

``def buggy(arg, result=[]): print(id(result)) result.append(arg) print(result)buggy('a')buggy('b')``

``3116693617856['a']3116693617856['a', 'b']``

``def nonbuggy(arg, result=None): if result is None: result = [] print(id(result)) result.append(arg) print(result)nonbuggy('a')nonbuggy('b')``

``3116695253696['a']3116696770496['b']``

## 5.遍历序列数据结构

``weekdays = ['Monday','Tuesday','Wednesday','Thursday','Friday']i = 0while i < len(weekdays): print(weekdays[i]) i +=1``

``MondayTuesdayWednesdayThursdayFriday``

``for i in range(len(weekdays)): print(weekdays[i])``

``for day in weekdays: print(day)``

``for key,day in enumerate(weekdays): print(str(key)+" "+day)``

``0 Monday1 Tuesday2 Wednesday3 Thursday4 Friday``

``pizza = { "size":"medium", "size":"small", "type":"pepperoni", "crust":"Thick", "qty": 1, "deliver":True,}for k in pizza: print(k)print() for k, v in pizza.items(): print("key is {}, value is {}".format(k,v))print()for v in pizza.values(): print(v) ``

``sizetypecrustqtydeliverkey is size, value is smallkey is type, value is pepperonikey is crust, value is Thickkey is qty, value is 1key is deliver, value is TruesmallpepperoniThick1True``

https://www.helloworld.net/p/9o4YT7phjmiJ3