I can do it!!

He can do! She can do! why cannot me? i can do it!

개발/sk infosec cloud ai 전문가 양성과정

[pandas를 활용한 데이터분석]SK infosec 클라우드 AI 전문가 양성과정 수업필기본

gogoriver 2020. 9. 8. 09:04

 

 

 

In [1]:
%matplotlib inline
import pandas as pd
import matplotlib.pyplot as plt
In [2]:
##1. csv 파일 불러오기 (convenient_store.csv)
df = pd.read_csv('C:/Users/ka030/Documents/GitHub/python_analysis/sources/Day4/workbook/convenient_store.csv')
In [3]:
##2. 전체 컬럼 정보, null 값 유무 확인
df.info()
 
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 177 entries, 0 to 176
Data columns (total 7 columns):
area           177 non-null object
company        177 non-null object
hourly_wage    177 non-null int64
area1          177 non-null object
area2          177 non-null object
outlier        177 non-null int64
name           177 non-null object
dtypes: int64(2), object(5)
memory usage: 9.8+ KB
In [4]:
##3. 개수, 평균, 편차, 최소, 최대값 확인
df.describe()
Out[4]:
  hourly_wage outlier
count 177.000000 177.0
mean 5787.627119 0.0
std 352.318646 0.0
min 5580.000000 0.0
25% 5580.000000 0.0
50% 5600.000000 0.0
75% 6000.000000 0.0
max 7500.000000 0.0
In [5]:
##4. 지역에 대한 통계, 개수, 유니크한 정보, 제일 빈도가 높은 지역
df['area'].describe()
Out[5]:
count         177
unique        117
top       강남구 논현동
freq            7
Name: area, dtype: object
In [6]:
df.head()
Out[6]:
  area company hourly_wage area1 area2 outlier name
0 강남구 삼성동 gs25 오크우드점 5600 강남구 삼성동 0 gs25
1 강남구 삼성동 gs25 코엑스점 5700 강남구 삼성동 0 gs25
2 강서구 화곡동 gs25편의점 5600 강서구 화곡동 0 gs25
3 광진구 군자동 gs25 5580 광진구 군자동 0 gs25
4 광진구 중곡동 gs25중곡대원점 5580 광진구 중곡동 0 gs25
In [7]:
##5. 시간 당 급여가 6500원 이상인 지역의 편의점 정보 출력 (상위 10개만)
df[df['hourly_wage']>=6500].head(10)
Out[7]:
  area company hourly_wage area1 area2 outlier name
29 강남구 논현동 세븐일레븐편의점 7500 강남구 논현동 0 7/11
30 강남구 청담동 세븐일레븐 청담그린점 6500 강남구 청담동 0 7/11
37 강서구 등촌동 세븐일레븐 서울호서대점 6600 강서구 등촌동 0 7/11
53 구로구 구로4동 세븐일레븐 구로리공원점 6500 구로구 구로4동 0 7/11
60 도봉구 창동 세븐일레븐 6500 도봉구 창동 0 7/11
72 마포구 동교동 세븐일레븐 마포홍익점 6500 마포구 동교동 0 7/11
89 성동구 사근동 세븐일레븐 한양대학교병원점 6500 성동구 사근동 0 7/11
135 중구 명동2가 세븐일레븐 6690 중구 명동2가 0 7/11
137 중구 북창동 세븐일레븐 북창점 6500 중구 북창동 0 7/11
145 강남구 논현1동 CUBE pc방 6500 강남구 논현1동 0 CU
In [8]:
##6. 시간 당 급여가 높은 순서로 정렬 (sort_value() 함수 사용, 상위 10개만 출력)
df['hourly_wage'].sort_values().head(10)
Out[8]:
176    5580
109    5580
64     5580
111    5580
59     5580
58     5580
115    5580
116    5580
108    5580
55     5580
Name: hourly_wage, dtype: int64
In [9]:
##7. 영등포구에서 시간 당 급여가 6000원 이상인 편의점 검색
df[(df['hourly_wage']>=6000)&(df['area1']=='영등포구')]
Out[9]:
  area company hourly_wage area1 area2 outlier name
13 영등포구 영등포동 gs25 6300 영등포구 영등포동 0 gs25
106 영등포구 여의도동 세븐일레븐 여의역점 6000 영등포구 여의도동 0 7/11
107 영등포구 영등포동 세븐일레븐 영등포 2호점 6200 영등포구 영등포동 0 7/11
In [10]:
##8. CU 편이점만 출력 (상위 10개만)
df[df['name']=='CU'].head(10)
Out[10]:
  area company hourly_wage area1 area2 outlier name
141 강남구 논현동 CU 논현힐탑점 5600 강남구 논현동 0 CU
142 강남구 논현동 CU논현한미점 6000 강남구 논현동 0 CU
143 강남구 신사동 CU 로데오점 6100 강남구 신사동 0 CU
144 강남구 대치4동 CU 대치본점 6000 강남구 대치4동 0 CU
145 강남구 논현1동 CUBE pc방 6500 강남구 논현1동 0 CU
146 강북구 수유3동 CU수유중앙점 5600 강북구 수유3동 0 CU
147 강서구 방화동 CU방화오피스점 6000 강서구 방화동 0 CU
148 관악구 신원동 CU신림인석점 5580 관악구 신원동 0 CU
149 관악구 봉천동 CU관악상상점 5580 관악구 봉천동 0 CU
150 구로구 구로동 CU편의점(고대구로병원1호점) 5580 구로구 구로동 0 CU
In [11]:
##9. 지역 컬럼(location)을 추가한 다음, in Seoul 이라는 값 저장, 상위 5개 출력
# df.drop(['지역 컬럼(location)'],axis=1,inplace=True)
df['location']="in Seoul"
df.head(5)
Out[11]:
  area company hourly_wage area1 area2 outlier name location
0 강남구 삼성동 gs25 오크우드점 5600 강남구 삼성동 0 gs25 in Seoul
1 강남구 삼성동 gs25 코엑스점 5700 강남구 삼성동 0 gs25 in Seoul
2 강서구 화곡동 gs25편의점 5600 강서구 화곡동 0 gs25 in Seoul
3 광진구 군자동 gs25 5580 광진구 군자동 0 gs25 in Seoul
4 광진구 중곡동 gs25중곡대원점 5580 광진구 중곡동 0 gs25 in Seoul
In [12]:
##10. 6000원 이상 컬럼 추가(more_than_6000) -> True, False 값 저장 (상위 20개 출력)
tmp_list = []
def more_than_6000(x):
    if x>6000:
        tmp_list.append(True)
    else:
        tmp_list.append(False)
df['hourly_wage'].apply(lambda x :more_than_6000(x) )
df['more_than_6000'] = tmp_list
df.head(20)
Out[12]:
  area company hourly_wage area1 area2 outlier name location more_than_6000
0 강남구 삼성동 gs25 오크우드점 5600 강남구 삼성동 0 gs25 in Seoul False
1 강남구 삼성동 gs25 코엑스점 5700 강남구 삼성동 0 gs25 in Seoul False
2 강서구 화곡동 gs25편의점 5600 강서구 화곡동 0 gs25 in Seoul False
3 광진구 군자동 gs25 5580 광진구 군자동 0 gs25 in Seoul False
4 광진구 중곡동 gs25중곡대원점 5580 광진구 중곡동 0 gs25 in Seoul False
5 구로구 구로동 gs25구로동양점 6000 구로구 구로동 0 gs25 in Seoul False
6 구로구 구로동 gs25구로동양점 5580 구로구 구로동 0 gs25 in Seoul False
7 동대문구 장안동 gs25장안중앙점 5600 동대문구 장안동 0 gs25 in Seoul False
8 마포구 서교동 gs25 홍대아트점 5600 마포구 서교동 0 gs25 in Seoul False
9 성동구 금호동4 gs25 서울숲푸르지오점 6000 성동구 금호동4 0 gs25 in Seoul False
10 성북구 동소문동 gs25동소문본점 5580 성북구 동소문동 0 gs25 in Seoul False
11 성북구 하월곡동 gs25 성북 푸르지오 5800 성북구 하월곡동 0 gs25 in Seoul False
12 송파구 송파동 gs25송파중앙점 5600 송파구 송파동 0 gs25 in Seoul False
13 영등포구 영등포동 gs25 6300 영등포구 영등포동 0 gs25 in Seoul True
14 은평구 신사동 gs25 5580 은평구 신사동 0 gs25 in Seoul False
15 종로구 관수동 gs25 국일관점 5600 종로구 관수동 0 gs25 in Seoul False
16 중구 소공동 gs25 5600 중구 소공동 0 gs25 in Seoul False
17 중구 신당동 gs25 5580 중구 신당동 0 gs25 in Seoul False
18 강남구 논현동 세븐일레븐 논현11호점 6200 강남구 논현동 0 7/11 in Seoul True
19 강남구 신사동 편의점/ 세븐일레븐 압구정 리갈펠리스점 5600 강남구 신사동 0 7/11 in Seoul False
In [13]:
##11. more_than_6000 컬럼에서 True인 데이터들의 평균, 개수, 편차 등의 정보 출력
df[df['more_than_6000']].describe()
Out[13]:
  hourly_wage outlier
count 25.000000 25.0
mean 6518.000000 0.0
std 343.923441 0.0
min 6100.000000 0.0
25% 6300.000000 0.0
50% 6500.000000 0.0
75% 6500.000000 0.0
max 7500.000000 0.0
In [14]:
##12. more_than_6000 이름의 함수를 생성하고, 6000원이상인 경우 A group, 아니면 B group을 반환하는 함수 생성

def more_than_6000(x):
    if x>6000:
        return "A_group"
    else:
        return "B_group"
In [15]:
##13. more_than_6000_f 컬럼 생성하고 more_than_6000 함수의 결과를 저장

df['more_than_6000_f'] = df['hourly_wage'].apply(lambda x:more_than_6000(x) )
In [16]:
##14. 지금까지의 결과 상위 10개를 출력
df.head(10)
Out[16]:
  area company hourly_wage area1 area2 outlier name location more_than_6000 more_than_6000_f
0 강남구 삼성동 gs25 오크우드점 5600 강남구 삼성동 0 gs25 in Seoul False B_group
1 강남구 삼성동 gs25 코엑스점 5700 강남구 삼성동 0 gs25 in Seoul False B_group
2 강서구 화곡동 gs25편의점 5600 강서구 화곡동 0 gs25 in Seoul False B_group
3 광진구 군자동 gs25 5580 광진구 군자동 0 gs25 in Seoul False B_group
4 광진구 중곡동 gs25중곡대원점 5580 광진구 중곡동 0 gs25 in Seoul False B_group
5 구로구 구로동 gs25구로동양점 6000 구로구 구로동 0 gs25 in Seoul False B_group
6 구로구 구로동 gs25구로동양점 5580 구로구 구로동 0 gs25 in Seoul False B_group
7 동대문구 장안동 gs25장안중앙점 5600 동대문구 장안동 0 gs25 in Seoul False B_group
8 마포구 서교동 gs25 홍대아트점 5600 마포구 서교동 0 gs25 in Seoul False B_group
9 성동구 금호동4 gs25 서울숲푸르지오점 6000 성동구 금호동4 0 gs25 in Seoul False B_group
In [17]:
##15-1. more_than_6000가 True인 데이터의 지역과 시간당 급여를 가진 새로운 데이터프레임 생성(data2)
data2 = df[df['more_than_6000']][["area","hourly_wage"]]
data2


##15-2. data2 데이터를 시간당 급여 순으로 정렬 (높은순) 
data2.sort_values("hourly_wage",ascending=False)
Out[17]:
  area hourly_wage
173 용산구 이태원동 7500
29 강남구 논현동 7500
160 서대문구 신촌동 6900
135 중구 명동2가 6690
37 강서구 등촌동 6600
172 용산구 한남동 6600
30 강남구 청담동 6500
53 구로구 구로4동 6500
60 도봉구 창동 6500
161 서대문구 현저동 6500
72 마포구 동교동 6500
89 성동구 사근동 6500
145 강남구 논현1동 6500
137 중구 북창동 6500
121 종로구 관수동 6480
128 종로구 낙원동 6480
133 중구 다동 6300
151 금천구 독산동 6300
152 금천구 가산동 6300
61 동대문구 장안동 6300
13 영등포구 영등포동 6300
18 강남구 논현동 6200
107 영등포구 영등포동 6200
56 구로구 개봉동 6200
143 강남구 신사동 6100
In [18]:
##16. data2를 darta2.csv 파일로 저장
data2.to_csv("output.csv",encoding='ms949', index=False)
In [41]:
##17. 시간당 급여를 histogram 으로 표시 (matplotlib hist() 사용)
hourly_wage_data = df['hourly_wage']
plt.hist(hourly_wage_data)

df.hourly_wage.hist(bins=10)
plt.show()
 
In [20]:
##18. 시간당 급여를 box 차트로 표시 
plt.boxplot(hourly_wage_data)
Out[20]:
{'whiskers': [<matplotlib.lines.Line2D at 0x20a4d7c3f48>,
  <matplotlib.lines.Line2D at 0x20a4d7c88c8>],
 'caps': [<matplotlib.lines.Line2D at 0x20a4d7c8f48>,
  <matplotlib.lines.Line2D at 0x20a4d7c8fc8>],
 'boxes': [<matplotlib.lines.Line2D at 0x20a4d7c36c8>],
 'medians': [<matplotlib.lines.Line2D at 0x20a4d7cefc8>],
 'fliers': [<matplotlib.lines.Line2D at 0x20a4d7cef88>],
 'means': []}
 
In [22]:
##19. 시간당 급여를 box 차트로 표시(이름순으로) by name


# sort_values(by=['col1'])
by_name_wages=df.sort_values(by=['name'])["hourly_wage"]
# df['name'] df

# by_name_wages
plt.boxplot(by_name_wages)
plt.title("시간당 급여(by name)")
plt.xlabel("이름순으로 정렬된 데이터")
plt.ylabel("급여")
Out[22]:
Text(0, 0.5, '급여')
 
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 49884 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44036 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 45817 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44553 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 50668 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51060 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 47492 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 49692 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51004 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 47196 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51221 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 47148 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 46108 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 45936 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 53552 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51060 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 47492 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 49692 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51004 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 47196 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51221 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 47148 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 46108 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 45936 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 53552 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44553 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 50668 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 49884 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44036 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 45817 missing from current font.
  font.set_text(s, 0, flags=flags)
 
In [28]:
# ##19. 시간당 급여를 box 차트로 표시(이름순으로) by name
# import matplotlib.pyplot as plt
import seaborn as sns
data = df[['hourly_wage','name']]
sns.boxplot(y = 'hourly_wage', x = 'name', data=data ,palette='rainbow') 
Out[28]:
<matplotlib.axes._subplots.AxesSubplot at 0x20a528cdac8>
 
In [43]:
# 19번 다시
df.boxplot(column='hourly_wage', by='name')
Out[43]:
<matplotlib.axes._subplots.AxesSubplot at 0x20a54f4b708>
 
In [31]:
##19. 시간당 급여를 box 차트로 표시(지역순으로) by area

import seaborn as sns
data = df[['hourly_wage','area1']]
sns.boxplot(y = 'hourly_wage', x = 'area1', data=data ,palette='rainbow') 
Out[31]:
<matplotlib.axes._subplots.AxesSubplot at 0x20a534ecb88>
 
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44053 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 45224 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44396 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 49436 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44305 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51652 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 47196 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 46041 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 45824 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 47928 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 47560 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 54252 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 49457 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 48513 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 49569 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 54028 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 50689 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 46321 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51008 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 54217 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51333 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51473 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44288 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 50501 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 44552 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 52380 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 45432 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 50896 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 46020 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 48393 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 51089 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 52488 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 50577 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 50857 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 49328 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:211: RuntimeWarning: Glyph 46993 missing from current font.
  font.set_text(s, 0.0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44053 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 45224 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44396 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 49436 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44305 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51652 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 47196 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 46041 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 45824 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 47928 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 47560 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 54252 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 49457 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 48513 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 49569 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 54028 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 50689 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 46321 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51008 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 54217 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51333 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51473 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44288 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 50501 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 44552 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 52380 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 45432 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 50896 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 46020 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 48393 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 51089 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 52488 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 50577 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 50857 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 49328 missing from current font.
  font.set_text(s, 0, flags=flags)
C:\Users\ka030\Anaconda3\lib\site-packages\matplotlib\backends\backend_agg.py:180: RuntimeWarning: Glyph 46993 missing from current font.
  font.set_text(s, 0, flags=flags)
 
In [44]:
df.boxplot(column='hourly_wage', by='area1')
Out[44]:
<matplotlib.axes._subplots.AxesSubplot at 0x20a54f7d548>
 
In [35]:
##20. 한글 표시되게 matplotlib 지정 font로 
import matplotlib
import matplotlib.font_manager as fm

 
In [36]:
##21. 시간당 급여를 box 차트로 표시(지역순으로) <- 다시 실행 -> 한글 반영되게


import seaborn as sns
data = df[['hourly_wage','area1']]
sns.boxplot(y = 'hourly_wage', x = 'area1', data=data ,palette='rainbow')
Out[36]:
<matplotlib.axes._subplots.AxesSubplot at 0x20a53a70508>
 
In [46]:
df.boxplot(column='hourly_wage', by='area1')
Out[46]:
<matplotlib.axes._subplots.AxesSubplot at 0x20a54f7d508>
 
In [47]:
##22-1. 지역구별 box 차트(플롯) 

df.boxplot(column='hourly_wage', by='area1')
plt.xticks(fontsize=6)
##22-2. 폰트 사이즈 6
Out[47]:
(array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
        18, 19, 20, 21, 22, 23, 24, 25]),
 <a list of 25 Text xticklabel objects>)
 
In [48]:
##23-1. 지역구별 box 차트(플롯), 지역구가 세로로 표시

df.boxplot(column='hourly_wage', by='area1', vert=False)
plt.xticks(fontsize=6)

##23-2. 폰트 사이즈 6
Out[48]:
(array([5250., 5500., 5750., 6000., 6250., 6500., 6750., 7000., 7250.,
        7500., 7750.]), <a list of 11 Text xticklabel objects>)