req 테이블을 소개해드리겠습니다.
1 | req.head() | cs |
Unique Key | Created Date | Closed Date | Agency | Complaint Type | Descriptor | Location Type | Zip | Address | Address Type | City | Status | Resolution Description | Borough | X Coordinate (State Plane) | Y Coordinate (State Plane) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 32211354 | 12/16/2015 04:44:58 PM | 12/20/2015 10:43:23 AM | HPD | UNSANITARY CONDITION | PESTS | RESIDENTIAL BUILDING | 10034.0 | 118 VERMILYEA AVENUE | ADDRESS | NEW YORK | Closed | The Department of Housing Preservation and Dev... | MANHATTAN | 1005955 | 254952 |
1 | 32211886 | 12/16/2015 09:31:09 AM | 12/19/2015 12:05:54 PM | HPD | HEAT/HOT WATER | ENTIRE BUILDING | RESIDENTIAL BUILDING | 10461.0 | 1026 MORRIS PARK AVENUE | ADDRESS | BRONX | Closed | The Department of Housing Preservation and Dev... | BRONX | 1024296 | 248458 |
2 | 32212824 | 12/16/2015 03:35:43 PM | 12/18/2015 01:10:26 AM | HPD | HEAT/HOT WATER | ENTIRE BUILDING | RESIDENTIAL BUILDING | 10456.0 | 1221 COLLEGE AVENUE | ADDRESS | BRONX | Closed | The Department of Housing Preservation and Dev... | BRONX | 1008281 | 242832 |
3 | 32213779 | 12/16/2015 02:23:40 AM | 12/21/2015 01:33:29 PM | HPD | HEAT/HOT WATER | APARTMENT ONLY | RESIDENTIAL BUILDING | 10453.0 | 150 WEST 179 STREET | ADDRESS | BRONX | Closed | The Department of Housing Preservation and Dev... | BRONX | 1008161 | 250940 |
4 | 32214486 | 12/16/2015 03:07:57 AM | 12/16/2015 07:43:08 AM | NYPD | Blocked Driveway | No Access | Street/Sidewalk | 10459.0 | 1343 CHISHOLM STREET | ADDRESS | BRONX | Closed | The Police Department issued a summons in resp... | BRONX | 1013104 | 242126 |
1. 여기서 내가 원하는 열만 뽑아서 보려면
1 | req['Complaint Type'].head() | cs |
0 UNSANITARY CONDITION 1 HEAT/HOT WATER 2 HEAT/HOT WATER 3 HEAT/HOT WATER 4 Blocked Driveway Name: Complaint Type, dtype: object
2. 내가 원하는 열에 항목들이 어떤 것이 있는지 보려면
1 req['Complaint Type'].unique()cs
array(['UNSANITARY CONDITION', 'HEAT/HOT WATER', 'Blocked Driveway', 'Street Light Condition'], dtype=object)
array리스트로 보여주네요.
3. 내가 원하는 열에서 원하는 항목을 가진 행의 정보를 보고 싶다면
1 | req[req['Complaint Type']=='Blocked Driveway'].head() # complain Type이 Blocked Driveway인 행 상위 10개 보여줘 | cs |
Unique Key | Created Date | Closed Date | Agency | Complaint Type | Descriptor | Location Type | Zip | Address | Address Type | City | Status | Resolution Description | Borough | X Coordinate (State Plane) | Y Coordinate (State Plane) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
4 | 32214486 | 12/16/2015 03:07:57 AM | 12/16/2015 07:43:08 AM | NYPD | Blocked Driveway | No Access | Street/Sidewalk | 10459.0 | 1343 CHISHOLM STREET | ADDRESS | BRONX | Closed | The Police Department issued a summons in resp... | BRONX | 1013104 | 242126 |
9 | 32218456 | 12/17/2015 06:36:58 AM | 12/17/2015 07:38:47 AM | NYPD | Blocked Driveway | No Access | Street/Sidewalk | NaN | 834 SAINT ANN'S AVENUE | ADDRESS | BRONX | Closed | The Police Department responded to the complai... | BRONX | 1009226 | 238620 |
39 | 32236868 | 12/19/2015 03:42:22 AM | 12/19/2015 07:24:18 AM | NYPD | Blocked Driveway | No Access | Street/Sidewalk | 11103.0 | 31-29 43 STREET | ADDRESS | ASTORIA | Closed | The Police Department responded and upon arriv... | QUEENS | 1007700 | 215945 |
67 | 32255745 | 12/22/2015 11:54:24 PM | 12/23/2015 02:42:57 AM | NYPD | Blocked Driveway | No Access | Street/Sidewalk | 10472.0 | 2218 CHATTERTON AVENUE | ADDRESS | BRONX | Closed | The Police Department issued a summons in resp... | BRONX | 1025937 | 241136 |
70 | 32260772 | 12/23/2015 02:55:43 PM | 12/24/2015 03:12:40 AM | NYPD | Blocked Driveway | Partial Access | Street/Sidewalk | 11377.0 | 41-40 60 STREET | ADDRESS | WOODSIDE | Closed | The Police Department responded to the complai... | QUEENS | 1010789 | 210010 |
4. 내가 원하는 인덱스 값 행의 정보를 보고자 한다면
1 | req.loc[3] #인덱스 3의 정보를 가져와 달라 | cs |
Unique Key 32213779 Created Date 12/16/2015 02:23:40 AM Closed Date 12/21/2015 01:33:29 PM Agency HPD Complaint Type HEAT/HOT WATER Descriptor APARTMENT ONLY Location Type RESIDENTIAL BUILDING Zip 10453 Address 150 WEST 179 STREET Address Type ADDRESS City BRONX Status Closed Resolution Description The Department of Housing Preservation and Dev... Borough BRONX X Coordinate (State Plane) 1008161 Y Coordinate (State Plane) 250940 Name: 3, dtype: object
5. 내가 원하는 인덱스 값 행에서 원하는 열만 보고자 한다면
1 | # 인덱스 3에서 Complaint Type와 Zip 열 사이의 정보를 가져와 달라 req.loc[3, 'Complaint Type':'Zip'] | cs |
Complaint Type HEAT/HOT WATER Descriptor APARTMENT ONLY Location Type RESIDENTIAL BUILDING Zip 10453 Name: 3, dtype: object
6. 그룹화 하기
groupby() 함수를 소개하고자 한다.
이 함수의 파라미터로서는 열 또는 열 리스트, 행 인덱스가 가능하다.
기재한 파라미터를 기준을 그룹을 나눈 뒤 다양한 연산을 할 수 있다.
다음은 자주 사용되는 그룹 연산 메서드들이다.
size()
,count()
: 갯수mean()
,median()
,min()
,max()
: 평균, 중앙값, 최소, 최대sum()
,prod()
,std()
,var()
,quantile()
: 합계, 곱, 표준편차, 분산, 사분위수first()
,last()
: 가장 첫번째 데이터와 가장 나중 데이터
(출처 : 데이터 사이언스 스쿨)
예시 1)
1 | #City 열의 각 항목들의 크기(개수)를 보여줘 req.groupby('City').size() | cs |
City ASTORIA 2 BRONX 89 NEW YORK 39 WOODSIDE 2 dtype: int64
예시2)
1 | #City를 기준으로 항목을 나누어서 실수 열의 평균을 보여줘 req.groupby('City').mean() | cs |
Unique Key | Zip | X Coordinate (State Plane) | Y Coordinate (State Plane) | |
---|---|---|---|---|
City | ||||
ASTORIA | 3.227122e+07 | 11104.500000 | 1.005759e+06 | 217099.500000 |
BRONX | 3.225789e+07 | 10461.917647 | 1.016048e+06 | 248364.651685 |
NEW YORK | 3.226228e+07 | 10030.914286 | 9.963616e+05 | 229870.076923 |
WOODSIDE | 3.227604e+07 | 11377.000000 | 1.009564e+06 | 210841.000000 |