테이블 다루기 (selection & query) :: A4용지




req 테이블을 소개해드리겠습니다.

1
req.head()
cs

Unique KeyCreated DateClosed DateAgencyComplaint TypeDescriptorLocation TypeZipAddressAddress TypeCityStatusResolution DescriptionBoroughX Coordinate (State Plane)Y Coordinate (State Plane)
03221135412/16/2015 04:44:58 PM12/20/2015 10:43:23 AMHPDUNSANITARY CONDITIONPESTSRESIDENTIAL BUILDING10034.0118 VERMILYEA AVENUEADDRESSNEW YORKClosedThe Department of Housing Preservation and Dev...MANHATTAN1005955254952
13221188612/16/2015 09:31:09 AM12/19/2015 12:05:54 PMHPDHEAT/HOT WATERENTIRE BUILDINGRESIDENTIAL BUILDING10461.01026 MORRIS PARK AVENUEADDRESSBRONXClosedThe Department of Housing Preservation and Dev...BRONX1024296248458
23221282412/16/2015 03:35:43 PM12/18/2015 01:10:26 AMHPDHEAT/HOT WATERENTIRE BUILDINGRESIDENTIAL BUILDING10456.01221 COLLEGE AVENUEADDRESSBRONXClosedThe Department of Housing Preservation and Dev...BRONX1008281242832
33221377912/16/2015 02:23:40 AM12/21/2015 01:33:29 PMHPDHEAT/HOT WATERAPARTMENT ONLYRESIDENTIAL BUILDING10453.0150 WEST 179 STREETADDRESSBRONXClosedThe Department of Housing Preservation and Dev...BRONX1008161250940
43221448612/16/2015 03:07:57 AM12/16/2015 07:43:08 AMNYPDBlocked DrivewayNo AccessStreet/Sidewalk10459.01343 CHISHOLM STREETADDRESSBRONXClosedThe Police Department issued a summons in resp...BRONX1013104242126


1. 여기서 내가 원하는 열만 뽑아서 보려면

1
req['Complaint Type'].head()
cs


0    UNSANITARY CONDITION
1          HEAT/HOT WATER
2          HEAT/HOT WATER
3          HEAT/HOT WATER
4        Blocked Driveway
Name: Complaint Type, dtype: object


2. 내가 원하는 열에 항목들이 어떤 것이 있는지 보려면

1
req['Complaint Type'].unique()
cs
array(['UNSANITARY CONDITION', 'HEAT/HOT WATER', 'Blocked Driveway',
       'Street Light Condition'], dtype=object)

array리스트로 보여주네요.


3. 내가 원하는 열에서 원하는 항목을 가진 행의 정보를 보고 싶다면

1
req[req['Complaint Type']=='Blocked Driveway'].head()
# complain Type이 Blocked Driveway인 행 상위 10개 보여줘
cs
Unique KeyCreated DateClosed DateAgencyComplaint TypeDescriptorLocation TypeZipAddressAddress TypeCityStatusResolution DescriptionBoroughX Coordinate (State Plane)Y Coordinate (State Plane)
43221448612/16/2015 03:07:57 AM12/16/2015 07:43:08 AMNYPDBlocked DrivewayNo AccessStreet/Sidewalk10459.01343 CHISHOLM STREETADDRESSBRONXClosedThe Police Department issued a summons in resp...BRONX1013104242126
93221845612/17/2015 06:36:58 AM12/17/2015 07:38:47 AMNYPDBlocked DrivewayNo AccessStreet/SidewalkNaN834 SAINT ANN'S AVENUEADDRESSBRONXClosedThe Police Department responded to the complai...BRONX1009226238620
393223686812/19/2015 03:42:22 AM12/19/2015 07:24:18 AMNYPDBlocked DrivewayNo AccessStreet/Sidewalk11103.031-29 43 STREETADDRESSASTORIAClosedThe Police Department responded and upon arriv...QUEENS1007700215945
673225574512/22/2015 11:54:24 PM12/23/2015 02:42:57 AMNYPDBlocked DrivewayNo AccessStreet/Sidewalk10472.02218 CHATTERTON AVENUEADDRESSBRONXClosedThe Police Department issued a summons in resp...BRONX1025937241136
703226077212/23/2015 02:55:43 PM12/24/2015 03:12:40 AMNYPDBlocked DrivewayPartial AccessStreet/Sidewalk11377.041-40 60 STREETADDRESSWOODSIDEClosedThe Police Department responded to the complai...QUEENS1010789210010

4. 내가 원하는 인덱스 값 행의 정보를 보고자 한다면

1
req.loc[3]
#인덱스 3의 정보를 가져와 달라
cs

Unique Key                                                             32213779
Created Date                                             12/16/2015 02:23:40 AM
Closed Date                                              12/21/2015 01:33:29 PM
Agency                                                                      HPD
Complaint Type                                                   HEAT/HOT WATER
Descriptor                                                       APARTMENT ONLY
Location Type                                              RESIDENTIAL BUILDING
Zip                                                                       10453
Address                                                    150 WEST  179 STREET
Address Type                                                            ADDRESS
City                                                                      BRONX
Status                                                                   Closed
Resolution Description        The Department of Housing Preservation and Dev...
Borough                                                                   BRONX
X Coordinate (State Plane)                                              1008161
Y Coordinate (State Plane)                                               250940
Name: 3, dtype: object

5. 내가 원하는 인덱스 값 행에서 원하는 열만 보고자 한다면

1
# 인덱스 3에서 Complaint Type와 Zip 열 사이의 정보를 가져와 달라
req.loc[3, 'Complaint Type':'Zip']
cs

Complaint Type          HEAT/HOT WATER
Descriptor              APARTMENT ONLY
Location Type     RESIDENTIAL BUILDING
Zip                              10453
Name: 3, dtype: object

6. 그룹화 하기

groupby() 함수를 소개하고자 한다.

이 함수의 파라미터로서는 열 또는 열 리스트, 행 인덱스가 가능하다.

기재한 파라미터를 기준을 그룹을 나눈 뒤 다양한 연산을 할 수 있다.


다음은 자주 사용되는 그룹 연산 메서드들이다.

  • size()count(): 갯수
  • mean()median()min()max(): 평균, 중앙값, 최소, 최대
  • sum()prod()std()var()quantile() : 합계, 곱, 표준편차, 분산, 사분위수
  • first()last(): 가장 첫번째 데이터와 가장 나중 데이터

(출처 : 데이터 사이언스 스쿨)

예시 1)

1
#City 열의 각 항목들의 크기(개수)를 보여줘
req.groupby('City').size()
cs
City
ASTORIA      2
BRONX       89
NEW YORK    39
WOODSIDE     2
dtype: int64

예시2)

1
#City를 기준으로 항목을 나누어서 실수 열의 평균을 보여줘
req.groupby('City').mean()
cs

Unique KeyZipX Coordinate (State Plane)Y Coordinate (State Plane)
City
ASTORIA3.227122e+0711104.5000001.005759e+06217099.500000
BRONX3.225789e+0710461.9176471.016048e+06248364.651685
NEW YORK3.226228e+0710030.9142869.963616e+05229870.076923
WOODSIDE3.227604e+0711377.0000001.009564e+06210841.000000


+ Recent posts