If you have a DataFrame containing addresses, and you want to split the addresses using Python, here is one way to do it:
|
Name |
Sex |
Education |
address |
Age |
Score |
dates |
| 0 |
John |
M |
Bachelor |
123 Main St, San Francisco, CA 94102 |
37 |
99 |
2023-09-05 11:43:01 |
| 1 |
Jane |
F |
Master |
456 Elm St, New York, NY 10001 |
30 |
83 |
2023-09-03 00:46:36 |
| 2 |
Bob |
M |
PhD |
789 Oak St, Los Angeles, CA 90001 |
25 |
84 |
2023-09-18 05:10:26 |
| 3 |
Emily |
F |
Bachelor |
101 Pine Ave, Chicago, IL 60611 |
28 |
77 |
2023-09-06 20:18:57 |
| 4 |
Mike |
M |
High School |
202 Cedar Rd, Houston, TX 77002 |
36 |
97 |
2023-09-04 00:29:09 |
import pandas as pd
data['city'] = data['address'].apply(lambda x: x.split(',')[1])
# function way
def get_city(address):
return address.split(',')[1]
def get_state(address):
return address.split(',')[2].split(' ')[1]
data['city_short'] = data['address'].apply(lambda x: get_city(x)+' '+ get_state(x))
data.head()
|
Name |
Sex |
Education |
address |
Age |
Score |
dates |
city |
city_short |
| 0 |
John |
M |
Bachelor |
123 Main St, San Francisco, CA 94102 |
37 |
99 |
2023-09-05 11:43:01 |
San Francisco |
San Francisco CA |
| 1 |
Jane |
F |
Master |
456 Elm St, New York, NY 10001 |
30 |
83 |
2023-09-03 00:46:36 |
New York |
New York NY |
| 2 |
Bob |
M |
PhD |
789 Oak St, Los Angeles, CA 90001 |
25 |
84 |
2023-09-18 05:10:26 |
Los Angeles |
Los Angeles CA |
| 3 |
Emily |
F |
Bachelor |
101 Pine Ave, Chicago, IL 60611 |
28 |
77 |
2023-09-06 20:18:57 |
Chicago |
Chicago IL |
| 4 |
Mike |
M |
High School |
202 Cedar Rd, Houston, TX 77002 |
36 |
97 |
2023-09-04 00:29:09 |
Houston |
Houston TX |