![chicago divvy chicago divvy](https://www.chicago.gov/content/dam/city/depts/cdot/Divvy/phoneapp.jpg)
The histogram on the left shows that when we just consider the Downtown area bike stations there is a greater proportion of the people who use bike stations close to major transit stops than when compared to the entire ChicagoLand area. The power of OmniSci's crossfiltering capabilities helped me query and instantly render graphics that compared ChicagoLand regions and transit stops. The Downtown bike stations have a greater proportion of riders who use bike stations closer to major transit stops throughout the year. To start the analysis and figure out where people are going, I created a simple Geo Heatmap that measured the average trip duration at each bike station. The last answer is rather obvious for this case-the Divvy Bikes! The second to last answer is also somewhat obvious-people travel for either business (to go to and from work) or pleasure (to enjoy the cityscape). To understand the Divvy Dataset, I needed to figure out the same broad questions that define transportation data at large:
Chicago divvy full#
To see the specific details of the data, the full code is available at: Deep Dive into Divvy Data This made it significantly easier for me to create a new feature, instantly add it as a table and make new graphs.
![chicago divvy chicago divvy](https://www.planetizen.com/files/images/Divvy.jpg)
![chicago divvy chicago divvy](https://static.wixstatic.com/media/06bb24_bf981f16ff7745ce97299023cf6a27e0~mv2_d_2711_2119_s_2.jpg)
While at first I imported data directly into Immerse by compressing the files, I found that using the pymapd connector allowed me to insert data much faster. This left me with 8,646,054 rows and 31 columns of data to dig into. During this process, I noticed that some rows had missing data from a few of the variables, so I removed those. In Python, I calculated the user’s age when they used bikes and also grouped the age into Generations (GenZ, Millennials, GenX, Baby Boomers, and Traditional). In my opinion, a more useful metric is the user’s age. For example, Birth Year is one of the variables provided for a user.
Chicago divvy zip#
I also grouped the zip codes into regions within the ChicagoLand Area (Downtown, Southside, Northside, Near Northside, Far Northside, and Western Suburbs).įurther data pre-processing revealed even more useful metrics. To fix this, I found a Python library that accessed zip codes using longitude and latitude (which were given correctly).
![chicago divvy chicago divvy](https://www.tripsavvy.com/thmb/c-BZhtAwu4Y8gksrOIgAaeHmRtg=/960x960/filters:no_upscale():max_bytes(150000):strip_icc()/Divvy-Chicago-56a398f65f9b58b7d0d2b33c.jpg)
For example, I noticed that the zip codes were incorrect in the Dept of Transportation’s data. However, data pre-processing would unlock the full value of the data and correct some errors that are present in the raw download. Let’s take a deeper look to see what we can find out using OmniSci! Data Cleaning and PyMapDĪ quick glance at the data reveals a great deal of useful information like User Type, Starting and Stopping Station Locations and BikeID. Key statistics are provided about each user’s ride, including trip duration, geolocation, start and stop times and much more. To ease congestion, the city of Chicago’s Department of Transportation created a bike-sharing service called Divvy. In fact, a great amount of this data is publicly available for you to donload and analyze.Īnother quality that most major cities have in common is traffic-rush hours are filled with walkers, bikers, taxis and buses. Even the places where you least expect it are passively collecting massive amounts of data: your local coffee shops, street lights and even post offices. You can’t walk one block without feeling the presence of major tech companies. Big data to be exact! Pick any urban center. A modern city lives, breathes and eats data.