Tag: Data Engineering

In this workshop with PyData NYC, we will explore 311 dataset, starting with exploratory charts (including maps), maybe creating a linked animation, and concluding with a simple interactive visualization. In doing so, we will unpack some of the fundamental concepts that underlie the architecture of Matplotlib, hopefully providing attendees with the foundation for creating effective visualizations using Matplotlib. And the vocabulary to make more effective use of AI tools. Matplotlib is a big library, and it can be difficult to know where to start.

This demo-tutorial is a guided tour through many of the essential features and concepts of Matplotlib so you can get started making publication-quality, animated, and interactive figures. We will be using the 311 dataset as a case study.

This event will be held at the CUNY Graduate Center (365 5th Avenue) in Manhattan. Register here.

Tags Data Engineering, Data Science

City of Yes for Housing Opportunity is a city-wide zoning text amendment that addresses New York City’s housing crisis by making it possible to build a little more housing in every neighborhood. It was adopted by the City Council in December 2024 and is already being put to use to create homes across all five boroughs.

How do legislative changes translate to data changes? How can new and old zoning tools be reflected in land use data? What do people need to know about the City’s tax lots to make informed decisions?

In this session, the Data Engineering team from the NYC Department of City Planning (DCP) will share how the agency added new fields to one of it’s most popular datasets: PLUTO. New fields about Mandatory Inclusionary Housing (MIH) and transit zones will soon be available in PLUTO to give data users a more complete picture of the City’s zoning and land use.

DCP subject matter experts in zoning, housing, and transportation worked with engineers to understand the relevant Zoning Resolution text, the intentions of City of Yes amendments, and the data necessary to relate them to every tax lot in the City. Attendees will learn about the processes, decisions, and surprises that have been a part of this journey through legislation, code, and open data.

Tags Data Engineering, GIS / Mapping, Housing, Urban Planning/Transportation/Mobility

Christian Casazza is a data engineer who has built a open-source data platform on top of NYC Open Data. In this talk, he discusses using open source data engineering tools like Dagster, Polars, and DuckDB to ingest and clean gov data like NYC 311 and from the NYC Checkbook API. He will show participants how they can build on top of the clean, curated government data to build applications for the public good.

Anyone who is interested in using government data to improve the city’s operations and citizens quality of life should attend.

The first part of the event will involve understanding the core open source technologies anyone working with data should know. Understanding the logic behind open source tools is important to appreciate how much faster, cheaper, and simpler modern data app building is with open source tools. These tools can be applied for anyone’s civic interests and day to day work. The second part of the event will discuss some of the tools I’ve built around open source data. We will discuss using QueryStation.app and NYCStats.app and how New Yorkers can go there to learn about their city.

Tags Data Engineering, Data Science

Open Data Week 2026 March 22-29, 2026 (and more!)

WeGovNYC’s Databook (databook.nyc) is a data pipeline that indexes, normalizes, and republishes over 60 NYC Open Data datasets as a publicly accessible API and into an interface that offers in-depth profiles of City agencies, public schools, civil service titles, contracts and much more.

Our recent focus is providing tools for people interested in reforming civil service titles and technology procurement. We FOILed for civil service title descriptions, extracted their data, integrated it into Databook along with data visualizations at databook.nyc/titles. We also extracted all data from PassportPublic and Checkbook.NYC to create a new section in Databook connecting vendors, solicitations, agencies, vendors and contracts at https://databook.nyc/procurement.

We will discuss all our tools as well as how we’ve adopted generative AI (vibe coding) to accelerate our development.

Tags Artificial Intelligence (AI), Data Engineering

UnSchool of Data is BetaNYC’s open space unconference for networking, co-creating, and learning. It brings together city residents, technologists, civic leaders, students, advocates, policy nerds, government staff, elected officials, journalists, designers, and more to leverage open data to tackle some of the most pressing issues in NYC and beyond.

It’s a community driven day for turning open data into civic solutions.

UnSchool of Data has these underlying goals:

Convene community members to share civic insights and ideas.
Create processes/projects that people will use for further action.
Foster formal and informal communities of practice and action.

Learn more about UnSchool of Data and how it works at www.schoolofdata.nyc/unschool.

Tags Art and Media, Artificial Intelligence, Data Engineering, Data Governance, Data Science, Demographics, Digital Literacy/Equity, Education, GIS / Mapping, Health/Environment, Smart Cities / Urban Tech, Urban Planning/Transportation/Mobility

Join Paul Reeping, Director of Research at Vital City, for an interactive session exploring Vital City’s new Crime Data Explorer, a multi-decade, precinct-level platform covering complaints, arrests, and shootings in New York City. Paul will demonstrate how the tool works, explain the analytic framework behind it, and highlight key findings from Vital City’s most recent end-of-year crime report. Participants will gain a clearer understanding of long-term crime trends, how different categories are measured, and how to responsibly interpret citywide and neighborhood-level data.

The session will also look ahead. After walking through the Explorer, Paul will preview upcoming data initiatives at Vital City and invite participants to help shape future tools for data visualization, public safety measurement, and open data accessibility. This event is ideal for researchers, journalists, policymakers, technologists, students, and anyone interested in understanding crime trends and building better public data tools. Expect a mix of live demonstration, substantive analysis, and collaborative discussion about what New York City should measure, visualize, and build next.

Tags Data Engineering, Data Science, GIS / Mapping

Using data insights to make decisions is what every organization seeks to do, but there are many reasons why this doesn’t happen in practice: data is hard to find, it is siloed and inaccessible, it is undocumented and difficult to understand, it is too large or complex for the skills and tools available. All these problems existed at Metropolitan Transportation Authority

(MTA) and were the motivation for the recent establishment of a central data team, which has the goal of facilitating analytical work for teams all across the company. Standing up such a team is challenging, especially for public sector agencies with many internal and external stakeholders, legacy systems and limited resources. In this talk, Andy Kuziemko, who leads the Data & Analytics team at the MTA, will describe the progress to date at the agency, lessons learned along the way, and the remaining challenges the agency faces.

Tags Data Engineering, Data Governance, Urban Planning/Transportation/Mobility

NYC School of Data is BetaNYC’s community conference that demystifies the policies and practices around open data, technology, and service design. This year’s conference helps conclude NYC Open Data Week and features 40+ sessions organized by NYC’s civic technology, data, and design community! Our conversations and workshops will feed your mind and inspire you to improve your neighborhood.

To attend, you need to purchase tickets. The venue is accessible, and the content is all-ages friendly! If you have accessibility questions or needs, please email the BetaNYC team at [email protected].

Thank you to Reinvent Albany for their support as Lead Partner and helping cover conference costs to make it possible to meet in 2026. Additional sponsors include HaydenAI, SVA Masters in Data Visualization and Communication, Nava, The Center for Urban Science + Progress (CUSP) at NYU Tandon. and Cyvl.

If you can’t join us in person, tune into the main stage live stream provided by the Internet Society New York Chapter. Follow the conversation #NYCSoData on Bluesky.

Purchase your tickets here.

Tags Art and Media, Artificial Intelligence, Data Engineering, Data Governance, Data Science, Demographics, Digital Literacy/Equity, Education, GIS / Mapping, Health/Environment, Smart Cities / Urban Tech, Urban Planning/Transportation/Mobility