Scraper and Tests Build: Cincinnati City Council #9

vsspnkkvr · 2024-10-28T16:59:16Z

What's this PR do?

This PR updates the scraper for Cincinnati City Council because of changes to how they display their meeting schedule.

Why are we doing this?

Scraper requested from this spreadsheet.

Steps to manually test

Ensure the project is installed:

pipenv sync --dev

Activate the virtual env and enter the pipenv shell:

pipenv shell

Run the spider:

scrapy crawl cinoh_city_council -O test_output.csv

Monitor the output and ensure no errors are raised.
Inspect test_output.csv to ensure the data looks valid.

Are there any smells or added technical debt to note?

The one thing complex about this new parsing routine is the amount of links that need to be parsed. Right now they are all just put into an array like how the guide directs, but this may not be optimal or helpful. Some sort of link sorting method could help make that data more useful.

Summary by CodeRabbit

New Features
- Introduced a new spider for scraping meeting information from the Cincinnati City Council's Legistar calendar.
- Added a JSON file containing comprehensive meeting details, including dates, times, locations, and relevant links.
Tests
- Implemented a suite of unit tests for the new spider, validating various attributes of the parsed meeting data, including counts, titles, and statuses.

coderabbitai · 2024-10-28T16:59:29Z

Walkthrough

A new spider class named CinohCityCouncilSpider has been added to scrape meeting information from the Cincinnati City Council's Legistar calendar. This class includes methods for parsing meeting details, including classification, location, and relevant links. Additionally, a JSON file containing sample meeting data has been introduced, along with a test suite for the spider, validating various aspects of the parsed meeting items using the pytest framework.

Changes

File	Change Summary
city_scrapers/spiders/cinoh_city_council.py	Added `CinohCityCouncilSpider` class with methods for parsing meeting data from Legistar.
tests/files/cinoh_city_council.json	Added a JSON file containing meeting data for Cincinnati City Council.
tests/test_cinoh_city_council.py	Introduced unit tests for `CinohCityCouncilSpider`, verifying various attributes of parsed meetings.

Poem

In the city where council meets,
A spider weaves its web of feats.
With data fresh, it spins and twirls,
Gathering news for boys and girls.
Meetings in JSON, neat and bright,
A hop and a skip, all feels just right! 🐇✨

✨ Finishing Touches

📝 Generate Docstrings (Beta)

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Generate unit testing code for this file.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
- @coderabbitai generate unit testing code for this file.
- @coderabbitai modularize this function.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read src/utils.ts and generate unit testing code.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
- @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

@coderabbitai pause to pause the reviews on a PR.
@coderabbitai resume to resume the paused reviews.
@coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
@coderabbitai full review to do a full review from scratch and review all the files again.
@coderabbitai summary to regenerate the summary of the PR.
@coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
@coderabbitai resolve resolve all the CodeRabbit review comments.
@coderabbitai configuration to show the current CodeRabbit configuration for the repository.
@coderabbitai help to get help.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (3)

tests/test_cinoh_city_council.py (1)
23-75: Add docstrings to test functions

Adding docstrings to test functions would improve test clarity and documentation.

Example for one test:
 def test_count():
+    """Verify the correct number of meetings are parsed from the test data."""
     assert len(parsed_items) == 21
city_scrapers/spiders/cinoh_city_council.py (2)
11-12: Remove unused link_types attribute

The link_types attribute is defined but is empty and doesn't appear to be used in the spider. If it's not required, consider removing it to clean up the code.

Apply this change to remove the unused attribute:
     # Add the titles of any links not included in the scraped results
-    link_types = []
15-20: Update docstring to reflect actual methods used

The docstring refers to _parse_title and _parse_start methods, which are not defined in this spider. Update the docstring to accurately reflect the implemented methods.

Apply this change to correct the docstring:
         """
-        `parse_legistar` should always `yield` Meeting items.
-
-        Change the `_parse_title`, `_parse_start`, etc methods to fit your scraping
-        needs.
+        Parses Legistar events and yields Meeting items.
+
+        Uses helper methods `_parse_classification`, `_parse_location`, and `_parse_links`
+        to extract meeting details.
         """

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 975f98e and cd611e5.

📒 Files selected for processing (3)

city_scrapers/spiders/cinoh_city_council.py (1 hunks)
tests/files/cinoh_city_council.json (1 hunks)
tests/test_cinoh_city_council.py (1 hunks)

🧰 Additional context used

🪛 Ruff

tests/test_cinoh_city_council.py

6-6: city_scrapers_core.constants.COMMITTEE imported but unused

Remove unused import: city_scrapers_core.constants.COMMITTEE

(F401)

🔇 Additional comments (3)

tests/test_cinoh_city_council.py (1)
63-66: 🛠️ Refactor suggestion

Enhance link testing coverage

Given the PR's concern about handling multiple links, consider adding test cases for meetings with multiple links to ensure proper handling.

Add test cases for multiple links:
def test_links_multiple():
    # Test case for an item with multiple links
    multi_link_item = [item for item in parsed_items if len(item["links"]) > 1]
    if multi_link_item:
        assert isinstance(multi_link_item[0]["links"], list)
        assert all(
            {"title", "href"}.issubset(link.keys())
            for link in multi_link_item[0]["links"]
        )
Let's verify if there are any items with multiple links:
tests/files/cinoh_city_council.json (2)
1-1: Comprehensive test fixture with good coverage.

The test data includes a good variety of scenarios including:

Regular meetings

Special sessions

Cancelled meetings

Various committees

Different availability states for agendas/minutes

1-1: Verify the date format parsing in the scraper.

The test data uses MM/DD/YYYY format for dates. Ensure that the scraper correctly handles this format and converts it to a consistent internal representation.

✅ Verification successful

The date format parsing appears to be handled correctly

Based on the test file (tests/test_cinoh_city_council.py), the scraper correctly parses dates from MM/DD/YYYY format into proper datetime objects. The test case verifies that "10/30/2024" is correctly parsed into datetime(2024, 10, 30, 14, 0). Since this is inherited from the LegistarSpider base class (a core component), and the tests are passing, the date parsing implementation is working as expected.
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:
#!/bin/bash
# Search for date parsing logic in the scraper
ast-grep --pattern 'def parse_$_($$$) {
  $$$
  date$$$
  $$$
}'
Length of output: 66

Script:
#!/bin/bash
# Search for Python files that might contain date parsing logic
fd -e py | xargs rg -l "datetime|strptime|date"

# Then search for specific date parsing patterns
rg -A 5 -B 5 "strptime|datetime.strptime|parse.*date"
Length of output: 723

Script:
#!/bin/bash
# Let's find the actual scraper file that processes these dates
fd -e py | xargs rg -l "cincinnati|cinoh"

# Then examine its content for date handling
rg -A 10 -B 10 "class Cin.*Spider"
Length of output: 1451

Script:
#!/bin/bash
# Let's examine the LegistarSpider parent class since it's handling the date parsing
rg -l "class LegistarSpider"

# And then look at its implementation
rg -A 20 -B 5 "class LegistarSpider"
Length of output: 69

tests/test_cinoh_city_council.py

coderabbitai · 2024-10-28T17:04:30Z

tests/files/cinoh_city_council.json

@@ -0,0 +1 @@
+[{"Name": {"label": "Cincinnati City Council", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38076&GUID=1CA48415-BFFD-4857-8A93-48AA89BD31C6"}, "Meeting Date": "10/30/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229949&GUID=40029B38-4ED1-4770-8B4F-E76E6D0FE583"}, "Meeting Time": "2:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Healthy Neighborhoods", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47057&GUID=94D4079D-9CEF-48C0-9FC8-6B42FE58BC0C"}, "Meeting Date": "10/29/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1230062&GUID=DEBE1F8D-8F23-470B-8BBF-9C0B3F21E2C7"}, "Meeting Time": "12:30 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Public Safety & Governance", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47056&GUID=ADCB91E2-9BDF-482F-AAF6-6D5EF2A040FB"}, "Meeting Date": "10/29/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229955&GUID=3D92E418-E420-456D-972A-85F33C138E1F"}, "Meeting Time": "9:30 AM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Budget and Finance Committee", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38077&GUID=207ADF11-05C1-4163-AF58-E1C0119342BC"}, "Meeting Date": "10/28/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229952&GUID=B1C0FDFB-15D4-411F-A817-A25E7B07ADFA"}, "Meeting Time": "1:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Cincinnati City Council", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38076&GUID=1CA48415-BFFD-4857-8A93-48AA89BD31C6"}, "Meeting Date": "10/23/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229959&GUID=47066CAE-AF74-45BF-914A-830667219A0C"}, "Meeting Time": "2:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Equitable Growth & Housing", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47053&GUID=2F284759-56BA-4D9D-B640-E02600D83497"}, "Meeting Date": "10/22/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1230014&GUID=D3A25E8D-7EBB-4BD2-9930-29B6CCB3D68D"}, "Meeting Time": "1:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Climate, Environment & Infrastructure", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47054&GUID=05A8515F-E080-4889-AF46-16001666DA90"}, "Meeting Date": "10/22/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1230015&GUID=739A5464-04D7-41C4-B470-CE3E10B47462"}, "Meeting Time": "10:00 AM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Budget and Finance Committee", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38077&GUID=207ADF11-05C1-4163-AF58-E1C0119342BC"}, "Meeting Date": "10/21/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229951&GUID=7B19DEC4-C430-4CD3-B462-BD6CBE4ABD49"}, "Meeting Time": "1:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": "Meeting\u00a0details", "Agenda": "Not\u00a0available", "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Cincinnati City Council", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38076&GUID=1CA48415-BFFD-4857-8A93-48AA89BD31C6"}, "Meeting Date": "10/17/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1235477&GUID=0CC17DD2-8A13-4EC6-A533-F86D41F010D3"}, "Meeting Time": "12:30 PM", "Meeting Location": "Council Chambers, Room 300 SPECIAL SESSION", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1235477&GUID=0CC17DD2-8A13-4EC6-A533-F86D41F010D3&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1235477&GUID=0CC17DD2-8A13-4EC6-A533-F86D41F010D3"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1235477&GUID=0CC17DD2-8A13-4EC6-A533-F86D41F010D3"}, "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Cincinnati City Council", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38076&GUID=1CA48415-BFFD-4857-8A93-48AA89BD31C6"}, "Meeting Date": "10/16/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229958&GUID=94FF187B-82AF-490B-87EE-C9ECF62EBF84"}, "Meeting Time": "2:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229958&GUID=94FF187B-82AF-490B-87EE-C9ECF62EBF84&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229958&GUID=94FF187B-82AF-490B-87EE-C9ECF62EBF84"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1229958&GUID=94FF187B-82AF-490B-87EE-C9ECF62EBF84"}, "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Healthy Neighborhoods", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47057&GUID=94D4079D-9CEF-48C0-9FC8-6B42FE58BC0C"}, "Meeting Date": "10/15/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1232596&GUID=1CAC132B-2224-4A15-9F79-058CF4F3E913"}, "Meeting Time": "12:30 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1232596&GUID=1CAC132B-2224-4A15-9F79-058CF4F3E913&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1232596&GUID=1CAC132B-2224-4A15-9F79-058CF4F3E913"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1232596&GUID=1CAC132B-2224-4A15-9F79-058CF4F3E913"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1232596&GUID=1CAC132B-2224-4A15-9F79-058CF4F3E913"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Public Safety & Governance", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47056&GUID=ADCB91E2-9BDF-482F-AAF6-6D5EF2A040FB"}, "Meeting Date": "10/15/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229954&GUID=F227C97F-CA60-4F4B-8473-B93864250B47"}, "Meeting Time": "9:30 AM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229954&GUID=F227C97F-CA60-4F4B-8473-B93864250B47&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229954&GUID=F227C97F-CA60-4F4B-8473-B93864250B47"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1229954&GUID=F227C97F-CA60-4F4B-8473-B93864250B47"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1229954&GUID=F227C97F-CA60-4F4B-8473-B93864250B47"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Budget and Finance Committee", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38077&GUID=207ADF11-05C1-4163-AF58-E1C0119342BC"}, "Meeting Date": "10/14/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229950&GUID=C9733291-6F67-4456-B507-4FD3D037B229"}, "Meeting Time": "1:00 PM", "Meeting Location": "Council Chambers, Room 300 NOTICE OF CANCELLATION", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229950&GUID=C9733291-6F67-4456-B507-4FD3D037B229&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229950&GUID=C9733291-6F67-4456-B507-4FD3D037B229"}, "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Cincinnati City Council", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38076&GUID=1CA48415-BFFD-4857-8A93-48AA89BD31C6"}, "Meeting Date": "10/9/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229957&GUID=0BEEEA8D-EDFB-4D14-810B-C8303A53436C"}, "Meeting Time": "2:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229957&GUID=0BEEEA8D-EDFB-4D14-810B-C8303A53436C&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229957&GUID=0BEEEA8D-EDFB-4D14-810B-C8303A53436C"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1229957&GUID=0BEEEA8D-EDFB-4D14-810B-C8303A53436C"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1229957&GUID=0BEEEA8D-EDFB-4D14-810B-C8303A53436C"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Equitable Growth & Housing", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47053&GUID=2F284759-56BA-4D9D-B640-E02600D83497"}, "Meeting Date": "10/8/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1230013&GUID=4C858055-B215-47C4-8792-B26EEF66EFE5"}, "Meeting Time": "1:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1230013&GUID=4C858055-B215-47C4-8792-B26EEF66EFE5&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1230013&GUID=4C858055-B215-47C4-8792-B26EEF66EFE5"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1230013&GUID=4C858055-B215-47C4-8792-B26EEF66EFE5"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1230013&GUID=4C858055-B215-47C4-8792-B26EEF66EFE5"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Climate, Environment & Infrastructure", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47054&GUID=05A8515F-E080-4889-AF46-16001666DA90"}, "Meeting Date": "10/8/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1230012&GUID=6D6EACB3-C83A-4E42-890B-D4A9071298F9"}, "Meeting Time": "10:00 AM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1230012&GUID=6D6EACB3-C83A-4E42-890B-D4A9071298F9&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1230012&GUID=6D6EACB3-C83A-4E42-890B-D4A9071298F9"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1230012&GUID=6D6EACB3-C83A-4E42-890B-D4A9071298F9"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1230012&GUID=6D6EACB3-C83A-4E42-890B-D4A9071298F9"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Budget and Finance Committee", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38077&GUID=207ADF11-05C1-4163-AF58-E1C0119342BC"}, "Meeting Date": "10/7/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229948&GUID=CD6166C2-61D6-4722-AA2F-9B5264EA6199"}, "Meeting Time": "1:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229948&GUID=CD6166C2-61D6-4722-AA2F-9B5264EA6199&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229948&GUID=CD6166C2-61D6-4722-AA2F-9B5264EA6199"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1229948&GUID=CD6166C2-61D6-4722-AA2F-9B5264EA6199"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1229948&GUID=CD6166C2-61D6-4722-AA2F-9B5264EA6199"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Cincinnati City Council", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=38076&GUID=1CA48415-BFFD-4857-8A93-48AA89BD31C6"}, "Meeting Date": "10/2/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229956&GUID=C4B25E37-661F-44A4-8E06-0554A1864CA4"}, "Meeting Time": "2:00 PM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229956&GUID=C4B25E37-661F-44A4-8E06-0554A1864CA4&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229956&GUID=C4B25E37-661F-44A4-8E06-0554A1864CA4"}, "Agenda Packet": {"label": "Agenda\u00a0Packet", "url": "https://cincinnatioh.legistar.com/View.ashx?M=PA&ID=1229956&GUID=C4B25E37-661F-44A4-8E06-0554A1864CA4"}, "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1229956&GUID=C4B25E37-661F-44A4-8E06-0554A1864CA4"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Healthy Neighborhoods", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47057&GUID=94D4079D-9CEF-48C0-9FC8-6B42FE58BC0C"}, "Meeting Date": "10/1/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1231940&GUID=D11B7136-1C39-474F-AC7A-ADF12060DA94"}, "Meeting Time": "5:30 PM", "Meeting Location": "Sayler Park Recreation Center, 6720 Home City Avenue, Cincinnati, Ohio 45233 SPECIAL MEETING", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1231940&GUID=D11B7136-1C39-474F-AC7A-ADF12060DA94&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1231940&GUID=D11B7136-1C39-474F-AC7A-ADF12060DA94"}, "Agenda Packet": "Not\u00a0available", "Minutes": {"label": "Minutes", "url": "https://cincinnatioh.legistar.com/View.ashx?M=M&ID=1231940&GUID=D11B7136-1C39-474F-AC7A-ADF12060DA94"}, "Video": "Not\u00a0available"}, {"Name": {"label": "Healthy Neighborhoods", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47057&GUID=94D4079D-9CEF-48C0-9FC8-6B42FE58BC0C"}, "Meeting Date": "10/1/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229960&GUID=991B9783-6CA2-4AE3-BC94-64020D6263E6"}, "Meeting Time": "12:30 PM", "Meeting Location": "Council Chambers, Room 300 NOTICE OF CANCELLATION", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229960&GUID=991B9783-6CA2-4AE3-BC94-64020D6263E6&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229960&GUID=991B9783-6CA2-4AE3-BC94-64020D6263E6"}, "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}, {"Name": {"label": "Public Safety & Governance", "url": "https://cincinnatioh.legistar.com/DepartmentDetail.aspx?ID=47056&GUID=ADCB91E2-9BDF-482F-AAF6-6D5EF2A040FB"}, "Meeting Date": "10/1/2024", "iCalendar": {"url": "https://cincinnatioh.legistar.com/View.ashx?M=IC&ID=1229953&GUID=88A4DE50-7828-449F-8DAE-6B452906BBF3"}, "Meeting Time": "9:30 AM", "Meeting Location": "Council Chambers, Room 300", "Meeting Details": {"label": "Meeting\u00a0details", "url": "https://cincinnatioh.legistar.com/MeetingDetail.aspx?ID=1229953&GUID=88A4DE50-7828-449F-8DAE-6B452906BBF3&Options=info|&Search="}, "Agenda": {"label": "Agenda", "url": "https://cincinnatioh.legistar.com/View.ashx?M=A&ID=1229953&GUID=88A4DE50-7828-449F-8DAE-6B452906BBF3"}, "Agenda Packet": "Not\u00a0available", "Minutes": "Not\u00a0available", "Video": "Not\u00a0available"}]


🛠️ Refactor suggestion

Standardize meeting location format.

The meeting location field currently mixes the base location with status information (e.g., "NOTICE OF CANCELLATION", "SPECIAL SESSION"). Consider separating these into distinct fields for better parsing and consistency.

- "Meeting Location": "Council Chambers, Room 300 NOTICE OF CANCELLATION", + "Meeting Location": "Council Chambers, Room 300", + "Meeting Status": "CANCELLED",

Committable suggestion was skipped due to low confidence.

🛠️ Refactor suggestion

Consider standardizing "Not available" values.

Currently, unavailable items are represented as string "Not\u00a0available". Consider using null values instead, which is more idiomatic in JSON and allows for easier programmatic handling.

- "Agenda": "Not\u00a0available", + "Agenda": null,

Committable suggestion was skipped due to low confidence.

city_scrapers/spiders/cinoh_city_council.py

coderabbitai · 2024-10-28T17:04:31Z

city_scrapers/spiders/cinoh_city_council.py

+class CinohCityCouncilSpider(LegistarSpider):
+    name = "cinoh_city_council"
+    agency = "Cincinnati City Council"
+    timezone = "America/Chicago"
+    start_urls = ["https://cincinnatioh.legistar.com/Calendar.aspx"]


⚠️ Potential issue

Mismatch between PR objectives and code – City mismatch

The code is implementing a spider for "Cincinnati City Council" as indicated by the agency, start_urls, name, and class name. However, the PR objectives mention updating the scraper for Cleveland City Council. Please verify and update the code to align with the PR objectives.

Apply the following changes to update the code:

-class CinohCityCouncilSpider(LegistarSpider): +class ClevelandCityCouncilSpider(LegistarSpider): name = "cinoh_city_council" - agency = "Cincinnati City Council" + name = "cle_city_council" + agency = "Cleveland City Council" - timezone = "America/Chicago" + timezone = "America/New_York" - start_urls = ["https://cincinnatioh.legistar.com/Calendar.aspx"] + start_urls = ["https://clevelandoh.legistar.com/Calendar.aspx"]

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

class CinohCityCouncilSpider(LegistarSpider):

name = "cinoh_city_council"

agency = "Cincinnati City Council"

timezone = "America/Chicago"

start_urls = ["https://cincinnatioh.legistar.com/Calendar.aspx"]

class ClevelandCityCouncilSpider(LegistarSpider):

name = "cle_city_council"

agency = "Cleveland City Council"

timezone = "America/New_York"

start_urls = ["https://clevelandoh.legistar.com/Calendar.aspx"]

coderabbitai · 2024-10-28T17:04:31Z

city_scrapers/spiders/cinoh_city_council.py

+        if obj.get("Name"):
+            links.append({"title": "meeting page", "href": obj["Name"]["url"]})
+        elif obj.get("iCalendar"):
+            links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})
+        elif obj.get("Meeting Details"):
+            links.append({"title": "Meeting Details", "href": obj["Meeting Details"]["url"]})
+        elif obj.get("Agenda"):
+            links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})
+        elif obj.get("Agenda Packet"):
+            links.append({"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]})
+        elif obj.get("Minutes"):
+            links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})
+        elif obj.get("Video"):
+            links.append({"title": "Video", "href": obj["Video"]["url"]})


⚠️ Potential issue

Modify link parsing to include all available links

Currently, the _parse_links method uses elif statements, so only the first available link is added. To include all available links, change elif to if.

Apply this change to include all relevant links:

links = [] if obj.get("Name") and obj["Name"].get("url"): links.append({"title": "Meeting Page", "href": obj["Name"]["url"]}) - elif obj.get("iCalendar") and obj["iCalendar"].get("url"): + if obj.get("iCalendar") and obj["iCalendar"].get("url"): links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]}) - elif obj.get("Meeting Details") and obj["Meeting Details"].get("url"): + if obj.get("Meeting Details") and obj["Meeting Details"].get("url"): links.append({"title": "Meeting Details", "href": obj["Meeting Details"]["url"]}) - elif obj.get("Agenda") and obj["Agenda"].get("url"): + if obj.get("Agenda") and obj["Agenda"].get("url"): links.append({"title": "Agenda", "href": obj["Agenda"]["url"]}) - elif obj.get("Agenda Packet") and obj["Agenda Packet"].get("url"): + if obj.get("Agenda Packet") and obj["Agenda Packet"].get("url"): links.append({"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]}) - elif obj.get("Minutes") and obj["Minutes"].get("url"): + if obj.get("Minutes") and obj["Minutes"].get("url"): links.append({"title": "Minutes", "href": obj["Minutes"]["url"]}) - elif obj.get("Video") and obj["Video"].get("url"): + if obj.get("Video") and obj["Video"].get("url"): links.append({"title": "Video", "href": obj["Video"]["url"]}) return links

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if obj.get("Name"):

links.append({"title": "meeting page", "href": obj["Name"]["url"]})

elif obj.get("iCalendar"):

links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})

elif obj.get("Meeting Details"):

links.append({"title": "Meeting Details", "href": obj["Meeting Details"]["url"]})

elif obj.get("Agenda"):

links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})

elif obj.get("Agenda Packet"):

links.append({"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]})

elif obj.get("Minutes"):

links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})

elif obj.get("Video"):

links.append({"title": "Video", "href": obj["Video"]["url"]})

if obj.get("Name"):

links.append({"title": "meeting page", "href": obj["Name"]["url"]})

if obj.get("iCalendar"):

links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})

if obj.get("Meeting Details"):

links.append({"title": "Meeting Details", "href": obj["Meeting Details"]["url"]})

if obj.get("Agenda"):

links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})

if obj.get("Agenda Packet"):

links.append({"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]})

if obj.get("Minutes"):

links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})

if obj.get("Video"):

links.append({"title": "Video", "href": obj["Video"]["url"]})

city_scrapers/spiders/cinoh_city_council.py

cruznunez

Nice work here!

I would improve the _parse_links method so it accurately captures all available links. Also, I would add another test case for the COMMITTEE classification. The docstrings could also use some updating instead of just being the generated string. And finally, the PR description talks about Cleveland City Council but this PR is supposed to be for Cincinnati City Council I believe. Double check to make sure everything is correct there.

city_scrapers/spiders/cinoh_city_council.py

tests/test_cinoh_city_council.py

LienDang

There are issues with "timezone" and "status". Please double check!

LienDang · 2024-12-24T04:57:40Z

city_scrapers/spiders/cinoh_city_council.py

+class CinohCityCouncilSpider(LegistarSpider):
+    name = "cinoh_city_council"
+    agency = "Cincinnati City Council"
+    timezone = "America/Chicago"


Can you double check "what is the correct timezone for Cincinnati city"?

LienDang · 2024-12-24T05:23:54Z

city_scrapers/spiders/cinoh_city_council.py

+                source=self.legistar_source(obj),
+            )
+
+            meeting["status"] = self._get_status(meeting)


In case the meeting location contains the text "NOTICE OF CANCELLATION", for example the last meeting of the year 2024 Equitable Growth & Housing on 12/31/2024, meaning that this meeting is cancelled. So you need to override the method _parse_status(meeting) instead of using the inherited method self._get_status(meeting) to return status "CANCELLED"

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (3)

city_scrapers/spiders/cinoh_city_council.py (3)

20-32: ⚠️ Potential issue

Add error handling for missing keys in meeting data.

Direct dictionary access without safety checks could raise KeyError exceptions.

Apply these changes to safely handle missing data:

             meeting = Meeting(
-                title=obj["Name"]["label"],
+                title=obj.get("Name", {}).get("label", ""),
                 description="",
                 classification=self._parse_classification(obj),
                 start=self.legistar_start(obj),
                 end=None,
                 all_day=False,
                 time_notes="",
                 status=self._parse_status(obj),
                 location=self._parse_location(obj),
                 links=self._parse_links(obj),
-                source=self.legistar_source(obj),
+                source=self.legistar_source(obj) or "",
             )

50-55: ⚠️ Potential issue

Add error handling for location parsing.

Direct access to obj["Meeting Location"] could raise KeyError.

     def _parse_location(self, obj):
-        room = obj["Meeting Location"]
+        room = obj.get("Meeting Location", "Location not provided")
         return {
             "address": "801 Plum St. Cincinnati, OH 45202",
-            "name": f"{room}"
+            "name": room
         }

57-74: ⚠️ Potential issue

Fix link parsing to include all available links.

The current implementation using separate if statements with return only captures the first matching link. This means potentially important links are being missed.

     def _parse_links(self, obj):
-        
-        if obj.get("Name"):
-            return [{"title": "meeting page", "href": obj["Name"]["url"]}]
-        if obj.get("iCalendar"):
-            return [{"title": "iCalendar", "href": obj["iCalendar"]["url"]}]
-        if obj.get("Meeting Details"):
-            return [{"title": "Meeting Details", "href": obj["Meeting Details"]["url"]}]
-        if obj.get("Agenda"):
-            return [{"title": "Agenda", "href": obj["Agenda"]["url"]}]
-        if obj.get("Agenda Packet"):
-            return [{"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]}]
-        if obj.get("Minutes"):
-            return [{"title": "Minutes", "href": obj["Minutes"]["url"]}]
-        if obj.get("Video"):
-            return [{"title": "Video", "href": obj["Video"]["url"]}]
-        
-
+        links = []
+        if obj.get("Name", {}).get("url"):
+            links.append({"title": "Meeting Page", "href": obj["Name"]["url"]})
+        if obj.get("iCalendar", {}).get("url"):
+            links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})
+        if obj.get("Meeting Details", {}).get("url"):
+            links.append({"title": "Meeting Details", "href": obj["Meeting Details"]["url"]})
+        if obj.get("Agenda", {}).get("url"):
+            links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})
+        if obj.get("Agenda Packet", {}).get("url"):
+            links.append({"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]})
+        if obj.get("Minutes", {}).get("url"):
+            links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})
+        if obj.get("Video", {}).get("url"):
+            links.append({"title": "Video", "href": obj["Video"]["url"]})
+        return links

🧹 Nitpick comments (1)

city_scrapers/spiders/cinoh_city_council.py (1)

13-18: Improve docstring clarity and fix typos.

The docstring could be more informative and contains typos:

"calander" should be "calendar"
Add information about the return type and parameters

Update the docstring to:

-        """
-        Parse upcoming and past meetings from the Cincinnati City Council meetings table.
-
-        Oftentimes, the columns: meeting details, agenda, minutes, and video are left blank on the calander
-        but when they are, they are in the form of links. 
-        """
+        """Parse upcoming and past meetings from the Cincinnati City Council meetings table.
+
+        Args:
+            response: JSON response from Legistar API containing meeting data
+
+        Yields:
+            Meeting: Scraped meeting items with details including title, time, location, and links.
+            
+        Note:
+            The columns (meeting details, agenda, minutes, and video) are sometimes empty on the calendar,
+            but when present, they contain links to the respective resources.
+        """

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd611e5 and d508fb9.

📒 Files selected for processing (2)

city_scrapers/spiders/cinoh_city_council.py (1 hunks)
tests/test_cinoh_city_council.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_cinoh_city_council.py

🔇 Additional comments (1)

city_scrapers/spiders/cinoh_city_council.py (1)

6-10: LGTM! Configuration is correct.

The spider is properly configured with the correct timezone for Cincinnati and the appropriate Legistar calendar URL.

coderabbitai · 2025-01-08T17:54:21Z

city_scrapers/spiders/cinoh_city_council.py

+    def _parse_status(self, obj):
+        if obj["Meeting Location"] == "Council Chambers, Room 300 NOTICE OF CANCELLATION":
+            return "cancelled"
+        else:
+            return "tentative"


🛠️ Refactor suggestion

Enhance cancellation detection logic.

The current implementation only checks for one specific cancellation notice format. Consider:

Case-insensitive matching

Different cancellation notice formats

Default to using the inherited self._get_status() for other cases

def _parse_status(self, obj): - if obj["Meeting Location"] == "Council Chambers, Room 300 NOTICE OF CANCELLATION": + location = obj.get("Meeting Location", "").lower() + if "notice of cancellation" in location or "canceled" in location: return "cancelled" - else: - return "tentative" + return self._get_status(obj)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _parse_status(self, obj):

if obj["Meeting Location"] == "Council Chambers, Room 300 NOTICE OF CANCELLATION":

return "cancelled"

else:

return "tentative"

def _parse_status(self, obj):

location = obj.get("Meeting Location", "").lower()

if "notice of cancellation" in location or "canceled" in location:

return "cancelled"

return self._get_status(obj)

cruznunez

We still have some problems with the links parsing. Please fix.

city_scrapers/spiders/cinoh_city_council.py

lamle-ea

Please format the code using the black code formatter to fix the styling issues.

lamle-ea · 2025-01-16T08:59:07Z

city_scrapers/spiders/cinoh_city_council.py

+        else:
+            return COMMITTEE
+
+    def _parse_status(self, obj):


If the meeting date is earlier than the current date, return the status PASSED.

lamle-ea · 2025-01-16T09:04:51Z

city_scrapers/spiders/cinoh_city_council.py

+            links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})
+
+        if obj.get("Meeting Details") == "Meeting\u00a0details":
+            links.append({"title": "Meeting Details", "href": "Not Available"})


If a document is unavailable, exclude it from the list of links. We can refactor this block by creating an array of possible document names and iterating through them, like this:

for document_name in ["Meeting Details", "Agenda", ...]: document = obj.get(document_name) if document.get("url"): links.append({"title": document_name, "href": document["url"]})

cruznunez

My comments have been fixed but we still have failing lint issues. Runpipenv run black . and other lint commands to fix the issues.

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (1)

city_scrapers/spiders/cinoh_city_council.py (1)

22-34: ⚠️ Potential issue

Add error handling for missing keys in meeting data.

Direct dictionary access without checking for key existence could raise KeyError exceptions.

Apply these changes to safely handle missing keys:

             meeting = Meeting(
-                title=obj["Name"]["label"],
+                title=obj.get("Name", {}).get("label", "No title provided"),
                 description="",
                 classification=self._parse_classification(obj),
                 start=self.legistar_start(obj),
                 end=None,
                 all_day=False,
                 time_notes="",
                 status=self._parse_status(obj),
                 location=self._parse_location(obj),
                 links=self._parse_links(obj),
-                source=self.legistar_source(obj),
+                source=self.legistar_source(obj) if obj else "",
             )

🧹 Nitpick comments (1)

city_scrapers/spiders/cinoh_city_council.py (1)

14-20: Enhance method documentation.

The docstring could be more descriptive about the method's parameters, return value, and specific behavior.

Apply this change:

     def parse_legistar(self, response):
         """
-        Parse upcoming and past meetings from the Cincinnati City Council meetings table.
-
-        Oftentimes, the columns: meeting details, agenda, minutes, and video are left blank on the calander
-        but when they are, they are in the form of links.
+        Parse meeting information from the Legistar API response.
+
+        Args:
+            response: JSON response from Legistar API containing meeting data
+
+        Yields:
+            Meeting: Scraped meeting items with details including title, date, location,
+                    status, and associated document links
+
+        Note:
+            Meeting documents (agenda, minutes, video) may be unavailable or present as links
+            in the calendar view.
         """

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d508fb9 and 3cb02fb.

📒 Files selected for processing (2)

city_scrapers/spiders/cinoh_city_council.py (1 hunks)
tests/test_cinoh_city_council.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_cinoh_city_council.py

🧰 Additional context used

🪛 Ruff (0.8.2)

city_scrapers/spiders/cinoh_city_council.py

75-75: Use obj.get("Meeting Details") != "Meeting\xa0details" instead of not obj.get("Meeting Details") == "Meeting\xa0details"

Replace with != operator

(SIM201)

80-80: Use obj.get("Agenda") != "Not\xa0available" instead of not obj.get("Agenda") == "Not\xa0available"

Replace with != operator

(SIM201)

83-83: Use obj.get("Agenda Packet") != "Not\xa0available" instead of not obj.get("Agenda Packet") == "Not\xa0available"

Replace with != operator

(SIM201)

88-88: Use obj.get("Minutes") != "Not\xa0available" instead of not obj.get("Minutes") == "Not\xa0available"

Replace with != operator

(SIM201)

91-91: Use obj.get("Video") != "Not\xa0available" instead of not obj.get("Video") == "Not\xa0available"

Replace with != operator

(SIM201)

🔇 Additional comments (2)

city_scrapers/spiders/cinoh_city_council.py (2)

8-12: LGTM! Configuration is accurate.

The timezone is correctly set to "America/New_York" for Cincinnati, and other configuration parameters are properly defined.

46-60: 🛠️ Refactor suggestion

Enhance meeting status detection.

The status detection needs improvement to handle:

Case-insensitive cancellation notice matching

Different cancellation notice formats

Past meetings (already implemented)

Apply this change:
     def _parse_status(self, obj):
         date = obj["Meeting Date"]
         parsed_date = parse(date, fuzzy=True)
 
-        if (
-            obj["Meeting Location"]
-            == "Council Chambers, Room 300 NOTICE OF CANCELLATION"
-        ):
+        location = obj.get("Meeting Location", "").lower()
+        if "notice of cancellation" in location or "canceled" in location:
             return "cancelled"
         elif parsed_date < datetime.today():
             return "passed"
         else:
             return "tentative"
Likely invalid or redundant comment.

coderabbitai · 2025-01-17T17:36:16Z

city_scrapers/spiders/cinoh_city_council.py

+    def _parse_links(self, obj):
+
+        links = []
+
+        if obj.get("Name"):
+            links.append({"title": "meeting page", "href": obj["Name"]["url"]})
+
+        if obj.get("iCalendar"):
+            links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})
+
+        if not obj.get("Meeting Details") == "Meeting\u00a0details":
+            links.append(
+                {"title": "Meeting Details", "href": obj["Meeting Details"]["url"]}
+            )
+
+        if not obj.get("Agenda") == "Not\u00a0available":
+            links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})
+
+        if not obj.get("Agenda Packet") == "Not\u00a0available":
+            links.append(
+                {"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]}
+            )
+
+        if not obj.get("Minutes") == "Not\u00a0available":
+            links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})
+
+        if not obj.get("Video") == "Not\u00a0available":
+            links.append({"title": "Video", "href": obj["Video"]["url"]})
+
+        return links


🛠️ Refactor suggestion

Refactor link parsing for better maintainability and reliability.

The current implementation has several issues:

Inefficient comparison operators (using not == instead of !=)

Repetitive code structure

Unsafe dictionary access

Apply this refactoring to improve the code:

def _parse_links(self, obj): links = [] - - if obj.get("Name"): - links.append({"title": "meeting page", "href": obj["Name"]["url"]}) - - if obj.get("iCalendar"): - links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]}) - - if not obj.get("Meeting Details") == "Meeting\u00a0details": - links.append( - {"title": "Meeting Details", "href": obj["Meeting Details"]["url"]} - ) - - if not obj.get("Agenda") == "Not\u00a0available": - links.append({"title": "Agenda", "href": obj["Agenda"]["url"]}) - - if not obj.get("Agenda Packet") == "Not\u00a0available": - links.append( - {"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]} - ) - - if not obj.get("Minutes") == "Not\u00a0available": - links.append({"title": "Minutes", "href": obj["Minutes"]["url"]}) - - if not obj.get("Video") == "Not\u00a0available": - links.append({"title": "Video", "href": obj["Video"]["url"]}) + documents = { + "Name": "meeting page", + "iCalendar": "iCalendar", + "Meeting Details": "Meeting Details", + "Agenda": "Agenda", + "Agenda Packet": "Agenda Packet", + "Minutes": "Minutes", + "Video": "Video" + } + + for key, title in documents.items(): + doc = obj.get(key, {}) + if isinstance(doc, dict) and doc.get("url"): + if key in ["Meeting Details"] and doc.get("label") == "Meeting\u00a0details": + continue + if doc.get("label") != "Not\u00a0available": + links.append({"title": title, "href": doc["url"]}) return links

This refactoring:

Uses a dictionary to define document types

Handles missing keys safely

Properly checks for unavailable documents

Reduces code duplication

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _parse_links(self, obj):

links = []

if obj.get("Name"):

links.append({"title": "meeting page", "href": obj["Name"]["url"]})

if obj.get("iCalendar"):

links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})

if not obj.get("Meeting Details") == "Meeting\u00a0details":

links.append(

{"title": "Meeting Details", "href": obj["Meeting Details"]["url"]}

)

if not obj.get("Agenda") == "Not\u00a0available":

links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})

if not obj.get("Agenda Packet") == "Not\u00a0available":

links.append(

{"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]}

)

if not obj.get("Minutes") == "Not\u00a0available":

links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})

if not obj.get("Video") == "Not\u00a0available":

links.append({"title": "Video", "href": obj["Video"]["url"]})

return links

def _parse_links(self, obj):

links = []

documents = {

"Name": "meeting page",

"iCalendar": "iCalendar",

"Meeting Details": "Meeting Details",

"Agenda": "Agenda",

"Agenda Packet": "Agenda Packet",

"Minutes": "Minutes",

"Video": "Video"

}

for key, title in documents.items():

doc = obj.get(key, {})

if isinstance(doc, dict) and doc.get("url"):

if key in ["Meeting Details"] and doc.get("label") == "Meeting\u00a0details":

continue

if doc.get("label") != "Not\u00a0available":

links.append({"title": title, "href": doc["url"]})

return links

🧰 Tools

🪛 Ruff (0.8.2)

75-75: Use obj.get("Meeting Details") != "Meeting\xa0details" instead of not obj.get("Meeting Details") == "Meeting\xa0details"

Replace with != operator

(SIM201)

80-80: Use obj.get("Agenda") != "Not\xa0available" instead of not obj.get("Agenda") == "Not\xa0available"

Replace with != operator

(SIM201)

83-83: Use obj.get("Agenda Packet") != "Not\xa0available" instead of not obj.get("Agenda Packet") == "Not\xa0available"

Replace with != operator

(SIM201)

88-88: Use obj.get("Minutes") != "Not\xa0available" instead of not obj.get("Minutes") == "Not\xa0available"

Replace with != operator

(SIM201)

91-91: Use obj.get("Video") != "Not\xa0available" instead of not obj.get("Video") == "Not\xa0available"

Replace with != operator

(SIM201)

lamle-ea

I noticed that the CI checks are still failing. Before pushing your commits, please run these code quality checks locally:

pipenv run isort . --check-only  # Check import sorting
pipenv run black . --check       # Check code formatting
pipenv run flake8 .             # Check code style

coderabbitai

Actionable comments posted: 1

♻️ Duplicate comments (4)

city_scrapers/spiders/cinoh_city_council.py (4)

25-37: ⚠️ Potential issue

Add error handling for missing keys.

The current implementation could raise KeyError exceptions if the response data is malformed.

Apply this change to handle missing keys safely:

             meeting = Meeting(
-                title=obj["Name"]["label"],
+                title=obj.get("Name", {}).get("label", "No title provided"),
                 description="",
                 classification=self._parse_classification(obj),
                 start=self.legistar_start(obj),
                 end=None,
                 all_day=False,
                 time_notes="",
                 status=self._parse_status(obj),
                 location=self._parse_location(obj),
                 links=self._parse_links(obj),
                 source=self.legistar_source(obj),
             )

49-63: 🛠️ Refactor suggestion

Enhance cancellation detection logic.

The current implementation:

Has unsafe key access
Uses a very specific cancellation notice format
Could miss other cancellation indicators

     def _parse_status(self, obj):
         date = obj["Meeting Date"]
         parsed_date = parse(date, fuzzy=True)
 
-        if (
-            obj["Meeting Location"]
-            == "Council Chambers, Room 300 NOTICE OF CANCELLATION"
-        ):
+        location = obj.get("Meeting Location", "").lower()
+        if "notice of cancellation" in location or "canceled" in location:
             return "cancelled"
         elif parsed_date < datetime.today():
             return "passed"
         else:
             return "tentative"

64-66: ⚠️ Potential issue

Add error handling for missing location.

The method could raise KeyError if the "Meeting Location" key is missing.

     def _parse_location(self, obj):
-        room = obj["Meeting Location"]
+        room = obj.get("Meeting Location", "No room provided")
         return {"address": "801 Plum St. Cincinnati, OH 45202", "name": f"{room}"}

68-97: 🛠️ Refactor suggestion

Refactor link parsing for better maintainability and reliability.

Current implementation has several issues:

Unsafe dictionary access
Inefficient comparison operators
Repetitive code structure

     def _parse_links(self, obj):
         links = []
-
-        if obj.get("Name"):
-            links.append({"title": "meeting page", "href": obj["Name"]["url"]})
-
-        if obj.get("iCalendar"):
-            links.append({"title": "iCalendar", "href": obj["iCalendar"]["url"]})
-
-        if not obj.get("Meeting Details") == "Meeting\u00a0details":
-            links.append(
-                {"title": "Meeting Details", "href": obj["Meeting Details"]["url"]}
-            )
-
-        if not obj.get("Agenda") == "Not\u00a0available":
-            links.append({"title": "Agenda", "href": obj["Agenda"]["url"]})
-
-        if not obj.get("Agenda Packet") == "Not\u00a0available":
-            links.append(
-                {"title": "Agenda Packet", "href": obj["Agenda Packet"]["url"]}
-            )
-
-        if not obj.get("Minutes") == "Not\u00a0available":
-            links.append({"title": "Minutes", "href": obj["Minutes"]["url"]})
-
-        if not obj.get("Video") == "Not\u00a0available":
-            links.append({"title": "Video", "href": obj["Video"]["url"]})
+        documents = {
+            "Name": "meeting page",
+            "iCalendar": "iCalendar",
+            "Meeting Details": "Meeting Details",
+            "Agenda": "Agenda",
+            "Agenda Packet": "Agenda Packet",
+            "Minutes": "Minutes",
+            "Video": "Video"
+        }
+
+        for key, title in documents.items():
+            doc = obj.get(key, {})
+            if isinstance(doc, dict) and doc.get("url"):
+                if key in ["Meeting Details"] and doc.get("label") == "Meeting\u00a0details":
+                    continue
+                if doc.get("label") != "Not\u00a0available":
+                    links.append({"title": title, "href": doc["url"]})
 
         return links

This refactoring:

Uses a dictionary to define document types
Handles missing keys safely
Properly checks for unavailable documents
Reduces code duplication

🧰 Tools

🪛 Ruff (0.8.2)

78-78: Use obj.get("Meeting Details") != "Meeting\xa0details" instead of not obj.get("Meeting Details") == "Meeting\xa0details"

Replace with != operator

(SIM201)

83-83: Use obj.get("Agenda") != "Not\xa0available" instead of not obj.get("Agenda") == "Not\xa0available"

Replace with != operator

(SIM201)

86-86: Use obj.get("Agenda Packet") != "Not\xa0available" instead of not obj.get("Agenda Packet") == "Not\xa0available"

Replace with != operator

(SIM201)

91-91: Use obj.get("Minutes") != "Not\xa0available" instead of not obj.get("Minutes") == "Not\xa0available"

Replace with != operator

(SIM201)

94-94: Use obj.get("Video") != "Not\xa0available" instead of not obj.get("Video") == "Not\xa0available"

Replace with != operator

(SIM201)

🧹 Nitpick comments (1)

city_scrapers/spiders/cinoh_city_council.py (1)

16-23: Enhance docstring with more details.

Add information about:

The method's parameters
The return value (yielded Meeting objects)
The structure of the response object

     def parse_legistar(self, response):
         """
-        Parse upcoming and past meetings from the
-        Cincinnati City Council meetings table.
+        Parse upcoming and past meetings from the Cincinnati City Council meetings table.
+
+        Args:
+            response: JSON response from Legistar API containing meeting data
+
+        Yields:
+            Meeting: Meeting object containing parsed meeting information
 
         Oftentimes, the columns: meeting details, agenda,
         minutes, and video are left blank on the calander
         but when they are, they are in the form of links.
         """

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3cb02fb and 221c31f.

📒 Files selected for processing (2)

city_scrapers/spiders/cinoh_city_council.py (1 hunks)
tests/test_cinoh_city_council.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_cinoh_city_council.py

🧰 Additional context used

🪛 Ruff (0.8.2)

city_scrapers/spiders/cinoh_city_council.py

78-78: Use obj.get("Meeting Details") != "Meeting\xa0details" instead of not obj.get("Meeting Details") == "Meeting\xa0details"

Replace with != operator

(SIM201)

83-83: Use obj.get("Agenda") != "Not\xa0available" instead of not obj.get("Agenda") == "Not\xa0available"

Replace with != operator

(SIM201)

86-86: Use obj.get("Agenda Packet") != "Not\xa0available" instead of not obj.get("Agenda Packet") == "Not\xa0available"

Replace with != operator

(SIM201)

91-91: Use obj.get("Minutes") != "Not\xa0available" instead of not obj.get("Minutes") == "Not\xa0available"

Replace with != operator

(SIM201)

94-94: Use obj.get("Video") != "Not\xa0available" instead of not obj.get("Video") == "Not\xa0available"

Replace with != operator

(SIM201)

🔇 Additional comments (1)

city_scrapers/spiders/cinoh_city_council.py (1)

9-13: LGTM! Configuration is accurate.

The timezone "America/New_York" is correct for Cincinnati, and the Legistar calendar URL is properly configured.

coderabbitai · 2025-01-21T18:17:08Z

city_scrapers/spiders/cinoh_city_council.py

+    def _parse_classification(self, obj):
+        if obj["Name"]["label"] == "Cincinnati City Council":
+            return CITY_COUNCIL
+        else:
+            return COMMITTEE
+


⚠️ Potential issue

Add error handling for missing keys in classification.

The method could raise KeyError if the "Name" or "label" keys are missing.

def _parse_classification(self, obj): - if obj["Name"]["label"] == "Cincinnati City Council": + if obj.get("Name", {}).get("label") == "Cincinnati City Council": return CITY_COUNCIL else: return COMMITTEE

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def _parse_classification(self, obj):

if obj["Name"]["label"] == "Cincinnati City Council":

return CITY_COUNCIL

else:

return COMMITTEE

def _parse_classification(self, obj):

if obj.get("Name", {}).get("label") == "Cincinnati City Council":

return CITY_COUNCIL

else:

return COMMITTEE

scraper and tests first build

cd611e5

vsspnkkvr changed the title ~~Scraper and Tests Build: Cinoh City Council~~ Scraper and Tests Build: Cincinnati City Council Oct 28, 2024

coderabbitai bot reviewed Oct 28, 2024

View reviewed changes

cruznunez requested changes Nov 19, 2024

View reviewed changes

city_scrapers/spiders/cinoh_city_council.py Outdated Show resolved Hide resolved

city_scrapers/spiders/cinoh_city_council.py Outdated Show resolved Hide resolved

tests/test_cinoh_city_council.py Show resolved Hide resolved

changes

f3a20fc

LienDang requested changes Dec 24, 2024

View reviewed changes

cruznunez requested a review from lamle-ea January 7, 2025 18:10

fixes

d508fb9

coderabbitai bot reviewed Jan 8, 2025

View reviewed changes

cruznunez requested changes Jan 10, 2025

View reviewed changes

city_scrapers/spiders/cinoh_city_council.py Outdated Show resolved Hide resolved

links fix with test

fc12e00

lamle-ea requested changes Jan 16, 2025

View reviewed changes

cruznunez approved these changes Jan 16, 2025

View reviewed changes

lint, status, links fixes

3cb02fb

coderabbitai bot reviewed Jan 17, 2025

View reviewed changes

lamle-ea requested changes Jan 20, 2025

View reviewed changes

final flake8 and isort fixes

221c31f

coderabbitai bot reviewed Jan 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Scraper and Tests Build: Cincinnati City Council #9

Scraper and Tests Build: Cincinnati City Council #9

vsspnkkvr commented Oct 28, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 28, 2024 •

edited

Loading

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (`.coderabbit.yaml`)

Documentation and Community

coderabbitai bot left a comment

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

coderabbitai bot Oct 28, 2024

cruznunez left a comment

LienDang left a comment

LienDang Dec 24, 2024

LienDang Dec 24, 2024

coderabbitai bot left a comment

coderabbitai bot Jan 8, 2025

cruznunez left a comment

lamle-ea left a comment

lamle-ea Jan 16, 2025

lamle-ea Jan 16, 2025

cruznunez left a comment

coderabbitai bot left a comment

coderabbitai bot Jan 17, 2025

lamle-ea left a comment

coderabbitai bot left a comment

coderabbitai bot Jan 21, 2025

Scraper and Tests Build: Cincinnati City Council #9

Are you sure you want to change the base?

Scraper and Tests Build: Cincinnati City Council #9

Conversation

vsspnkkvr commented Oct 28, 2024 • edited by coderabbitai bot Loading

What's this PR do?

Why are we doing this?

Steps to manually test

Are there any smells or added technical debt to note?

Summary by CodeRabbit

Summary by CodeRabbit

coderabbitai bot commented Oct 28, 2024 • edited Loading

Walkthrough

Changes

Poem

Chat

CodeRabbit Commands (Invoked using PR comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Documentation and Community

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

coderabbitai bot Oct 28, 2024

Choose a reason for hiding this comment

cruznunez left a comment

Choose a reason for hiding this comment

LienDang left a comment

Choose a reason for hiding this comment

LienDang Dec 24, 2024

Choose a reason for hiding this comment

LienDang Dec 24, 2024

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 8, 2025

Choose a reason for hiding this comment

cruznunez left a comment

Choose a reason for hiding this comment

lamle-ea left a comment

Choose a reason for hiding this comment

lamle-ea Jan 16, 2025

Choose a reason for hiding this comment

lamle-ea Jan 16, 2025

Choose a reason for hiding this comment

cruznunez left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 17, 2025

Choose a reason for hiding this comment

lamle-ea left a comment

Choose a reason for hiding this comment

coderabbitai bot left a comment

Choose a reason for hiding this comment

coderabbitai bot Jan 21, 2025

Choose a reason for hiding this comment

vsspnkkvr commented Oct 28, 2024 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Oct 28, 2024 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)