Skip to content

Fix date parsing, timezone, and type errors in Yahoo collector#2118

Open
Ayush10 wants to merge 1 commit intomicrosoft:mainfrom
Ayush10:fix/issue-1981-yahoo-collector-bugs
Open

Fix date parsing, timezone, and type errors in Yahoo collector#2118
Ayush10 wants to merge 1 commit intomicrosoft:mainfrom
Ayush10:fix/issue-1981-yahoo-collector-bugs

Conversation

@Ayush10
Copy link

@Ayush10 Ayush10 commented Jan 31, 2026

Summary

  • Fix ValueError on mixed date formats (YYYY-MM-DD vs YYYY-MM-DD HH:MM:SS) by using pd.to_datetime(utc=True) which handles mixed formats across all pandas versions
  • Fix AttributeError: 'Index' object has no attribute 'tz_localize' by switching to tz_convert(None) after utc=True conversion
  • Fix TypeError: unsupported operand type(s) for /: 'str' and 'float' by adding pd.to_numeric(errors="coerce") before arithmetic operations on columns that may contain string data from CSV reads

Changes

  • scripts/data_collector/yahoo/collector.py: Fix normalize_yahoo date handling (lines 395-396), add numeric coercion in adjusted_price and _manual_adj_data
  • scripts/data_collector/base.py: Fix Normalize._executor date filtering (line 308)

Test plan

  • The fillna(method="ffill") deprecation warning mentioned in the issue is already fixed in the current codebase (.ffill() is used)
  • Performance improvements and --skip_download are feature requests beyond the scope of this bug fix PR

Fixes #1981

- Use pd.to_datetime(utc=True) + tz_convert(None) to handle mixed date
  formats and timezone-aware/naive inputs in pandas >= 2.0
- Add pd.to_numeric(errors="coerce") before arithmetic on columns that
  may contain string data from CSV reads
- Apply same utc=True fix in base.py Normalize._executor date filtering

Fixes microsoft#1981
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

1 participant