Premium lockedPySparkIntermediate
Too Many Small Files from Hourly Writes
Each hourly job writes many tiny files into the same date partition. Metadata overhead dominates scan time.
Practice type
Log / Error Analysis
Estimated time
17 min
Skills
PySpark, Small Files, Compaction
Create an account to continue
Sign in with OTP first, then choose a plan and complete UPI activation.