I was Stuck
When I am stuck;
My emails start piling-up;
Pending payments go unnoticed;
People get offended as I don't reply to their DMs;
Meetings as hastened or cancelled;
The error logs start increasing;
All the KMPs take a sudden nose-dive.
I get stuck fixing the music player, while all the guests are waiting for the dinner. I get stuck fixing the page layout in Excel, when it is last week of September and a dozen tax returns need to be filed.
I was in 6th or 7th Grade. We had a Physics test. I got stuck in deriving the formula for Centigrade to Fahrenheit conversion. I don't know when 30 minutes got over. I could answer no other questions and got a good zero.
Stuck in calculating median PE
Your browser doesn't support HTML video. Here is a link to the video instead.
This ratio was currently calculated only on the year end values. The 10 year median PE considered only 10 values. This was wrong! The correct median should consider all traded days over last 10 years. It should be median of 250 trading days x 10 years = 2500 values.
The original script which calculated the ratios for all 5000 companies took around 24 minutes. The correct computation method increased the time to over 56 minutes. I wanted it to be under 25 minutes. That's where I got stuck.
After multiple hits-and-tries, brain-fuck moments, and 2kg more weight on the body, I was able to bring down the script time to under 22 minutes. Yup, this includes 250x more calculations and yet a shorter time.
There were 4 main optimisations:
- Dumping large tables into Pandas using CSV: loading using read_sql took 3+ minutes vs 50 seconds in using SELECT INTO OUTFILE option in MySQL.
- Slicing vs querying in Pandas: filtering rows using slicing on sorted index is 100x faster then queries
- Using a lot of lru_cache
- Changing the algorithm for historical computation