-
Notifications
You must be signed in to change notification settings - Fork 0
Computing
We are going to be getting new computers and so Piset wants us to do some testing of code we run to get a sense of what type of machine we might need.
I ran the script https://github.com/psrc/travel-studies/blob/master/2017/summary/hh_survey_2017_summary.py on my local machine and it took 153 seconds = 2.55 minutes.
On my “modeling” machine, it took 153 seconds. On the test machine, it took 156 seconds. On both machines, this particular script was definitely not maxing out the CPU. The RAM use on my machine was at like 10 GB, but somehow on the test machine it was at 6 GB.
I think the conclusion that I’m coming around to is this:
Basic machines are probably fine for the data science team (members are Suzanne, Craig, Angela, Mike, Christy, Chris P., and Diana), but…. We need to have resources for those exceptional times when there is something computationally expensive they need to do.
Part of what is going on here is that some of the people who were traditionally modelers are now re-labeled data scientists. So we need a dedicated resource.
Here’s my thoughts on how to deal with that: a) Allocate a few of the modeling servers to the data science team. And/Or b) Set up an AWS data science team server. We can help you hone in on the specs. They will be lower than the modeling server.