Wildfire mitigation is one of the most significant initiatives for California Investor-Owned Utilities in recent years from the standpoint of risk and safety. At San Diego Gas & Electric, several teams have leveraged a variety of analytical tools to focus and prioritize mitigation strategies. In the last year, however, our team has introduced a new focus towards big data analytics, machine learning, and statistical modeling to address the growing demand for data-driven solutions. Since our first data scientist hire, we have developed and deployed predictive models that have been used for real-time decision-making during Public Safety Power Shutoff (PSPS) events.
In this talk, we report on our full technical solution for model development and deployment using open-source Python libraries and Amazon Web Services, with emphasis on the technical, operational, and administrative challenges faced when pushing the limits of the systems common among utilities. This includes the technical limitations of data sources used for model training, challenges in working with secure, on-premise data and how this chartered our path to a cloud platform, and considerations when hiring talent capable of utilizing the latest tools in AI and machine learning. This year-one retrospective will frame the progress that SDG&E has made in deploying cloud-based models in relation to our data maturation journey. We will demonstrate key areas where we identified the need for better data science practices, such as version control, reproducibility, and statistical rigor, and how we ultimately addressed these needs.
- For teams yet to have taken the step into data science, this will serve as preparatory document of the factors to consider prior to the first hire; for utilities that are more mature in analytics, it will be a dialogue on best practices and common challenges.
- Technical challenges with the data often collected by utilities for regulations
- Pitfalls and largest blockers to model deployment
- The skillsets to considerations for hiring the first data scientist
- Areas where best practices and centralization improve efficiency and reduce the chance of error